Regular expression pattern normalizing output checker¶
The pattern-normalizing output checker from
zope.testing.renormalizing
extends the default output checker with
an option to normalize expected and actual output.
You specify a sequence of patterns and replacements. The replacements are applied to the expected and actual outputs before calling the default outputs checker. Let’s look at an example. In this example, we have some times and addresses:
>>> want = '''\
... <object object at 0xb7f14438>
... completed in 1.234 seconds.
... <BLANKLINE>
... <object object at 0xb7f14440>
... completed in 123.234 seconds.
... <BLANKLINE>
... <object object at 0xb7f14448>
... completed in .234 seconds.
... <BLANKLINE>
... <object object at 0xb7f14450>
... completed in 1.234 seconds.
... <BLANKLINE>
... '''
>>> got = '''\
... <object object at 0xb7f14458>
... completed in 1.235 seconds.
...
... <object object at 0xb7f14460>
... completed in 123.233 seconds.
...
... <object object at 0xb7f14468>
... completed in .231 seconds.
...
... <object object at 0xb7f14470>
... completed in 1.23 seconds.
...
... '''
We may wish to consider these two strings to match, even though they differ in actual addresses and times. The default output checker will consider them different:
>>> import doctest
>>> doctest.OutputChecker().check_output(want, got, 0)
False
We’ll use the zope.testing.renormalizing.OutputChecker
to normalize both the
wanted and gotten strings to ignore differences in times and
addresses:
>>> import re
>>> from zope.testing.renormalizing import OutputChecker
>>> checker = OutputChecker([
... (re.compile('[0-9]*[.][0-9]* seconds'), '<SOME NUMBER OF> seconds'),
... (re.compile('at 0x[0-9a-f]+'), 'at <SOME ADDRESS>'),
... ])
>>> checker.check_output(want, got, 0)
True
Usual doctest.OutputChecker
options work as expected:
>>> want_ellided = '''\
... <object object at 0xb7f14438>
... completed in 1.234 seconds.
... ...
... <object object at 0xb7f14450>
... completed in 1.234 seconds.
... <BLANKLINE>
... '''
>>> checker.check_output(want_ellided, got, 0)
False
>>> checker.check_output(want_ellided, got, doctest.ELLIPSIS)
True
When we get differencs, we output them with normalized text:
>>> source = '''\
... >>> do_something()
... <object object at 0xb7f14438>
... completed in 1.234 seconds.
... ...
... <object object at 0xb7f14450>
... completed in 1.234 seconds.
... <BLANKLINE>
... '''
>>> example = doctest.Example(source, want_ellided)
>>> print_(checker.output_difference(example, got, 0))
Expected:
<object object at <SOME ADDRESS>>
completed in <SOME NUMBER OF> seconds.
...
<object object at <SOME ADDRESS>>
completed in <SOME NUMBER OF> seconds.
Got:
<object object at <SOME ADDRESS>>
completed in <SOME NUMBER OF> seconds.
<object object at <SOME ADDRESS>>
completed in <SOME NUMBER OF> seconds.
<object object at <SOME ADDRESS>>
completed in <SOME NUMBER OF> seconds.
<object object at <SOME ADDRESS>>
completed in <SOME NUMBER OF> seconds.
>>> print_(checker.output_difference(example, got,
... doctest.REPORT_NDIFF))
Differences (ndiff with -expected +actual):
- <object object at <SOME ADDRESS>>
- completed in <SOME NUMBER OF> seconds.
- ...
<object object at <SOME ADDRESS>>
completed in <SOME NUMBER OF> seconds.
+ <object object at <SOME ADDRESS>>
+ completed in <SOME NUMBER OF> seconds.
+ <BLANKLINE>
+ <object object at <SOME ADDRESS>>
+ completed in <SOME NUMBER OF> seconds.
+ <BLANKLINE>
+ <object object at <SOME ADDRESS>>
+ completed in <SOME NUMBER OF> seconds.
+ <BLANKLINE>
If the wanted text is empty, however, we don’t transform the actual output. This is usful when writing tests. We leave the expected output empty, run the test, and use the actual output as expected, after reviewing it.
>>> source = '''\
... >>> do_something()
... '''
>>> example = doctest.Example(source, '\n')
>>> print_(checker.output_difference(example, got, 0))
Expected:
Got:
<object object at 0xb7f14458>
completed in 1.235 seconds.
<object object at 0xb7f14460>
completed in 123.233 seconds.
<object object at 0xb7f14468>
completed in .231 seconds.
<object object at 0xb7f14470>
completed in 1.23 seconds.
If regular expressions aren’t expressive enough, you can use arbitrary Python callables to transform the text. For example, suppose you want to ignore case during comparison:
>>> checker = OutputChecker([
... lambda s: s.lower(),
... lambda s: s.replace('<blankline>', '<BLANKLINE>'),
... ])
>>> want = '''\
... Usage: thundermonkey [options] [url]
... <BLANKLINE>
... Options:
... -h display this help message
... '''
>>> got = '''\
... usage: thundermonkey [options] [URL]
...
... options:
... -h Display this help message
... '''
>>> checker.check_output(want, got, 0)
True
Suppose we forgot that <BLANKLINE>
must be in upper case:
>>> checker = OutputChecker([
... lambda s: s.lower(),
... ])
>>> checker.check_output(want, got, 0)
False
The difference would show us that:
>>> source = '''\
... >>> print_help_message()
... ''' + want
>>> example = doctest.Example(source, want)
>>> print_(checker.output_difference(example, got,
... doctest.REPORT_NDIFF))
Differences (ndiff with -expected +actual):
usage: thundermonkey [options] [url]
- <blankline>
+ <BLANKLINE>
options:
-h display this help message
It is possible to combine OutputChecker
checkers for easy reuse:
>>> address_and_time_checker = OutputChecker([
... (re.compile('[0-9]*[.][0-9]* seconds'), '<SOME NUMBER OF> seconds'),
... (re.compile('at 0x[0-9a-f]+'), 'at <SOME ADDRESS>'),
... ])
>>> lowercase_checker = OutputChecker([
... lambda s: s.lower(),
... ])
>>> combined_checker = address_and_time_checker + lowercase_checker
>>> len(combined_checker.transformers)
3
Combining a checker with something else does not work:
>>> lowercase_checker + 5
Traceback (most recent call last):
...
TypeError: unsupported operand type(s) for +: ...