New submission from Jonathan Goble:

Once a regular expression is compiled with `obj = re.compile()`, it would be 
nice to have access to the raw bytecode, probably as `obj.code` or 
`obj.bytecode`, so it can be explored programmatically. Currently, regex 
bytecode is only stored in a C struct and not exposed to Python code; the only 
way to examine the compiled version is to pass the `re.DEBUG` flag to 
`re.compile()`, which prints only to stdout and outputs not the finished 
bytecode, but a "pretty-printed" intermediate representation useless for 
programmatic analysis.

This is basically requesting the equivalent of the `co_code` attribute of the 
code object returned by the built-in `compile()`, but for regular expression 
objects instead of Python code objects.

Given that the bytecode can actually be multi-byte integers, 
`regexobj.bytecode` should return a list (perhaps even just the same list 
passed to the C function?) or an `array.array()` instance, rather than a 
bytestring.

----------
components: Library (Lib), Regular Expressions
messages: 260072
nosy: Jonathan Goble, ezio.melotti, mrabarnett, pitrou, serhiy.storchaka
priority: normal
severity: normal
status: open
title: Expose regex bytecode as attribute of compiled pattern object
type: enhancement
versions: Python 3.6

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue26336>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to