[issue26336] Expose regex bytecode as attribute of compiled pattern object

2017-05-07 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: See issue30299 which adds the output of decoded bytecode in debug mode. The format of the bytecode is implementation detail, it is irregular, new opcodes can be added, and the format of existing opcodes can be changed. Thus it is hard to support third-party

[issue26336] Expose regex bytecode as attribute of compiled pattern object

2016-06-05 Thread Jelle Zijlstra
Jelle Zijlstra added the comment: Yes, you can get at it with ctypes. I released a small (and virtually untested) library at https://github.com/JelleZijlstra/regdis that provides dis-like capabilities. -- ___ Python tracker

[issue26336] Expose regex bytecode as attribute of compiled pattern object

2016-06-05 Thread Terry J. Reedy
Terry J. Reedy added the comment: I prefer 'rexcode' for the attribute name. I share Serhiy's reservations. When people write code that depends on CPython implementation details, even though documented as such, the existence of such code becomes a drag on change, especially when details

[issue26336] Expose regex bytecode as attribute of compiled pattern object

2016-06-05 Thread Jelle Zijlstra
Jelle Zijlstra added the comment: Updated patch attached. I don't feel strongly about whether this should be in Python, but it seems potentially useful at least as a tool to learn more about how re is implemented. If I have time I may write a tool using __pattern_code__ and the sre_constants

[issue26336] Expose regex bytecode as attribute of compiled pattern object

2016-06-05 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Added comments on Rietveld. I still not think this is a good idea. -- ___ Python tracker ___

[issue26336] Expose regex bytecode as attribute of compiled pattern object

2016-06-04 Thread Jelle Zijlstra
Jelle Zijlstra added the comment: Thanks for the feedback. This patch instead exposes the code as a tuple of integers named __pattern_code__. "Bytecode" is technically inaccurate since the code isn't limited to bytes but can contain larger integers. -- Added file:

[issue26336] Expose regex bytecode as attribute of compiled pattern object

2016-06-04 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: __code__ is associated with Python bytecode. Regex bytecode can't be represented as a Unicode string since it is a sequence of 32-bit integers that can be out of the ord(sys.maxunicode) limit. -- ___ Python

[issue26336] Expose regex bytecode as attribute of compiled pattern object

2016-06-04 Thread Jelle Zijlstra
Changes by Jelle Zijlstra : Removed file: http://bugs.python.org/file43218/issue26336.patch ___ Python tracker ___

[issue26336] Expose regex bytecode as attribute of compiled pattern object

2016-06-04 Thread Jelle Zijlstra
Changes by Jelle Zijlstra : Added file: http://bugs.python.org/file43219/issue26336.patch ___ Python tracker ___

[issue26336] Expose regex bytecode as attribute of compiled pattern object

2016-06-04 Thread Jelle Zijlstra
Jelle Zijlstra added the comment: This patch exposes the bytecode as a __code__ attribute on pattern objects as a Unicode string (consistent with the internal representation as Py_UCS4 instances). -- keywords: +patch nosy: +Jelle Zijlstra Added file:

[issue26336] Expose regex bytecode as attribute of compiled pattern object

2016-02-20 Thread Jonathan Goble
Jonathan Goble added the comment: Noting for the record that, as I had brought up on python-ideas [1], in addition to simply exposing the raw code, it would be nice to have a public constructor for the compiled pattern type and a 'dis'-like module for support. The former would enable

[issue26336] Expose regex bytecode as attribute of compiled pattern object

2016-02-17 Thread Jonathan Goble
Jonathan Goble added the comment: It would indeed be marked as a CPython implementation detail, and with no guarantee of backward compatibility. Others (well, at least one other) have suggested the same on python-ideas. So a simple note in the accompanying documentation would suffice.

[issue26336] Expose regex bytecode as attribute of compiled pattern object

2016-02-17 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Regex bytecode is implementation detail. It was 16-bit in narrow builds, but was changed to at least 32-bit in bugfix releases. It can be changed to 64-bit or to pack an argument with an opcode in one word. The implementation can not use the bytecode at

[issue26336] Expose regex bytecode as attribute of compiled pattern object

2016-02-16 Thread Paul Moore
Changes by Paul Moore : -- keywords: +easy ___ Python tracker ___ ___ Python-bugs-list

[issue26336] Expose regex bytecode as attribute of compiled pattern object

2016-02-10 Thread Jonathan Goble
New submission from Jonathan Goble: Once a regular expression is compiled with `obj = re.compile()`, it would be nice to have access to the raw bytecode, probably as `obj.code` or `obj.bytecode`, so it can be explored programmatically. Currently, regex bytecode is only stored in a C struct