New submission from S Murthy <smur...@pm.me>:
I note that on disassembling a piece of source code (via source strings or code objects) the corresponding sequence of bytecode instruction objects (https://docs.python.org/3/library/dis.html#dis.Instruction) do not always have the `starts_line` attribute set - the storage and display of this line no. seems to be based on whether a given instruction is the first in a block of instructions which implement a given source line. I think it would be better, for mapping source and logical lines of code to bytecode instruction blocks, to set `starts_line` for every instruction, and amend the bytecode printing method (`dis._disassemble_bytes`) to keep the existing behaviour by detecting whether an instruction is the first line of an instruction block. ATM `Instruction` objects are created and generated within this loop in `dis._get_bytecode_instructions`: def _get_instructions_bytes(code, varnames=None, names=None, constants=None, cells=None, linestarts=None, line_offset=0): """Iterate over the instructions in a bytecode string. Generates a sequence of Instruction namedtuples giving the details of each opcode. Additional information about the code's runtime environment (e.g. variable names, constants) can be specified using optional arguments. """ labels = findlabels(code) starts_line = None for offset, op, arg in _unpack_opargs(code): if linestarts is not None: starts_line = linestarts.get(offset, None) ... ... So it's this line starts_line = linestarts.get(offset, None) which currently causes `starts_line` to be to set to `None` for every instruction which isn't the first in an instruction block - linestarts is a dict of source line numbers and offsets of the first instructions starting the corresponding instruction blocks. My idea is to (1) change that line above to starts_line = linestarts.get(offset, starts_line) which ensures every instruction will have the corresponding source line no. set, (2) amend `Instruction._disassemble` to add a new optional argument `print_start_line` with default of `True` to determine whether to print the source line no., and (3) amend `dis._disassemble_bytes` to accept a new optional argument `start_line_by_block` with a default of `True` which can be used to preserve existing behaviour of printing source line numbers by instruction block. I was wondering whether this sounds OK, if so, I am happy to submit a PR. ---------- components: Library (Lib) messages: 363140 nosy: smurthy priority: normal severity: normal status: open title: Disassembly - improve documentation for bytecode instruction class and set source line no. attribute for every instruction type: enhancement versions: Python 3.7 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue39823> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com