Chris Angelico wrote: > Situation: A terminal application. Requirement: Display nicely-wrapped > text. With colour codes in it. And that text might be indented to any > depth. > > label = f"{indent}\U0010cc32{code}\U0010cc00 > @{tweet['user']['screen_name']}: " wrapper = textwrap.TextWrapper( > initial_indent=label, > subsequent_indent=indent + " " * 12, > width=shutil.get_terminal_size().columns, > break_long_words=False, break_on_hyphens=False, # Stop URLs from > breaking > ) > for line in tweet["full_text"].splitlines(): > print(wrapper.fill(line) > .replace("\U0010cc32", "\x1b[32m\u2026") > .replace("\U0010cc00", "\u2026\x1b[0m") > ) > wrapper.initial_indent = wrapper.subsequent_indent # For > subsequent lines, just indent them > > > The parameter "indent" is always some number of spaces (possibly > zero). If I simply include the escape codes in the label, their > characters will be counted, and the first line will be shorter. Rather > than mess with how textwrap defines text, I just replace the escape > codes *and one other character* with a placeholder. In the final > display, \U0010cc32 means "colour code 32 and an ellipsis", and > \U0010cc00 means "colour code 0 and an ellipsis", so textwrap > correctly counts them as one character each. > > So what do you folks think? Is this a gloriously elegant way to > collapse nonprinting text, or is it a gross hacky mess
Yes ;) > that's going to cause problems? Probably not. I had a quick look at the TextWrapper class, and it doesn't really lend itself to clean and elegant customisation. However, my first idea to approach this problem was to patch the len() builtin: import re import textwrap text = """The parameter "indent" is always some number of spaces (possibly zero). If I simply include the escape codes in the label, their characters will be counted, and the first line will be shorter. Rather than mess with how textwrap defines text, I just replace the escape codes *and one other character* with a placeholder. In the final display, \U0010cc32 means "colour code 32 and an ellipsis", and \U0010cc00 means "colour code 0 and an ellipsis", so textwrap correctly counts them as one character each. """ print(textwrap.fill(text, width=40)) # add some color to the text sample GREEN = "\x1b[32m" NORMAL = "\x1b[0m" parts = text.split(" ") parts[::2] = [GREEN + p + NORMAL for p in parts[::2]] ctext = " ".join(parts) # wrong wrapping print(textwrap.fill(ctext, width=40)) # fixed wrapping def color_len(s): return len(re.compile("\x1b\[\d+m").sub("", s)) textwrap.len = color_len print(textwrap.fill(ctext, width=40)) The output of my ad-hoc test script looks OK. However, I did not try to understand the word-breaking regexes, so I don't know if the escape codes can be spread across words which would confuse color_len(). Likewise, I have no idea if textwrap can cope with zero-length chunks. But at least now you have two -- elegant or gross -- hacks to choose from ;) -- https://mail.python.org/mailman/listinfo/python-list