The fix to a bug in Leo's Rust importer requires an alternative to 
*g.splitLines*:


def splitLines(s: str) -> list[str]:

    return s.splitlines(True) if s else []


The problem arises from str.splitlines 
<https://docs.python.org/3.3/library/stdtypes.html#str.splitlines>. 
Importers should split lines *only* newlines. A crucial invariant fails 
otherwise.


A new function is required: *g.splitLinesAtNewline*. This function splits 
lines *only* at newlines. It does *not* split lines at form-feeds and other 
unusual line-ending characters. 


This function is surprisingly complicated. I had to develop it using a new 
unit test.


*Summary*


All of Leo's importers use *g.splitLinesAtNewline*.


*g.splitLines* will remain as it is: any change would be a breaking change 
to Leo's API.


Edward


P.S. Here is g.splitLinesAtNewline:


def splitLinesAtNewline(s: str) -> list[str]:
    """
    Split lines *only* at '\n', preserving form-feeds and other unusual 
line-ending characters.
    """
    if not s:
        return []
    lines = s.split(sep='\n')
    if lines[-1] == '':
        lines.pop()
    lines = [f"{z}\n" for z in lines]
    if not s.endswith('\n'):
        lines[-1] = lines[-1][:-1]
    return lines


*Notes*


The "s" arg *must* be a unicode string. byte args are not allowed, but all 
tests pass with g.splitLines defined this way:


def splitLines(s: str) -> list[str]:
    return splitLinesAtNewline(g.toUnicode(s))

To repeat, I am *not* going to change g.splitLines!


EKR

-- 
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/leo-editor/e098280c-cbbe-474a-8368-074ec7949501n%40googlegroups.com.

Reply via email to