[Python-ideas] Re: TextIOWrapper support for null-terminated lines
On Mon, Oct 26, 2020 at 10:47 AM Cameron Simpson wrote: > > On 26Oct2020 09:45, Chris Angelico wrote: > >On Mon, Oct 26, 2020 at 8:44 AM Cameron Simpson wrote: > >> On 24Oct2020 13:37, Dan Sommers <2qdxy4rzwzuui...@potatochowder.com> wrote: > >> >Spaces in filenames are just as bad, and much more common: > >> > >> But much easier to handle in simple text listings, which are newline > >> delimited. > >> You're really running into a horrible behaviour from xargs, which is one > >> reason why GNU parallel exists. > > > >I don't consider the behaviour horrible, and xargs isn't the only > >thing to do this - other tools can be put into zero-termination mode > >too. > > I'm not talking about -print0 and -0, which I merely dislike as a hack > to accomodate badly named filenames, but xargs' non-0 behaviour, which > splits on whitespace. Instead of newlines. That pissed me off enough to > write my own. > Ohh, I see what you mean. Yeah, newlines would be a better default for a lot of situations. Can't be changed now. ChrisA ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/OCQ6OTLSWBPOHHVBZFK3Z35RIOSK35PO/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: TextIOWrapper support for null-terminated lines
On 26Oct2020 09:45, Chris Angelico wrote: >On Mon, Oct 26, 2020 at 8:44 AM Cameron Simpson wrote: >> On 24Oct2020 13:37, Dan Sommers <2qdxy4rzwzuui...@potatochowder.com> wrote: >> >Spaces in filenames are just as bad, and much more common: >> >> But much easier to handle in simple text listings, which are newline >> delimited. >> You're really running into a horrible behaviour from xargs, which is one >> reason why GNU parallel exists. > >I don't consider the behaviour horrible, and xargs isn't the only >thing to do this - other tools can be put into zero-termination mode >too. I'm not talking about -print0 and -0, which I merely dislike as a hack to accomodate badly named filenames, but xargs' non-0 behaviour, which splits on whitespace. Instead of newlines. That pissed me off enough to write my own. [...] >If you actually DO need to read null-terminated records from a file >that's too big for memory, it's probably worth just rolling your own >buffering, reading a chunk at a time and splitting off the interesting >parts. It's not hugely difficult, and it's a good exercise to do now >and then. Aye. That's what my cs.buffer.CornuCopyBuffer class does for me: https://pypi.org/project/cs.buffer/ aimed particularly at parsing binary data easily (it takes any iterable of bytes, and has a few factories to start from a file etc). Parsing a NUL terminated string from binary data isn't too bad given such a thing. Cheers, Cameron Simpson ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/2G5RBMJYUWKFC7R5CO2VKODKJ2GZPA2H/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: TextIOWrapper support for null-terminated lines
On Sun, Oct 25, 2020, at 18:45, Chris Angelico wrote: > If you actually DO need to read null-terminated records from a file > that's too big for memory, it's probably worth just rolling your own > buffering, reading a chunk at a time and splitting off the interesting > parts. It's not hugely difficult, and it's a good exercise to do now > and then. And yes, I can see the temptation to get Python to do it, > but unfortunately, newline support is such a weird mess of > cross-platform support that I don't think it needs to be made more > complicated :) Maybe a getdelim method that ignores all the newline support complexity and just reads until it reaches the specified character? It would make sense on binary files too. The problem with rolling your own buffering is that there's not really a good way to put back the unused data after the delimiter if you're mixing this processing with something else. You'd have to do it a character at a time, which would be very inefficient in pure python. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/CXZWUKIIJNGP7EDXG7P3CHZKF3XW2P6P/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: TextIOWrapper support for null-terminated lines
On Mon, Oct 26, 2020 at 8:44 AM Cameron Simpson wrote: > > On 24Oct2020 13:37, Dan Sommers <2qdxy4rzwzuui...@potatochowder.com> wrote: > >On 2020-10-24 at 12:29:01 -0400, > >Brian Allen Vanderburg II via Python-ideas wrote: > > > >> ... Find can output it's filenames in null-terminated lines since it > >> is possible to have newlines in a filename(yuck) ... > > > >Spaces in filenames are just as bad, and much more common: > > But much easier to handle in simple text listings, which are newline > delimited. > > You're really running into a horrible behaviour from xargs, which is one > reason why GNU parallel exists. > I don't consider the behaviour horrible, and xargs isn't the only thing to do this - other tools can be put into zero-termination mode too. But it's pretty rare to consume huge amounts of data in this way (normally it'll just be a list of file names), so what I would do is simply read the entire thing, then split on "\0". It's not like reading a gigabyte of log file, where you really want to work line by line and not read in more than you need; it's easily going to fit into memory. If you actually DO need to read null-terminated records from a file that's too big for memory, it's probably worth just rolling your own buffering, reading a chunk at a time and splitting off the interesting parts. It's not hugely difficult, and it's a good exercise to do now and then. And yes, I can see the temptation to get Python to do it, but unfortunately, newline support is such a weird mess of cross-platform support that I don't think it needs to be made more complicated :) ChrisA ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/5VGYDJ4RZRWQWHBMSQZUD5QJUHVF2J66/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: TextIOWrapper support for null-terminated lines
On 24Oct2020 13:37, Dan Sommers <2qdxy4rzwzuui...@potatochowder.com> wrote: >On 2020-10-24 at 12:29:01 -0400, >Brian Allen Vanderburg II via Python-ideas wrote: > >> ... Find can output it's filenames in null-terminated lines since it >> is possible to have newlines in a filename(yuck) ... > >Spaces in filenames are just as bad, and much more common: But much easier to handle in simple text listings, which are newline delimited. You're really running into a horrible behaviour from xargs, which is one reason why GNU parallel exists. Cheers, Cameron Simpson ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/6EO37LQLQWTZDJQA3FRD4FQSC7IOHKYU/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: TextIOWrapper support for null-terminated lines
On 2020-10-24 at 12:29:01 -0400, Brian Allen Vanderburg II via Python-ideas wrote: > ... Find can output it's filenames in null-terminated lines since it > is possible to have newlines in a filename(yuck) ... Spaces in filenames are just as bad, and much more common: $ touch 'foo bar' $ find . -name 'foo bar' ./foo bar $ find . -name 'foo bar' -print | xargs ls -l ls: cannot access './foo': No such file or directory ls: cannot access 'bar': No such file or directory $ find . -name 'foo bar' -print0 | xargs -0 ls -l -rw-r--r-- 1 dan dan 0 Oct 24 13:31 './foo bar' $ rm 'foo bar' ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/F5UX5CL7YQIHEX3MP5R4GUVHIXCS5VQP/ Code of Conduct: http://python.org/psf/codeofconduct/