Re: tail

Chris Angelico Sat, 23 Apr 2022 15:27:09 -0700

On Sun, 24 Apr 2022 at 08:18, Cameron Simpson <[email protected]> wrote:
>
> On 24Apr2022 07:15, Chris Angelico <[email protected]> wrote:
> >On Sun, 24 Apr 2022 at 07:13, Marco Sulla <[email protected]> 
> >wrote:
> >> Emh, why chunks? My function simply reads byte per byte and compares
> >> it to b"\n". When it find it, it stops and do a readline():
> [...]
> >> This is only for one line and in utf8, but it can be generalised.
>
> For some encodings that generalisation might be hard. But mostly, yes.
>
> >Ah. Well, then, THAT is why it's inefficient: you're seeking back one
> >single byte at a time, then reading forwards. That is NOT going to
> >play nicely with file systems or buffers.
>
> An approach I think you both may have missed: mmap the file and use
> mmap.rfind(b'\n') to locate line delimiters.
> https://docs.python.org/3/library/mmap.html#mmap.mmap.rfind


Yeah, I made a vague allusion to use of mmap, but didn't elaborate
because I actually have zero idea of how efficient this would be.
Would it be functionally equivalent to the chunking, but with the
chunk size defined by the system as whatever's most optimal? It would
need to be tested.

I've never used mmap for this kind of job, so it's not something I'm
comfortable predicting the performance of.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: tail

Reply via email to