On 19May2022 19:50, Marco Sulla wrote:
>On Wed, 18 May 2022 at 23:32, Cameron Simpson wrote:
>> You're measuring different things. timeit() tries hard to measure
>> just
>> the code snippet you provide. It doesn't measure the startup cost of the
>> whole python interpreter. Try:
>>
>> time p
On Wed, 18 May 2022 at 23:32, Cameron Simpson wrote:
>
> On 17May2022 22:45, Marco Sulla wrote:
> >Well, I've done a benchmark.
> timeit.timeit("tail('/home/marco/small.txt')", globals={"tail":tail},
> number=10)
> >1.5963431186974049
> timeit.timeit("tail('/home/marco/lorem.t
On 17May2022 22:45, Marco Sulla wrote:
>Well, I've done a benchmark.
timeit.timeit("tail('/home/marco/small.txt')", globals={"tail":tail},
number=10)
>1.5963431186974049
timeit.timeit("tail('/home/marco/lorem.txt')", globals={"tail":tail},
number=10)
>2.52406043745577
Well, I've done a benchmark.
>>> timeit.timeit("tail('/home/marco/small.txt')", globals={"tail":tail},
>>> number=10)
1.5963431186974049
>>> timeit.timeit("tail('/home/marco/lorem.txt')", globals={"tail":tail},
>>> number=10)
2.5240604374557734
>>> timeit.timeit("tail('/home/marco/lorem.
On Fri, 13 May 2022 at 12:49, <2qdxy4rzwzuui...@potatochowder.com> wrote:
>
> On 2022-05-13 at 12:16:57 +0200,
> Marco Sulla wrote:
>
> > On Fri, 13 May 2022 at 00:31, Cameron Simpson wrote:
>
> [...]
>
> > > This is nearly the worst "specification" I have ever seen.
>
> > You're lucky. I've seen
On 2022-05-13 at 12:16:57 +0200,
Marco Sulla wrote:
> On Fri, 13 May 2022 at 00:31, Cameron Simpson wrote:
[...]
> > This is nearly the worst "specification" I have ever seen.
> You're lucky. I've seen much worse (or no one).
At least with *no* documentation, the source code stands for itsel
On Fri, 13 May 2022 at 00:31, Cameron Simpson wrote:
> On 12May2022 19:48, Marco Sulla wrote:
> >On Thu, 12 May 2022 at 00:50, Stefan Ram wrote:
> >> There's no spec/doc, so one can't even test it.
> >
> >Excuse me, you're very right.
> >
> >"""
> >A function that "tails" the file. If you don
On 12May2022 19:48, Marco Sulla wrote:
>On Thu, 12 May 2022 at 00:50, Stefan Ram wrote:
>> There's no spec/doc, so one can't even test it.
>
>Excuse me, you're very right.
>
>"""
>A function that "tails" the file. If you don't know what that means,
>google "man tail"
>
>filepath: the file path
On Thu, 12 May 2022 22:45:42 +0200, Marco Sulla
declaimed the following:
>
>Maybe. Maybe not. What if the file ends with no newline?
https://github.com/coreutils/coreutils/blob/master/src/tail.c
Lines 567-569 (also lines 550-557 for "bytes_read" determination)
--
Wulfraed
Thank you very much. This helped me to improve the function:
import os
_lf = b"\n"
_err_n = "Parameter n must be a positive integer number"
_err_chunk_size = "Parameter chunk_size must be a positive integer number"
def tail(filepath, n=10, chunk_size=100):
if (n <= 0):
raise ValueErr
On Thu, 12 May 2022 at 00:50, Stefan Ram wrote:
>
> Marco Sulla writes:
> >def tail(filepath, n=10, chunk_size=100):
> >if (n <= 0):
> >raise ValueError(_err_n)
> ...
>
> There's no spec/doc, so one can't even test it.
Excuse me, you're very right.
"""
A function that "tails" the
needed but for smaller
files, KISS.
-Original Message-
From: Dennis Lee Bieber
To: python-list@python.org
Sent: Wed, May 11, 2022 6:15 pm
Subject: Re: tail
On Thu, 12 May 2022 06:07:18 +1000, Chris Angelico
declaimed the following:
>I don't understand why this wants to b
than how to stepwise make changes in a pipeline so reading from the beginning
to end was not an issue.
-Original Message-
From: Marco Sulla
To: Chris Angelico
Cc: python-list@python.org
Sent: Wed, May 11, 2022 5:27 pm
Subject: Re: tail
On Wed, 11 May 2022 at 22:09, Chris Angelico wrote
On Thu, 12 May 2022 06:07:18 +1000, Chris Angelico
declaimed the following:
>I don't understand why this wants to be in the standard library.
>
Especially as any Linux distribution probably includes the compiled
"tail" command, so this would only be of use on Windows.
Under recen
On Thu, 12 May 2022 at 07:27, Marco Sulla wrote:
>
> On Wed, 11 May 2022 at 22:09, Chris Angelico wrote:
> >
> > Have you actually checked those three, or do you merely suppose them to be
> > true?
>
> I only suppose, as I said. I should do some benchmark and some other
> tests, and, frankly, I
On Wed, 11 May 2022 at 22:09, Chris Angelico wrote:
>
> Have you actually checked those three, or do you merely suppose them to be
> true?
I only suppose, as I said. I should do some benchmark and some other
tests, and, frankly, I don't want to. I don't want to because I'm
quite sure the impleme
On Thu, 12 May 2022 at 06:03, Marco Sulla wrote:
> I suppose this function is fast. It reads the bytes from the file in chunks
> and stores them in a bytearray, prepending them to it. The final result is
> read from the bytearray and converted to bytes (to be consistent with the
> read method).
>
On Mon, 9 May 2022 at 23:15, Dennis Lee Bieber
wrote:
>
> On Mon, 9 May 2022 21:11:23 +0200, Marco Sulla
> declaimed the following:
>
> >Nevertheless, tail is a fundamental tool in *nix. It's fast and
> >reliable. Also the tail command can't handle different encodings?
>
> Based upon
> ht
Marco Sulla writes:
On Mon, 9 May 2022 at 19:53, Chris Angelico wrote:
...
Nevertheless, tail is a fundamental tool in *nix. It's fast and
reliable. Also the tail command can't handle different encodings?
It definitely can't. It works for UTF-8, and all the ASCII compatible
single
On Mon, 9 May 2022 21:11:23 +0200, Marco Sulla
declaimed the following:
>Nevertheless, tail is a fundamental tool in *nix. It's fast and
>reliable. Also the tail command can't handle different encodings?
Based upon
https://github.com/coreutils/coreutils/blob/master/src/tail.c the ONLY
th
On Tue, 10 May 2022 at 07:07, Barry wrote:
> POSIX tail just prints the bytes to the output that it finds between \n bytes.
> At no time does it need to care about encodings as that is a problem solved
> by the terminal software. I would not expect utf-16 to work with tail on
> linux systems.
UTF
> On 9 May 2022, at 20:14, Marco Sulla wrote:
>
> On Mon, 9 May 2022 at 19:53, Chris Angelico wrote:
>>
>>> On Tue, 10 May 2022 at 03:47, Marco Sulla
>>> wrote:
>>>
>>> On Mon, 9 May 2022 at 07:56, Cameron Simpson wrote:
The point here is that text is a very different thing. B
> On 9 May 2022, at 17:41, r...@zedat.fu-berlin.de wrote:
>
> Barry Scott writes:
>> Why use tiny chunks? You can read 4KiB as fast as 100 bytes
>
> When optimizing code, it helps to be aware of the orders of
> magnitude
That is true and we’ll know to me, now show how what I said is wrong.
On Tue, 10 May 2022 at 05:12, Marco Sulla wrote:
>
> On Mon, 9 May 2022 at 19:53, Chris Angelico wrote:
> >
> > On Tue, 10 May 2022 at 03:47, Marco Sulla
> > wrote:
> > >
> > > On Mon, 9 May 2022 at 07:56, Cameron Simpson wrote:
> > > >
> > > > The point here is that text is a very different t
On Mon, 9 May 2022 at 19:53, Chris Angelico wrote:
>
> On Tue, 10 May 2022 at 03:47, Marco Sulla
> wrote:
> >
> > On Mon, 9 May 2022 at 07:56, Cameron Simpson wrote:
> > >
> > > The point here is that text is a very different thing. Because you
> > > cannot seek to an absolute number of charact
On 2022-05-08 at 18:52:42 +,
Stefan Ram wrote:
> Remember how recently people here talked about how you cannot copy
> text from a video? Then, how did I do it? Turns out, for my
> operating system, there's a screen OCR program! So I did this OCR
> and then manually corrected a few wro
On Tue, 10 May 2022 at 03:47, Marco Sulla wrote:
>
> On Mon, 9 May 2022 at 07:56, Cameron Simpson wrote:
> >
> > The point here is that text is a very different thing. Because you
> > cannot seek to an absolute number of characters in an encoding with
> > variable sized characters. _If_ you did a
On Mon, 9 May 2022 at 07:56, Cameron Simpson wrote:
>
> The point here is that text is a very different thing. Because you
> cannot seek to an absolute number of characters in an encoding with
> variable sized characters. _If_ you did a seek to an arbitrary number
> you can end up in the middle of
On Sun, 8 May 2022 22:48:32 +0200, Marco Sulla
declaimed the following:
>
>Emh. I re-quote
>
>seek(offset, whence=SEEK_SET)
>Change the stream position to the given byte offset.
>
>And so on. No mention of differences between text and binary mode.
You ignore that, underneath, Python is j
On 9/05/22 7:47 am, Marco Sulla wrote:
It will fail if the contents is not ASCII.
Why?
For some encodings, if you seek to an arbitrary byte position and
then read, it may *appear* to succeed but give you complete gibberish.
Your method might work for a certain subset of encodings (those that
On 08May2022 22:48, Marco Sulla wrote:
>On Sun, 8 May 2022 at 22:34, Barry wrote:
>> >> In text mode you can only seek to a value return from f.tell()
>> >> otherwise the behaviour is undefined.
>> >
>> > Why? I don't see any recommendation about it in the docs:
>> > https://docs.python.org/3/li
On Sun, 8 May 2022 at 22:34, Barry wrote:
>
> > On 8 May 2022, at 20:48, Marco Sulla wrote:
> >
> > On Sun, 8 May 2022 at 20:31, Barry Scott wrote:
> >>
> On 8 May 2022, at 17:05, Marco Sulla
> wrote:
> >>>
> >>> def tail(filepath, n=10, newline=None, encoding=None, chunk_size=100):
> On 8 May 2022, at 20:48, Marco Sulla wrote:
>
> On Sun, 8 May 2022 at 20:31, Barry Scott wrote:
>>
On 8 May 2022, at 17:05, Marco Sulla wrote:
>>>
>>> def tail(filepath, n=10, newline=None, encoding=None, chunk_size=100):
>>> n_chunk_size = n * chunk_size
>>
>> Why use tiny chunk
On Sun, 8 May 2022 at 22:02, Chris Angelico wrote:
>
> Absolutely not. As has been stated multiple times in this thread, a
> fully general approach is extremely complicated, horrifically
> unreliable, and hopelessly inefficient.
Well, my implementation is quite general now. It's not complicated a
On Mon, 9 May 2022 at 05:49, Marco Sulla wrote:
> Anyway, apart from my implementation, I'm curious if you think a tail
> method is worth it to be a method of the builtin file objects in
> CPython.
Absolutely not. As has been stated multiple times in this thread, a
fully general approach is extre
On Sun, 8 May 2022 at 20:31, Barry Scott wrote:
>
> > On 8 May 2022, at 17:05, Marco Sulla wrote:
> >
> > def tail(filepath, n=10, newline=None, encoding=None, chunk_size=100):
> >n_chunk_size = n * chunk_size
>
> Why use tiny chunks? You can read 4KiB as fast as 100 bytes as its typically
>
On 2022-05-08 19:15, Barry Scott wrote:
On 7 May 2022, at 22:31, Chris Angelico wrote:
On Sun, 8 May 2022 at 07:19, Stefan Ram wrote:
MRAB writes:
On 2022-05-07 19:47, Stefan Ram wrote:
...
def encoding( name ):
path = pathlib.Path( name )
for encoding in( "utf_8", "latin_1", "cp1
> On 8 May 2022, at 17:05, Marco Sulla wrote:
>
> I think I've _almost_ found a simpler, general way:
>
> import os
>
> _lf = "\n"
> _cr = "\r"
>
> def tail(filepath, n=10, newline=None, encoding=None, chunk_size=100):
>n_chunk_size = n * chunk_size
Why use tiny chunks? You can read 4K
On Mon, 9 May 2022 at 04:15, Barry Scott wrote:
>
>
>
> > On 7 May 2022, at 22:31, Chris Angelico wrote:
> >
> > On Sun, 8 May 2022 at 07:19, Stefan Ram wrote:
> >>
> >> MRAB writes:
> >>> On 2022-05-07 19:47, Stefan Ram wrote:
> >> ...
> def encoding( name ):
> path = pathlib.Path(
> On 7 May 2022, at 22:31, Chris Angelico wrote:
>
> On Sun, 8 May 2022 at 07:19, Stefan Ram wrote:
>>
>> MRAB writes:
>>> On 2022-05-07 19:47, Stefan Ram wrote:
>> ...
def encoding( name ):
path = pathlib.Path( name )
for encoding in( "utf_8", "latin_1", "cp1252" ):
> On 7 May 2022, at 14:40, Stefan Ram wrote:
>
> Marco Sulla writes:
>> So there's no way to reliably read lines in reverse in text mode using
>> seek and read, but the only option is readlines?
>
> I think, CPython is based on C. I don't know whether
> Python's seek function directly call
I think I've _almost_ found a simpler, general way:
import os
_lf = "\n"
_cr = "\r"
def tail(filepath, n=10, newline=None, encoding=None, chunk_size=100):
n_chunk_size = n * chunk_size
pos = os.stat(filepath).st_size
chunk_line_pos = -1
lines_not_found = n
with open(filepath
> On 7 May 2022, at 17:29, Marco Sulla wrote:
>
> On Sat, 7 May 2022 at 16:08, Barry wrote:
>> You need to handle the file in bin mode and do the handling of line endings
>> and encodings yourself. It’s not that hard for the cases you wanted.
>
"\n".encode("utf-16")
> b'\xff\xfe\n\x00'
On Sun, 8 May 2022 at 07:19, Stefan Ram wrote:
>
> MRAB writes:
> >On 2022-05-07 19:47, Stefan Ram wrote:
> ...
> >>def encoding( name ):
> >>path = pathlib.Path( name )
> >>for encoding in( "utf_8", "latin_1", "cp1252" ):
> >>try:
> >>with path.open( encoding=encoding
On Sun, 8 May 2022 at 04:37, Marco Sulla wrote:
>
> On Sat, 7 May 2022 at 19:02, MRAB wrote:
> >
> > On 2022-05-07 17:28, Marco Sulla wrote:
> > > On Sat, 7 May 2022 at 16:08, Barry wrote:
> > >> You need to handle the file in bin mode and do the handling of line
> > >> endings and encodings yo
On 2022-05-07 19:47, Stefan Ram wrote:
Marco Sulla writes:
Well, ok, but I need a generic method to get LF and CR for any
encoding an user can input.
"LF" and "CR" come from US-ASCII. It is theoretically
possible that there might be some encodings out there
(not for Unicode) that are
On 2022-05-07 19:35, Marco Sulla wrote:
On Sat, 7 May 2022 at 19:02, MRAB wrote:
>
> On 2022-05-07 17:28, Marco Sulla wrote:
> > On Sat, 7 May 2022 at 16:08, Barry wrote:
> >> You need to handle the file in bin mode and do the handling of line
endings and encodings yourself. It’s not that hard
On Sat, 7 May 2022 20:35:34 +0200, Marco Sulla
declaimed the following:
>Well, ok, but I need a generic method to get LF and CR for any
>encoding an user can input.
Other than EBCDIC, and AS BYTES should appear as x0A and x0D
in any of the 8-bit encodings (ASCII, ISO-8859-x, CP, UT
On Sat, 7 May 2022 at 19:02, MRAB wrote:
>
> On 2022-05-07 17:28, Marco Sulla wrote:
> > On Sat, 7 May 2022 at 16:08, Barry wrote:
> >> You need to handle the file in bin mode and do the handling of line
> >> endings and encodings yourself. It’s not that hard for the cases you
> >> wanted.
> >
On 2022-05-07 17:28, Marco Sulla wrote:
On Sat, 7 May 2022 at 16:08, Barry wrote:
You need to handle the file in bin mode and do the handling of line endings and
encodings yourself. It’s not that hard for the cases you wanted.
"\n".encode("utf-16")
b'\xff\xfe\n\x00'
"".encode("utf-16")
b
I believe I'd do something like:
#!/usr/local/cpython-3.10/bin/python3
"""
Output the last 10 lines of a potentially-huge file.
O(n). But technically so is scanning backward from the EOF.
It'd be faster to use a dict, but this has the advantage of working for
huge num_lines.
"""
import d
On Sat, 7 May 2022 at 16:08, Barry wrote:
> You need to handle the file in bin mode and do the handling of line endings
> and encodings yourself. It’s not that hard for the cases you wanted.
>>> "\n".encode("utf-16")
b'\xff\xfe\n\x00'
>>> "".encode("utf-16")
b'\xff\xfe'
>>> "a\nb".encode("utf-16
> On 7 May 2022, at 14:24, Marco Sulla wrote:
>
> On Sat, 7 May 2022 at 01:03, Dennis Lee Bieber wrote:
>>
>>Windows also uses for the EOL marker, but Python's I/O system
>> condenses that to just internally (for TEXT mode) -- so using the
>> length of a string so read to compute a
general purpose tool,
internationalization from ASCII has created a challenge for lots of such tools.
-Original Message-
From: Marco Sulla
To: Dennis Lee Bieber
Cc: python-list@python.org
Sent: Sat, May 7, 2022 9:21 am
Subject: Re: tail
On Sat, 7 May 2022 at 01:03, Dennis Lee Bieber wrote:
>
>
On Sat, 7 May 2022 at 01:03, Dennis Lee Bieber wrote:
>
> Windows also uses for the EOL marker, but Python's I/O system
> condenses that to just internally (for TEXT mode) -- so using the
> length of a string so read to compute a file position may be off-by-one for
> each EOL in the stri
On Fri, 6 May 2022 21:19:48 +0100, MRAB
declaimed the following:
>Is the file UTF-8? That's a variable-width encoding, so are any of the
>characters > U+007F?
>
>Which OS? On Windows, it's common/normal for UTF-8 files to start with a
>BOM/signature, which is 3 bytes/1 codepoint.
Windo
On 2022-05-06 20:21, Marco Sulla wrote:
I have a little problem.
I tried to extend the tail function, so it can read lines from the bottom
of a file object opened in text mode.
The problem is it does not work. It gets a starting position that is lower
than the expected by 3 characters. So the f
I have a little problem.
I tried to extend the tail function, so it can read lines from the bottom
of a file object opened in text mode.
The problem is it does not work. It gets a starting position that is lower
than the expected by 3 characters. So the first line is read only for 2
chars, and th
On Mon, 2 May 2022 at 00:20, Cameron Simpson wrote:
>
> On 01May2022 18:55, Marco Sulla wrote:
> >Something like this is OK?
> [...]
> >def tail(f):
> >chunk_size = 100
> >size = os.stat(f.fileno()).st_size
>
> I think you want os.fstat().
It's the same from py 3.3
> >chunk_line_pos
Ok, I suppose \n and \r are enough:
readline(size=- 1, /)
Read and return one line from the stream. If size is specified, at
most size bytes will be read.
The line terminator is always b'\n' for binary files; for text files,
the newline argument to open() can be used to select the line
On Tue, 3 May 2022 at 04:38, Marco Sulla wrote:
>
> On Mon, 2 May 2022 at 18:31, Stefan Ram wrote:
> >
> > |The Unicode standard defines a number of characters that
> > |conforming applications should recognize as line terminators:[7]
> > |
> > |LF:Line Feed, U+000A
> > |VT:Vertical Tab,
On Mon, 2 May 2022 at 18:31, Stefan Ram wrote:
>
> |The Unicode standard defines a number of characters that
> |conforming applications should recognize as line terminators:[7]
> |
> |LF:Line Feed, U+000A
> |VT:Vertical Tab, U+000B
> |FF:Form Feed, U+000C
> |CR:Carriage Return, U+0
On Mon, 2 May 2022 at 11:54, Cameron Simpson wrote:
>
> On 01May2022 23:30, Stefan Ram wrote:
> >Dan Stromberg writes:
> >>But what about Unicode? Are all 10 bytes newlines in Unicode encodings?
> > It seems in UTF-8, when a value is above U+007F, it will be
> > encoded with bytes that always
On 01May2022 23:30, Stefan Ram wrote:
>Dan Stromberg writes:
>>But what about Unicode? Are all 10 bytes newlines in Unicode encodings?
> It seems in UTF-8, when a value is above U+007F, it will be
> encoded with bytes that always have their high bit set.
Aye. Design festure enabling easy resy
On Mon, 2 May 2022 at 09:19, Dan Stromberg wrote:
>
> On Sun, May 1, 2022 at 3:19 PM Cameron Simpson wrote:
>
> > On 01May2022 18:55, Marco Sulla wrote:
> > >Something like this is OK?
> >
>
> Scanning backward for a byte == 10 in ASCII or ISO-8859 seems fine.
>
> But what about Unicode? Are al
On Sun, May 1, 2022 at 3:19 PM Cameron Simpson wrote:
> On 01May2022 18:55, Marco Sulla wrote:
> >Something like this is OK?
>
Scanning backward for a byte == 10 in ASCII or ISO-8859 seems fine.
But what about Unicode? Are all 10 bytes newlines in Unicode encodings?
If not, and you have a hu
On 01May2022 18:55, Marco Sulla wrote:
>Something like this is OK?
[...]
>def tail(f):
>chunk_size = 100
>size = os.stat(f.fileno()).st_size
I think you want os.fstat().
>positions = iter(range(size, -1, -chunk_size))
>next(positions)
I was wondering about the iter, but this mak
Something like this is OK?
import os
def tail(f):
chunk_size = 100
size = os.stat(f.fileno()).st_size
positions = iter(range(size, -1, -chunk_size))
next(positions)
chunk_line_pos = -1
pos = 0
for pos in positions:
f.seek(pos)
chars = f.read(chunk_si
On 26/04/2022 10.54, Cameron Simpson wrote:
> On 25Apr2022 08:08, DL Neil wrote:
>> Thus, the observation that the OP may find that a serial,
>> read-the-entire-file approach is faster is some situations (relatively
>> short files). Conversely, with longer files, some sort of 'last chunk'
>> appro
On 25Apr2022 08:08, DL Neil wrote:
>Thus, the observation that the OP may find that a serial,
>read-the-entire-file approach is faster is some situations (relatively
>short files). Conversely, with longer files, some sort of 'last chunk'
>approach would be superior.
If you make the chunk big enou
On 25/04/2022 04.21, pjfarl...@earthlink.net wrote:
>> -Original Message-
>> From: dn
>> Sent: Saturday, April 23, 2022 6:05 PM
>> To: python-list@python.org
>> Subject: Re: tail
>>
>
>> NB quite a few of IBM's (extensively researched) a
On Sun, 24 Apr 2022 12:21:36 -0400, declaimed the
following:
>
>WRT the mentioned IBM utility program[me]s, the non-Posix part of the IBM
>mainframe file system has always provided record-managed storage since the
>late 1960's (as opposed to the byte-managed storage of *ix systems) so
>searchi
> -Original Message-
> From: dn
> Sent: Saturday, April 23, 2022 6:05 PM
> To: python-list@python.org
> Subject: Re: tail
>
> NB quite a few of IBM's (extensively researched) algorithms which formed
> utility
> program[me]s on mainframes, made similar
On Sun, 24 Apr 2022 at 11:21, Roel Schroeven wrote:
> dn schreef op 24/04/2022 om 0:04:
> > Disagreeing with @Chris in the sense that I use tail very frequently,
> > and usually in the context of server logs - but I'm talking about the
> > Linux implementation, not Python code!
> If I understand
On Mon, 25 Apr 2022 at 01:47, Marco Sulla wrote:
>
>
>
> On Sat, 23 Apr 2022 at 23:18, Chris Angelico wrote:
>>
>> Ah. Well, then, THAT is why it's inefficient: you're seeking back one
>> single byte at a time, then reading forwards. That is NOT going to
>> play nicely with file systems or buffer
On Sun, 24 Apr 2022 at 00:19, Cameron Simpson wrote:
> An approach I think you both may have missed: mmap the file and use
> mmap.rfind(b'\n') to locate line delimiters.
> https://docs.python.org/3/library/mmap.html#mmap.mmap.rfind
>
Ah, I played very little with mmap, I didn't know about this.
On Sat, 23 Apr 2022 at 23:18, Chris Angelico wrote:
> Ah. Well, then, THAT is why it's inefficient: you're seeking back one
> single byte at a time, then reading forwards. That is NOT going to
> play nicely with file systems or buffers.
>
> Compare reading line by line over the file with readline
thon-list@python.org
Sent: Sun, Apr 24, 2022 5:19 am
Subject: Re: tail
dn schreef op 24/04/2022 om 0:04:
> Disagreeing with @Chris in the sense that I use tail very frequently,
> and usually in the context of server logs - but I'm talking about the
> Linux implementation, not Python co
On Sun, 24 Apr 2022 at 21:11, Antoon Pardon wrote:
>
>
>
> Op 23/04/2022 om 20:57 schreef Chris Angelico:
> > On Sun, 24 Apr 2022 at 04:37, Marco Sulla
> > wrote:
> >> What about introducing a method for text streams that reads the lines
> >> from the bottom? Java has also a ReversedLinesFileRea
Op 23/04/2022 om 20:57 schreef Chris Angelico:
On Sun, 24 Apr 2022 at 04:37, Marco Sulla wrote:
What about introducing a method for text streams that reads the lines
from the bottom? Java has also a ReversedLinesFileReader with Apache
Commons IO.
1) Read the entire file and decode bytes to
dn schreef op 24/04/2022 om 0:04:
Disagreeing with @Chris in the sense that I use tail very frequently,
and usually in the context of server logs - but I'm talking about the
Linux implementation, not Python code!
If I understand Marco correctly, what he want is to read the lines from
bottom to t
On Sun, 24 Apr 2022 at 10:04, Cameron Simpson wrote:
>
> On 24Apr2022 08:21, Chris Angelico wrote:
> >On Sun, 24 Apr 2022 at 08:18, Cameron Simpson wrote:
> >> An approach I think you both may have missed: mmap the file and use
> >> mmap.rfind(b'\n') to locate line delimiters.
> >> https://docs.
On 24Apr2022 08:21, Chris Angelico wrote:
>On Sun, 24 Apr 2022 at 08:18, Cameron Simpson wrote:
>> An approach I think you both may have missed: mmap the file and use
>> mmap.rfind(b'\n') to locate line delimiters.
>> https://docs.python.org/3/library/mmap.html#mmap.mmap.rfind
>
>Yeah, I made a v
On Sun, 24 Apr 2022 at 08:18, Cameron Simpson wrote:
>
> On 24Apr2022 07:15, Chris Angelico wrote:
> >On Sun, 24 Apr 2022 at 07:13, Marco Sulla
> >wrote:
> >> Emh, why chunks? My function simply reads byte per byte and compares
> >> it to b"\n". When it find it, it stops and do a readline():
>
On Sun, 24 Apr 2022 at 08:06, dn wrote:
>
> On 24/04/2022 09.15, Chris Angelico wrote:
> > On Sun, 24 Apr 2022 at 07:13, Marco Sulla
> > wrote:
> >>
> >> On Sat, 23 Apr 2022 at 23:00, Chris Angelico wrote:
> > This is quite inefficient in general.
>
> Why inefficient? I think that
On 24Apr2022 07:15, Chris Angelico wrote:
>On Sun, 24 Apr 2022 at 07:13, Marco Sulla wrote:
>> Emh, why chunks? My function simply reads byte per byte and compares
>> it to b"\n". When it find it, it stops and do a readline():
[...]
>> This is only for one line and in utf8, but it can be general
On Sun, 24 Apr 2022 at 08:03, Peter J. Holzer wrote:
>
> On 2022-04-24 04:57:20 +1000, Chris Angelico wrote:
> > On Sun, 24 Apr 2022 at 04:37, Marco Sulla
> > wrote:
> > > What about introducing a method for text streams that reads the lines
> > > from the bottom? Java has also a ReversedLinesFi
On 24/04/2022 09.15, Chris Angelico wrote:
> On Sun, 24 Apr 2022 at 07:13, Marco Sulla
> wrote:
>>
>> On Sat, 23 Apr 2022 at 23:00, Chris Angelico wrote:
> This is quite inefficient in general.
Why inefficient? I think that readlines() will be much slower, not
only more time c
On 2022-04-24 04:57:20 +1000, Chris Angelico wrote:
> On Sun, 24 Apr 2022 at 04:37, Marco Sulla
> wrote:
> > What about introducing a method for text streams that reads the lines
> > from the bottom? Java has also a ReversedLinesFileReader with Apache
> > Commons IO.
>
> It's fundamentally diffi
On Sun, 24 Apr 2022 at 07:13, Marco Sulla wrote:
>
> On Sat, 23 Apr 2022 at 23:00, Chris Angelico wrote:
> > > > This is quite inefficient in general.
> > >
> > > Why inefficient? I think that readlines() will be much slower, not
> > > only more time consuming.
> >
> > It depends on which is more
On Sat, 23 Apr 2022 at 23:00, Chris Angelico wrote:
> > > This is quite inefficient in general.
> >
> > Why inefficient? I think that readlines() will be much slower, not
> > only more time consuming.
>
> It depends on which is more costly: reading the whole file (cost
> depends on size of file) o
On Sun, 24 Apr 2022 at 06:41, Marco Sulla wrote:
>
> On Sat, 23 Apr 2022 at 20:59, Chris Angelico wrote:
> >
> > On Sun, 24 Apr 2022 at 04:37, Marco Sulla
> > wrote:
> > >
> > > What about introducing a method for text streams that reads the lines
> > > from the bottom? Java has also a Reversed
On Sat, 23 Apr 2022 at 20:59, Chris Angelico wrote:
>
> On Sun, 24 Apr 2022 at 04:37, Marco Sulla
> wrote:
> >
> > What about introducing a method for text streams that reads the lines
> > from the bottom? Java has also a ReversedLinesFileReader with Apache
> > Commons IO.
>
> It's fundamentally
On Sun, 24 Apr 2022 at 04:37, Marco Sulla wrote:
>
> What about introducing a method for text streams that reads the lines
> from the bottom? Java has also a ReversedLinesFileReader with Apache
> Commons IO.
It's fundamentally difficult to get precise. In general, there are
three steps to reading
On 10/08/2013 02:22 AM, Steven D'Aprano wrote:
On Mon, 07 Oct 2013 20:27:13 -0700, Mark Janssen wrote:
But even putting that aside, even if somebody wrote such a
description, it would be reductionism gone mad. What possible light
on the problem would be shined by a long, long list of machine co
Alain Ketterlin writes:
> Antoon Pardon writes:
>
> > Op 07-10-13 19:15, Alain Ketterlin schreef:
>
> [...]
> >> That's fine. My point was: you can't at the same time have full
> >> dynamicity *and* procedural optimizations (like tail call opt).
> >> Everybody should be clear about the trade-off.
Op 07-10-13 23:27, random...@fastmail.us schreef:
> On Sat, Oct 5, 2013, at 3:39, Antoon Pardon wrote:
>> What does this mean?
>>
>> Does it mean that a naive implementation would arbitrarily mess up
>> stack traces and he wasn't interested in investigating more
>> sophisticated implementations?
>>
On Mon, 07 Oct 2013 20:27:13 -0700, Mark Janssen wrote:
But even putting that aside, even if somebody wrote such a
description, it would be reductionism gone mad. What possible light
on the problem would be shined by a long, long list of machine code
operations, even if written
Op 08-10-13 01:50, Steven D'Aprano schreef:
> On Mon, 07 Oct 2013 15:47:26 -0700, Mark Janssen wrote:
>
>> I challenge you to get
>> down to the machine code in scheme and formally describe how it's doing
>> both.
>
> For which machine?
>
> Or are you assuming that there's only one machine code
Antoon Pardon writes:
> Op 07-10-13 19:15, Alain Ketterlin schreef:
[...]
>> That's fine. My point was: you can't at the same time have full
>> dynamicity *and* procedural optimizations (like tail call opt).
>> Everybody should be clear about the trade-off.
>
> Your wrong. Full dynamics is not i
1 - 100 of 176 matches
Mail list logo