On 2022-05-06 20:21, Marco Sulla wrote:
I have a little problem.I tried to extend the tail function, so it can read lines from the bottom of a file object opened in text mode. The problem is it does not work. It gets a starting position that is lower than the expected by 3 characters. So the first line is read only for 2 chars, and the last line is missing. import os _lf = "\n" _cr = "\r" _lf_ord = ord(_lf) def tail(f, n=10, chunk_size=100): n_chunk_size = n * chunk_size pos = os.stat(f.fileno()).st_size chunk_line_pos = -1 lines_not_found = n binary_mode = "b" in f.mode lf = _lf_ord if binary_mode else _lf while pos != 0: pos -= n_chunk_size if pos < 0: pos = 0 f.seek(pos) chars = f.read(n_chunk_size) for i, char in enumerate(reversed(chars)): if char == lf: lines_not_found -= 1 if lines_not_found == 0: chunk_line_pos = len(chars) - i - 1 print(chunk_line_pos, i) break if lines_not_found == 0: break line_pos = pos + chunk_line_pos + 1 f.seek(line_pos) res = b"" if binary_mode else "" for i in range(n): res += f.readline() return res Maybe the problem is 1 char != 1 byte?
Is the file UTF-8? That's a variable-width encoding, so are any of the characters > U+007F?
Which OS? On Windows, it's common/normal for UTF-8 files to start with a BOM/signature, which is 3 bytes/1 codepoint.
-- https://mail.python.org/mailman/listinfo/python-list
