Re: [Tutor] Finding line number by offset value.
Steven D'Aprano wrote: > On Mon, Feb 22, 2016 at 01:41:42AM +, Alan Gauld wrote: >> On 21/02/16 19:32, Cody West wrote: > >> > I'm trying to take 48L, which I believe is the character number, and >> > get the line number from that. The documentation isn't explicit, but """ with open('/foo/bar/my_file', 'rb') as f: matches = rules.match(data=f.read()) """ suggests that the library operates on bytes, not characters. >> I'm not totally clear what you mean but, if it is that 48L >> is the character count from the start of the file and you >> want to know the line number then you need to count the >> number of \n characters between the first and 48th >> characters. >> >> But thats depending on your line-end system of course, >> there may be two characters on each EOL... > > Provided your version of Python is built with "universal newline > support", and nearly every Python is, then if you open the file in text > mode, all end-of-lines are automatically converted to \n on reading. Be careful, *if* the numbers are byte offsets and you open the file in universal newlines mode or text mode your results will be unreliable. > If the file is small enough to read all at once, you can do this: > offset = 48 > text = the_file.read(offset) > print text.count('\n') It's the offset that matters, not the file size; the first 48 bytes of a terabyte file will easily fit into the memory of your Apple II ;) ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Finding line number by offset value.
On Mon, Feb 22, 2016 at 01:41:42AM +, Alan Gauld wrote: > On 21/02/16 19:32, Cody West wrote: > > I'm trying to take 48L, which I believe is the character number, and get > > the line number from that. > > I'm not totally clear what you mean but, if it is that 48L > is the character count from the start of the file and you > want to know the line number then you need to count the > number of \n characters between the first and 48th > characters. > > But thats depending on your line-end system of course, > there may be two characters on each EOL... Provided your version of Python is built with "universal newline support", and nearly every Python is, then if you open the file in text mode, all end-of-lines are automatically converted to \n on reading. If the file is small enough to read all at once, you can do this: offset = 48 text = the_file.read(offset) print text.count('\n') to print a line number starting from 0. If the file is too big to read all at once, you can do this: # untested running_total = 0 line_num = -1 offset = 4800 # say for line in the_file: running_total += len(line) line_num += 1 if running_total >= offset: print line_num break -- Steve ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Finding line number by offset value.
On 21/02/16 19:32, Cody West wrote: > I'm using yara-python for some file scanning and I'm trying to take the > offset in the 'strings' field and turn it into a line number. I know nothing about yara except that its some kind of pattern matching engine. However... > (48L, '$execution', 'eval(base64_decode') > > I'm trying to take 48L, which I believe is the character number, and get > the line number from that. I'm not totally clear what you mean but, if it is that 48L is the character count from the start of the file and you want to know the line number then you need to count the number of \n characters between the first and 48th characters. But thats depending on your line-end system of course, there may be two characters on each EOL... It depends on your OS/version and possibly the character encoding too. And if it's a binary file, who knows, as there won't really be any line endings. And, as I understand it, yara is targeted at reading byte patterns from binary files? Are you sure you really need the line number? -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor