[issue20992] reading individual bytes of multiple binary files using the Python module fileinput

2014-03-24 Thread Josh Rosenberg

Josh Rosenberg added the comment:

And of course, missed another typo. open's first arg should be file, not 
filename.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20992] reading individual bytes of multiple binary files using the Python module fileinput

2014-03-24 Thread Josh Rosenberg

Josh Rosenberg added the comment:

On memory: Yeah, it could be if the file didn't include any newline characters. 
Same problem could apply if a text input file relied on word wrap in an editor 
and included very few or no newlines itself.

There are non-fileinput ways of doing this, like I said; if you want consistent 
performance, you'd probably use one of them. For example, using the two arg 
form of iter:

from functools import partial

def bytefileinput(files):
for file in files:
with open(filename, "rb") as f:
yield from iter(partial(f.read, 1), b'')

Still kind of slow, but predictable on memory usage and not to complex.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20992] reading individual bytes of multiple binary files using the Python module fileinput

2014-03-24 Thread Tommy Carstensen

Tommy Carstensen added the comment:

I read the fileinput code and realized how heavily tied it is to line input.

Will reading individual bytes as suggested not be very memory intensive, if 
each line is billions of characters?

def bytefileinput():
return (bytes((b,)) for line in fileinput.input() for b in line)

I posted my workaround on stackoverflow (see link earlier in tread), which does 
not make use of the fileinput module at all. After having read through the 
fileinput code I agree that the module should only support reading lines and 
this enhancement request should be closed.

--
resolution:  -> rejected
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20992] reading individual bytes of multiple binary files using the Python module fileinput

2014-03-24 Thread Josh Rosenberg

Josh Rosenberg added the comment:

That example should have included mode="rb" when using fileinput.input(); oops. 
Pretend I didn't forget it.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20992] reading individual bytes of multiple binary files using the Python module fileinput

2014-03-24 Thread Josh Rosenberg

Josh Rosenberg added the comment:

fileinput's semantics are heavily tied to lines, not bytes. And processing 
binary files byte by byte is rather inefficient; can you explain why this 
feature would be of general utility such that it would be worth including it in 
the standard library?

It's not hard to just get a byte at a time using existing parts:

def bytefileinput():
return (bytes((b,)) for line in fileinput.input() for b in line)

There are ways to do similar things without using fileinput at all. But it 
really depends on your use case.

Giving fileinput a read() method isn't a bad idea assuming some reasonable 
behavior is defined for the various line oriented methods, but making it 
iterate binary mode input byte by byte would be a breaking change of limited 
utility in my view.

--
nosy: +josh.rosenberg

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20992] reading individual bytes of multiple binary files using the Python module fileinput

2014-03-24 Thread Berker Peksag

Changes by Berker Peksag :


--
stage:  -> needs patch
versions: +Python 3.5 -Python 3.3, Python 3.4

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20992] reading individual bytes of multiple binary files using the Python module fileinput

2014-03-20 Thread Tommy Carstensen

New submission from Tommy Carstensen:

This is my first post on bugs.python.org. I hope I abide to the rules. It was 
suggested to me on stackoverflow.com, that I request an enhancement to the 
module fileinput here:
http://stackoverflow.com/questions/22510123/reading-individual-bytes-of-multiple-binary-files-using-the-python-module-filein

I can read the first byte of a binary file like this:

with open(my_binary_file,'rb') as f:
f.read(1)

But when I run this code:

import fileinput
with fileinput.FileInput(my_binary_file,'rb') as f:
f.read(1)

then I get this error:

AttributeError: 'FileInput' object has no attribute 'read'

I would like to propose an enhancement to fileinput, which makes it possible to 
read binary files byte by byte.

I posted this solution to my problem:

def process_binary_files(list_of_binary_files):

for file in list_of_binary_files:
with open(file,'rb') as f:
yield f.read(1)

return

list_of_binary_files = ['f1', 'f2']
generate_byte = process_binary_files(list_of_binary_files)
byte = next(generate_byte)

--
components: Library (Lib)
messages: 214195
nosy: Tommy.Carstensen
priority: normal
severity: normal
status: open
title: reading individual bytes of multiple binary files using the Python 
module fileinput
type: enhancement
versions: Python 3.3, Python 3.4

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com