Re: Checking for EOF in stream

2007-02-27 Thread kousue
On Feb 19, 6:58 pm, GiBo <[EMAIL PROTECTED]> wrote:
> Hi!
>
> Classic situation - I have to process an input stream of unknown length
> until a I reach its end (EOF, End Of File). How do I check for EOF? The
> input stream can be anything from opened file through sys.stdin to a
> network socket. And it's binary and potentially huge (gigabytes), thus
> "for line in stream.readlines()" isn't really a way to go.

Could you use xreadlines()? It's a lazily-evaluated stream reader.

> For now I have roughly:
>
> stream = sys.stdin
> while True:
> data = stream.read(1024)
> process_data(data)
> if len(data) < 1024: ## (*)
> break
>
> I smell a fragile point at (*) because as far as I know e.g. network
> sockets streams may return less data than requested even when the socket
> is still open.

Well it depends on a lot of things. Is the stream blocking or non-
blocking (on sockets and some other sorts of streams, you can pick
this yourself)? What are the underlying semantics (reliable-and-
blocking TCP or dropping-and-unordered-UDP)? Unfortunately, you really
need to just know what you're working with (and there's really no
better solution; trying to hide the underlying semantics under a
proscribed overlaid set of semantics can only lead to badness in the
long run).

> I'd better like something like:
>
> while not stream.eof():
> ...
>
> but there is not eof() method :-(
>
> This is probably a trivial problem but I haven't found a decent solution.

For your case, it's not so hard:
http://pyref.infogami.com/EOFError says "read() and readline() methods
of file objects return an empty string when they hit EOF." so you
should assume that if something is claiming to be a file-like object
that it will work this way.

> Any hints?

So:
stream = sys.stdin
while True:
data = stream.read(1024)
if data=="":
break
process_data(data)

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Checking for EOF in stream

2007-02-20 Thread Nathan
On 2/19/07, Gabriel Genellina <[EMAIL PROTECTED]> wrote:
> En Mon, 19 Feb 2007 21:50:11 -0300, GiBo <[EMAIL PROTECTED]> escribió:
>
> > Grant Edwards wrote:
> >> On 2007-02-19, GiBo <[EMAIL PROTECTED]> wrote:
> >>>
> >>> Classic situation - I have to process an input stream of unknown length
> >>> until a I reach its end (EOF, End Of File). How do I check for EOF? The
> >>> input stream can be anything from opened file through sys.stdin to a
> >>> network socket. And it's binary and potentially huge (gigabytes), thus
> >>> "for line in stream.readlines()" isn't really a way to go.
> >>>
> >>> For now I have roughly:
> >>>
> >>> stream = sys.stdin
> >>> while True:
> >>> data = stream.read(1024)
> >> if len(data) == 0:
> >>  break  #EOF
> >>> process_data(data)
> >
> > Right, not a big difference though. Isn't there a cleaner / more
> > intuitive way? Like using some wrapper objects around the streams or
> > something?
>
> Read the documentation... For a true file object:
> read([size]) ... An empty string is returned when EOF is encountered
> immediately.
> All the other "file-like" objects (like StringIO, socket.makefile, etc)
> maintain this behavior.
> So this is the way to check for EOF. If you don't like how it was spelled,
> try this:
>
>if data=="": break
>
> If your data is made of lines of text, you can use the file as its own
> iterator, yielding lines:
>
> for line in stream:
>  process_line(line)
>
> --
> Gabriel Genellina
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>

Not to beat a dead horse, but I often do this:

data = f.read(bufsize):
while data:
# ... process data.
data = f.read(bufsize)


-The only annoying bit it the duplicated line.  I find I often follow
this pattern, and I realize python doesn't plan to have any sort of
do-while construct, but even still I prefer this idiom.  What's the
concensus here?

What about creating a standard binary-file iterator:

def blocks_of(infile, bufsize = 1024):
data = infile.read(bufsize)
if data:
yield data


-the use would look like this:

for block in blocks_of(myfile, bufsize = 2**16):
process_data(block) # len(block) <= bufsize...
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Checking for EOF in stream

2007-02-20 Thread Nathan
On 2/20/07, Nathan <[EMAIL PROTECTED]> wrote:
> On 2/19/07, Gabriel Genellina <[EMAIL PROTECTED]> wrote:
> > En Mon, 19 Feb 2007 21:50:11 -0300, GiBo <[EMAIL PROTECTED]> escribió:
> >
> > > Grant Edwards wrote:
> > >> On 2007-02-19, GiBo <[EMAIL PROTECTED]> wrote:
> > >>>
> > >>> Classic situation - I have to process an input stream of unknown length
> > >>> until a I reach its end (EOF, End Of File). How do I check for EOF? The
> > >>> input stream can be anything from opened file through sys.stdin to a
> > >>> network socket. And it's binary and potentially huge (gigabytes), thus
> > >>> "for line in stream.readlines()" isn't really a way to go.
> > >>>
> > >>> For now I have roughly:
> > >>>
> > >>> stream = sys.stdin
> > >>> while True:
> > >>> data = stream.read(1024)
> > >> if len(data) == 0:
> > >>  break  #EOF
> > >>> process_data(data)
> > >
> > > Right, not a big difference though. Isn't there a cleaner / more
> > > intuitive way? Like using some wrapper objects around the streams or
> > > something?
> >
> > Read the documentation... For a true file object:
> > read([size]) ... An empty string is returned when EOF is encountered
> > immediately.
> > All the other "file-like" objects (like StringIO, socket.makefile, etc)
> > maintain this behavior.
> > So this is the way to check for EOF. If you don't like how it was spelled,
> > try this:
> >
> >if data=="": break
> >
> > If your data is made of lines of text, you can use the file as its own
> > iterator, yielding lines:
> >
> > for line in stream:
> >  process_line(line)
> >
> > --
> > Gabriel Genellina
> >
> > --
> > http://mail.python.org/mailman/listinfo/python-list
> >
>
> Not to beat a dead horse, but I often do this:
>
> data = f.read(bufsize):
> while data:
> # ... process data.
> data = f.read(bufsize)
>
>
> -The only annoying bit it the duplicated line.  I find I often follow
> this pattern, and I realize python doesn't plan to have any sort of
> do-while construct, but even still I prefer this idiom.  What's the
> concensus here?
>
> What about creating a standard binary-file iterator:
>
> def blocks_of(infile, bufsize = 1024):
> data = infile.read(bufsize)
> if data:
> yield data
>
>
> -the use would look like this:
>
> for block in blocks_of(myfile, bufsize = 2**16):
> process_data(block) # len(block) <= bufsize...
>


(ahem), make that iterator something that works, like:

def blocks_of(infile, bufsize = 1024):
data = infile.read(bufsize)
while data:
yield data
data = infile.read(bufsize)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Checking for EOF in stream

2007-02-19 Thread Jon Ribbens
In article <[EMAIL PROTECTED]>, Gabriel Genellina wrote:
> So this is the way to check for EOF. If you don't like how it was spelled,  
> try this:
> 
>if data=="": break

How about:

  if not data: break

? ;-)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Checking for EOF in stream

2007-02-19 Thread Grant Edwards
On 2007-02-20, GiBo <[EMAIL PROTECTED]> wrote:

>>> stream = sys.stdin
>>> while True:
>>> data = stream.read(1024)
>>  if len(data) == 0:
>>  break  #EOF
>>> process_data(data)
>
> Right, not a big difference though. Isn't there a cleaner /
> more intuitive way?

A file is at EOF when read() returns ''.  The above is the
cleanest, simplest, most direct way to do what you specified.
Everybody does it that way, and everybody recognizes what's
being done.

It's also the "standard, Pythonic" way to do it.

> Like using some wrapper objects around the streams or
> something?

You can do that, but then you're mostly just obfuscating
things.

-- 
Grant Edwards   grante Yow!  Vote for ME
  at   -- I'm well-tapered,
   visi.comhalf-cocked, ill-conceived
   and TAX-DEFERRED!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: New-style classes (was Re: Checking for EOF in stream)

2007-02-19 Thread Steven Bethard
GiBo wrote:
> One more question - is it likely that StringIO will be turned into
> new-style class in the future? The reason I ask is whether I should try
> to deal with detection of new-/old-style classes or take the
> old-styleness for granted and set in stone instead.

In Python 3.0, everything will be new-style classes. In Python 2.X, it 
is extremely unlikely that anything will be upgraded to a new-style 
class. That would raise a number of backwards incompatibilities with 
relatively little benefit.

STeVe
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: New-style classes (was Re: Checking for EOF in stream)

2007-02-19 Thread GiBo
Gabriel Genellina wrote:
> En Mon, 19 Feb 2007 22:30:59 -0300, GiBo <[EMAIL PROTECTED]> escribió:
> 
>> Is there a reason why some classes distributed with Python 2.5 are not
>> new-style classes? For instance StringIO is apparently "old-style" class
>> i.e. not inherited from "object". Can I somehow turn an existing
>> old-style class to a new-style one? I tried for example:
>>  class MyStreamIO(StreamIO, object):
>>  pass
>> but got an error.
> 
> Try again (and look carefully the error text)

Ahhh, thanks! ;-)

Just for the record:

import StringIO
class MyStringIO(StringIO, object):
pass

must of course be:

class MyStringIO(StringIO.StringIO, object):
pass

One more question - is it likely that StringIO will be turned into
new-style class in the future? The reason I ask is whether I should try
to deal with detection of new-/old-style classes or take the
old-styleness for granted and set in stone instead.

GiBo
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Checking for EOF in stream

2007-02-19 Thread Gabriel Genellina
En Mon, 19 Feb 2007 21:50:11 -0300, GiBo <[EMAIL PROTECTED]> escribió:

> Grant Edwards wrote:
>> On 2007-02-19, GiBo <[EMAIL PROTECTED]> wrote:
>>>
>>> Classic situation - I have to process an input stream of unknown length
>>> until a I reach its end (EOF, End Of File). How do I check for EOF? The
>>> input stream can be anything from opened file through sys.stdin to a
>>> network socket. And it's binary and potentially huge (gigabytes), thus
>>> "for line in stream.readlines()" isn't really a way to go.
>>>
>>> For now I have roughly:
>>>
>>> stream = sys.stdin
>>> while True:
>>> data = stream.read(1024)
>> if len(data) == 0:
>>  break  #EOF
>>> process_data(data)
>
> Right, not a big difference though. Isn't there a cleaner / more
> intuitive way? Like using some wrapper objects around the streams or
> something?

Read the documentation... For a true file object:
read([size]) ... An empty string is returned when EOF is encountered  
immediately.
All the other "file-like" objects (like StringIO, socket.makefile, etc)  
maintain this behavior.
So this is the way to check for EOF. If you don't like how it was spelled,  
try this:

   if data=="": break

If your data is made of lines of text, you can use the file as its own  
iterator, yielding lines:

for line in stream:
 process_line(line)

-- 
Gabriel Genellina

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: New-style classes (was Re: Checking for EOF in stream)

2007-02-19 Thread Gabriel Genellina
En Mon, 19 Feb 2007 22:30:59 -0300, GiBo <[EMAIL PROTECTED]> escribió:

> Is there a reason why some classes distributed with Python 2.5 are not
> new-style classes? For instance StringIO is apparently "old-style" class
> i.e. not inherited from "object". Can I somehow turn an existing
> old-style class to a new-style one? I tried for example:
>   class MyStreamIO(StreamIO, object):
>   pass
> but got an error.

Try again (and look carefully the error text)

-- 
Gabriel Genellina

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Checking for EOF in stream

2007-02-19 Thread GiBo
Grant Edwards wrote:
> On 2007-02-19, GiBo <[EMAIL PROTECTED]> wrote:
>> Hi!
>>
>> Classic situation - I have to process an input stream of unknown length
>> until a I reach its end (EOF, End Of File). How do I check for EOF? The
>> input stream can be anything from opened file through sys.stdin to a
>> network socket. And it's binary and potentially huge (gigabytes), thus
>> "for line in stream.readlines()" isn't really a way to go.
>>
>> For now I have roughly:
>>
>> stream = sys.stdin
>> while True:
>>  data = stream.read(1024)
> if len(data) == 0:
>  break  #EOF
>>  process_data(data)

Right, not a big difference though. Isn't there a cleaner / more
intuitive way? Like using some wrapper objects around the streams or
something?

GiBo

-- 
http://mail.python.org/mailman/listinfo/python-list


New-style classes (was Re: Checking for EOF in stream)

2007-02-19 Thread GiBo
GiBo wrote:
> Hi!
> 
> Classic situation - I have to process an input stream of unknown length
> until a I reach its end (EOF, End Of File). How do I check for EOF? 
> [...]
> I'd better like something like:
> 
> while not stream.eof():
>   ...

Is there a reason why some classes distributed with Python 2.5 are not
new-style classes? For instance StringIO is apparently "old-style" class
i.e. not inherited from "object". Can I somehow turn an existing
old-style class to a new-style one? I tried for example:
class MyStreamIO(StreamIO, object):
pass
but got an error. Is my only chance taking StringIO.py from python tree,
adding "(object)" to the class declaration and distributing it with my
source?

FWIW I created a wrapper class that does what I need with the EOF
problem in quite elegant way, but due to __getattribute__() stuff it
only works with new-style classes :-(

class MyStreamer(object):
def __init__(self, stream):
self._stream = stream
self.eof = False

def read(self, amount = None):
data = self._stream.read(amount)
if len(data) == 0:
self.eof = True
raise EOFError
return data

def __getattribute__(self, attribute):
try:
return object.__getattribute__(self, attribute)
except AttributeError:
pass
return self._stream.__getattribute__(attribute)


GiBo


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Checking for EOF in stream

2007-02-19 Thread Grant Edwards
On 2007-02-19, GiBo <[EMAIL PROTECTED]> wrote:
> Hi!
>
> Classic situation - I have to process an input stream of unknown length
> until a I reach its end (EOF, End Of File). How do I check for EOF? The
> input stream can be anything from opened file through sys.stdin to a
> network socket. And it's binary and potentially huge (gigabytes), thus
> "for line in stream.readlines()" isn't really a way to go.
>
> For now I have roughly:
>
> stream = sys.stdin
> while True:
>   data = stream.read(1024)
if len(data) == 0:
 break  #EOF
>   process_data(data)

-- 
Grant Edwards   grante Yow!  CALIFORNIA is where
  at   people from IOWA or NEW
   visi.comYORK go to subscribe to
   CABLE TELEVISION!!
-- 
http://mail.python.org/mailman/listinfo/python-list


Checking for EOF in stream

2007-02-19 Thread GiBo
Hi!

Classic situation - I have to process an input stream of unknown length
until a I reach its end (EOF, End Of File). How do I check for EOF? The
input stream can be anything from opened file through sys.stdin to a
network socket. And it's binary and potentially huge (gigabytes), thus
"for line in stream.readlines()" isn't really a way to go.

For now I have roughly:

stream = sys.stdin
while True:
data = stream.read(1024)
process_data(data)
if len(data) < 1024:## (*)
break

I smell a fragile point at (*) because as far as I know e.g. network
sockets streams may return less data than requested even when the socket
is still open.

I'd better like something like:

while not stream.eof():
...

but there is not eof() method :-(

This is probably a trivial problem but I haven't found a decent solution.

Any hints?

Thanks!

GiBo
-- 
http://mail.python.org/mailman/listinfo/python-list