Re: Flush stdin

2014-10-22 Thread Marko Rauhamaa
random...@fastmail.us:

> Yes, and 90% of the time, when someone says they want to "flush
> stdin", what they really want to do is go to the next line after
> they've sloppily read part of the line they're on (and the behavior
> they are seeing that they object to is that their next read function
> reads the rest of the current line). The appropriate course of action
> in these cases is to actually read to the next newline and discard the
> data, not to do any kind of flush.

I'm not sure I have seen that. However, somewhat analogously, there are
linux text utilities that read a number of lines and leave the input
intact. Since you can't really effectively read lines, the utilities
routinely read past the designated endpoint and then seek back to the
end of the line.

For example, consider this script:

seq 2 >test.dat
{
 head -n 5 >/dev/null
 head -n 5
} /dev/null
 head -n 5
}

I get:


1861
1862
1863
1864

because you can't seek back a pipe. The first "head" command has
greedily read in the first 1860 lines and the second one continues where
the first one left off.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Flush stdin

2014-10-22 Thread random832
On Tue, Oct 21, 2014, at 19:16, Dan Stromberg wrote:
> Actually, doesn't line buffering sometimes exist inside an OS kernel?
> stty/termios/termio/sgtty relate here, for *ix examples.  Supporting
> code: http://stromberg.dnsalias.org/~strombrg/ttype/  It turns on
> character-at-a-time I/O in the tty driver via a variety of methods for
> portability.  I wrote it in C before I took an interest in Python.

Yes, and 90% of the time, when someone says they want to "flush stdin",
what they really want to do is go to the next line after they've
sloppily read part of the line they're on (and the behavior they are
seeing that they object to is that their next read function reads the
rest of the current line). The appropriate course of action in these
cases is to actually read to the next newline and discard the data, not
to do any kind of flush.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Flush stdin

2014-10-22 Thread Marko Rauhamaa
Dan Stromberg :

> On Mon, Oct 20, 2014 at 9:41 PM, Marko Rauhamaa  wrote:
>> Terminal devices support line buffering on write.
> Yes, though that's not the only place it's useful.
>
>> Line buffering on read is an illusion created by higher-level libraries.
>> The low-level read function reads in blocks of bytes.
>
> Actually, doesn't line buffering sometimes exist inside an OS kernel?
> stty/termios/termio/sgtty relate here, for *ix examples.  Supporting
> code: http://stromberg.dnsalias.org/~strombrg/ttype/  It turns on
> character-at-a-time I/O in the tty driver via a variety of methods for
> portability.  I wrote it in C before I took an interest in Python.

I was being sloppy in my TTY terminology. A TTY device is running inside
the kernel and thus "writes" by copying bytes from its kernel buffer
into the user space when the user space process calls read(2).


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Flush stdin

2014-10-21 Thread Chris Angelico
On Wed, Oct 22, 2014 at 4:38 PM, Marko Rauhamaa  wrote:
> Dan Stromberg :
>
>> On Mon, Oct 20, 2014 at 9:41 PM, Marko Rauhamaa  wrote:
>>> Nagle affects the communication between the peer OS kernels and isn't
>>> directly related to anything the application does.
>>
>> Actually, Nagle can cause two or more small packets to be merged,
>> which is something an application must be able to deal with, because
>> they could show up in the receiving application as one or more (but
>> anyway: fewer) merged recv()'s.
>
> Packets have barely anything to do with TCP sockets since they provide
> an octet stream abstraction.

TCP does abstract over the individual packets, but they are still important.

>> Of course, but who's doing one byte per second?  You and I in our
>> tests, and perhaps some application developers with remarkably
>> undemanding I/O.  That doesn't really mean we should _recommend_ a
>> series of os.read(0, 1)'s.
>
> No, here's my statement: if you need to process input as soon as it
> becomes available, you can't use sys.stdin. Instead, you need to use
> os.read().
>
> You typically supply os.read() with a buffer of a kilobyte or more. Key
> is, os.read() returns right away if fewer bytes are available.

Then your statement is false. Maybe it's not *efficient* if you always
use sys.stdin.read(1), but you certainly can do it. It's not that you
*need to* use something else.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Flush stdin

2014-10-21 Thread Marko Rauhamaa
Dan Stromberg :

> On Mon, Oct 20, 2014 at 9:41 PM, Marko Rauhamaa  wrote:
>> Nagle affects the communication between the peer OS kernels and isn't
>> directly related to anything the application does.
>
> Actually, Nagle can cause two or more small packets to be merged,
> which is something an application must be able to deal with, because
> they could show up in the receiving application as one or more (but
> anyway: fewer) merged recv()'s.

Packets have barely anything to do with TCP sockets since they provide
an octet stream abstraction.

> Of course, but who's doing one byte per second?  You and I in our
> tests, and perhaps some application developers with remarkably
> undemanding I/O.  That doesn't really mean we should _recommend_ a
> series of os.read(0, 1)'s.

No, here's my statement: if you need to process input as soon as it
becomes available, you can't use sys.stdin. Instead, you need to use
os.read().

You typically supply os.read() with a buffer of a kilobyte or more. Key
is, os.read() returns right away if fewer bytes are available.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Flush stdin

2014-10-21 Thread Dan Stromberg
On Tue, Oct 21, 2014 at 7:49 PM, Nobody  wrote:
> On Sat, 18 Oct 2014 18:42:00 -0700, Dan Stromberg wrote:
>
>> On Sat, Oct 18, 2014 at 6:34 PM, Dan Stromberg  wrote:
 Once the "nc" process actually write()s the data to its standard
 output (i.e. desriptor 1, not the "stdout" FILE*)
>>> I'm not sure why you're excluding stdout, but even if nc is using
>>> filedes 1 instead of FILE * stdout, isn't it kind of irrelevant?
>>
>> On further reflection, isn't it stdio that does the varied buffering,
>> and filedes 1 that's always unbuffered?  IOW, the OP might wish nc was
>> using 1, but it probably can't be given what they're seeing.
>
> Yes. stdio does buffering. Writing to stdout stores data in a buffer; that
> data should eventually be written to descriptor 1, although perhaps not
> until immediately prior to termination.
>
> Which is probably the cause of the OP's problem.

Huh.  And here I thought I demonstrated elsewhere in this thread, that
the buffering between nc and python didn't appear to be the problem.

'found it, here it is again:

If I run the following in one tty:
nc -l localhost 9000 | /tmp/z

...where /tmp/z has just:
#!/usr/bin/python3

import sys

for line in sys.stdin.buffer:
print(line)

And then run the following in another tty on the same computer:
while read line; do echo $line; sleep 1; done < /etc/passwd | nc
localhost 9000

...then everything acts line buffered, or perhaps even character
buffered (the two are pretty indistinguishable in this test).  What I
see is my /etc/passwd file popping out of the nc -l side, one line at
a time, each line one second apart.

I suppose this suggests that it's the client that's sending TCP data
that is buffering.

That, or we're using two different versions of netcat (there are at
least two available).
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Flush stdin

2014-10-21 Thread Nobody
On Sat, 18 Oct 2014 18:42:00 -0700, Dan Stromberg wrote:

> On Sat, Oct 18, 2014 at 6:34 PM, Dan Stromberg  wrote:
>>> Once the "nc" process actually write()s the data to its standard
>>> output (i.e. desriptor 1, not the "stdout" FILE*)
>> I'm not sure why you're excluding stdout, but even if nc is using
>> filedes 1 instead of FILE * stdout, isn't it kind of irrelevant?
> 
> On further reflection, isn't it stdio that does the varied buffering,
> and filedes 1 that's always unbuffered?  IOW, the OP might wish nc was
> using 1, but it probably can't be given what they're seeing.

Yes. stdio does buffering. Writing to stdout stores data in a buffer; that
data should eventually be written to descriptor 1, although perhaps not
until immediately prior to termination.

Which is probably the cause of the OP's problem.

If it is, using a pseudo-tty would probably fix it. At startup,
stdin and stdout are line-buffered if they are associated with a tty and
fully-buffered otherwise (file, pipe, ...); stderr is unbuffered.

At least, this is the case on Unix and Windows. The exact requirements of
the C standard are:

As initially opened, the standard error stream is not fully
buffered; the standard input and standard output streams are
fully buffered if and only if the stream can be determined not
to refer to an interactive device.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Flush stdin

2014-10-21 Thread Cameron Simpson

On 21Oct2014 16:16, Dan Stromberg  wrote:
[...snip...]

This is tremendously inefficient.  It demands a context switch for
every character.


Inefficiency isn't an issue when you generate one byte a second.


Of course, but who's doing one byte per second?  You and I in our
tests, and perhaps some application developers with remarkably
undemanding I/O.  That doesn't really mean we should _recommend_ a
series of os.read(0, 1)'s.


Indeed not. But there is one glaring exception: the shell's read builtin.  
Because it can be interspersed in a script between other input-consuming 
commands, it _must_ read no more than one line, and therefore is has to read in 
increments of 1 character.


Of course, that says nothing about the upstream write() granularity.

I now return y'all to your regularly sheduled nit picking.

Cheers,
Cameron Simpson 

If it ain't broken, keep playing with it.
--
https://mail.python.org/mailman/listinfo/python-list


Re: Flush stdin

2014-10-21 Thread Dan Stromberg
On Mon, Oct 20, 2014 at 9:41 PM, Marko Rauhamaa  wrote:
> Dan Stromberg :
>
>> Often with TCP protocols, line buffered is preferred to character
>> buffered,
>
> Terminal devices support line buffering on write.

Yes, though that's not the only place it's useful.

> Line buffering on read is an illusion created by higher-level libraries.
> The low-level read function reads in blocks of bytes.

Actually, doesn't line buffering sometimes exist inside an OS kernel?
stty/termios/termio/sgtty relate here, for *ix examples.  Supporting
code: http://stromberg.dnsalias.org/~strombrg/ttype/  It turns on
character-at-a-time I/O in the tty driver via a variety of methods for
portability.  I wrote it in C before I took an interest in Python.

Also, here's some supporting documentation:
http://man7.org/linux/man-pages/man3/stdout.3.html - excerpt:
Indeed, normally terminal input is line buffered in the kernel.

But even if line buffering (or even character buffering) were never in
the kernel, calling it an illusion is perhaps going a little far.
It's useful sometimes, irrespective of where it comes from.
"Illusion" has a bit of an undeserved pejorative connotation.

>> Also, it's a straightforward way of framing your data, to avoid
>> getting messed up by Nagle or fragmentation.
>
> Nagle affects the communication between the peer OS kernels and isn't
> directly related to anything the application does.

Actually, Nagle can cause two or more small packets to be merged,
which is something an application must be able to deal with, because
they could show up in the receiving application as one or more (but
anyway: fewer) merged recv()'s.  That's one reason why something like
http://stromberg.dnsalias.org/~strombrg/bufsock.html can be helpful.

> Also, Nagle doesn't
> play any role with pipes.

Yes, but pipes aren't the only thing involved in the OP's question.
You "simplified" the problem down to pipes, but that doesn't really
capture the complete essence of the matter.  Nagle is one of the
reasons.

>>> 
>>> $ bash ./test.sh | strace python3 ./test.py
>>> ...
>>> read(0, "x", 4096)  = 1
>>> read(0, "x", 4096)  = 1
>>> read(0, "x", 4096)  = 1
>>> read(0, "x", 4096)  = 1
>>> read(0, "x", 4096)  = 1
>>> fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 3), ...}) = 0
>>> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
>>> 0) = 0x7f3143bab000
>>> write(1, "120\n", 4120
>>> )= 4
>>> ...
>> 
>>
>> This is tremendously inefficient.  It demands a context switch for
>> every character.
>
> Inefficiency isn't an issue when you generate one byte a second.

Of course, but who's doing one byte per second?  You and I in our
tests, and perhaps some application developers with remarkably
undemanding I/O.  That doesn't really mean we should _recommend_ a
series of os.read(0, 1)'s.

> If data
> were generated at a brisker pace, "read(0, ..., 4096)" could get more
> bytes at a time. Notice that even if the Python code requests 5 bytes,
> CPython requests up to 4096 bytes in a single read.

Not if you use os.read(0, 1), for example, which was what you appeared
to be recommending.  os.read(0, 1) (when on a pipe) makes a call into
kernel space via a context switch, once for each os.read(0, 1).

I guess I should add that when you do an os.read(0, 1), and see it
show up in strace, strace is showing kernel<->userspace interactions,
not library stuff, and not stuff in an application that sits above
libraries.  ltrace shows some of the library stuff, but probably not
all of it - I haven't studied ltrace as much as I have strace.

Just wondering: Are we helping the OP?
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Flush stdin

2014-10-20 Thread Marko Rauhamaa
Dan Stromberg :

> Often with TCP protocols, line buffered is preferred to character
> buffered,

Terminal devices support line buffering on write.

Line buffering on read is an illusion created by higher-level libraries.
The low-level read function reads in blocks of bytes.

> Also, it's a straightforward way of framing your data, to avoid
> getting messed up by Nagle or fragmentation.

Nagle affects the communication between the peer OS kernels and isn't
directly related to anything the application does. Also, Nagle doesn't
play any role with pipes.

>> 
>> $ bash ./test.sh | strace python3 ./test.py
>> ...
>> read(0, "x", 4096)  = 1
>> read(0, "x", 4096)  = 1
>> read(0, "x", 4096)  = 1
>> read(0, "x", 4096)  = 1
>> read(0, "x", 4096)  = 1
>> fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 3), ...}) = 0
>> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
>> 0) = 0x7f3143bab000
>> write(1, "120\n", 4120
>> )= 4
>> ...
> 
>
> This is tremendously inefficient.  It demands a context switch for
> every character.

Inefficiency isn't an issue when you generate one byte a second. If data
were generated at a brisker pace, "read(0, ..., 4096)" could get more
bytes at a time. Notice that even if the Python code requests 5 bytes,
CPython requests up to 4096 bytes in a single read.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Flush stdin

2014-10-20 Thread Dan Stromberg
On Mon, Oct 20, 2014 at 4:18 PM, Marko Rauhamaa  wrote:
> Dan Stromberg :
>> ...then everything acts line buffered, or perhaps even character
>> buffered [...]
>>
>> That, or we're using two different versions of netcat (there are at
>> least two available).
>
> Let's unconfuse the issue a bit. I'll take line buffering, netcat and
> the OS out of the picture.
>
> Here's a character generator (test.sh):
> 
> while : ; do
> echo -n x
> sleep 1
> done
> 
>
> and here's a character sink (test.py):
> 
> import sys
> while True:
> c = sys.stdin.read(1)
> if not c:
> break
> print(ord(c[0]))
> 
>
> Then, I run:
> 
> $ bash ./test.sh | python3 ./test.py
> 120
> 120
> 120
> 120
> 
>
> The lines are output at one-second intervals.
>
> That demonstrates that sys.stdin.read(1) does not block for more than
> one character. IOW, there is no buffering whatsoever.

Aren't character-buffered and unbuffered synonymous?

Often with TCP protocols, line buffered is preferred to character
buffered, both for performance and for simplicity: it doesn't suffer
from tinygrams (as much), and telnet becomes a useful test client.

Also, it's a straightforward way of framing your data, to avoid
getting messed up by Nagle or fragmentation.  One might find
http://stromberg.dnsalias.org/~strombrg/bufsock.html worth a glance.
It's buffered, but it keeps things framed, and doesn't fall prey to
tinygrams nearly as much as character buffering.

> If I change the sink a bit: "c = sys.stdin.read(5)", I get the same
> output but at five-second intervals indicating that sys.stdin.read()
> calls the underlying os.read() function five times before returning. In
> fact, that conclusion is made explicit by running:
>
> 
> $ bash ./test.sh | strace python3 ./test.py
> ...
> read(0, "x", 4096)  = 1
> read(0, "x", 4096)  = 1
> read(0, "x", 4096)  = 1
> read(0, "x", 4096)  = 1
> read(0, "x", 4096)  = 1
> fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 3), ...}) = 0
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
> 0x7f3143bab000
> write(1, "120\n", 4120
> )= 4
> ...


This is tremendously inefficient.  It demands a context switch for
every character.

> If I modify test.py to call os.read():
> 
> import os
> while True:
> c = os.read(0, 5)
> if not c:
> break
> print(ord(c[0]))
> 
>
> The output is again printed at one-second intervals: no buffering.
>
> Thus, we are back at my suggestion: use os.read() if you don't want
> Python to buffer stdin for you.

It's true that Python won't buffer (or will be character-buffered)
then, but that takes some potentially-salient elements out of the
picture.  IOW, I don't think Python reading unbuffered is necessarily
the whole issue, and may even be going to far.

I have a habit of saying "necessary, but not necessarily sufficient",
but in this case I believe it's more of a "not necessarily necessary,
and not necessarily sufficient".  A lot depends on the other pieces of
the puzzle that you've chosen to "unconfuse" away.  Yes, you can make
Python unbuffered/character-buffered, but that's not the whole story.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Flush stdin

2014-10-20 Thread Marko Rauhamaa

Dan Stromberg :

> ...then everything acts line buffered, or perhaps even character
> buffered [...]
>
> That, or we're using two different versions of netcat (there are at
> least two available).

Let's unconfuse the issue a bit. I'll take line buffering, netcat and
the OS out of the picture.

Here's a character generator (test.sh):

while : ; do
echo -n x
sleep 1
done


and here's a character sink (test.py):

import sys
while True:
c = sys.stdin.read(1)
if not c:
break
print(ord(c[0]))


Then, I run:

$ bash ./test.sh | python3 ./test.py
120
120
120
120


The lines are output at one-second intervals.

That demonstrates that sys.stdin.read(1) does not block for more than
one character. IOW, there is no buffering whatsoever.

If I change the sink a bit: "c = sys.stdin.read(5)", I get the same
output but at five-second intervals indicating that sys.stdin.read()
calls the underlying os.read() function five times before returning. In
fact, that conclusion is made explicit by running:


$ bash ./test.sh | strace python3 ./test.py
...
read(0, "x", 4096)  = 1
read(0, "x", 4096)  = 1
read(0, "x", 4096)  = 1
read(0, "x", 4096)  = 1
read(0, "x", 4096)  = 1
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 3), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x7f3143bab000
write(1, "120\n", 4120
)= 4
...


If I modify test.py to call os.read():

import os
while True:
c = os.read(0, 5)
if not c:
break
print(ord(c[0]))


The output is again printed at one-second intervals: no buffering.

Thus, we are back at my suggestion: use os.read() if you don't want
Python to buffer stdin for you.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Flush stdin

2014-10-20 Thread Marko Rauhamaa
Dan Stromberg :

> On Sun, Oct 19, 2014 at 9:45 PM, Marko Rauhamaa  wrote:
>> I found this comment in CPython's source code (pythonrun.c):
>>
>> /* stdin is always opened in buffered mode, first because it shouldn't
>>make a difference in common use cases, second because TextIOWrapper
>>depends on the presence of a read1() method which only exists on
>>buffered streams.
>> */
>>
>> The solution is to use os.read().
>
> Seriously?

I wasn't joking.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Flush stdin

2014-10-20 Thread Dan Stromberg
If I run the following in one tty:
nc -l localhost 9000 | /tmp/z

...where /tmp/z has just:
#!/usr/bin/python3

import sys

for line in sys.stdin.buffer:
print(line)

And then run the following in another tty on the same computer:
while read line; do echo $line; sleep 1; done < /etc/passwd | nc
localhost 9000

...then everything acts line buffered, or perhaps even character
buffered (the two are pretty indistinguishable in this test).  What I
see is my /etc/passwd file popping out of the nc -l side, one line at
a time, each line one second apart.

I suppose this suggests that it's the client that's sending TCP data
that is buffering.

That, or we're using two different versions of netcat (there are at
least two available).
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Flush stdin

2014-10-20 Thread Dan Stromberg
On Sun, Oct 19, 2014 at 9:45 PM, Marko Rauhamaa  wrote:
> I found this comment in CPython's source code (pythonrun.c):
>
> /* stdin is always opened in buffered mode, first because it shouldn't
>make a difference in common use cases, second because TextIOWrapper
>depends on the presence of a read1() method which only exists on
>buffered streams.
> */
>
> The solution is to use os.read().

Seriously?
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Flush stdin

2014-10-19 Thread Cameron Simpson

On 18Oct2014 18:42, Dan Stromberg  wrote:

On Sat, Oct 18, 2014 at 6:34 PM, Dan Stromberg  wrote:

Once the "nc" process actually write()s the data to its standard
output (i.e. desriptor 1, not the "stdout" FILE*)

I'm not sure why you're excluding stdout, but even if nc is using
filedes 1 instead of FILE * stdout, isn't it kind of irrelevant?


On further reflection, isn't it stdio that does the varied buffering,
and filedes 1 that's always unbuffered?  IOW, the OP might wish nc was
using 1, but it probably can't be given what they're seeing.


Traditionally, fd 1 (standard output, _generally_ associated with FILE 
*stdout), gets stdio buffering; line buffered for a terminal, block buffered 
otherwise. fd 2 (standard error, _generally_ associated with FILE *stderr) gets 
an unbuffered stdio stream by default.


However, nc may well be behaving like "tail -f": always unbuffered.

However, as I recall the OP seemed to want to "flush" the stream from nc to 
python. Even if nc itself does no buffering (handing data to the OS as soon as 
received, highly desirable for a tool like nc), the OS keeps a buffer for the 
pipeline between nc and python, and python itself keeps a buffer for sys.stdin.


Both of those are candidates for some kind of flush/discard. IF (a big IF) that 
is what the OP really needs.


Have we heard anything from the OP since this discussion took off?

I think we need to better understand his/her use case.

Cheers,
Cameron Simpson 

Do you even know anything about perl? - AC replying to Tom Christiansen post
--
https://mail.python.org/mailman/listinfo/python-list


Re: Flush stdin

2014-10-19 Thread Marko Rauhamaa
Cameron Simpson :

> Even if nc itself does no buffering (handing data to the OS as soon as
> received, highly desirable for a tool like nc), the OS keeps a buffer
> for the pipeline between nc and python,

Yes, there is a buffer associated with the pipe, but linux/unix never
withholds any data from the reader. As soon as there is a single byte in
the pipe buffer, the reader process becomes ready to run and read(2) on
the pipe returns immediately.

> and python itself keeps a buffer for sys.stdin.

I found this comment in CPython's source code (pythonrun.c):

/* stdin is always opened in buffered mode, first because it shouldn't
   make a difference in common use cases, second because TextIOWrapper
   depends on the presence of a read1() method which only exists on
   buffered streams.
*/

The solution is to use os.read().


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Flush stdin

2014-10-18 Thread Dan Stromberg
On Sat, Oct 18, 2014 at 6:34 PM, Dan Stromberg  wrote:
>> Once the "nc" process actually write()s the data to its standard
>> output (i.e. desriptor 1, not the "stdout" FILE*)
> I'm not sure why you're excluding stdout, but even if nc is using
> filedes 1 instead of FILE * stdout, isn't it kind of irrelevant?

On further reflection, isn't it stdio that does the varied buffering,
and filedes 1 that's always unbuffered?  IOW, the OP might wish nc was
using 1, but it probably can't be given what they're seeing.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Flush stdin

2014-10-18 Thread Dan Stromberg
On Sat, Oct 18, 2014 at 6:11 PM, Nobody  wrote:
> On Sat, 18 Oct 2014 12:32:07 -0500, Tim Chase wrote:
>
>> On 2014-10-18 17:55, Nobody wrote:
>>> On Fri, 17 Oct 2014 12:38:54 +0100, Empty Account wrote:
>>>
>>> > I am using netcat to listen to a port and python to read stdin and
>>> > print to the console.
>>> >
>>> > nc -l 2003 | python print_metrics.py
>>> >
>>> > sys.stdin.flush() doesn’t seem to flush stdin,
>>>
>>> You can't "flush" an input stream.
>>
>> You can't flush it, but you can make it unbuffered.  You can either force
>> python to use unbuffered stdio:
>
> [snipped]
>
> None of this helps in any way, as it's not the behaviour of the Python
> script which is causing the problem, but that "nc" is (probably) buffering
> its output, so the data isn't passed to the OS (let alone to the Python
> script) in a timely manner.

Agreed.

> Once the "nc" process actually write()s the data to its standard
> output (i.e. desriptor 1, not the "stdout" FILE*)
I'm not sure why you're excluding stdout, but even if nc is using
filedes 1 instead of FILE * stdout, isn't it kind of irrelevant?

> it will be available to
> the Python script immediately thereafter without requiring any low-level
> tweaks.
Which, on a pipe, generally means either the buffer filled and needed
to be passed along to make room, or the process exited.

I'd probably rewrite just enough nc in Python to make it so you don't
need to depend on a pipe (example:
http://stromberg.dnsalias.org/~strombrg/pnetcat.html), but if you're
on *ix you could try
http://ftp.sunet.se/pub/usenet/ftp.uu.net/comp.sources.unix/volume23/pty/
in an effort to persuade nc to think that it's on a tty and hence
should output line buffered data instead of block buffered - despite
being on a pipe in reality.

HTH.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Flush stdin

2014-10-18 Thread Chris Angelico
On Fri, Oct 17, 2014 at 10:38 PM, Empty Account  wrote:
> I am using netcat to listen to a port and python to read stdin and print to
> the console.
>
> nc -l 2003 | python print_metrics.py

After lengthy discussion about what it means to flush stdin, I think
it's high time someone asked the question: Why not skip nc altogether,
and have your Python program do its own socket work? Then you don't
have to worry about stream flushing at all.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Flush stdin

2014-10-18 Thread Nobody
On Sat, 18 Oct 2014 12:32:07 -0500, Tim Chase wrote:

> On 2014-10-18 17:55, Nobody wrote:
>> On Fri, 17 Oct 2014 12:38:54 +0100, Empty Account wrote:
>> 
>> > I am using netcat to listen to a port and python to read stdin and
>> > print to the console.
>> > 
>> > nc -l 2003 | python print_metrics.py
>> > 
>> > sys.stdin.flush() doesn’t seem to flush stdin,
>> 
>> You can't "flush" an input stream.
> 
> You can't flush it, but you can make it unbuffered.  You can either force
> python to use unbuffered stdio:

[snipped]

None of this helps in any way, as it's not the behaviour of the Python
script which is causing the problem, but that "nc" is (probably) buffering
its output, so the data isn't passed to the OS (let alone to the Python
script) in a timely manner.

Once the "nc" process actually write()s the data to its standard
output (i.e. desriptor 1, not the "stdout" FILE*), it will be available to
the Python script immediately thereafter without requiring any low-level
tweaks.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Flush stdin

2014-10-18 Thread Terry Reedy

On 10/18/2014 5:01 PM, Cameron Simpson wrote:

On 18Oct2014 17:55, Nobody  wrote:

On Fri, 17 Oct 2014 12:38:54 +0100, Empty Account wrote:

I am using netcat to listen to a port and python to read stdin and print
to the console.

nc -l 2003 | python print_metrics.py

sys.stdin.flush() doesn’t seem to flush stdin,


You can't "flush" an input stream.



Sure you can.


You are collectively confusing three different meaning of 'flush'. 
Python and its docs are the best authority on what python can and cannot 
do.  One can "call .flush() on an imput stream" (meaning 1).


>>> import sys
>>> sys.stdin.flush()
>>>

However, one cannot "empty the steam buffer by calling .flush()" 
(meaning 2).


" class IOBase
...
flush()
Flush the write buffers of the stream if applicable. This does 
nothing for read-only and non-blocking streams."



Most streams are read through an API which buffers. That
buffer can be discarded.


But one can "empty and discard the buffer" (meaning 3) with
stream.read()

And, of course, an 'input stream' 'down here' is an 'output stream' 'up 
there, where ever' and one can 'flush' pending output into the stream so 
that it can be read here.


--
Terry Jan Reedy


--
https://mail.python.org/mailman/listinfo/python-list


Re: Flush stdin

2014-10-18 Thread Cameron Simpson

On 18Oct2014 17:55, Nobody  wrote:

On Fri, 17 Oct 2014 12:38:54 +0100, Empty Account wrote:

I am using netcat to listen to a port and python to read stdin and print
to the console.

nc -l 2003 | python print_metrics.py

sys.stdin.flush() doesn’t seem to flush stdin,


You can't "flush" an input stream.


Sure you can. Most streams are read through an API which buffers. That buffer 
can be discarded.


I'm not sure it is what the OP needs, but it is not a nonsensical idea.

Cheers,
Cameron Simpson 

I've seen things you people wouldn't believe.  Attack ships on fire off the
shoulder of Orion. I've watched C-beams glitter in the dark near the
Tannhauser Gate.  All these memories will be lost in time, like tears in rain.
- Roy Baty, _Blade Runner_
--
https://mail.python.org/mailman/listinfo/python-list


Re: Flush stdin

2014-10-18 Thread MRAB

On 2014-10-18 17:55, Nobody wrote:

On Fri, 17 Oct 2014 12:38:54 +0100, Empty Account wrote:


I am using netcat to listen to a port and python to read stdin and print
to the console.

nc -l 2003 | python print_metrics.py

sys.stdin.flush() doesn’t seem to flush stdin,


You can't "flush" an input stream.


[snip]

Flushing an input stream means (or could mean) discarding any data
that's currently in the buffer.

--
https://mail.python.org/mailman/listinfo/python-list


Re: Flush stdin

2014-10-18 Thread Tim Chase
On 2014-10-18 17:55, Nobody wrote:
> On Fri, 17 Oct 2014 12:38:54 +0100, Empty Account wrote:
> 
> > I am using netcat to listen to a port and python to read stdin
> > and print to the console.
> > 
> > nc -l 2003 | python print_metrics.py
> > 
> > sys.stdin.flush() doesn’t seem to flush stdin,
> 
> You can't "flush" an input stream.

You can't flush it, but you can make it unbuffered.  You can either
force python to use unbuffered stdio:

  python -u myfile.py

or you can get an unbuffered handle to the file

 import os, sys
 buffer_size = 1
 new_stdin = os.fdopen(sys.stdin.fileno(), 'r', buffer_size)
 for c in new_stdin:
   do_something(c)

though based on my reading, the first method works with both Py2
and Py3k while the second method doesn't reliably work in Py3k.

-tkc




-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Flush stdin

2014-10-18 Thread Nobody
On Fri, 17 Oct 2014 12:38:54 +0100, Empty Account wrote:

> I am using netcat to listen to a port and python to read stdin and print
> to the console.
> 
> nc -l 2003 | python print_metrics.py
> 
> sys.stdin.flush() doesn’t seem to flush stdin,

You can't "flush" an input stream.

> so I am using the termios module.

> I am receiving this exception
> termios.error: (25, 'Inappropriate ioctl for device')

termios only works on terminals, not pipes.

It's a safe bet that your problem is that "nc" isn't flushing its stdout
after each line (this is the default behaviour for stdio streams which
don't correspond to a terminal).

Check whether "nc" has a flag to line-buffer its output. If it doesn't,
the simplest solution is probably to write a Python script which creates a
pseudo-tty (using the "pty" module) and executes "nc" with its stdout
associated with the pty.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Flush stdin

2014-10-18 Thread Cameron Simpson

On 17Oct2014 12:38, Empty Account  wrote:

I am using netcat to listen to a port and python to read stdin and print to
the console.

nc -l 2003 | python print_metrics.py

sys.stdin.flush() doesn’t seem to flush stdin, so I am using the termios
module. 


You're aware that a stdio flush and a termios flush operate on two totally 
unrelated buffers?



while True: 
   input = sys.stdin.readline()
   # do some parsing 
   …  
   sys.stdout.write(parsed_data)
   time.sleep(3)
   termios.tcflush(sys.stdin, termios.TCIOFLUSH)

I am receiving this exception
termios.error: (25, 'Inappropriate ioctl for device')


That is because stdin is attached to the pipe from netcat. A pipe is not a 
terminal.



I will be using this script on Unix based systems and I wondered what
approach I could use 
to flush stdin?


Like Chris, I think you need to explain why you even want to flush stdin.  
There's probably something better you can do.


Cheers,
Cameron Simpson 

I strongly suspect so.  Practically everyone on sci.physics has a theory that
is far superior to special relativity, general relativity, quantum mechanics
*and* the standard model.  Around here, it's only a small clique of arrogant
young members of the physics establishment who fail to recognize these
revolutionary theories.  I'd explain why, but I have to go finish designing
my faster-than-light vacuum energy perpetual motion telekinetic
aether-powered time machine.- John Baez
--
https://mail.python.org/mailman/listinfo/python-list


Re: Flush stdin

2014-10-17 Thread Chris Angelico
On Fri, Oct 17, 2014 at 10:38 PM, Empty Account  wrote:
> I will be using this script on Unix based systems and I wondered what
> approach I could use
> to flush stdin?

Why exactly do you need to flush stdin? If you've written a small
amount of data to the console, it's stdout that you need to flush.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Flush stdin

2014-10-17 Thread Empty Account
Hi,

I am using netcat to listen to a port and python to read stdin and print to
the console.

nc -l 2003 | python print_metrics.py

sys.stdin.flush() doesn’t seem to flush stdin, so I am using the termios
module.

while True:
   input = sys.stdin.readline()
   # do some parsing
   …
   sys.stdout.write(parsed_data)
   time.sleep(3)
   termios.tcflush(sys.stdin, termios.TCIOFLUSH)


I am receiving this exception
termios.error: (25, 'Inappropriate ioctl for device')


I will be using this script on Unix based systems and I wondered what
approach I could use
to flush stdin?

Many Thanks

Aidy
-- 
https://mail.python.org/mailman/listinfo/python-list