[Python-Dev] hg.python.org is slow

2013-08-26 Thread Charles-François Natali
Hi,

I'm trying to checkout a pristine clone from
ssh://h...@hg.python.org/cpython, and it's taking forever:
"""
07:45:35.605941 IP 192.168.0.23.43098 >
virt-7yvsjn.psf.osuosl.org.ssh: Flags [.], ack 22081460, win 14225,
options [nop,nop,TS val 368519 ecr 2401783356], length 0
07:45:38.558348 IP virt-7yvsjn.psf.osuosl.org.ssh >
192.168.0.23.43098: Flags [.], seq 22081460:22082908, ack 53985, win
501, options [nop,nop,TS val 2401784064 ecr 368519], length 1448
07:45:38.558404 IP 192.168.0.23.43098 >
virt-7yvsjn.psf.osuosl.org.ssh: Flags [.], ack 22082908, win 14225,
options [nop,nop,TS val 369257 ecr 2401784064], length 0
07:45:39.649995 IP virt-7yvsjn.psf.osuosl.org.ssh >
192.168.0.23.43098: Flags [.], seq 22082908:22084356, ack 53985, win
501, options [nop,nop,TS val 2401784367 ecr 369257], length 1448
"""

See the time to just get an ACK?

Am I the only one experiencing this?

Cheers,

cf
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] please back out changeset f903cf864191 before alpha-2

2013-08-26 Thread Scott Dial
On 8/26/2013 8:51 AM, Antoine Pitrou wrote:
> Le Mon, 26 Aug 2013 08:24:58 -0400,
> Tres Seaver  a écrit :
>> On 08/26/2013 04:36 AM, Antoine Pitrou wrote:
>>> event-driven processing using network libraries
>>
>> Maybe I missed something:  why should considerations from that topic
>> influence the design of an API for XML processing?
> 
> Because this API is mostly useful when the data is received (*) at a
> slow enough speed - which usually means from the network, not from a
> hard drive.
...
> The whole *point* of adding IncrementalParser was to parse incoming
> XML data in a way that is friendly with event-driven network
> programming, other use cases being *already* covered by existing
> APIs. This is why it's far from nonsensical to re-use an existing
> terminology from that world.

Since when is Tulip the OOWTDI? If this was Twisted, it would be "write"
and "finish"[1]. Tulip's Protocol ABC isn't even a good match for the
application. There is reason that Twisted has a separate
Consumer/Producer interface from the network I/O interface. I'm sure
there is other existing practice in this specific area too (e.g.,
XMLParser).

[1]
http://twistedmatrix.com/documents/13.1.0/api/twisted.protocols.ftp.IFinishableConsumer.html

-- 
Scott Dial
sc...@scottdial.com
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] please back out changeset f903cf864191 before alpha-2

2013-08-26 Thread Greg Ewing

Ryan wrote:

Nonblocking sounds too Internet-related. How about...flow?


AsyncParser?

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 446 (make FD non inheritable) ready for a final review

2013-08-26 Thread Guido van Rossum
Wow, that was quick! I propose that we wait for one more day for any
feedback from others in response to this post, and then accept the PEP.


On Mon, Aug 26, 2013 at 3:19 PM, Victor Stinner wrote:

> 2013/8/26 Guido van Rossum :
> > I have reviewed the PEP and I think it is good. Thank you so much for
> > pushing this topic and for your very thorough review of all the feedback,
> > related issues and so on. It is an exemplary PEP!
>
> Thanks :-) I updated the PEP:
> http://hg.python.org/peps/rev/edd8250f6893
>
> > I've made a bunch of small edits (mostly to improve grammar slightly,
> hope
> > you don't mind) and committed these to the repo.
>
> Thanks, I'm not a native english speaker, so not problem for such edit.
>
> >
> https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode437
> > pep-0446.txt:437: condition.
> > As C-F Natali pointed out, this is not actually a problem, because after
> > fork()
> > only the main thread survives.  Maybe just delete this paragraph?
>
> Ok, I didn't know that only one thread survives to fork(). (I read
> Charles' email, but I forgot to update the PEP.) I simply deleted the
> paragraph.
>
> >
> https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode450
> > pep-0446.txt:450: parameter is a non-empty list of file descriptors.
> > Well, it could pass closefrom() the max of the given list and manually
> close
> > the
> > rest. This would be useful if the system max is large but none of the FDs
> > given
> > in the list is. (This would be more complex code but it would address the
> > issue
> > for most programs.)
>
> This was related to the multi-thread issue, which does not exist, so I
> also removed this paragraph.
>
> Using closefrom() to optimize subprocess is unrelated to this PEP.
>
> (And yes, the maximum file descriptor can be huge!)
>
> >
> https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode538
> > pep-0446.txt:538: descriptors).
> > I would say it should not be changed because the default is still better.
> > :-)
>
> (The PEP does not propose to change the default value.)
>
> Under Linux, recent versions of the glibc uses non-inheritable FD for
> internal files. Slowly, more and more libraries and programs will do
> the same. This PEP is a step in this direction ;-)
>
> Victor
>



-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 446 (make FD non inheritable) ready for a final review

2013-08-26 Thread Victor Stinner
2013/8/26 Guido van Rossum :
> I have reviewed the PEP and I think it is good. Thank you so much for
> pushing this topic and for your very thorough review of all the feedback,
> related issues and so on. It is an exemplary PEP!

Thanks :-) I updated the PEP:
http://hg.python.org/peps/rev/edd8250f6893

> I've made a bunch of small edits (mostly to improve grammar slightly, hope
> you don't mind) and committed these to the repo.

Thanks, I'm not a native english speaker, so not problem for such edit.

> https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode437
> pep-0446.txt:437: condition.
> As C-F Natali pointed out, this is not actually a problem, because after
> fork()
> only the main thread survives.  Maybe just delete this paragraph?

Ok, I didn't know that only one thread survives to fork(). (I read
Charles' email, but I forgot to update the PEP.) I simply deleted the
paragraph.

> https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode450
> pep-0446.txt:450: parameter is a non-empty list of file descriptors.
> Well, it could pass closefrom() the max of the given list and manually close
> the
> rest. This would be useful if the system max is large but none of the FDs
> given
> in the list is. (This would be more complex code but it would address the
> issue
> for most programs.)

This was related to the multi-thread issue, which does not exist, so I
also removed this paragraph.

Using closefrom() to optimize subprocess is unrelated to this PEP.

(And yes, the maximum file descriptor can be huge!)

> https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode538
> pep-0446.txt:538: descriptors).
> I would say it should not be changed because the default is still better.
> :-)

(The PEP does not propose to change the default value.)

Under Linux, recent versions of the glibc uses non-inheritable FD for
internal files. Slowly, more and more libraries and programs will do
the same. This PEP is a step in this direction ;-)

Victor
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 446 (make FD non inheritable) ready for a final review

2013-08-26 Thread Guido van Rossum
Hi Victor,

I have reviewed the PEP and I think it is good. Thank you so much for
pushing this topic and for your very thorough review of all the feedback,
related issues and so on. It is an exemplary PEP!

I've made a bunch of small edits (mostly to improve grammar slightly, hope
you don't mind) and committed these to the repo. I've also got a few more
comments on the text that I didn't want to commit behind your back; I've
written these up in a Rietveld review of the PEP (which you can also use to
see exactly what I did already commit).
https://codereview.appspot.com/13240043/

Here's a summary of those review changes:

https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt
File pep-0446.txt (right):
https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode25
pep-0446.txt:25: descriptors.
I'd add at this point:

We are aware of the code breakage this is likely to cause, and doing it anyway
for the good of mankind. (Details in the section "Backward Compatibility"
below.)
https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode80
pep-0446.txt:80: inheritable handles are inherited by the child process.
Maybe mention here that this also affects the subprocess module?  (You mention
it later, but it's important to realize at this point.)
https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode437
pep-0446.txt:437: condition.
As C-F Natali pointed out, this is not actually a problem, because after fork()
only the main thread survives.  Maybe just delete this paragraph?
https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode450
pep-0446.txt:450: parameter is a non-empty list of file descriptors.
Well, it could pass closefrom() the max of the given list and manually close the
rest. This would be useful if the system max is large but none of the FDs given
in the list is. (This would be more complex code but it would address the issue
for most programs.)
https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode486
pep-0446.txt:486: * ``socket.socketpair()``
I would call out that dup2() is intentionally not in this list, and add a
rationale for that omission below.
https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode528
pep-0446.txt:528: by default, but non-inheritable if *inheritable* is ``False``.
This might be a good place to explain the rationale for this exception.
https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode538
pep-0446.txt:538: descriptors).
I would say it should not be changed because the default is still better. :-)


-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 446 (make FD non inheritable) ready for a final review

2013-08-26 Thread Guido van Rossum
On Fri, Aug 23, 2013 at 1:30 PM, Charles-François Natali
 wrote:
>> About your example: I'm not sure that it is reliable/portable. I sa
>> daemon libraries closing *all* file descriptors and then expecting new
>> file descriptors to become 0, 1 and 2. Your example is different
>> because w is still open. On Windows, I have seen cases with only fd 0,
>> 1, 2 open, and the next open() call gives the fd 10 or 13...
>
> Well, my example uses fork(), so obviously doesn't apply to Windows.
> It's perfectly safe on Unix.

But relying on this in UNIX has also been discouraged ever since the
dup2() system call was introduced. (I can't easily find a reference
about its history but IIRC it is probably as old as UNIX v7 or
otherwise BSD 4.x.)

>> I'm optimistic and I expect that most Python applications and
>> libraries already use the subprocess module. The subprocess module
>> closes all file descriptors (except 0, 1, 2) since Python 3.2.
>> Developers relying on the FD inheritance and using the subprocess with
>> Python 3.2 or later already had to use the pass_fds parameter.
>
> As long as the PEP makes it clear that this breaks backward
> compatibility, that's fine. IMO the risk of breakage outweights the
> modicum benefit.

I know this will break code. But it is for the good of mankind.

(I will now review the full PEP, finally.)

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] please back out changeset f903cf864191 before alpha-2

2013-08-26 Thread Ryan
Nonblocking sounds too Internet-related. How about...flow?

Ah, I'll probably still end up using Expat regardless.

Eli Bendersky  wrote:

>On Mon, Aug 26, 2013 at 10:40 AM, Paul Moore 
>wrote:
>
>> On 26 August 2013 17:40, Eli Bendersky  wrote:
>>
>>> Yes, exactly :-) "Incremental", though, seems to support the
>conjecture
>>> that it's the input. Which is true, but, since XMLParser is also
>>> "incremental" in this sense, slightly confusing.
>>
>>
>> As a data point, until you explained the difference between the two
>> classes earlier in this thread, I too had been completely confused as
>both
>> the existing and the new classes are "incremental" (on the input side
>-
>> that's what I interpret "incremental" as meaning). It never even
>occurred
>> to me that the difference was in the *output* side. Maybe
>"NonBlocking"
>> would imply that to me. Or maybe "Generator". But regardless, I think
>the
>> changes you've made sound good, and I'm certainly less concerned with
>the
>> new version(as someone who will likely never use the new API, and
>therefore
>> doesn't really have a vote).
>>
>
>Thanks for the data point; it is useful.
>
>>  How about StreamParser?
>
>The problem with StreamParser is similar to IncrementalParser. "Stream"
>carries the impression that it refers to the input. But the input of ET
>parsers is *always* streaming, in a way (the feed/close interface). I
>want
>a name that conveys that the *output* is also
>nonblocking/streaming/yielding/generating/etc. Therefore Nonblocking
>(I'll
>let better English experts to decide whether B should be capitalized)
>sounds better to me, because it helps convey that both sides of the
>parser
>are asynchronous.
>
>Eli
>
>
>
>
>___
>Python-Dev mailing list
>Python-Dev@python.org
>http://mail.python.org/mailman/listinfo/python-dev
>Unsubscribe:
>http://mail.python.org/mailman/options/python-dev/rymg19%40gmail.com

-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] please back out changeset f903cf864191 before alpha-2

2013-08-26 Thread Stefan Behnel
Paul Moore, 26.08.2013 19:40:
> On 26 August 2013 17:40, Eli Bendersky wrote:
> 
>> Yes, exactly :-) "Incremental", though, seems to support the conjecture
>> that it's the input. Which is true, but, since XMLParser is also
>> "incremental" in this sense, slightly confusing.
> 
> As a data point, until you explained the difference between the two classes
> earlier in this thread, I too had been completely confused as both the
> existing and the new classes are "incremental" (on the input side - that's
> what I interpret "incremental" as meaning). It never even occurred to me
> that the difference was in the *output* side.

The fix I'm proposing is to not make it two separate classes. But those who
are interested in the details should really participate in the ticket
discussion rather than here.

Stefan


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] please back out changeset f903cf864191 before alpha-2

2013-08-26 Thread Eli Bendersky
On Mon, Aug 26, 2013 at 10:40 AM, Paul Moore  wrote:

> On 26 August 2013 17:40, Eli Bendersky  wrote:
>
>> Yes, exactly :-) "Incremental", though, seems to support the conjecture
>> that it's the input. Which is true, but, since XMLParser is also
>> "incremental" in this sense, slightly confusing.
>
>
> As a data point, until you explained the difference between the two
> classes earlier in this thread, I too had been completely confused as both
> the existing and the new classes are "incremental" (on the input side -
> that's what I interpret "incremental" as meaning). It never even occurred
> to me that the difference was in the *output* side. Maybe "NonBlocking"
> would imply that to me. Or maybe "Generator". But regardless, I think the
> changes you've made sound good, and I'm certainly less concerned with the
> new version(as someone who will likely never use the new API, and therefore
> doesn't really have a vote).
>

Thanks for the data point; it is useful.

>  How about StreamParser?

The problem with StreamParser is similar to IncrementalParser. "Stream"
carries the impression that it refers to the input. But the input of ET
parsers is *always* streaming, in a way (the feed/close interface). I want
a name that conveys that the *output* is also
nonblocking/streaming/yielding/generating/etc. Therefore Nonblocking (I'll
let better English experts to decide whether B should be capitalized)
sounds better to me, because it helps convey that both sides of the parser
are asynchronous.

Eli
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] please back out changeset f903cf864191 before alpha-2

2013-08-26 Thread Ryan
How about StreamParser? I mean, even if it isn't quite the same, that name 
would still make sense.

Eli Bendersky  wrote:

>On Mon, Aug 26, 2013 at 8:57 AM, Antoine Pitrou 
>wrote:
>
>> Le Mon, 26 Aug 2013 17:44:41 +0200,
>> Simon Cross  a écrit :
>> > On Mon, Aug 26, 2013 at 2:51 PM, Antoine Pitrou
>
>> > wrote:
>> > > Because this API is mostly useful when the data is received (*)
>at a
>> > > slow enough speed - which usually means from the network, not
>from a
>> > > hard drive.
>> >
>> > It looks like all the events have to be ready before one can start
>> > iterating over .events() in the new API? That doesn't seem that
>useful
>> > from an asynchronous programming perspective and .data_received()
>and
>> > .eof_received() appear to be thin wrappers over .feed() and
>.close()?
>>
>> What do you mean, "all events have to be ready"?
>> If you look at the unit tests, the events are generated on-the-fly,
>> not at the end of the document.
>> (exactly the same as iterparse(), except that iterparse() is
>blocking)
>>
>> Implementation-wise, data_received() and eof_received() are not thin
>> wrappers over feed() and close(), they rely on an internal API to get
>> at the generated events (which justifies putting the functionality
>> inside the etree module, by the way).
>>
>
>Antoine, you opted out of the tracker issue but I feel it's fair to let
>you
>know that after a lot of discussion with Nick and Stefan (*), we've
>settled
>on renaming the input methods to feed & close, and the output method to
>read_events. We are also considering a different name for the class.
>
>I've posted with more detail and rationale in
>http://bugs.python.org/issue17741, but to summarize:
>
>The input-side of IncrementalParser is the same as the plain XMLParser.
>The
>latter can also be given data incrementally by means of "feed". By
>default
>it would collect the whole tree and return it in close(), but in
>reality
>you can rig a custom target that does something more fluid (though not
>to
>the full extent of IncrementalParser). Therefore it was deemed
>confusing to
>have different names for this. Another reason is consistency with
>xml.sax.xmlreader.IncrementalParser, which also has feed() and close().
>
>As for the output method name, Nick suggested that read_events conveys
>the
>destructive nature of the method better (by analogy to file/stream
>APIs),
>and others agreed.
>
>As for the class name, IncrementalParser is ambiguous because it's not
>immediately clear which side is incremental. Input or output? For the
>input, it's no more incremental than XMLParser itself, as stated above.
>The
>output is what's different here, so we're considering a few candidates
>for
>a better name that conveys the meaning more precisely.
>
>And to reiterate, I realize that it's unpleasant for you to have this
>dug
>up after it has already been committed. I assume the blame for not
>reviewing it in more detail originally. However, I feel it would still
>be
>better to revise this now than just leave it be. APIs added to stdlib
>are
>cooked in there for a *long time*. Alternatively, Nick suggested
>granting
>this API a "provisional" status (PEP 411), and that's an option if we
>don't
>manage to reach some sort of consensus.
>
>Eli
>
>(*) Well, to be completely precise, Stefan is still opposed to the
>whole
>idea.
>
>
>
>
>___
>Python-Dev mailing list
>Python-Dev@python.org
>http://mail.python.org/mailman/listinfo/python-dev
>Unsubscribe:
>http://mail.python.org/mailman/options/python-dev/rymg19%40gmail.com

-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] please back out changeset f903cf864191 before alpha-2

2013-08-26 Thread Paul Moore
On 26 August 2013 17:40, Eli Bendersky  wrote:

> Yes, exactly :-) "Incremental", though, seems to support the conjecture
> that it's the input. Which is true, but, since XMLParser is also
> "incremental" in this sense, slightly confusing.


As a data point, until you explained the difference between the two classes
earlier in this thread, I too had been completely confused as both the
existing and the new classes are "incremental" (on the input side - that's
what I interpret "incremental" as meaning). It never even occurred to me
that the difference was in the *output* side. Maybe "NonBlocking" would
imply that to me. Or maybe "Generator". But regardless, I think the changes
you've made sound good, and I'm certainly less concerned with the new
version(as someone who will likely never use the new API, and therefore
doesn't really have a vote).

Paul
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] please back out changeset f903cf864191 before alpha-2

2013-08-26 Thread Eli Bendersky
On Mon, Aug 26, 2013 at 9:21 AM, Antoine Pitrou  wrote:

> Le Mon, 26 Aug 2013 09:14:38 -0700,
> Eli Bendersky  a écrit :
> >
> > Antoine, you opted out of the tracker issue but I feel it's fair to
> > let you know that after a lot of discussion with Nick and Stefan (*),
> > we've settled on renaming the input methods to feed & close, and the
> > output method to read_events. We are also considering a different
> > name for the class.
>
> Fair enough.
>
> > As for the class name, IncrementalParser is ambiguous because it's not
> > immediately clear which side is incremental. Input or output?
>
> Both are :-)
>
>
Yes, exactly :-) "Incremental", though, seems to support the conjecture
that it's the input. Which is true, but, since XMLParser is also
"incremental" in this sense, slightly confusing.

As a more anecdotal piece of evidence: when the issue was reopened, I
myself at first got confused by exactly this point because I forgot all
about this in the months that passed since the commit. And I recalled that
when I initially reviewed your patch, I got confused too :-) That would
suggest one of two things: (1) The name is indeed confusing or (2) I'm
stupid. The fact that Nick also got confused when trying a cursory
understanding of the documentation cams me down w.r.t. (2).

Back to the discussion, my new favorite is NonblockingParser. Because its
input side is exactly similar to XMLParser, the "Nonblocking" in the name
points to the difference in output, which is correct.

As the popular quote says, "There are only two hard problems in Computer
Science: cache invalidation and naming things."

Eli
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] please back out changeset f903cf864191 before alpha-2

2013-08-26 Thread Antoine Pitrou
Le Mon, 26 Aug 2013 09:14:38 -0700,
Eli Bendersky  a écrit :
> 
> Antoine, you opted out of the tracker issue but I feel it's fair to
> let you know that after a lot of discussion with Nick and Stefan (*),
> we've settled on renaming the input methods to feed & close, and the
> output method to read_events. We are also considering a different
> name for the class.

Fair enough.

> As for the class name, IncrementalParser is ambiguous because it's not
> immediately clear which side is incremental. Input or output?

Both are :-)

(which makes sense, really: an incremental input without output will
only yield a slight memory consumption benefit - only slight, since the
object tree representation should be much more costly than its bytes
serialization -; an incremental output without input doesn't seem to
have any point at all)

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] please back out changeset f903cf864191 before alpha-2

2013-08-26 Thread Eli Bendersky
On Mon, Aug 26, 2013 at 8:57 AM, Antoine Pitrou  wrote:

> Le Mon, 26 Aug 2013 17:44:41 +0200,
> Simon Cross  a écrit :
> > On Mon, Aug 26, 2013 at 2:51 PM, Antoine Pitrou 
> > wrote:
> > > Because this API is mostly useful when the data is received (*) at a
> > > slow enough speed - which usually means from the network, not from a
> > > hard drive.
> >
> > It looks like all the events have to be ready before one can start
> > iterating over .events() in the new API? That doesn't seem that useful
> > from an asynchronous programming perspective and .data_received() and
> > .eof_received() appear to be thin wrappers over .feed() and .close()?
>
> What do you mean, "all events have to be ready"?
> If you look at the unit tests, the events are generated on-the-fly,
> not at the end of the document.
> (exactly the same as iterparse(), except that iterparse() is blocking)
>
> Implementation-wise, data_received() and eof_received() are not thin
> wrappers over feed() and close(), they rely on an internal API to get
> at the generated events (which justifies putting the functionality
> inside the etree module, by the way).
>

Antoine, you opted out of the tracker issue but I feel it's fair to let you
know that after a lot of discussion with Nick and Stefan (*), we've settled
on renaming the input methods to feed & close, and the output method to
read_events. We are also considering a different name for the class.

I've posted with more detail and rationale in
http://bugs.python.org/issue17741, but to summarize:

The input-side of IncrementalParser is the same as the plain XMLParser. The
latter can also be given data incrementally by means of "feed". By default
it would collect the whole tree and return it in close(), but in reality
you can rig a custom target that does something more fluid (though not to
the full extent of IncrementalParser). Therefore it was deemed confusing to
have different names for this. Another reason is consistency with
xml.sax.xmlreader.IncrementalParser, which also has feed() and close().

As for the output method name, Nick suggested that read_events conveys the
destructive nature of the method better (by analogy to file/stream APIs),
and others agreed.

As for the class name, IncrementalParser is ambiguous because it's not
immediately clear which side is incremental. Input or output? For the
input, it's no more incremental than XMLParser itself, as stated above. The
output is what's different here, so we're considering a few candidates for
a better name that conveys the meaning more precisely.

And to reiterate, I realize that it's unpleasant for you to have this dug
up after it has already been committed. I assume the blame for not
reviewing it in more detail originally. However, I feel it would still be
better to revise this now than just leave it be. APIs added to stdlib are
cooked in there for a *long time*. Alternatively, Nick suggested granting
this API a "provisional" status (PEP 411), and that's an option if we don't
manage to reach some sort of consensus.

Eli

(*) Well, to be completely precise, Stefan is still opposed to the whole
idea.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] please back out changeset f903cf864191 before alpha-2

2013-08-26 Thread Antoine Pitrou
Le Mon, 26 Aug 2013 17:44:41 +0200,
Simon Cross  a écrit :
> On Mon, Aug 26, 2013 at 2:51 PM, Antoine Pitrou 
> wrote:
> > Because this API is mostly useful when the data is received (*) at a
> > slow enough speed - which usually means from the network, not from a
> > hard drive.
> 
> It looks like all the events have to be ready before one can start
> iterating over .events() in the new API? That doesn't seem that useful
> from an asynchronous programming perspective and .data_received() and
> .eof_received() appear to be thin wrappers over .feed() and .close()?

What do you mean, "all events have to be ready"?
If you look at the unit tests, the events are generated on-the-fly,
not at the end of the document.
(exactly the same as iterparse(), except that iterparse() is blocking)

Implementation-wise, data_received() and eof_received() are not thin
wrappers over feed() and close(), they rely on an internal API to get
at the generated events (which justifies putting the functionality
inside the etree module, by the way).

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] please back out changeset f903cf864191 before alpha-2

2013-08-26 Thread Simon Cross
On Mon, Aug 26, 2013 at 2:51 PM, Antoine Pitrou  wrote:
> Because this API is mostly useful when the data is received (*) at a
> slow enough speed - which usually means from the network, not from a
> hard drive.

It looks like all the events have to be ready before one can start
iterating over .events() in the new API? That doesn't seem that useful
from an asynchronous programming perspective and .data_received() and
.eof_received() appear to be thin wrappers over .feed() and .close()?

Am I misunderstanding something?
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] please back out changeset f903cf864191 before alpha-2

2013-08-26 Thread Antoine Pitrou
Le Mon, 26 Aug 2013 08:24:58 -0400,
Tres Seaver  a écrit :
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> On 08/26/2013 04:36 AM, Antoine Pitrou wrote:
> > event-driven processing using network librarie
> 
> Maybe I missed something:  why should considerations from that topic
> influence the design of an API for XML processing?

Because this API is mostly useful when the data is received (*) at a
slow enough speed - which usually means from the network, not from a
hard drive.

((*) "data" ... "received"; does it ring a bell? ;-))

If you want iterative processing from a fast data source, you can
already use iterparse(): it's blocking, but it's not a problem with
disk I/O (not to mention that non-blocking disk I/O doesn't really
exist under Linux, AFAIK: I haven't been able to get EAGAIN with
os.read() on a non-blocking file, even when reading from a huge
uncached file).

The whole *point* of adding IncrementalParser was to parse incoming
XML data in a way that is friendly with event-driven network
programming, other use cases being *already* covered by existing
APIs. This is why it's far from nonsensical to re-use an existing
terminology from that world.

If you don't do any non-blocking network I/O, then fine - you won't
even need the API, and can safely ignore its existence.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] please back out changeset f903cf864191 before alpha-2

2013-08-26 Thread Tres Seaver
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/26/2013 04:36 AM, Antoine Pitrou wrote:
> event-driven processing using network librarie

Maybe I missed something:  why should considerations from that topic
influence the design of an API for XML processing?  'feed' and 'close'
make much more sense for a parser API, as well has having the benefit of
long usage.



Tres.
- -- 
===
Tres Seaver  +1 540-429-0999  tsea...@palladion.com
Palladion Software   "Excellence by Design"http://palladion.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with undefined - http://www.enigmail.net/

iEYEARECAAYFAlIbSRoACgkQ+gerLs4ltQ5lhwCgnG7TLgSkVf+gXSOxO1KP2kLC
eLwAn1QbqbHUqJ7bKV6us/nDQ79AYUgk
=aN8S
-END PGP SIGNATURE-

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] please back out changeset f903cf864191 before alpha-2

2013-08-26 Thread Antoine Pitrou
Le Sat, 24 Aug 2013 14:42:24 -0400,
Terry Reedy  a écrit :
> >
> > And these for IncrementalParser:
> >
> >  data_received(data)
> >  Feed the given bytes data to the incremental parser.
> 
> Longer, awkward, and to me ugly in comparison to 'feed'. Since it
> seems to mean more or less the same thing, why not reuse 'feed' and
> continue to build on people prior knowledge of Python?

Just because *your* prior knowledge of Python doesn't include
event-driven processing using network libraries, doesn't mean it's a
completely new and unknown thing to other people. There are reasons why
"data_received" is better (less ambiguous) than "feed".

If you want to influence tulip's design, however, this is the wrong
mailing-list to do so.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com