[Python-Dev] hg.python.org is slow
Hi, I'm trying to checkout a pristine clone from ssh://h...@hg.python.org/cpython, and it's taking forever: """ 07:45:35.605941 IP 192.168.0.23.43098 > virt-7yvsjn.psf.osuosl.org.ssh: Flags [.], ack 22081460, win 14225, options [nop,nop,TS val 368519 ecr 2401783356], length 0 07:45:38.558348 IP virt-7yvsjn.psf.osuosl.org.ssh > 192.168.0.23.43098: Flags [.], seq 22081460:22082908, ack 53985, win 501, options [nop,nop,TS val 2401784064 ecr 368519], length 1448 07:45:38.558404 IP 192.168.0.23.43098 > virt-7yvsjn.psf.osuosl.org.ssh: Flags [.], ack 22082908, win 14225, options [nop,nop,TS val 369257 ecr 2401784064], length 0 07:45:39.649995 IP virt-7yvsjn.psf.osuosl.org.ssh > 192.168.0.23.43098: Flags [.], seq 22082908:22084356, ack 53985, win 501, options [nop,nop,TS val 2401784367 ecr 369257], length 1448 """ See the time to just get an ACK? Am I the only one experiencing this? Cheers, cf ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] please back out changeset f903cf864191 before alpha-2
On 8/26/2013 8:51 AM, Antoine Pitrou wrote: > Le Mon, 26 Aug 2013 08:24:58 -0400, > Tres Seaver a écrit : >> On 08/26/2013 04:36 AM, Antoine Pitrou wrote: >>> event-driven processing using network libraries >> >> Maybe I missed something: why should considerations from that topic >> influence the design of an API for XML processing? > > Because this API is mostly useful when the data is received (*) at a > slow enough speed - which usually means from the network, not from a > hard drive. ... > The whole *point* of adding IncrementalParser was to parse incoming > XML data in a way that is friendly with event-driven network > programming, other use cases being *already* covered by existing > APIs. This is why it's far from nonsensical to re-use an existing > terminology from that world. Since when is Tulip the OOWTDI? If this was Twisted, it would be "write" and "finish"[1]. Tulip's Protocol ABC isn't even a good match for the application. There is reason that Twisted has a separate Consumer/Producer interface from the network I/O interface. I'm sure there is other existing practice in this specific area too (e.g., XMLParser). [1] http://twistedmatrix.com/documents/13.1.0/api/twisted.protocols.ftp.IFinishableConsumer.html -- Scott Dial sc...@scottdial.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] please back out changeset f903cf864191 before alpha-2
Ryan wrote: Nonblocking sounds too Internet-related. How about...flow? AsyncParser? -- Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 446 (make FD non inheritable) ready for a final review
Wow, that was quick! I propose that we wait for one more day for any feedback from others in response to this post, and then accept the PEP. On Mon, Aug 26, 2013 at 3:19 PM, Victor Stinner wrote: > 2013/8/26 Guido van Rossum : > > I have reviewed the PEP and I think it is good. Thank you so much for > > pushing this topic and for your very thorough review of all the feedback, > > related issues and so on. It is an exemplary PEP! > > Thanks :-) I updated the PEP: > http://hg.python.org/peps/rev/edd8250f6893 > > > I've made a bunch of small edits (mostly to improve grammar slightly, > hope > > you don't mind) and committed these to the repo. > > Thanks, I'm not a native english speaker, so not problem for such edit. > > > > https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode437 > > pep-0446.txt:437: condition. > > As C-F Natali pointed out, this is not actually a problem, because after > > fork() > > only the main thread survives. Maybe just delete this paragraph? > > Ok, I didn't know that only one thread survives to fork(). (I read > Charles' email, but I forgot to update the PEP.) I simply deleted the > paragraph. > > > > https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode450 > > pep-0446.txt:450: parameter is a non-empty list of file descriptors. > > Well, it could pass closefrom() the max of the given list and manually > close > > the > > rest. This would be useful if the system max is large but none of the FDs > > given > > in the list is. (This would be more complex code but it would address the > > issue > > for most programs.) > > This was related to the multi-thread issue, which does not exist, so I > also removed this paragraph. > > Using closefrom() to optimize subprocess is unrelated to this PEP. > > (And yes, the maximum file descriptor can be huge!) > > > > https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode538 > > pep-0446.txt:538: descriptors). > > I would say it should not be changed because the default is still better. > > :-) > > (The PEP does not propose to change the default value.) > > Under Linux, recent versions of the glibc uses non-inheritable FD for > internal files. Slowly, more and more libraries and programs will do > the same. This PEP is a step in this direction ;-) > > Victor > -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 446 (make FD non inheritable) ready for a final review
2013/8/26 Guido van Rossum : > I have reviewed the PEP and I think it is good. Thank you so much for > pushing this topic and for your very thorough review of all the feedback, > related issues and so on. It is an exemplary PEP! Thanks :-) I updated the PEP: http://hg.python.org/peps/rev/edd8250f6893 > I've made a bunch of small edits (mostly to improve grammar slightly, hope > you don't mind) and committed these to the repo. Thanks, I'm not a native english speaker, so not problem for such edit. > https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode437 > pep-0446.txt:437: condition. > As C-F Natali pointed out, this is not actually a problem, because after > fork() > only the main thread survives. Maybe just delete this paragraph? Ok, I didn't know that only one thread survives to fork(). (I read Charles' email, but I forgot to update the PEP.) I simply deleted the paragraph. > https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode450 > pep-0446.txt:450: parameter is a non-empty list of file descriptors. > Well, it could pass closefrom() the max of the given list and manually close > the > rest. This would be useful if the system max is large but none of the FDs > given > in the list is. (This would be more complex code but it would address the > issue > for most programs.) This was related to the multi-thread issue, which does not exist, so I also removed this paragraph. Using closefrom() to optimize subprocess is unrelated to this PEP. (And yes, the maximum file descriptor can be huge!) > https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode538 > pep-0446.txt:538: descriptors). > I would say it should not be changed because the default is still better. > :-) (The PEP does not propose to change the default value.) Under Linux, recent versions of the glibc uses non-inheritable FD for internal files. Slowly, more and more libraries and programs will do the same. This PEP is a step in this direction ;-) Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 446 (make FD non inheritable) ready for a final review
Hi Victor, I have reviewed the PEP and I think it is good. Thank you so much for pushing this topic and for your very thorough review of all the feedback, related issues and so on. It is an exemplary PEP! I've made a bunch of small edits (mostly to improve grammar slightly, hope you don't mind) and committed these to the repo. I've also got a few more comments on the text that I didn't want to commit behind your back; I've written these up in a Rietveld review of the PEP (which you can also use to see exactly what I did already commit). https://codereview.appspot.com/13240043/ Here's a summary of those review changes: https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt File pep-0446.txt (right): https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode25 pep-0446.txt:25: descriptors. I'd add at this point: We are aware of the code breakage this is likely to cause, and doing it anyway for the good of mankind. (Details in the section "Backward Compatibility" below.) https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode80 pep-0446.txt:80: inheritable handles are inherited by the child process. Maybe mention here that this also affects the subprocess module? (You mention it later, but it's important to realize at this point.) https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode437 pep-0446.txt:437: condition. As C-F Natali pointed out, this is not actually a problem, because after fork() only the main thread survives. Maybe just delete this paragraph? https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode450 pep-0446.txt:450: parameter is a non-empty list of file descriptors. Well, it could pass closefrom() the max of the given list and manually close the rest. This would be useful if the system max is large but none of the FDs given in the list is. (This would be more complex code but it would address the issue for most programs.) https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode486 pep-0446.txt:486: * ``socket.socketpair()`` I would call out that dup2() is intentionally not in this list, and add a rationale for that omission below. https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode528 pep-0446.txt:528: by default, but non-inheritable if *inheritable* is ``False``. This might be a good place to explain the rationale for this exception. https://codereview.appspot.com/13240043/diff/3001/pep-0446.txt#newcode538 pep-0446.txt:538: descriptors). I would say it should not be changed because the default is still better. :-) -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 446 (make FD non inheritable) ready for a final review
On Fri, Aug 23, 2013 at 1:30 PM, Charles-François Natali wrote: >> About your example: I'm not sure that it is reliable/portable. I sa >> daemon libraries closing *all* file descriptors and then expecting new >> file descriptors to become 0, 1 and 2. Your example is different >> because w is still open. On Windows, I have seen cases with only fd 0, >> 1, 2 open, and the next open() call gives the fd 10 or 13... > > Well, my example uses fork(), so obviously doesn't apply to Windows. > It's perfectly safe on Unix. But relying on this in UNIX has also been discouraged ever since the dup2() system call was introduced. (I can't easily find a reference about its history but IIRC it is probably as old as UNIX v7 or otherwise BSD 4.x.) >> I'm optimistic and I expect that most Python applications and >> libraries already use the subprocess module. The subprocess module >> closes all file descriptors (except 0, 1, 2) since Python 3.2. >> Developers relying on the FD inheritance and using the subprocess with >> Python 3.2 or later already had to use the pass_fds parameter. > > As long as the PEP makes it clear that this breaks backward > compatibility, that's fine. IMO the risk of breakage outweights the > modicum benefit. I know this will break code. But it is for the good of mankind. (I will now review the full PEP, finally.) -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] please back out changeset f903cf864191 before alpha-2
Nonblocking sounds too Internet-related. How about...flow? Ah, I'll probably still end up using Expat regardless. Eli Bendersky wrote: >On Mon, Aug 26, 2013 at 10:40 AM, Paul Moore >wrote: > >> On 26 August 2013 17:40, Eli Bendersky wrote: >> >>> Yes, exactly :-) "Incremental", though, seems to support the >conjecture >>> that it's the input. Which is true, but, since XMLParser is also >>> "incremental" in this sense, slightly confusing. >> >> >> As a data point, until you explained the difference between the two >> classes earlier in this thread, I too had been completely confused as >both >> the existing and the new classes are "incremental" (on the input side >- >> that's what I interpret "incremental" as meaning). It never even >occurred >> to me that the difference was in the *output* side. Maybe >"NonBlocking" >> would imply that to me. Or maybe "Generator". But regardless, I think >the >> changes you've made sound good, and I'm certainly less concerned with >the >> new version(as someone who will likely never use the new API, and >therefore >> doesn't really have a vote). >> > >Thanks for the data point; it is useful. > >> How about StreamParser? > >The problem with StreamParser is similar to IncrementalParser. "Stream" >carries the impression that it refers to the input. But the input of ET >parsers is *always* streaming, in a way (the feed/close interface). I >want >a name that conveys that the *output* is also >nonblocking/streaming/yielding/generating/etc. Therefore Nonblocking >(I'll >let better English experts to decide whether B should be capitalized) >sounds better to me, because it helps convey that both sides of the >parser >are asynchronous. > >Eli > > > > >___ >Python-Dev mailing list >Python-Dev@python.org >http://mail.python.org/mailman/listinfo/python-dev >Unsubscribe: >http://mail.python.org/mailman/options/python-dev/rymg19%40gmail.com -- Sent from my Android phone with K-9 Mail. Please excuse my brevity.___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] please back out changeset f903cf864191 before alpha-2
Paul Moore, 26.08.2013 19:40: > On 26 August 2013 17:40, Eli Bendersky wrote: > >> Yes, exactly :-) "Incremental", though, seems to support the conjecture >> that it's the input. Which is true, but, since XMLParser is also >> "incremental" in this sense, slightly confusing. > > As a data point, until you explained the difference between the two classes > earlier in this thread, I too had been completely confused as both the > existing and the new classes are "incremental" (on the input side - that's > what I interpret "incremental" as meaning). It never even occurred to me > that the difference was in the *output* side. The fix I'm proposing is to not make it two separate classes. But those who are interested in the details should really participate in the ticket discussion rather than here. Stefan ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] please back out changeset f903cf864191 before alpha-2
On Mon, Aug 26, 2013 at 10:40 AM, Paul Moore wrote: > On 26 August 2013 17:40, Eli Bendersky wrote: > >> Yes, exactly :-) "Incremental", though, seems to support the conjecture >> that it's the input. Which is true, but, since XMLParser is also >> "incremental" in this sense, slightly confusing. > > > As a data point, until you explained the difference between the two > classes earlier in this thread, I too had been completely confused as both > the existing and the new classes are "incremental" (on the input side - > that's what I interpret "incremental" as meaning). It never even occurred > to me that the difference was in the *output* side. Maybe "NonBlocking" > would imply that to me. Or maybe "Generator". But regardless, I think the > changes you've made sound good, and I'm certainly less concerned with the > new version(as someone who will likely never use the new API, and therefore > doesn't really have a vote). > Thanks for the data point; it is useful. > How about StreamParser? The problem with StreamParser is similar to IncrementalParser. "Stream" carries the impression that it refers to the input. But the input of ET parsers is *always* streaming, in a way (the feed/close interface). I want a name that conveys that the *output* is also nonblocking/streaming/yielding/generating/etc. Therefore Nonblocking (I'll let better English experts to decide whether B should be capitalized) sounds better to me, because it helps convey that both sides of the parser are asynchronous. Eli ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] please back out changeset f903cf864191 before alpha-2
How about StreamParser? I mean, even if it isn't quite the same, that name would still make sense. Eli Bendersky wrote: >On Mon, Aug 26, 2013 at 8:57 AM, Antoine Pitrou >wrote: > >> Le Mon, 26 Aug 2013 17:44:41 +0200, >> Simon Cross a écrit : >> > On Mon, Aug 26, 2013 at 2:51 PM, Antoine Pitrou > >> > wrote: >> > > Because this API is mostly useful when the data is received (*) >at a >> > > slow enough speed - which usually means from the network, not >from a >> > > hard drive. >> > >> > It looks like all the events have to be ready before one can start >> > iterating over .events() in the new API? That doesn't seem that >useful >> > from an asynchronous programming perspective and .data_received() >and >> > .eof_received() appear to be thin wrappers over .feed() and >.close()? >> >> What do you mean, "all events have to be ready"? >> If you look at the unit tests, the events are generated on-the-fly, >> not at the end of the document. >> (exactly the same as iterparse(), except that iterparse() is >blocking) >> >> Implementation-wise, data_received() and eof_received() are not thin >> wrappers over feed() and close(), they rely on an internal API to get >> at the generated events (which justifies putting the functionality >> inside the etree module, by the way). >> > >Antoine, you opted out of the tracker issue but I feel it's fair to let >you >know that after a lot of discussion with Nick and Stefan (*), we've >settled >on renaming the input methods to feed & close, and the output method to >read_events. We are also considering a different name for the class. > >I've posted with more detail and rationale in >http://bugs.python.org/issue17741, but to summarize: > >The input-side of IncrementalParser is the same as the plain XMLParser. >The >latter can also be given data incrementally by means of "feed". By >default >it would collect the whole tree and return it in close(), but in >reality >you can rig a custom target that does something more fluid (though not >to >the full extent of IncrementalParser). Therefore it was deemed >confusing to >have different names for this. Another reason is consistency with >xml.sax.xmlreader.IncrementalParser, which also has feed() and close(). > >As for the output method name, Nick suggested that read_events conveys >the >destructive nature of the method better (by analogy to file/stream >APIs), >and others agreed. > >As for the class name, IncrementalParser is ambiguous because it's not >immediately clear which side is incremental. Input or output? For the >input, it's no more incremental than XMLParser itself, as stated above. >The >output is what's different here, so we're considering a few candidates >for >a better name that conveys the meaning more precisely. > >And to reiterate, I realize that it's unpleasant for you to have this >dug >up after it has already been committed. I assume the blame for not >reviewing it in more detail originally. However, I feel it would still >be >better to revise this now than just leave it be. APIs added to stdlib >are >cooked in there for a *long time*. Alternatively, Nick suggested >granting >this API a "provisional" status (PEP 411), and that's an option if we >don't >manage to reach some sort of consensus. > >Eli > >(*) Well, to be completely precise, Stefan is still opposed to the >whole >idea. > > > > >___ >Python-Dev mailing list >Python-Dev@python.org >http://mail.python.org/mailman/listinfo/python-dev >Unsubscribe: >http://mail.python.org/mailman/options/python-dev/rymg19%40gmail.com -- Sent from my Android phone with K-9 Mail. Please excuse my brevity.___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] please back out changeset f903cf864191 before alpha-2
On 26 August 2013 17:40, Eli Bendersky wrote: > Yes, exactly :-) "Incremental", though, seems to support the conjecture > that it's the input. Which is true, but, since XMLParser is also > "incremental" in this sense, slightly confusing. As a data point, until you explained the difference between the two classes earlier in this thread, I too had been completely confused as both the existing and the new classes are "incremental" (on the input side - that's what I interpret "incremental" as meaning). It never even occurred to me that the difference was in the *output* side. Maybe "NonBlocking" would imply that to me. Or maybe "Generator". But regardless, I think the changes you've made sound good, and I'm certainly less concerned with the new version(as someone who will likely never use the new API, and therefore doesn't really have a vote). Paul ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] please back out changeset f903cf864191 before alpha-2
On Mon, Aug 26, 2013 at 9:21 AM, Antoine Pitrou wrote: > Le Mon, 26 Aug 2013 09:14:38 -0700, > Eli Bendersky a écrit : > > > > Antoine, you opted out of the tracker issue but I feel it's fair to > > let you know that after a lot of discussion with Nick and Stefan (*), > > we've settled on renaming the input methods to feed & close, and the > > output method to read_events. We are also considering a different > > name for the class. > > Fair enough. > > > As for the class name, IncrementalParser is ambiguous because it's not > > immediately clear which side is incremental. Input or output? > > Both are :-) > > Yes, exactly :-) "Incremental", though, seems to support the conjecture that it's the input. Which is true, but, since XMLParser is also "incremental" in this sense, slightly confusing. As a more anecdotal piece of evidence: when the issue was reopened, I myself at first got confused by exactly this point because I forgot all about this in the months that passed since the commit. And I recalled that when I initially reviewed your patch, I got confused too :-) That would suggest one of two things: (1) The name is indeed confusing or (2) I'm stupid. The fact that Nick also got confused when trying a cursory understanding of the documentation cams me down w.r.t. (2). Back to the discussion, my new favorite is NonblockingParser. Because its input side is exactly similar to XMLParser, the "Nonblocking" in the name points to the difference in output, which is correct. As the popular quote says, "There are only two hard problems in Computer Science: cache invalidation and naming things." Eli ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] please back out changeset f903cf864191 before alpha-2
Le Mon, 26 Aug 2013 09:14:38 -0700, Eli Bendersky a écrit : > > Antoine, you opted out of the tracker issue but I feel it's fair to > let you know that after a lot of discussion with Nick and Stefan (*), > we've settled on renaming the input methods to feed & close, and the > output method to read_events. We are also considering a different > name for the class. Fair enough. > As for the class name, IncrementalParser is ambiguous because it's not > immediately clear which side is incremental. Input or output? Both are :-) (which makes sense, really: an incremental input without output will only yield a slight memory consumption benefit - only slight, since the object tree representation should be much more costly than its bytes serialization -; an incremental output without input doesn't seem to have any point at all) Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] please back out changeset f903cf864191 before alpha-2
On Mon, Aug 26, 2013 at 8:57 AM, Antoine Pitrou wrote: > Le Mon, 26 Aug 2013 17:44:41 +0200, > Simon Cross a écrit : > > On Mon, Aug 26, 2013 at 2:51 PM, Antoine Pitrou > > wrote: > > > Because this API is mostly useful when the data is received (*) at a > > > slow enough speed - which usually means from the network, not from a > > > hard drive. > > > > It looks like all the events have to be ready before one can start > > iterating over .events() in the new API? That doesn't seem that useful > > from an asynchronous programming perspective and .data_received() and > > .eof_received() appear to be thin wrappers over .feed() and .close()? > > What do you mean, "all events have to be ready"? > If you look at the unit tests, the events are generated on-the-fly, > not at the end of the document. > (exactly the same as iterparse(), except that iterparse() is blocking) > > Implementation-wise, data_received() and eof_received() are not thin > wrappers over feed() and close(), they rely on an internal API to get > at the generated events (which justifies putting the functionality > inside the etree module, by the way). > Antoine, you opted out of the tracker issue but I feel it's fair to let you know that after a lot of discussion with Nick and Stefan (*), we've settled on renaming the input methods to feed & close, and the output method to read_events. We are also considering a different name for the class. I've posted with more detail and rationale in http://bugs.python.org/issue17741, but to summarize: The input-side of IncrementalParser is the same as the plain XMLParser. The latter can also be given data incrementally by means of "feed". By default it would collect the whole tree and return it in close(), but in reality you can rig a custom target that does something more fluid (though not to the full extent of IncrementalParser). Therefore it was deemed confusing to have different names for this. Another reason is consistency with xml.sax.xmlreader.IncrementalParser, which also has feed() and close(). As for the output method name, Nick suggested that read_events conveys the destructive nature of the method better (by analogy to file/stream APIs), and others agreed. As for the class name, IncrementalParser is ambiguous because it's not immediately clear which side is incremental. Input or output? For the input, it's no more incremental than XMLParser itself, as stated above. The output is what's different here, so we're considering a few candidates for a better name that conveys the meaning more precisely. And to reiterate, I realize that it's unpleasant for you to have this dug up after it has already been committed. I assume the blame for not reviewing it in more detail originally. However, I feel it would still be better to revise this now than just leave it be. APIs added to stdlib are cooked in there for a *long time*. Alternatively, Nick suggested granting this API a "provisional" status (PEP 411), and that's an option if we don't manage to reach some sort of consensus. Eli (*) Well, to be completely precise, Stefan is still opposed to the whole idea. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] please back out changeset f903cf864191 before alpha-2
Le Mon, 26 Aug 2013 17:44:41 +0200, Simon Cross a écrit : > On Mon, Aug 26, 2013 at 2:51 PM, Antoine Pitrou > wrote: > > Because this API is mostly useful when the data is received (*) at a > > slow enough speed - which usually means from the network, not from a > > hard drive. > > It looks like all the events have to be ready before one can start > iterating over .events() in the new API? That doesn't seem that useful > from an asynchronous programming perspective and .data_received() and > .eof_received() appear to be thin wrappers over .feed() and .close()? What do you mean, "all events have to be ready"? If you look at the unit tests, the events are generated on-the-fly, not at the end of the document. (exactly the same as iterparse(), except that iterparse() is blocking) Implementation-wise, data_received() and eof_received() are not thin wrappers over feed() and close(), they rely on an internal API to get at the generated events (which justifies putting the functionality inside the etree module, by the way). Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] please back out changeset f903cf864191 before alpha-2
On Mon, Aug 26, 2013 at 2:51 PM, Antoine Pitrou wrote: > Because this API is mostly useful when the data is received (*) at a > slow enough speed - which usually means from the network, not from a > hard drive. It looks like all the events have to be ready before one can start iterating over .events() in the new API? That doesn't seem that useful from an asynchronous programming perspective and .data_received() and .eof_received() appear to be thin wrappers over .feed() and .close()? Am I misunderstanding something? ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] please back out changeset f903cf864191 before alpha-2
Le Mon, 26 Aug 2013 08:24:58 -0400, Tres Seaver a écrit : > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > On 08/26/2013 04:36 AM, Antoine Pitrou wrote: > > event-driven processing using network librarie > > Maybe I missed something: why should considerations from that topic > influence the design of an API for XML processing? Because this API is mostly useful when the data is received (*) at a slow enough speed - which usually means from the network, not from a hard drive. ((*) "data" ... "received"; does it ring a bell? ;-)) If you want iterative processing from a fast data source, you can already use iterparse(): it's blocking, but it's not a problem with disk I/O (not to mention that non-blocking disk I/O doesn't really exist under Linux, AFAIK: I haven't been able to get EAGAIN with os.read() on a non-blocking file, even when reading from a huge uncached file). The whole *point* of adding IncrementalParser was to parse incoming XML data in a way that is friendly with event-driven network programming, other use cases being *already* covered by existing APIs. This is why it's far from nonsensical to re-use an existing terminology from that world. If you don't do any non-blocking network I/O, then fine - you won't even need the API, and can safely ignore its existence. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] please back out changeset f903cf864191 before alpha-2
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/26/2013 04:36 AM, Antoine Pitrou wrote: > event-driven processing using network librarie Maybe I missed something: why should considerations from that topic influence the design of an API for XML processing? 'feed' and 'close' make much more sense for a parser API, as well has having the benefit of long usage. Tres. - -- === Tres Seaver +1 540-429-0999 tsea...@palladion.com Palladion Software "Excellence by Design"http://palladion.com -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iEYEARECAAYFAlIbSRoACgkQ+gerLs4ltQ5lhwCgnG7TLgSkVf+gXSOxO1KP2kLC eLwAn1QbqbHUqJ7bKV6us/nDQ79AYUgk =aN8S -END PGP SIGNATURE- ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] please back out changeset f903cf864191 before alpha-2
Le Sat, 24 Aug 2013 14:42:24 -0400, Terry Reedy a écrit : > > > > And these for IncrementalParser: > > > > data_received(data) > > Feed the given bytes data to the incremental parser. > > Longer, awkward, and to me ugly in comparison to 'feed'. Since it > seems to mean more or less the same thing, why not reuse 'feed' and > continue to build on people prior knowledge of Python? Just because *your* prior knowledge of Python doesn't include event-driven processing using network libraries, doesn't mean it's a completely new and unknown thing to other people. There are reasons why "data_received" is better (less ambiguous) than "feed". If you want to influence tulip's design, however, this is the wrong mailing-list to do so. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com