Eli Bendersky added the comment: On Sun, Aug 25, 2013 at 10:40 PM, Stefan Behnel <rep...@bugs.python.org>wrote:
> > Stefan Behnel added the comment: > > Hmm, did you look at my last comment at all? It solves both the technical > issues and the API issues very nicely and avoids any problems of potential > future changes. Let me quickly explain why. > > The feature in question depends on two existing parts of the API: the > event generation of the parser, and the return values of the parser target > (e.g. a tree builder). So there are really only three places where this > feature makes sense, both technically and API-wise. > > 1) in the target > 2) in the parser > 3) between parser and target > > Note how a separate class is ruled out right from the start by the fact > that the feature lives somehwere between parser and target. It's an > inherent part of the existing design already (and of the implementation, > BTW), so I don't see how adding a separate thing to control it makes any > sense. > > 1) is impossible because the target is user provided and we do not control > it > 2) works fine because the parser controls both the call to the target and > its return value > 3) would be nice (and was my original favourite) but is hard to do with > the current implementation and requires further changes to the API of > parser targets > > So 2) is the choice that remains. > > I think folding it all into XMLParser is a bad idea. XMLParser is a fairly simple API and I don't want to complicate it. But more importantly, XMLParser knows nothing about Elements, at least in the direct API of today. The one constructing Elements is the target. The "read_events" method proposed for the new class (currently IncrementalParser.events) already returns Elements, having used a TreeBuilder to build them. XMLParser emits start/end/data calls into the target, but these only carry tag names, attributes and chunks of data. The hierarchical element construction is done by TreeBuilder. What I actually think would be better for the long term is to add new target invocations in XMLParser - start-ns and end-ns. So XMLParser would just keep *parsing*, leaving the interpretation of the parsed data to the target. Today's TreeBuilder is free to ignore these calls. A custom "EventCollectingTreeBuilder" can collect an event list, having all the information at its disposal. Thus, XMLParser would remain what it is today (minus the _setevents hack) - a router for pyexpat events. These discussions of the future API are interesting, but what's more important today is to have an API for IncrementalParser (using this name before a new one is agreed upon) that doesn't block future implementation changes. And I believe the API proposed here fits the bill. > > The class will be named EventParser. > > Obviously because it's parsing Events, as opposed to the XMLParser, which > parses XML, or the HTMLParser, which parses HTML, right? > The name is not perfect, and proposals for a better one are welcome. FWIW, since it already lives in the xml.etree namespace, "XML" does not necessarily have to be part of the name. So, some alternatives: * EventStreamer - proposed by Nick. I have to admit I don't feel good with it, because I still want to be crystal clear it's a *parser* we're talking about. * EventBasedParser * EventCollectingParser * NonblockingParser * ... other ideas? ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue17741> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com