[mwlib] Re: mwlib for NLP, and cleaning up the API

Joel Nothman Wed, 12 Aug 2009 22:22:36 -0700


The current parser also creates excess Nodes:


Article
     Paragraph tagname='p'
         u'\n'
         Style"'''"
             u'Thomas Cruise Mapother IV'
         Node
             u', better known by his '
             ArticleLink target=u'Stage name' ns=0
                 u'screen name'
             u' of '
         Style"'''"
             u'Tom Cruise'
         Node
             u', is an '
             ArticleLink target=u'United States' ns=0
                 u'American'
             u' actor and '
             ArticleLink target=u'film producer' ns=0
             u'. '

I would have thought these Text, ArticleLink, etc., should be directly  
under Paragraph, as they were in the earlier parser.

This can obviously be performed by a postprocessor, but I don't see why  
these intermediate Nodes are necessary in the basic parse.

- Joel

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"mwlib" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [email protected]
For more options, visit this group at http://groups.google.com/group/mwlib?hl=en
-~----------~----~----~----~------~----~------~--~---

[mwlib] Re: mwlib for NLP, and cleaning up the API

Reply via email to