hy , when i use standardTokenizer for parse for example "I.B.M" the type of the Token is HOST and not ACRONYM
WHY ??? in StandardTokenizer.jj // acronyms: U.S.A., I.B.M., etc. // use a post-filter to remove dots | <ACRONYM: <ALPHA> "." (<ALPHA> ".")+ > // hostname | <HOST: <ALPHANUM> ("." <ALPHANUM>)+ > "I.B.M" can be a host or acronym, so threre is a problem , no ? ----- Original Message ----- From: "petite_abeille" <[EMAIL PROTECTED]> To: "Lucene Users List" <[EMAIL PROTECTED]> Sent: Thursday, September 04, 2003 3:19 PM Subject: Re: Lucene app to index Java code > Hi Erik, > > On Thursday, Sep 4, 2003, at 15:03 Europe/Zurich, Erik Hatcher wrote: > > > - XDoclet could be used to sweep through Java code and build a > > text/XML file as richly as you'd like from the information there > > (complete with JavaDoc tags, which Zapata will miss :)), > > Correct. This happen to be on purpose :) Does XDoclet build an > "intertwingled" object graph of your code along the way? Performing a > plain search on a code base is pretty trivial... what seems to be more > interesting would be to put that in context. > > Zapata does something along the line of what MagicHat does for > Objective-C: > > http://homepage.mac.com/petite_abeille/MagicHat/ > > But from the sound of what Otis is saying this is not what you guys are > looking for... back to the pampa then... > > Cheers, > > PA. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]