Hi,

Oh.  Ok, thanks!  I'll give that a try.

Jim


---- "Armasu wrote: 
> Keyword: Field.Index.NOT_ANALYZED
> 
> -----Original Message-----
> From: oh...@cox.net [mailto:oh...@cox.net] 
> Sent: Thursday, July 30, 2009 4:36 PM
> To: java-user@lucene.apache.org
> Subject: How to index IP addresses?
> 
> Hi,
> 
> I am trying to index information in some proprietary-formatted files.  
> 
> In particular, these files contain some IP addresses in dotted notation, 
> e.g., aa.bb.cc.dd.
> 
> For my initial test, I have a Document implementation, and after I extract 
> what I need into a String named "Info", I do:
> 
> doc.add(new Field("contents", Info, Field.Store.YES, Field.Index.ANALYZED));
> 
> From looking at the resulting index using Luke, it appears that I am getting 
> terms for the full IP address string (e.g., "aa.bb.cc.dd"), but I am also 
> getting terms for each octet of each IP address string, e.g.:
> 
> aa
> bb
> cc
> dd
> 
> I'm still just getting started with Lucene, but from the research that I've 
> done, it seems like Lucene is treating the "." in the dotted notation strings 
> as "noise".  Is that correct?
> 
> If so, is there a way to get it not to do that?
> 
> Thanks,
> Jim
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
> 
> 


Amazon Development Center (Romania) S.R.L. registered office: 37 Lazar Street, 
floor 5, Iasi, Iasi County, Iasi 700049, Romania. Registered in Romania. 
Registration number J40/12967/2005.



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to