Alan Tanaman wrote: > Arnaud, > > Absolutely. As Nutch comes, the url field is searchable (and tokenized). > You predicate the search to a specific field using a colon, for example by > typing > > url:motherboard or url:"unix shell" > > The default search field (when no predicate is specified) is content. > > Generally the Lucene search syntax is supported (although I believe there > are Nutch specific issues): > http://lucene.apache.org/java/docs/queryparsersyntax.html > > Best regards, > Alan > _________________________ > Alan Tanaman > iDNA Solutions > Tel: +44 (20) 7257 6125 > Mobile: +44 (7796) 932 362 > http://blog.idna-solutions.com > > -----Original Message----- > From: Arnaud Goupil [mailto:[EMAIL PROTECTED] > Sent: 10 January 2007 10:04 > To: [email protected] > Subject: How to index and return files names ? > > Hi, > > I would like Nutch to return results when search terms > are found in the name of files known by the index. > > For example, my http location indexed by nutch > contains various files, named : > > > computer security.pdf > unix shell.pdf > motherboard specifications.pdf > > > If I search "motherboard", I want Nutch to return a > result pointing to my third document, even if this > document does not contain the word "motherboard", only > because it's in the name of the file. > > Is there a way to do this ? > > Thanks > > __________________________________________________ > Do You Yahoo!? > En finir avec le spam? Yahoo! Mail vous offre la meilleure protection > possible contre les messages non sollicités > http://mail.yahoo.fr Yahoo! Mail > > > As Alan suggested, you should search the url field. For searching the url field, you should include query-url plugin. But query-basic also queries the url field without adding the url: prefix to the query. Also I suggest you to use the URLTokenizer in the http://issues.apache.org/jira/browse/NUTCH-389, which tokenizes the urls better.
------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
