Hi Guys,
The problem I've found with the url: field is that if you try to search for a word document with url:doc it will not only return foo.doc but also things like /doc/text.html. So is there an easy way to search on file type? I don't believe it's indexed out of the box, but that way Arnaud could do searches such as: filetype:pdf motherboard Regards, Karl. Enis Soztutar wrote: > > Alan Tanaman wrote: >> Arnaud, >> >> Absolutely. As Nutch comes, the url field is searchable (and tokenized). >> You predicate the search to a specific field using a colon, for example >> by >> typing >> >> url:motherboard or url:"unix shell" >> >> The default search field (when no predicate is specified) is content. >> >> Generally the Lucene search syntax is supported (although I believe there >> are Nutch specific issues): >> http://lucene.apache.org/java/docs/queryparsersyntax.html >> >> Best regards, >> Alan > > > -- View this message in context: http://www.nabble.com/How-to-index-and-return-files-names---tf2951610.html#a8257753 Sent from the Nutch - User mailing list archive at Nabble.com. ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
