Are you sure this is the case? Do I need to change any configurations somewhere (either to make solr search for special characters, or to make sure nutch indexes them)? The wiki says that Nutch 1.3 treats special characters are whitespace, is this wrong? I indexed some pages and am testing out the search privately using http://127.0.0.1:8983/solr/admin/ as shown on the Nutch tutorial page https://wiki.apache.org/nutch/NutchTutorial. Special characters seem to be ignored. For example, I can search for "tree" and get certain results, then try "tree\+\+\+" and get exactly the same results, even though the string "tree+++" does not appear anywhere, so shouldn't I get no results, just as if I had searched for treennn?
On Mon, Aug 22, 2011 at 4:27 AM, Markus Jelsma <[email protected]>wrote: > In 1.3 search is delegated to Solr. It can happily search (or ignore) > `special` chars. > > > I downloaded and played around a bit with 1.3 but I don't really have > > anything invested in it (so if this is easier using another version, I > > would gladly use that instead). > > > > On Sun, Aug 21, 2011 at 7:23 AM, Markus Jelsma > > > > <[email protected]>wrote: > > > What version of Nutch are you using? > > > > > > > Hi, > > > > On your the wiki (here: https://wiki.apache.org/nutch/Features ) it > > > > says that special characters and punctuation are treated as spaces, > > > > but it > > > > > > does > > > > > > > not say where in the code this is or how to configure it. How can I > > > > configure nutch not to ignore special characters? >

