How about the query parser respecting backslash escaping? I need free-text input, no syntax at all. Right now, I'm escaping every Lucene special character in the front end. I just figured out that it breaks for colon, can't search for "12:01" with "12\:01".
wunder On 2/7/08 11:06 AM, "Chris Hostetter" <[EMAIL PROTECTED]> wrote: > > : I confirmed this behavior in trunk with the following query: > : > http://localhost:8983/solr/select?qt=dismax&q=6'2"&debugQuery=on&qf=cat&pf=cat > : > : The result is that the double quote is dropped: > : +DisjunctionMaxQuery((cat:6'2)~0.01) DisjunctionMaxQuery((cat:6'2)~0.01) > : > : This seems like it's a bug (rather than by design), but I could be > : wrong... Hoss? > > It was by design ... but it could be handled better. the idea is that if > the input has balanced quotes (ie: an even number) then leave them alone > so they are dealt with as phrase delimiters. If there is an uneven number > strip them out since we don't know wether they are a mistake (ie: unclosed > phrase) or intended to be literal. > > auto-escaping them probably would have been a better way to go (ie: let > the analyzer decide wether or not to strip them) ... i'm not sure why i > didn't do that in the first place (I think at the time the lucene > QueryParser didn't deal with escaped quotes very well) > > the thing to keep in mind, is that even if it did escape them, this still > wouldn't work if the user input were... > > the 6'2" man dating the 5'3" woman > > ...because it would assume the even number of double-quote characters mean > that " man dating the 5'3" is a phrase. i remember spending a day > going over query loks trying tp figure out a good set of hueristic rules > for guessing when quote characters in user input should be interpreted as > phrase delims vs "inch" markers before a coworker smacked me and made me > realize it was a fairly intractable problem and simple rules would be > easier to understand anyway. > > FYI: this is all happening in > SolrPluginUtils.stripUnbalancedQuotes(CharSequence) which > DisMax(RequestHanler) calls before passing the string to > DisjunctionMaxQueryParser. > > > > -Hoss >