Steve:

In short, no. There's no good way for Solr to solve this problem in
the _general_ case. Well, actually we could create parsers with rules
like "if the colon is inside a paren, escape it). Which would
completely break someone who wants to form queries like

q=field1:whatever AND (a AND field:b) OR (field2:c AND "d: is a letter
followed by a colon (:)").

You say: " A better solution would be to have Solr support a new
parameter that I can pass to Solr as part of the URL."

How would Solr know _which_ parts of the URL to escape in the case above?

You have to do this at the app layer as that's the only place that has
a clue what the peculiarities of the situation are.

But if you're using SolrJ in your app layer, you can use
ClientUtils.escapeQueryChars() for user-entered data to do the
escaping without you having to maintain a separate list.

Best,
Erick

On Mon, Apr 20, 2015 at 8:39 AM, Steven White <swhite4...@gmail.com> wrote:
> Hi Shawn,
>
> If the user types "title:(Apache: Solr Notes)" (without quotes) than I want
> Solr to treat the whole string as raw text string as if I escaped ":", "("
> and ")" and any other reserved Solr keywords / tokens.  Using dismax it
> worked for the ":" case, but I still get SyntaxError if I pass it the
> following "title:(Apache: Solr Notes) AND" (here is the full URL):
>
>
> http://localhost:8983/solr/db/select?q=title:(Apache:%20Solr%20Notes)%20AND&fl=id%2Cscore%2Ctitle&wt=xml&indent=true&q.op=AND&defType=dismax&qf=title
>
> So far, the only solution I can find is for my application to escape all
> Solr operators before sending the string to Solr.  This is fine, but it
> means my application will have to adopt to Solr's reserved operators as
> Solr grows (if Solr 5.x / 6.x adds a new operator, I have to add that to my
> applications escape list).  A better solution would be to have Solr support
> a new parameter that I can pass to Solr as part of the URL.
> This parameter will tell Solr to do the escaping for me or not (missing
> means the same as don't do the escaping).
>
> Thanks
>
> Steve
>
> On Mon, Apr 20, 2015 at 10:05 AM, Shawn Heisey <apa...@elyograg.org> wrote:
>
>> On 4/20/2015 7:41 AM, Steven White wrote:
>> > In my application, a user types "Apache Solr Notes".  I take that text
>> and
>> > send it over to Solr like so:
>> >
>> >
>> >
>> http://localhost:8983/solr/db/select?q=title:(Apache%20Solr%20Notes)&fl=id%2Cscore%2Ctitle&wt=xml&indent=true&q.op=AND
>> >
>> > And I get a hit on "Apache Solr Release Notes".  This is all good.
>> >
>> > Now if the same user types "Apache: Solr Notes" (notice the ":" after
>> > "Apache") I will get a SyntaxError.  The fix is to escape ":" before I
>> send
>> > it to Solr.  What I want to figure out is how can I tell Solr / Lucene to
>> > ignore ":" and escape it for me?  In this example, I used ":" but my need
>> > is for all other operators and reserved Solr / Lucene characters.
>>
>> If we assume that what you did for the first query is what you will do
>> for the second query, then this is what you would have sent:
>>
>> q=title:(Apache: Solr Notes)
>>
>> How is the parser supposed to know that only the second colon should be
>> escaped, and not the first one?  If you escape them both (or treat the
>> entire query string as query text), then the fact that you are searching
>> the "title" field is lost.  The text "title" becomes an actual part of
>> the query, and may not match, depending on what you have done with other
>> parameters, such as the default operator.
>>
>> If you use the dismax parser (*NOT* the edismax parser, which parses
>> field:value queries and boolean operator syntax just like the lucene
>> parser), you may be able to achieve what you're after.
>>
>> https://cwiki.apache.org/confluence/display/solr/The+DisMax+Query+Parser
>> https://wiki.apache.org/solr/DisMaxQParserPlugin
>>
>> With dismax, you would use the qf and possibly the pf parameter to tell
>> it which fields to search and send this as the query:
>>
>> q=Apache: Solr Notes
>>
>> Thanks,
>> Shawn
>>
>>

Reply via email to