Is the problem here that the error occurs sometimes or that it doesn't
occur all of the time? I mean, it is clearly a bug in the client if it is
sending a raw circumflex rather than a URL-encoded circumflex.

Also, some browsers automatically URL-encode character as needed, but I
have heard that some browsers don't always encode all of the characters.

Question: You mention the URL, but how are you sending that URL to Solr -
via a browser address box, curl, or... what?

If using curl, you also have to cope with some characters having a shell
meaning and needing to be escaped.

Whether it is Tomcat or Solr that gives the error, the main point is that
the raw circumflex shouldn't be sent to either.


-- Jack Krupansky

On Wed, Dec 24, 2014 at 4:32 PM, Erick Erickson <erickerick...@gmail.com>
wrote:

> OK, then I don't think it's a Solr problem. I think 5 of your Tomcats are
> configured in such a way that they consider ^ to be an illegal character.
>
> There have been recurring problems with Servlet containers being
> configured to allow/disallow various characters, and I think that's
> what's happening here. But this is totally outside Solr.
>
> Solr, when it successfully distributes a query, sends the query on to one
> replica of each shard, and I was wondering if that process wasn't
> working correctly somehow, although boosting is so common that it
> would be a huge shock since it would have broken almost every
> Tomcat installation out there. By sending the query directly to each
> node, you've bypassed any forwarding by Solr so it looks like the
> problem is before Solr even sees it.
>
> So my guess is that somehow 5 of your servers are configured to
> expect a different character than the server that works. I'm afraid
> I don't know Tomcat well enough to direct you there, but take a
> look here:
> https://wiki.apache.org/solr/SolrTomcat
>
> Sorry I can't be more help
> Erick
>
> On Wed, Dec 24, 2014 at 1:33 AM, S.L <simpleliving...@gmail.com> wrote:
> > Erik,
> >
> > The scenario 1, that you have listed is what seems to be the case.
> >
> > When I add distrib=false to query each one of the 6 servers only 1 of
> them
> > returns results (partial) and the rest of them give the illegal character
> > error .
> >
> > I have not set up any special logging I do not see any info in the
> > catalina.out but in a file called localhost_access_log.2014-12-24.txt in
> > tomcat/logs directory, I see the following logging message when the
> invalid
> > character error occurs.
> >
> > [24/Dec/2014:09:25:54 +0000] "GET
> >
> /solr/dyCollection1_shard2_replica1/xxxxxxxx?fl=*,score&q=canon+pixma+printer&sort=score+desc,productNameLength%20asc&wt=json&indent=true&rows=100&defType=edismax&qf=productName&mm=2&pf=productName&ps=1&pf2=productName&pf3=productName&stopwords=true&lowercaseOperators=true&bq=hasThumbnailImage:true^2.0&distrib=false
> > HTTP/1.1" 500 7781
> >
> > I am using Tomcat 7.0.42 and SolrCloud 4.10.1 and the Oracle JDK .
> >
> > java version "1.7.0_71"
> > Java(TM) SE Runtime Environment (build 1.7.0_71-b14)
> > Java HotSpot(TM) 64-Bit Server VM (build 24.71-b01, mixed mode)
> >
> > Thanks.
> >
> > On Tue, Dec 23, 2014 at 11:46 AM, Erick Erickson <
> erickerick...@gmail.com>
> > wrote:
> >
> >> Hmmm, so you are you pinging the servers directly, right?
> >> Here's a couple of things to try:
> >> 1> add &distrib=false to the query and try each of the 6 servers.
> >> What I'm wondering is if this is happening on the sub-query sent
> >> out or on the primary server. Adding &distrib=false will just execute
> >> on the node you're sending it to, and will NOT send sub-queries out
> >> to any other node so you'll get partial results back.
> >>
> >> If one server continues to work but the other 5 fail, then your servlet
> >> container is probably not set up with the right character sets. Although
> >> why that would manifest itself on the ^ character mystifies me.
> >>
> >> 2> Let's assume that all 6 servers handle the raw query. Next thing that
> >> would be really helpful is to see the sub-queries. Take &distrib=false
> >> off and tail the logs on all the servers. What we're looking for here is
> >> whether the sub-queries even make it to Solr or whether the problem
> >> is in your container.
> >>
> >> 3> If the sub-queries do NOT make it to the Solr logs, what is the query
> >> that the container sees? Is it recognizable or has Solr somehow munged
> >> the sub-query?
> >>
> >> What is your environment like? Tomcat? Jetty? Other? What JVM
> >> etc?
> >>
> >> Best,
> >> Erick
> >>
> >> On Tue, Dec 23, 2014 at 3:23 AM, S.L <simpleliving...@gmail.com> wrote:
> >> > Hi All,
> >> >
> >> > I am using SolrCloud 4.10.1 and I have 3 shards with replication
> factor
> >> of
> >> > 2 , i.e is 6 nodes altogether.
> >> >
> >> > When I query the server1 out of 6 nodes in the cluster with the below
> >> query
> >> > , it works fine , but any other node in the cluster when queried with
> the
> >> > same query results in a *HTTP Status 500 - {msg=Illegal character in
> >> query
> >> > at index 181:*
> >> > error.
> >> >
> >> > The character at index 181 is the boost character ^. I have see a Jira
> >> > SOLR-5971 <https://issues.apache.org/jira/browse/SOLR-5971> for a
> >> similar
> >> > issue , how can I overcome this issue.
> >> >
> >> > The query I use is below. Thanks in Advance!
> >> >
> >> >
> >>
> http://xxxxxx2.xxxxxxxx.com:8081/solr/dyCollection1_shard2_replica1/xxxxxxxx?q=xxxxx+xxxxx+xxxxxx&sort=score+desc&wt=json&indent=true&debugQuery=true&defType=edismax&qf=productName
> >> >
> >>
> ^1.5+productDescription&mm=1&pf=productName+productDescription&ps=1&pf2=productName+productDescription&pf3=productName+productDescription&stopwords=true&lowercaseOperators=true
> >>
>

Reply via email to