Re: 'Illegal character in query' on Solr cloud 4.10.1
Jack, I am using this query to test from the browser and this occurs consistently for the 5 out of the 6 servers in the cluster, but the actual API that I use is pysolr, so from the front end its sent using pysolr. I face the same issue in both Firefox and Google Chrome, the fact that there is an existing Jira for a similar issue , made me think this is a Solr issue , but I am still not clear how I can circumvent this issue. On Wed, Dec 24, 2014 at 4:57 PM, Jack Krupansky jack.krupan...@gmail.com wrote: Is the problem here that the error occurs sometimes or that it doesn't occur all of the time? I mean, it is clearly a bug in the client if it is sending a raw circumflex rather than a URL-encoded circumflex. Also, some browsers automatically URL-encode character as needed, but I have heard that some browsers don't always encode all of the characters. Question: You mention the URL, but how are you sending that URL to Solr - via a browser address box, curl, or... what? If using curl, you also have to cope with some characters having a shell meaning and needing to be escaped. Whether it is Tomcat or Solr that gives the error, the main point is that the raw circumflex shouldn't be sent to either. -- Jack Krupansky On Wed, Dec 24, 2014 at 4:32 PM, Erick Erickson erickerick...@gmail.com wrote: OK, then I don't think it's a Solr problem. I think 5 of your Tomcats are configured in such a way that they consider ^ to be an illegal character. There have been recurring problems with Servlet containers being configured to allow/disallow various characters, and I think that's what's happening here. But this is totally outside Solr. Solr, when it successfully distributes a query, sends the query on to one replica of each shard, and I was wondering if that process wasn't working correctly somehow, although boosting is so common that it would be a huge shock since it would have broken almost every Tomcat installation out there. By sending the query directly to each node, you've bypassed any forwarding by Solr so it looks like the problem is before Solr even sees it. So my guess is that somehow 5 of your servers are configured to expect a different character than the server that works. I'm afraid I don't know Tomcat well enough to direct you there, but take a look here: https://wiki.apache.org/solr/SolrTomcat Sorry I can't be more help Erick On Wed, Dec 24, 2014 at 1:33 AM, S.L simpleliving...@gmail.com wrote: Erik, The scenario 1, that you have listed is what seems to be the case. When I add distrib=false to query each one of the 6 servers only 1 of them returns results (partial) and the rest of them give the illegal character error . I have not set up any special logging I do not see any info in the catalina.out but in a file called localhost_access_log.2014-12-24.txt in tomcat/logs directory, I see the following logging message when the invalid character error occurs. [24/Dec/2014:09:25:54 +] GET /solr/dyCollection1_shard2_replica1/?fl=*,scoreq=canon+pixma+printersort=score+desc,productNameLength%20ascwt=jsonindent=truerows=100defType=edismaxqf=productNamemm=2pf=productNameps=1pf2=productNamepf3=productNamestopwords=truelowercaseOperators=truebq=hasThumbnailImage:true^2.0distrib=false HTTP/1.1 500 7781 I am using Tomcat 7.0.42 and SolrCloud 4.10.1 and the Oracle JDK . java version 1.7.0_71 Java(TM) SE Runtime Environment (build 1.7.0_71-b14) Java HotSpot(TM) 64-Bit Server VM (build 24.71-b01, mixed mode) Thanks. On Tue, Dec 23, 2014 at 11:46 AM, Erick Erickson erickerick...@gmail.com wrote: Hmmm, so you are you pinging the servers directly, right? Here's a couple of things to try: 1 add distrib=false to the query and try each of the 6 servers. What I'm wondering is if this is happening on the sub-query sent out or on the primary server. Adding distrib=false will just execute on the node you're sending it to, and will NOT send sub-queries out to any other node so you'll get partial results back. If one server continues to work but the other 5 fail, then your servlet container is probably not set up with the right character sets. Although why that would manifest itself on the ^ character mystifies me. 2 Let's assume that all 6 servers handle the raw query. Next thing that would be really helpful is to see the sub-queries. Take distrib=false off and tail the logs on all the servers. What we're looking for here is whether the sub-queries even make it to Solr or whether the problem is in your container. 3 If the sub-queries do NOT make it to the Solr logs, what is the query that the container sees? Is it recognizable or has Solr somehow munged the sub-query? What is your environment like? Tomcat? Jetty? Other? What JVM etc? Best, Erick On Tue, Dec
Re: 'Illegal character in query' on Solr cloud 4.10.1
Erik, The scenario 1, that you have listed is what seems to be the case. When I add distrib=false to query each one of the 6 servers only 1 of them returns results (partial) and the rest of them give the illegal character error . I have not set up any special logging I do not see any info in the catalina.out but in a file called localhost_access_log.2014-12-24.txt in tomcat/logs directory, I see the following logging message when the invalid character error occurs. [24/Dec/2014:09:25:54 +] GET /solr/dyCollection1_shard2_replica1/?fl=*,scoreq=canon+pixma+printersort=score+desc,productNameLength%20ascwt=jsonindent=truerows=100defType=edismaxqf=productNamemm=2pf=productNameps=1pf2=productNamepf3=productNamestopwords=truelowercaseOperators=truebq=hasThumbnailImage:true^2.0distrib=false HTTP/1.1 500 7781 I am using Tomcat 7.0.42 and SolrCloud 4.10.1 and the Oracle JDK . java version 1.7.0_71 Java(TM) SE Runtime Environment (build 1.7.0_71-b14) Java HotSpot(TM) 64-Bit Server VM (build 24.71-b01, mixed mode) Thanks. On Tue, Dec 23, 2014 at 11:46 AM, Erick Erickson erickerick...@gmail.com wrote: Hmmm, so you are you pinging the servers directly, right? Here's a couple of things to try: 1 add distrib=false to the query and try each of the 6 servers. What I'm wondering is if this is happening on the sub-query sent out or on the primary server. Adding distrib=false will just execute on the node you're sending it to, and will NOT send sub-queries out to any other node so you'll get partial results back. If one server continues to work but the other 5 fail, then your servlet container is probably not set up with the right character sets. Although why that would manifest itself on the ^ character mystifies me. 2 Let's assume that all 6 servers handle the raw query. Next thing that would be really helpful is to see the sub-queries. Take distrib=false off and tail the logs on all the servers. What we're looking for here is whether the sub-queries even make it to Solr or whether the problem is in your container. 3 If the sub-queries do NOT make it to the Solr logs, what is the query that the container sees? Is it recognizable or has Solr somehow munged the sub-query? What is your environment like? Tomcat? Jetty? Other? What JVM etc? Best, Erick On Tue, Dec 23, 2014 at 3:23 AM, S.L simpleliving...@gmail.com wrote: Hi All, I am using SolrCloud 4.10.1 and I have 3 shards with replication factor of 2 , i.e is 6 nodes altogether. When I query the server1 out of 6 nodes in the cluster with the below query , it works fine , but any other node in the cluster when queried with the same query results in a *HTTP Status 500 - {msg=Illegal character in query at index 181:* error. The character at index 181 is the boost character ^. I have see a Jira SOLR-5971 https://issues.apache.org/jira/browse/SOLR-5971 for a similar issue , how can I overcome this issue. The query I use is below. Thanks in Advance! http://xx2..com:8081/solr/dyCollection1_shard2_replica1/?q=x+x+xxsort=score+descwt=jsonindent=truedebugQuery=truedefType=edismaxqf=productName ^1.5+productDescriptionmm=1pf=productName+productDescriptionps=1pf2=productName+productDescriptionpf3=productName+productDescriptionstopwords=truelowercaseOperators=true
Re: 'Illegal character in query' on Solr cloud 4.10.1
OK, then I don't think it's a Solr problem. I think 5 of your Tomcats are configured in such a way that they consider ^ to be an illegal character. There have been recurring problems with Servlet containers being configured to allow/disallow various characters, and I think that's what's happening here. But this is totally outside Solr. Solr, when it successfully distributes a query, sends the query on to one replica of each shard, and I was wondering if that process wasn't working correctly somehow, although boosting is so common that it would be a huge shock since it would have broken almost every Tomcat installation out there. By sending the query directly to each node, you've bypassed any forwarding by Solr so it looks like the problem is before Solr even sees it. So my guess is that somehow 5 of your servers are configured to expect a different character than the server that works. I'm afraid I don't know Tomcat well enough to direct you there, but take a look here: https://wiki.apache.org/solr/SolrTomcat Sorry I can't be more help Erick On Wed, Dec 24, 2014 at 1:33 AM, S.L simpleliving...@gmail.com wrote: Erik, The scenario 1, that you have listed is what seems to be the case. When I add distrib=false to query each one of the 6 servers only 1 of them returns results (partial) and the rest of them give the illegal character error . I have not set up any special logging I do not see any info in the catalina.out but in a file called localhost_access_log.2014-12-24.txt in tomcat/logs directory, I see the following logging message when the invalid character error occurs. [24/Dec/2014:09:25:54 +] GET /solr/dyCollection1_shard2_replica1/?fl=*,scoreq=canon+pixma+printersort=score+desc,productNameLength%20ascwt=jsonindent=truerows=100defType=edismaxqf=productNamemm=2pf=productNameps=1pf2=productNamepf3=productNamestopwords=truelowercaseOperators=truebq=hasThumbnailImage:true^2.0distrib=false HTTP/1.1 500 7781 I am using Tomcat 7.0.42 and SolrCloud 4.10.1 and the Oracle JDK . java version 1.7.0_71 Java(TM) SE Runtime Environment (build 1.7.0_71-b14) Java HotSpot(TM) 64-Bit Server VM (build 24.71-b01, mixed mode) Thanks. On Tue, Dec 23, 2014 at 11:46 AM, Erick Erickson erickerick...@gmail.com wrote: Hmmm, so you are you pinging the servers directly, right? Here's a couple of things to try: 1 add distrib=false to the query and try each of the 6 servers. What I'm wondering is if this is happening on the sub-query sent out or on the primary server. Adding distrib=false will just execute on the node you're sending it to, and will NOT send sub-queries out to any other node so you'll get partial results back. If one server continues to work but the other 5 fail, then your servlet container is probably not set up with the right character sets. Although why that would manifest itself on the ^ character mystifies me. 2 Let's assume that all 6 servers handle the raw query. Next thing that would be really helpful is to see the sub-queries. Take distrib=false off and tail the logs on all the servers. What we're looking for here is whether the sub-queries even make it to Solr or whether the problem is in your container. 3 If the sub-queries do NOT make it to the Solr logs, what is the query that the container sees? Is it recognizable or has Solr somehow munged the sub-query? What is your environment like? Tomcat? Jetty? Other? What JVM etc? Best, Erick On Tue, Dec 23, 2014 at 3:23 AM, S.L simpleliving...@gmail.com wrote: Hi All, I am using SolrCloud 4.10.1 and I have 3 shards with replication factor of 2 , i.e is 6 nodes altogether. When I query the server1 out of 6 nodes in the cluster with the below query , it works fine , but any other node in the cluster when queried with the same query results in a *HTTP Status 500 - {msg=Illegal character in query at index 181:* error. The character at index 181 is the boost character ^. I have see a Jira SOLR-5971 https://issues.apache.org/jira/browse/SOLR-5971 for a similar issue , how can I overcome this issue. The query I use is below. Thanks in Advance! http://xx2..com:8081/solr/dyCollection1_shard2_replica1/?q=x+x+xxsort=score+descwt=jsonindent=truedebugQuery=truedefType=edismaxqf=productName ^1.5+productDescriptionmm=1pf=productName+productDescriptionps=1pf2=productName+productDescriptionpf3=productName+productDescriptionstopwords=truelowercaseOperators=true
Re: 'Illegal character in query' on Solr cloud 4.10.1
On Wed, Dec 24, 2014 at 4:32 PM, Erick Erickson erickerick...@gmail.com wrote: OK, then I don't think it's a Solr problem. I think 5 of your Tomcats are configured in such a way that they consider ^ to be an illegal character. Hmmm, the stack trace in SOLR-5971 shows a different user (who gets the same error message) running in Jetty. Without looking into it further, I thought it most likely an issue with the proxying code. I don't think distrib=false won't prevent a node from proxying a query to another node that can actually handle that query. -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data
Re: 'Illegal character in query' on Solr cloud 4.10.1
Is the problem here that the error occurs sometimes or that it doesn't occur all of the time? I mean, it is clearly a bug in the client if it is sending a raw circumflex rather than a URL-encoded circumflex. Also, some browsers automatically URL-encode character as needed, but I have heard that some browsers don't always encode all of the characters. Question: You mention the URL, but how are you sending that URL to Solr - via a browser address box, curl, or... what? If using curl, you also have to cope with some characters having a shell meaning and needing to be escaped. Whether it is Tomcat or Solr that gives the error, the main point is that the raw circumflex shouldn't be sent to either. -- Jack Krupansky On Wed, Dec 24, 2014 at 4:32 PM, Erick Erickson erickerick...@gmail.com wrote: OK, then I don't think it's a Solr problem. I think 5 of your Tomcats are configured in such a way that they consider ^ to be an illegal character. There have been recurring problems with Servlet containers being configured to allow/disallow various characters, and I think that's what's happening here. But this is totally outside Solr. Solr, when it successfully distributes a query, sends the query on to one replica of each shard, and I was wondering if that process wasn't working correctly somehow, although boosting is so common that it would be a huge shock since it would have broken almost every Tomcat installation out there. By sending the query directly to each node, you've bypassed any forwarding by Solr so it looks like the problem is before Solr even sees it. So my guess is that somehow 5 of your servers are configured to expect a different character than the server that works. I'm afraid I don't know Tomcat well enough to direct you there, but take a look here: https://wiki.apache.org/solr/SolrTomcat Sorry I can't be more help Erick On Wed, Dec 24, 2014 at 1:33 AM, S.L simpleliving...@gmail.com wrote: Erik, The scenario 1, that you have listed is what seems to be the case. When I add distrib=false to query each one of the 6 servers only 1 of them returns results (partial) and the rest of them give the illegal character error . I have not set up any special logging I do not see any info in the catalina.out but in a file called localhost_access_log.2014-12-24.txt in tomcat/logs directory, I see the following logging message when the invalid character error occurs. [24/Dec/2014:09:25:54 +] GET /solr/dyCollection1_shard2_replica1/?fl=*,scoreq=canon+pixma+printersort=score+desc,productNameLength%20ascwt=jsonindent=truerows=100defType=edismaxqf=productNamemm=2pf=productNameps=1pf2=productNamepf3=productNamestopwords=truelowercaseOperators=truebq=hasThumbnailImage:true^2.0distrib=false HTTP/1.1 500 7781 I am using Tomcat 7.0.42 and SolrCloud 4.10.1 and the Oracle JDK . java version 1.7.0_71 Java(TM) SE Runtime Environment (build 1.7.0_71-b14) Java HotSpot(TM) 64-Bit Server VM (build 24.71-b01, mixed mode) Thanks. On Tue, Dec 23, 2014 at 11:46 AM, Erick Erickson erickerick...@gmail.com wrote: Hmmm, so you are you pinging the servers directly, right? Here's a couple of things to try: 1 add distrib=false to the query and try each of the 6 servers. What I'm wondering is if this is happening on the sub-query sent out or on the primary server. Adding distrib=false will just execute on the node you're sending it to, and will NOT send sub-queries out to any other node so you'll get partial results back. If one server continues to work but the other 5 fail, then your servlet container is probably not set up with the right character sets. Although why that would manifest itself on the ^ character mystifies me. 2 Let's assume that all 6 servers handle the raw query. Next thing that would be really helpful is to see the sub-queries. Take distrib=false off and tail the logs on all the servers. What we're looking for here is whether the sub-queries even make it to Solr or whether the problem is in your container. 3 If the sub-queries do NOT make it to the Solr logs, what is the query that the container sees? Is it recognizable or has Solr somehow munged the sub-query? What is your environment like? Tomcat? Jetty? Other? What JVM etc? Best, Erick On Tue, Dec 23, 2014 at 3:23 AM, S.L simpleliving...@gmail.com wrote: Hi All, I am using SolrCloud 4.10.1 and I have 3 shards with replication factor of 2 , i.e is 6 nodes altogether. When I query the server1 out of 6 nodes in the cluster with the below query , it works fine , but any other node in the cluster when queried with the same query results in a *HTTP Status 500 - {msg=Illegal character in query at index 181:* error. The character at index 181 is the boost character ^. I have see a Jira SOLR-5971 https://issues.apache.org/jira/browse/SOLR-5971 for a similar issue , how
'Illegal character in query' on Solr cloud 4.10.1
Hi All, I am using SolrCloud 4.10.1 and I have 3 shards with replication factor of 2 , i.e is 6 nodes altogether. When I query the server1 out of 6 nodes in the cluster with the below query , it works fine , but any other node in the cluster when queried with the same query results in a *HTTP Status 500 - {msg=Illegal character in query at index 181:* error. The character at index 181 is the boost character ^. I have see a Jira SOLR-5971 https://issues.apache.org/jira/browse/SOLR-5971 for a similar issue , how can I overcome this issue. The query I use is below. Thanks in Advance! http://xx2..com:8081/solr/dyCollection1_shard2_replica1/?q=x+x+xxsort=score+descwt=jsonindent=truedebugQuery=truedefType=edismaxqf=productName ^1.5+productDescriptionmm=1pf=productName+productDescriptionps=1pf2=productName+productDescriptionpf3=productName+productDescriptionstopwords=truelowercaseOperators=true
Re: 'Illegal character in query' on Solr cloud 4.10.1
Hmmm, so you are you pinging the servers directly, right? Here's a couple of things to try: 1 add distrib=false to the query and try each of the 6 servers. What I'm wondering is if this is happening on the sub-query sent out or on the primary server. Adding distrib=false will just execute on the node you're sending it to, and will NOT send sub-queries out to any other node so you'll get partial results back. If one server continues to work but the other 5 fail, then your servlet container is probably not set up with the right character sets. Although why that would manifest itself on the ^ character mystifies me. 2 Let's assume that all 6 servers handle the raw query. Next thing that would be really helpful is to see the sub-queries. Take distrib=false off and tail the logs on all the servers. What we're looking for here is whether the sub-queries even make it to Solr or whether the problem is in your container. 3 If the sub-queries do NOT make it to the Solr logs, what is the query that the container sees? Is it recognizable or has Solr somehow munged the sub-query? What is your environment like? Tomcat? Jetty? Other? What JVM etc? Best, Erick On Tue, Dec 23, 2014 at 3:23 AM, S.L simpleliving...@gmail.com wrote: Hi All, I am using SolrCloud 4.10.1 and I have 3 shards with replication factor of 2 , i.e is 6 nodes altogether. When I query the server1 out of 6 nodes in the cluster with the below query , it works fine , but any other node in the cluster when queried with the same query results in a *HTTP Status 500 - {msg=Illegal character in query at index 181:* error. The character at index 181 is the boost character ^. I have see a Jira SOLR-5971 https://issues.apache.org/jira/browse/SOLR-5971 for a similar issue , how can I overcome this issue. The query I use is below. Thanks in Advance! http://xx2..com:8081/solr/dyCollection1_shard2_replica1/?q=x+x+xxsort=score+descwt=jsonindent=truedebugQuery=truedefType=edismaxqf=productName ^1.5+productDescriptionmm=1pf=productName+productDescriptionps=1pf2=productName+productDescriptionpf3=productName+productDescriptionstopwords=truelowercaseOperators=true