Re: 'Illegal character in query' on Solr cloud 4.10.1

2014-12-25 Thread S.L
Jack,

I am using this query to test from the browser and this occurs consistently
for the 5 out of the 6 servers in the cluster, but the actual API that I
use is pysolr, so from the front end its sent using pysolr.

I face the same issue in both Firefox and Google Chrome, the fact that
there is an existing Jira for a similar issue , made me think this is a
Solr issue , but I am still not clear how I can circumvent this issue.





On Wed, Dec 24, 2014 at 4:57 PM, Jack Krupansky jack.krupan...@gmail.com
wrote:

 Is the problem here that the error occurs sometimes or that it doesn't
 occur all of the time? I mean, it is clearly a bug in the client if it is
 sending a raw circumflex rather than a URL-encoded circumflex.

 Also, some browsers automatically URL-encode character as needed, but I
 have heard that some browsers don't always encode all of the characters.

 Question: You mention the URL, but how are you sending that URL to Solr -
 via a browser address box, curl, or... what?

 If using curl, you also have to cope with some characters having a shell
 meaning and needing to be escaped.

 Whether it is Tomcat or Solr that gives the error, the main point is that
 the raw circumflex shouldn't be sent to either.


 -- Jack Krupansky

 On Wed, Dec 24, 2014 at 4:32 PM, Erick Erickson erickerick...@gmail.com
 wrote:

  OK, then I don't think it's a Solr problem. I think 5 of your Tomcats are
  configured in such a way that they consider ^ to be an illegal character.
 
  There have been recurring problems with Servlet containers being
  configured to allow/disallow various characters, and I think that's
  what's happening here. But this is totally outside Solr.
 
  Solr, when it successfully distributes a query, sends the query on to one
  replica of each shard, and I was wondering if that process wasn't
  working correctly somehow, although boosting is so common that it
  would be a huge shock since it would have broken almost every
  Tomcat installation out there. By sending the query directly to each
  node, you've bypassed any forwarding by Solr so it looks like the
  problem is before Solr even sees it.
 
  So my guess is that somehow 5 of your servers are configured to
  expect a different character than the server that works. I'm afraid
  I don't know Tomcat well enough to direct you there, but take a
  look here:
  https://wiki.apache.org/solr/SolrTomcat
 
  Sorry I can't be more help
  Erick
 
  On Wed, Dec 24, 2014 at 1:33 AM, S.L simpleliving...@gmail.com wrote:
   Erik,
  
   The scenario 1, that you have listed is what seems to be the case.
  
   When I add distrib=false to query each one of the 6 servers only 1 of
  them
   returns results (partial) and the rest of them give the illegal
 character
   error .
  
   I have not set up any special logging I do not see any info in the
   catalina.out but in a file called localhost_access_log.2014-12-24.txt
 in
   tomcat/logs directory, I see the following logging message when the
  invalid
   character error occurs.
  
   [24/Dec/2014:09:25:54 +] GET
  
 
 /solr/dyCollection1_shard2_replica1/?fl=*,scoreq=canon+pixma+printersort=score+desc,productNameLength%20ascwt=jsonindent=truerows=100defType=edismaxqf=productNamemm=2pf=productNameps=1pf2=productNamepf3=productNamestopwords=truelowercaseOperators=truebq=hasThumbnailImage:true^2.0distrib=false
   HTTP/1.1 500 7781
  
   I am using Tomcat 7.0.42 and SolrCloud 4.10.1 and the Oracle JDK .
  
   java version 1.7.0_71
   Java(TM) SE Runtime Environment (build 1.7.0_71-b14)
   Java HotSpot(TM) 64-Bit Server VM (build 24.71-b01, mixed mode)
  
   Thanks.
  
   On Tue, Dec 23, 2014 at 11:46 AM, Erick Erickson 
  erickerick...@gmail.com
   wrote:
  
   Hmmm, so you are you pinging the servers directly, right?
   Here's a couple of things to try:
   1 add distrib=false to the query and try each of the 6 servers.
   What I'm wondering is if this is happening on the sub-query sent
   out or on the primary server. Adding distrib=false will just execute
   on the node you're sending it to, and will NOT send sub-queries out
   to any other node so you'll get partial results back.
  
   If one server continues to work but the other 5 fail, then your
 servlet
   container is probably not set up with the right character sets.
 Although
   why that would manifest itself on the ^ character mystifies me.
  
   2 Let's assume that all 6 servers handle the raw query. Next thing
 that
   would be really helpful is to see the sub-queries. Take distrib=false
   off and tail the logs on all the servers. What we're looking for here
 is
   whether the sub-queries even make it to Solr or whether the problem
   is in your container.
  
   3 If the sub-queries do NOT make it to the Solr logs, what is the
 query
   that the container sees? Is it recognizable or has Solr somehow munged
   the sub-query?
  
   What is your environment like? Tomcat? Jetty? Other? What JVM
   etc?
  
   Best,
   Erick
  
   On Tue, Dec 

Re: 'Illegal character in query' on Solr cloud 4.10.1

2014-12-24 Thread S.L
Erik,

The scenario 1, that you have listed is what seems to be the case.

When I add distrib=false to query each one of the 6 servers only 1 of them
returns results (partial) and the rest of them give the illegal character
error .

I have not set up any special logging I do not see any info in the
catalina.out but in a file called localhost_access_log.2014-12-24.txt in
tomcat/logs directory, I see the following logging message when the invalid
character error occurs.

[24/Dec/2014:09:25:54 +] GET
/solr/dyCollection1_shard2_replica1/?fl=*,scoreq=canon+pixma+printersort=score+desc,productNameLength%20ascwt=jsonindent=truerows=100defType=edismaxqf=productNamemm=2pf=productNameps=1pf2=productNamepf3=productNamestopwords=truelowercaseOperators=truebq=hasThumbnailImage:true^2.0distrib=false
HTTP/1.1 500 7781

I am using Tomcat 7.0.42 and SolrCloud 4.10.1 and the Oracle JDK .

java version 1.7.0_71
Java(TM) SE Runtime Environment (build 1.7.0_71-b14)
Java HotSpot(TM) 64-Bit Server VM (build 24.71-b01, mixed mode)

Thanks.

On Tue, Dec 23, 2014 at 11:46 AM, Erick Erickson erickerick...@gmail.com
wrote:

 Hmmm, so you are you pinging the servers directly, right?
 Here's a couple of things to try:
 1 add distrib=false to the query and try each of the 6 servers.
 What I'm wondering is if this is happening on the sub-query sent
 out or on the primary server. Adding distrib=false will just execute
 on the node you're sending it to, and will NOT send sub-queries out
 to any other node so you'll get partial results back.

 If one server continues to work but the other 5 fail, then your servlet
 container is probably not set up with the right character sets. Although
 why that would manifest itself on the ^ character mystifies me.

 2 Let's assume that all 6 servers handle the raw query. Next thing that
 would be really helpful is to see the sub-queries. Take distrib=false
 off and tail the logs on all the servers. What we're looking for here is
 whether the sub-queries even make it to Solr or whether the problem
 is in your container.

 3 If the sub-queries do NOT make it to the Solr logs, what is the query
 that the container sees? Is it recognizable or has Solr somehow munged
 the sub-query?

 What is your environment like? Tomcat? Jetty? Other? What JVM
 etc?

 Best,
 Erick

 On Tue, Dec 23, 2014 at 3:23 AM, S.L simpleliving...@gmail.com wrote:
  Hi All,
 
  I am using SolrCloud 4.10.1 and I have 3 shards with replication factor
 of
  2 , i.e is 6 nodes altogether.
 
  When I query the server1 out of 6 nodes in the cluster with the below
 query
  , it works fine , but any other node in the cluster when queried with the
  same query results in a *HTTP Status 500 - {msg=Illegal character in
 query
  at index 181:*
  error.
 
  The character at index 181 is the boost character ^. I have see a Jira
  SOLR-5971 https://issues.apache.org/jira/browse/SOLR-5971 for a
 similar
  issue , how can I overcome this issue.
 
  The query I use is below. Thanks in Advance!
 
 
 http://xx2..com:8081/solr/dyCollection1_shard2_replica1/?q=x+x+xxsort=score+descwt=jsonindent=truedebugQuery=truedefType=edismaxqf=productName
 
 ^1.5+productDescriptionmm=1pf=productName+productDescriptionps=1pf2=productName+productDescriptionpf3=productName+productDescriptionstopwords=truelowercaseOperators=true



Re: 'Illegal character in query' on Solr cloud 4.10.1

2014-12-24 Thread Erick Erickson
OK, then I don't think it's a Solr problem. I think 5 of your Tomcats are
configured in such a way that they consider ^ to be an illegal character.

There have been recurring problems with Servlet containers being
configured to allow/disallow various characters, and I think that's
what's happening here. But this is totally outside Solr.

Solr, when it successfully distributes a query, sends the query on to one
replica of each shard, and I was wondering if that process wasn't
working correctly somehow, although boosting is so common that it
would be a huge shock since it would have broken almost every
Tomcat installation out there. By sending the query directly to each
node, you've bypassed any forwarding by Solr so it looks like the
problem is before Solr even sees it.

So my guess is that somehow 5 of your servers are configured to
expect a different character than the server that works. I'm afraid
I don't know Tomcat well enough to direct you there, but take a
look here:
https://wiki.apache.org/solr/SolrTomcat

Sorry I can't be more help
Erick

On Wed, Dec 24, 2014 at 1:33 AM, S.L simpleliving...@gmail.com wrote:
 Erik,

 The scenario 1, that you have listed is what seems to be the case.

 When I add distrib=false to query each one of the 6 servers only 1 of them
 returns results (partial) and the rest of them give the illegal character
 error .

 I have not set up any special logging I do not see any info in the
 catalina.out but in a file called localhost_access_log.2014-12-24.txt in
 tomcat/logs directory, I see the following logging message when the invalid
 character error occurs.

 [24/Dec/2014:09:25:54 +] GET
 /solr/dyCollection1_shard2_replica1/?fl=*,scoreq=canon+pixma+printersort=score+desc,productNameLength%20ascwt=jsonindent=truerows=100defType=edismaxqf=productNamemm=2pf=productNameps=1pf2=productNamepf3=productNamestopwords=truelowercaseOperators=truebq=hasThumbnailImage:true^2.0distrib=false
 HTTP/1.1 500 7781

 I am using Tomcat 7.0.42 and SolrCloud 4.10.1 and the Oracle JDK .

 java version 1.7.0_71
 Java(TM) SE Runtime Environment (build 1.7.0_71-b14)
 Java HotSpot(TM) 64-Bit Server VM (build 24.71-b01, mixed mode)

 Thanks.

 On Tue, Dec 23, 2014 at 11:46 AM, Erick Erickson erickerick...@gmail.com
 wrote:

 Hmmm, so you are you pinging the servers directly, right?
 Here's a couple of things to try:
 1 add distrib=false to the query and try each of the 6 servers.
 What I'm wondering is if this is happening on the sub-query sent
 out or on the primary server. Adding distrib=false will just execute
 on the node you're sending it to, and will NOT send sub-queries out
 to any other node so you'll get partial results back.

 If one server continues to work but the other 5 fail, then your servlet
 container is probably not set up with the right character sets. Although
 why that would manifest itself on the ^ character mystifies me.

 2 Let's assume that all 6 servers handle the raw query. Next thing that
 would be really helpful is to see the sub-queries. Take distrib=false
 off and tail the logs on all the servers. What we're looking for here is
 whether the sub-queries even make it to Solr or whether the problem
 is in your container.

 3 If the sub-queries do NOT make it to the Solr logs, what is the query
 that the container sees? Is it recognizable or has Solr somehow munged
 the sub-query?

 What is your environment like? Tomcat? Jetty? Other? What JVM
 etc?

 Best,
 Erick

 On Tue, Dec 23, 2014 at 3:23 AM, S.L simpleliving...@gmail.com wrote:
  Hi All,
 
  I am using SolrCloud 4.10.1 and I have 3 shards with replication factor
 of
  2 , i.e is 6 nodes altogether.
 
  When I query the server1 out of 6 nodes in the cluster with the below
 query
  , it works fine , but any other node in the cluster when queried with the
  same query results in a *HTTP Status 500 - {msg=Illegal character in
 query
  at index 181:*
  error.
 
  The character at index 181 is the boost character ^. I have see a Jira
  SOLR-5971 https://issues.apache.org/jira/browse/SOLR-5971 for a
 similar
  issue , how can I overcome this issue.
 
  The query I use is below. Thanks in Advance!
 
 
 http://xx2..com:8081/solr/dyCollection1_shard2_replica1/?q=x+x+xxsort=score+descwt=jsonindent=truedebugQuery=truedefType=edismaxqf=productName
 
 ^1.5+productDescriptionmm=1pf=productName+productDescriptionps=1pf2=productName+productDescriptionpf3=productName+productDescriptionstopwords=truelowercaseOperators=true



Re: 'Illegal character in query' on Solr cloud 4.10.1

2014-12-24 Thread Yonik Seeley
On Wed, Dec 24, 2014 at 4:32 PM, Erick Erickson erickerick...@gmail.com wrote:
 OK, then I don't think it's a Solr problem. I think 5 of your Tomcats are
 configured in such a way that they consider ^ to be an illegal character.

Hmmm, the stack trace in SOLR-5971 shows a different user (who gets
the same error message) running in Jetty.

Without looking into it further, I thought it most likely an issue
with the proxying code.

I don't think distrib=false won't prevent a node from proxying a query
to another node that can actually handle that query.

-Yonik
http://heliosearch.org - native code faceting, facet functions,
sub-facets, off-heap data


Re: 'Illegal character in query' on Solr cloud 4.10.1

2014-12-24 Thread Jack Krupansky
Is the problem here that the error occurs sometimes or that it doesn't
occur all of the time? I mean, it is clearly a bug in the client if it is
sending a raw circumflex rather than a URL-encoded circumflex.

Also, some browsers automatically URL-encode character as needed, but I
have heard that some browsers don't always encode all of the characters.

Question: You mention the URL, but how are you sending that URL to Solr -
via a browser address box, curl, or... what?

If using curl, you also have to cope with some characters having a shell
meaning and needing to be escaped.

Whether it is Tomcat or Solr that gives the error, the main point is that
the raw circumflex shouldn't be sent to either.


-- Jack Krupansky

On Wed, Dec 24, 2014 at 4:32 PM, Erick Erickson erickerick...@gmail.com
wrote:

 OK, then I don't think it's a Solr problem. I think 5 of your Tomcats are
 configured in such a way that they consider ^ to be an illegal character.

 There have been recurring problems with Servlet containers being
 configured to allow/disallow various characters, and I think that's
 what's happening here. But this is totally outside Solr.

 Solr, when it successfully distributes a query, sends the query on to one
 replica of each shard, and I was wondering if that process wasn't
 working correctly somehow, although boosting is so common that it
 would be a huge shock since it would have broken almost every
 Tomcat installation out there. By sending the query directly to each
 node, you've bypassed any forwarding by Solr so it looks like the
 problem is before Solr even sees it.

 So my guess is that somehow 5 of your servers are configured to
 expect a different character than the server that works. I'm afraid
 I don't know Tomcat well enough to direct you there, but take a
 look here:
 https://wiki.apache.org/solr/SolrTomcat

 Sorry I can't be more help
 Erick

 On Wed, Dec 24, 2014 at 1:33 AM, S.L simpleliving...@gmail.com wrote:
  Erik,
 
  The scenario 1, that you have listed is what seems to be the case.
 
  When I add distrib=false to query each one of the 6 servers only 1 of
 them
  returns results (partial) and the rest of them give the illegal character
  error .
 
  I have not set up any special logging I do not see any info in the
  catalina.out but in a file called localhost_access_log.2014-12-24.txt in
  tomcat/logs directory, I see the following logging message when the
 invalid
  character error occurs.
 
  [24/Dec/2014:09:25:54 +] GET
 
 /solr/dyCollection1_shard2_replica1/?fl=*,scoreq=canon+pixma+printersort=score+desc,productNameLength%20ascwt=jsonindent=truerows=100defType=edismaxqf=productNamemm=2pf=productNameps=1pf2=productNamepf3=productNamestopwords=truelowercaseOperators=truebq=hasThumbnailImage:true^2.0distrib=false
  HTTP/1.1 500 7781
 
  I am using Tomcat 7.0.42 and SolrCloud 4.10.1 and the Oracle JDK .
 
  java version 1.7.0_71
  Java(TM) SE Runtime Environment (build 1.7.0_71-b14)
  Java HotSpot(TM) 64-Bit Server VM (build 24.71-b01, mixed mode)
 
  Thanks.
 
  On Tue, Dec 23, 2014 at 11:46 AM, Erick Erickson 
 erickerick...@gmail.com
  wrote:
 
  Hmmm, so you are you pinging the servers directly, right?
  Here's a couple of things to try:
  1 add distrib=false to the query and try each of the 6 servers.
  What I'm wondering is if this is happening on the sub-query sent
  out or on the primary server. Adding distrib=false will just execute
  on the node you're sending it to, and will NOT send sub-queries out
  to any other node so you'll get partial results back.
 
  If one server continues to work but the other 5 fail, then your servlet
  container is probably not set up with the right character sets. Although
  why that would manifest itself on the ^ character mystifies me.
 
  2 Let's assume that all 6 servers handle the raw query. Next thing that
  would be really helpful is to see the sub-queries. Take distrib=false
  off and tail the logs on all the servers. What we're looking for here is
  whether the sub-queries even make it to Solr or whether the problem
  is in your container.
 
  3 If the sub-queries do NOT make it to the Solr logs, what is the query
  that the container sees? Is it recognizable or has Solr somehow munged
  the sub-query?
 
  What is your environment like? Tomcat? Jetty? Other? What JVM
  etc?
 
  Best,
  Erick
 
  On Tue, Dec 23, 2014 at 3:23 AM, S.L simpleliving...@gmail.com wrote:
   Hi All,
  
   I am using SolrCloud 4.10.1 and I have 3 shards with replication
 factor
  of
   2 , i.e is 6 nodes altogether.
  
   When I query the server1 out of 6 nodes in the cluster with the below
  query
   , it works fine , but any other node in the cluster when queried with
 the
   same query results in a *HTTP Status 500 - {msg=Illegal character in
  query
   at index 181:*
   error.
  
   The character at index 181 is the boost character ^. I have see a Jira
   SOLR-5971 https://issues.apache.org/jira/browse/SOLR-5971 for a
  similar
   issue , how 

'Illegal character in query' on Solr cloud 4.10.1

2014-12-23 Thread S.L
Hi All,

I am using SolrCloud 4.10.1 and I have 3 shards with replication factor of
2 , i.e is 6 nodes altogether.

When I query the server1 out of 6 nodes in the cluster with the below query
, it works fine , but any other node in the cluster when queried with the
same query results in a *HTTP Status 500 - {msg=Illegal character in query
at index 181:*
error.

The character at index 181 is the boost character ^. I have see a Jira
SOLR-5971 https://issues.apache.org/jira/browse/SOLR-5971 for a similar
issue , how can I overcome this issue.

The query I use is below. Thanks in Advance!

http://xx2..com:8081/solr/dyCollection1_shard2_replica1/?q=x+x+xxsort=score+descwt=jsonindent=truedebugQuery=truedefType=edismaxqf=productName
^1.5+productDescriptionmm=1pf=productName+productDescriptionps=1pf2=productName+productDescriptionpf3=productName+productDescriptionstopwords=truelowercaseOperators=true


Re: 'Illegal character in query' on Solr cloud 4.10.1

2014-12-23 Thread Erick Erickson
Hmmm, so you are you pinging the servers directly, right?
Here's a couple of things to try:
1 add distrib=false to the query and try each of the 6 servers.
What I'm wondering is if this is happening on the sub-query sent
out or on the primary server. Adding distrib=false will just execute
on the node you're sending it to, and will NOT send sub-queries out
to any other node so you'll get partial results back.

If one server continues to work but the other 5 fail, then your servlet
container is probably not set up with the right character sets. Although
why that would manifest itself on the ^ character mystifies me.

2 Let's assume that all 6 servers handle the raw query. Next thing that
would be really helpful is to see the sub-queries. Take distrib=false
off and tail the logs on all the servers. What we're looking for here is
whether the sub-queries even make it to Solr or whether the problem
is in your container.

3 If the sub-queries do NOT make it to the Solr logs, what is the query
that the container sees? Is it recognizable or has Solr somehow munged
the sub-query?

What is your environment like? Tomcat? Jetty? Other? What JVM
etc?

Best,
Erick

On Tue, Dec 23, 2014 at 3:23 AM, S.L simpleliving...@gmail.com wrote:
 Hi All,

 I am using SolrCloud 4.10.1 and I have 3 shards with replication factor of
 2 , i.e is 6 nodes altogether.

 When I query the server1 out of 6 nodes in the cluster with the below query
 , it works fine , but any other node in the cluster when queried with the
 same query results in a *HTTP Status 500 - {msg=Illegal character in query
 at index 181:*
 error.

 The character at index 181 is the boost character ^. I have see a Jira
 SOLR-5971 https://issues.apache.org/jira/browse/SOLR-5971 for a similar
 issue , how can I overcome this issue.

 The query I use is below. Thanks in Advance!

 http://xx2..com:8081/solr/dyCollection1_shard2_replica1/?q=x+x+xxsort=score+descwt=jsonindent=truedebugQuery=truedefType=edismaxqf=productName
 ^1.5+productDescriptionmm=1pf=productName+productDescriptionps=1pf2=productName+productDescriptionpf3=productName+productDescriptionstopwords=truelowercaseOperators=true