Chris,

You are the best.  Switching to POST solved the problem.  I hadn't noticed that 
option earlier but after finding: 
https://issues.apache.org/jira/browse/SOLR-612 I found the option in the code.

Thank you, you just made my day.

Secondly, in an effort to narrow down whether this was a glassfish issue or 
not, here is what I found.

Starting with glassfishv3 (I think) UTF-8 is the default for URI.  You can see 
this by going to the admin site, clicking on Network Config | Network Listeners 
| then select the listener.  Select the tab "HTTP" and about half way down, you 
will see URI Encoding: UTF-8.

HOWEVER, that doesn't appear to be correct because following Abdelhamid Abid's 
advice, I deployed Solr to Tomcat, then followed the direction here:
http://wiki.apache.org/solr/SolrTomcat to force tomcat to UTF-8 for URI.  Then 
I deployed Solr to tomcat, and using CommonsHttpSolrServer, connected to that 
tomcat served instance.  It worked- first time.

So, it appears that there is a problem with glassfishv3 and UTF-8 URI's for at 
least the apache-solr-1.4.0.war.  I wonder if I added that sun-web.xml file 
into the war to force UTF-8 it might work... not sure.  However, the workaround 
is to change the method to POST as Chris suggested.  You can do that in Solrj 
here:

server.query(solrQuery, METHOD.POST);

and it works as you'd expect.

Thanks for the advice/tips,

Tim

-----Original Message-----
From: Chris Hostetter [mailto:hossman_luc...@fucit.org] 
Sent: Thursday, May 20, 2010 2:41 PM
To: solr-user@lucene.apache.org
Subject: Re: Non-English query via Solr Example Admin corrupts text


: I am using apache-solr-1.4.0.war deployed to glassfishv3 on my 
        ...
: INFO: [] webapp=/apache-solr-1.4.0 path=/select 
: 
params={indent=on&version=2.2&q=numéro&fq=&start=0&rows=10&fl=*,score&qt=standard&wt=standard&explainOther=&hl.fl=}
 
: hits=0 status=0 QTime=16
        ...
: In my SolrJ using application, I have a test case which queries for 
: "numéro" and succeeds if I use Embedded and fails if I use 
: CommonsHttpSolrServer... I don't want to use embedded for a number of 
        ...
: I am sorry if you'd dealt with this issue in the past, I've spent a few 
: hours googling for solr utf-8 query and glassfishv3 utf-8 uri plus other 
: permutations/combinations but there were seemingly endless amounts of 
: chaff that I couldn't find anything useful after scouring it for a few 
: hours.  I can't decide whether it's a glassfish issue or not so I am not 
: sure where to direct my energy.  Any tips or advice are appreciated!

I suspect if you switched to using POST instead of GET your problem would 
go away -- this stems from amiguity in the way HTTP servers/browsers deal 
with encoding UTF8 in URLs.  a quick search for "glassfish url encoding" 
turns up this thread...

  http://forums.java.net/jive/thread.jspa?threadID=38020

which refreneces...

http://wiki.glassfish.java.net/Wiki.jsp?page=FaqHttpRequestParameterEncoding

...it looks like you want to modify the "default-charset attribute of the 
<parameter-encoding>"


-Hoss

Reply via email to