Chris, You are the best. Switching to POST solved the problem. I hadn't noticed that option earlier but after finding: https://issues.apache.org/jira/browse/SOLR-612 I found the option in the code.
Thank you, you just made my day. Secondly, in an effort to narrow down whether this was a glassfish issue or not, here is what I found. Starting with glassfishv3 (I think) UTF-8 is the default for URI. You can see this by going to the admin site, clicking on Network Config | Network Listeners | then select the listener. Select the tab "HTTP" and about half way down, you will see URI Encoding: UTF-8. HOWEVER, that doesn't appear to be correct because following Abdelhamid Abid's advice, I deployed Solr to Tomcat, then followed the direction here: http://wiki.apache.org/solr/SolrTomcat to force tomcat to UTF-8 for URI. Then I deployed Solr to tomcat, and using CommonsHttpSolrServer, connected to that tomcat served instance. It worked- first time. So, it appears that there is a problem with glassfishv3 and UTF-8 URI's for at least the apache-solr-1.4.0.war. I wonder if I added that sun-web.xml file into the war to force UTF-8 it might work... not sure. However, the workaround is to change the method to POST as Chris suggested. You can do that in Solrj here: server.query(solrQuery, METHOD.POST); and it works as you'd expect. Thanks for the advice/tips, Tim -----Original Message----- From: Chris Hostetter [mailto:hossman_luc...@fucit.org] Sent: Thursday, May 20, 2010 2:41 PM To: solr-user@lucene.apache.org Subject: Re: Non-English query via Solr Example Admin corrupts text : I am using apache-solr-1.4.0.war deployed to glassfishv3 on my ... : INFO: [] webapp=/apache-solr-1.4.0 path=/select : params={indent=on&version=2.2&q=numéro&fq=&start=0&rows=10&fl=*,score&qt=standard&wt=standard&explainOther=&hl.fl=} : hits=0 status=0 QTime=16 ... : In my SolrJ using application, I have a test case which queries for : "numéro" and succeeds if I use Embedded and fails if I use : CommonsHttpSolrServer... I don't want to use embedded for a number of ... : I am sorry if you'd dealt with this issue in the past, I've spent a few : hours googling for solr utf-8 query and glassfishv3 utf-8 uri plus other : permutations/combinations but there were seemingly endless amounts of : chaff that I couldn't find anything useful after scouring it for a few : hours. I can't decide whether it's a glassfish issue or not so I am not : sure where to direct my energy. Any tips or advice are appreciated! I suspect if you switched to using POST instead of GET your problem would go away -- this stems from amiguity in the way HTTP servers/browsers deal with encoding UTF8 in URLs. a quick search for "glassfish url encoding" turns up this thread... http://forums.java.net/jive/thread.jspa?threadID=38020 which refreneces... http://wiki.glassfish.java.net/Wiki.jsp?page=FaqHttpRequestParameterEncoding ...it looks like you want to modify the "default-charset attribute of the <parameter-encoding>" -Hoss