Re: Probleme with unicode query

2012-02-24 Thread Frederic Bouchery
Thanks !!

This is a tomcat issue and not solr : URIEncoding=UTF-8 is missing in
tomcat server.xml

Frederic

2012/2/23 Em mailformailingli...@yahoo.de

 Hi Frederic,

 I saw similar issues when sending such a request without proper
 URL-encoding. It is important to note that the URL-encoded string
 already has to be an UTF-8-string.
 What happens if you send that query via Solr's admin-panel?

 Have a look at this page for troubleshooting:
 http://wiki.apache.org/solr/SolrTomcat

 Kind regards,
 Em

 Am 23.02.2012 18:15, schrieb Frederic Bouchery:
  hello,
 
  I'm using Solr 3.5 over Tomcat 6 and I've some problemes with unicode
 quey.
 
  Here is my text field configuration
  analyzer type=index
  charFilter class=solr.HTMLStripCharFilterFactory/
  tokenizer class=solr.StandardTokenizerFactory/
  filter class=solr.StandardFilterFactory/
  filter class=solr.LowerCaseFilterFactory/
  filter class=solr.ElisionFilterFactory articles=elisions.txt/
  filter class=solr.StopFilterFactory words=stopwords.txt
  ignoreCase=true/
  filter class=solr.ASCIIFoldingFilterFactory/
  filter class=solr.SnowballPorterFilterFactory language=French /
  /analyzer
  analyzer type=query
  charFilter class=solr.HTMLStripCharFilterFactory/
  tokenizer class=solr.StandardTokenizerFactory/
  filter class=solr.StandardFilterFactory/
  filter class=solr.LowerCaseFilterFactory/
  filter class=solr.ElisionFilterFactory articles=elisions.txt/
  filter class=solr.StopFilterFactory words=stopwords.txt
  ignoreCase=true/
  filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
  ignoreCase=true/
  filter class=solr.ASCIIFoldingFilterFactory/
  filter class=solr.SnowballPorterFilterFactory language=French /
  /analyzer
 
  When I performe this request : select/?q=hygiene sécuritédebugQuery=true
  Here is debug infos :
  str name=rawquerystringhygiene sécurité/str
  str name=querystringhygiene sécurité/str
  str name=parsedquerysearchText:hygien (searchText:sa
  searchText:curit)/str
  str name=parsedquery_toStringsearchText:hygien (searchText:sa
  searchText:curit)/str
 
  Has you can see, unicode request failed : searchText:sa
 searchText:curit
  instead of searchText:securite
  I've tried with ISOLatin1AccentFilterFactory, I've changed the order,
 but
  no difference :(
 
  Any ideas ?
 
  Thanks
 
  Frederic
 




-- 
*Frédéric BOUCHERY*
OuestFranceMultimédi@
*BU - Emploi* : 0.22.33.55.88.9


Probleme with unicode query

2012-02-23 Thread Frederic Bouchery
hello,

I'm using Solr 3.5 over Tomcat 6 and I've some problemes with unicode quey.

Here is my text field configuration
analyzer type=index
charFilter class=solr.HTMLStripCharFilterFactory/
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.StandardFilterFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.ElisionFilterFactory articles=elisions.txt/
filter class=solr.StopFilterFactory words=stopwords.txt
ignoreCase=true/
filter class=solr.ASCIIFoldingFilterFactory/
filter class=solr.SnowballPorterFilterFactory language=French /
/analyzer
analyzer type=query
charFilter class=solr.HTMLStripCharFilterFactory/
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.StandardFilterFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.ElisionFilterFactory articles=elisions.txt/
filter class=solr.StopFilterFactory words=stopwords.txt
ignoreCase=true/
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
ignoreCase=true/
filter class=solr.ASCIIFoldingFilterFactory/
filter class=solr.SnowballPorterFilterFactory language=French /
/analyzer

When I performe this request : select/?q=hygiene sécuritédebugQuery=true
Here is debug infos :
str name=rawquerystringhygiene sécurité/str
str name=querystringhygiene sécurité/str
str name=parsedquerysearchText:hygien (searchText:sa
searchText:curit)/str
str name=parsedquery_toStringsearchText:hygien (searchText:sa
searchText:curit)/str

Has you can see, unicode request failed : searchText:sa searchText:curit
instead of searchText:securite
I've tried with ISOLatin1AccentFilterFactory, I've changed the order, but
no difference :(

Any ideas ?

Thanks

Frederic


probleme with unicode query

2012-02-23 Thread Frederic Bouchery
hello,

I'm using Solr 3.5 over Tomcat 6 and I've some problemes with unicode quey.

Here is my text field configuration
analyzer type=index
charFilter class=solr.HTMLStripCharFilterFactory/
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.StandardFilterFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.ElisionFilterFactory articles=elisions.txt/
filter class=solr.StopFilterFactory words=stopwords.txt
ignoreCase=true/
filter class=solr.ASCIIFoldingFilterFactory/
filter class=solr.SnowballPorterFilterFactory language=French /
/analyzer
analyzer type=query
charFilter class=solr.HTMLStripCharFilterFactory/
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.StandardFilterFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.ElisionFilterFactory articles=elisions.txt/
filter class=solr.StopFilterFactory words=stopwords.txt
ignoreCase=true/
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
ignoreCase=true/
filter class=solr.ASCIIFoldingFilterFactory/
filter class=solr.SnowballPorterFilterFactory language=French /
/analyzer

When I performe this request : select/?q=hygiene sécuritédebugQuery=true
Here is debug infos :
str name=rawquerystringhygiene sécurité/str
str name=querystringhygiene sécurité/str
str name=parsedquerysearchText:hygien (searchText:sa
searchText:curit)/str
str name=parsedquery_toStringsearchText:hygien (searchText:sa
searchText:curit)/str

Has you can see, unicode request failed : searchText:sa searchText:curit
instead of searchText:securite
I've tried with ISOLatin1AccentFilterFactory, I've changed the order, but
no difference :(

Any ideas ?

Thanks

Frederic


Re: Probleme with unicode query

2012-02-23 Thread Em
Hi Frederic,

I saw similar issues when sending such a request without proper
URL-encoding. It is important to note that the URL-encoded string
already has to be an UTF-8-string.
What happens if you send that query via Solr's admin-panel?

Have a look at this page for troubleshooting:
http://wiki.apache.org/solr/SolrTomcat

Kind regards,
Em

Am 23.02.2012 18:15, schrieb Frederic Bouchery:
 hello,
 
 I'm using Solr 3.5 over Tomcat 6 and I've some problemes with unicode quey.
 
 Here is my text field configuration
 analyzer type=index
 charFilter class=solr.HTMLStripCharFilterFactory/
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.StandardFilterFactory/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.ElisionFilterFactory articles=elisions.txt/
 filter class=solr.StopFilterFactory words=stopwords.txt
 ignoreCase=true/
 filter class=solr.ASCIIFoldingFilterFactory/
 filter class=solr.SnowballPorterFilterFactory language=French /
 /analyzer
 analyzer type=query
 charFilter class=solr.HTMLStripCharFilterFactory/
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.StandardFilterFactory/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.ElisionFilterFactory articles=elisions.txt/
 filter class=solr.StopFilterFactory words=stopwords.txt
 ignoreCase=true/
 filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
 ignoreCase=true/
 filter class=solr.ASCIIFoldingFilterFactory/
 filter class=solr.SnowballPorterFilterFactory language=French /
 /analyzer
 
 When I performe this request : select/?q=hygiene sécuritédebugQuery=true
 Here is debug infos :
 str name=rawquerystringhygiene sécurité/str
 str name=querystringhygiene sécurité/str
 str name=parsedquerysearchText:hygien (searchText:sa
 searchText:curit)/str
 str name=parsedquery_toStringsearchText:hygien (searchText:sa
 searchText:curit)/str
 
 Has you can see, unicode request failed : searchText:sa searchText:curit
 instead of searchText:securite
 I've tried with ISOLatin1AccentFilterFactory, I've changed the order, but
 no difference :(
 
 Any ideas ?
 
 Thanks
 
 Frederic