Re: Problems querying Russian content

2007-06-28 Thread funtick
I know Russian better than Russians ;) I currently use default configuration for "dismax" provided by SOLR 1.1; I can add few URLs tonight to the crawler to see what happens. As I know, Lucene/Nutch can even define web page (pdf, txt, html) language by checking raw bytearray (raw HTTP Respon

Re: Problems querying Russian content

2007-06-28 Thread Daniel Alheiros
Thanks. Yes I will do it. So you may be the best person to talk about the Russian content indexing. :) My indexing process follows: 1. RussianTokenizer 2. RussianLowerCaseFilter 3. RussianStopFilter 4. RussianStemFilter Seems OK to me as I'm using the same structure used by the

Re: Problems querying Russian content

2007-06-28 Thread Daniel Alheiros
Thanks a lot! Now it is working. It was the Tomcat connector setup Regards, Daniel On 28.06.2007 17:19, "Chris Hostetter" <[EMAIL PROTECTED]> wrote: > > : You can also ensure the browser sends an utf8 encoded post by > : : It works even if the page the form is in is not an UTF-8 page. >

Re: Problems querying Russian content

2007-06-28 Thread funtick
Hi Danier, Ensure that UTF-8 is everywhere... SOLR, WebServer, AppServer, HTTP Headers, etc. And do not use q=Бамбарбиа Киркуду use this instead (encoded URL): q=%D0%91%D0%B0%D0%BC%D0%B1%D0%B0%D1%80%D0%B1%D0%B8%D0%B0+%D0%9A%D0%B8%D1%80%D0%BA%D1%83%D0%B4%D1%83 http://www.tokenizer.org is

Re: Problems querying Russian content

2007-06-28 Thread Chris Hostetter
: You can also ensure the browser sends an utf8 encoded post by : http://www.nabble.com/Cyrillic-characters-t1963293.html#a5402562 http://wiki.apache.org/solr/SolrTomcat (see URI charset section) -Hoss

Re: Problems querying Russian content

2007-06-28 Thread Jérôme Etévé
On 6/28/07, Yonik Seeley <[EMAIL PROTECTED]> wrote: On 6/28/07, Daniel Alheiros <[EMAIL PROTECTED]> wrote: > I'm in trouble now about how to issue queries against Solr using in my "q" > parameter content in Russian (it applies to Chinese and Arabic as well). > > The problem is I can't send any Ru

Re: Problems querying Russian content

2007-06-28 Thread Yonik Seeley
On 6/28/07, Daniel Alheiros <[EMAIL PROTECTED]> wrote: I'm in trouble now about how to issue queries against Solr using in my "q" parameter content in Russian (it applies to Chinese and Arabic as well). The problem is I can't send any Russian special character in URL's because they don't fit in

Problems querying Russian content

2007-06-28 Thread Daniel Alheiros
Hi I'm in trouble now about how to issue queries against Solr using in my "q" parameter content in Russian (it applies to Chinese and Arabic as well). The problem is I can't send any Russian special character in URL's because they don't fit in ASCII domain, so I'm doing a POST to accomplish that.