Re: wild card search and lower-casing

2011-11-23 Thread Dmitry Kan
Yes, it should be ok, as currently we are on the English side. If that's beneficial for the effort, I could do a field test on 3.4 after you close the jira. Best, Dmitry On Wed, Nov 23, 2011 at 2:52 PM, Erick Erickson wrote: > Ah, I see what you're doing, go for it. > > I intend to commit it tod

Re: wild card search and lower-casing

2011-11-23 Thread Erick Erickson
Ah, I see what you're doing, go for it. I intend to commit it today, but things happen. About changing the setLowerCaseExpandedTerms(true), yes that'll take care of this issue, although it has some locale-specific assumptions (i.e. string.toLowerCase() uses the default locale). That may not m

Re: wild card search and lower-casing

2011-11-22 Thread Dmitry Kan
Thanks, Erick. I was in fact reading the patch (the one attached as a file to the aforementioned jira) you updated sometime yesterday. I'll watch the issue, but as said the change of a hard-coded boolean to its opposite worked just fine for me. Best, Dmitry On 11/22/11, Erick Erickson wrote: >

Re: wild card search and lower-casing

2011-11-22 Thread Erick Erickson
No, no, no That's something buried in Lucene, it has nothing to do with the patch! The patch has NOT yet been applied to any released code. You could pull the patch from the JIRA and apply it to trunk locally if you wanted. But there's no patch for 3.x, I'll probably put that up over the holid

Re: wild card search and lower-casing

2011-11-22 Thread Dmitry Kan
I guess, I have found your comment, thanks. For our current needs I have just set: setLowercaseExpandedTerms(true); // changed from default false in the SolrQueryParser's constructor and that seem to work so far. In order not to start a separate thread on wildcards. Is it so, that for the trail

Re: wild card search and lower-casing

2011-11-21 Thread Erick Erickson
It may be. The tricky bit is that there is a constant governing the behavior of this that restricts it to 3.6 and above. You'll have to change it after applying the patch for this to work for you. Should be trivial, I'll leave a note in the code about this, look for SOLR-2438 in the 3x code line fo

Re: wild card search and lower-casing

2011-11-20 Thread Dmitry Kan
Thanks Erick. Do you think the patch you are working on will be applicable as well to 3.4? Best, Dmitry On Mon, Nov 21, 2011 at 5:06 AM, Erick Erickson wrote: > As it happens I'm working on SOLR-2438 which should address this. This > patch > will provide two things: > > The ability to define a

Re: wild card search and lower-casing

2011-11-20 Thread Erick Erickson
As it happens I'm working on SOLR-2438 which should address this. This patch will provide two things: The ability to define a new analysis chain in your schema.xml, currently called "multiterm" that will be applied to queries of various sorts, including wildcard, prefix, range. This will be somewh

Re: wild card search and lower-casing

2011-11-18 Thread Ahmet Arslan
> You're right: > > public SolrQueryParser(IndexSchema schema, String > defaultField) { > ... > setLowercaseExpandedTerms(false); > ... > } Please note that lowercaseExpandedTerms uses String.toLowercase() (uses default Locale) which is a Locale sensitive operation. In Lucene AnalyzingQueryP

Re: wild card search and lower-casing

2011-11-18 Thread Dmitry Kan
You're right: public SolrQueryParser(IndexSchema schema, String defaultField) { ... setLowercaseExpandedTerms(false); ... } OK, thanks for pointing. On Fri, Nov 18, 2011 at 4:12 PM, Ahmet Arslan wrote: > > Actually I have just checked the source code of Lucene's > > QueryParser and > > lowerca

Re: wild card search and lower-casing

2011-11-18 Thread Ahmet Arslan
> Actually I have just checked the source code of Lucene's > QueryParser and > lowercaseExpandedTerms there is set to true by default > (version 3.4). The > code there does lower-casing by default. So in that sense I > don't need to > do anything in the client code. Is something wrong here? But So

Re: wild card search and lower-casing

2011-11-18 Thread Dmitry Kan
OK. Actually I have just checked the source code of Lucene's QueryParser and lowercaseExpandedTerms there is set to true by default (version 3.4). The code there does lower-casing by default. So in that sense I don't need to do anything in the client code. Is something wrong here? On Fri, Nov 18,

Re: wild card search and lower-casing

2011-11-18 Thread Ahmet Arslan
> Hi Ahmet, > > Thanks for the link. > > I'm a bit puzzled with the explanation found there > regarding lower casing: > > These queries are case-insensitive anyway because > QueryParser makes them > lowercase. > > that's exactly what I want to achieve, but somehow the > queries *are* > case-sen

Re: wild card search and lower-casing

2011-11-18 Thread Dmitry Kan
Hi Ahmet, Thanks for the link. I'm a bit puzzled with the explanation found there regarding lower casing: These queries are case-insensitive anyway because QueryParser makes them lowercase. that's exactly what I want to achieve, but somehow the queries *are* case-sensitive. Probably I should pl

Re: wild card search and lower-casing

2011-11-18 Thread Ahmet Arslan
> Here is one puzzle I couldn't yet find a key for: > > for the wild-card query: > > *ocvd > > SOLR 3.4 returns hits. But for > > *OCVD > > it doesn't This is a FAQ. Please see http://wiki.apache.org/lucene-java/LuceneFAQ#Are_Wildcard.2C_Prefix.2C_and_Fuzzy_queries_case_sensitive.3F

wild card search and lower-casing

2011-11-18 Thread Dmitry Kan
Hello, Here is one puzzle I couldn't yet find a key for: for the wild-card query: *ocvd SOLR 3.4 returns hits. But for *OCVD it doesn't On the indexing side two following tokenizers/filters are defined: On the query side: SOLR analysis tool shows, that OCVD gets lower-cased to ocvd