RE: KeywordAnalyzer still getting tokenized on spaces

2014-09-09 Thread Milind
I simplified the program to show this. I actually use a multiterm query parser and a join query across 2 Lucene Indexes. It's already complicated. I can understand the logic of parsing the query first (I need that in fact because I'm using different analyzers for different fields), but I don't un

KeywordAnalyzer still getting tokenized on spaces

2014-09-08 Thread Milind
tput class org.apache.lucene.search.TermQuery, sn:1023 4567 8765 Number of results found for '"1023 4567 8765"': 1 class org.apache.lucene.search.BooleanQuery, sn:1023 sn:4567 sn:8765 Number of results found for '1023 4567 8765': 0 -- Regards Milind

Re: Why does this search fail?

2014-08-27 Thread Milind
; > It also seems to support "**" in a quoted phrase to mean one or more > arbitrary terms. This isn't documented, but seems to work. > > > -- Jack Krupansky > > -Original Message- From: Milind > Sent: Wednesday, August 27, 2014 10:51 AM > To: java-user@

Re: Why does this search fail?

2014-08-27 Thread Milind
its component parts. There are some weird > side effects to do with term frequencies and phrase-like queries, but it > would make all these wildcard queries work I think. > > -Mike > > On 08/27/2014 09:54 AM, Milind wrote: > >> I see. This is going to be extremely difficult t

Re: Why does this search fail?

2014-08-27 Thread Milind
ort "*"? > > > > On Wed, Aug 27, 2014 at 9:54 AM, Milind wrote: > > > I see. This is going to be extremely difficult to explain to end users. > > It doesn't work as they would expect. Some of the tokenizing rules are > > already somewhat confusing. Their e

Re: Why does this search fail?

2014-08-27 Thread Milind
, but the standard tokenizer is not being called, so > the dot remains and this whole term is treated as one term, unlike the > index analysis. > > -- Jack Krupansky > > -Original Message- From: Milind > Sent: Tuesday, August 26, 2014 12:24 PM > To: java-user

Re: Why does this search fail?

2014-08-26 Thread Milind
ohnapplesee* Hits found: 1 On Tue, Aug 26, 2014 at 12:30 PM, Ralf Heyde wrote: > Can you Post the Result of the queryparser for the other queries too? > > Gesendet von meinem BlackBerry 10-Smartphone. > Originalnachricht > Von: Milind > Gesendet: Dienstag, 26. August 2014 18:24 >

Why does this search fail?

2014-08-26 Thread Milind
ss org.apache.lucene.search.PrefixQuery, Name:c0001.devnm00* Hits found: 0 -- Regards Milind

Re: Can't get case insensitive keyword analyzer to work

2014-08-12 Thread Milind
Christoph Kaser < christoph.ka...@iconparc.de> wrote: > Hello Milind, > > if you don't set the field to be tokenized, no analyzer will be used and > the field's contents will be stored "as-is", i.e. case sensitive. > It's the analyzer's job to toke

Re: Can't get case insensitive keyword analyzer to work

2014-08-11 Thread Milind
. But it seems that I don't need to do that. The LowerCaseKeywordAnalyzer works if the field is tokenized, but not if it's un-tokenized! How can that be? On Mon, Aug 11, 2014 at 1:49 PM, Milind wrote: > It does look like the lowercase is working. > > The following code &

Re: Can't get case insensitive keyword analyzer to work

2014-08-11 Thread Milind
g 9, 2014 at 4:39 PM, Milind wrote: > I looked at a couple of examples on how to get keyword analyzer to be case > insensitive but I think I missed something since it's not working for me. > > In the code below, I'm indexing text in upper case and searching in lower > case. Bu

Can't get case insensitive keyword analyzer to work

2014-08-09 Thread Milind
cher = new IndexSearcher(theIndexReader); TopScoreDocCollector theCollector = TopScoreDocCollector.create(10, true); theSearcher.search(theQuery, theCollector); ScoreDoc[] theHits = theCollector.topDocs().scoreDocs; System.out.println("Number of results found: " + theHits.length); } -- Regards Milind

Re: Incorrect tokenizing in the UAX29URLEmailAnalyzer analyzer?

2014-07-24 Thread Milind
Thanks again Steve. It was the version number. I hadn't noticed the deprecated warning. Changing to use Version.LUCENE_47 fixed the problem. On Wed, Jul 23, 2014 at 8:20 PM, Steve Rowe wrote: > On Jul 23, 2014, at 7:43 PM, Milind wrote: > > >>> input=esl2.gbr >

Re: Incorrect tokenizing in the UAX29URLEmailAnalyzer analyzer?

2014-07-23 Thread Milind
d 4.7 since it seems 4.8 onwards, Lucene is being compiled against Java 7 and I'm still on Java 6. Hopefully, this will be a non-issue with PerFieldAnalyzerWrapper. But I just wanted to point that out. On Wed, Jul 23, 2014 at 7:34 PM, Milind wrote: > Brilliant. Thanks! > >

Re: Incorrect tokenizing in the UAX29URLEmailAnalyzer analyzer?

2014-07-23 Thread Milind
e > > On Jul 23, 2014, at 6:00 PM, Milind wrote: > > > Thanks Steve, that helped. I had forgotten about the URL part of the > > Analyzer since I was using it for the email field. I need to see if it's > > possible to use different analyzers for different fields. If s

Re: Incorrect tokenizing in the UAX29URLEmailAnalyzer analyzer?

2014-07-23 Thread Milind
ld and use StandardAnalyzer for everything else. I'm not sure if that would work though. Since I'm using the MultiFieldQueryParser and that takes in a single Analyzer. On Wed, Jul 23, 2014 at 3:29 PM, Steve Rowe wrote: > Hi Milind, > > On Jul 23, 2014, at 1:49 PM, Milind wrote: > > &

Incorrect tokenizing in the UAX29URLEmailAnalyzer analyzer?

2014-07-23 Thread Milind
=[esl2][gbr] Any insights would be appreciated -- Regards Milind