Yes. If you search for alphare on google and alphare*, you get 2 different results. Sorry for the contrived example. I just tried searching for alpharetta and went backwards deleting characters.
On Wed, Aug 27, 2014 at 10:01 AM, Benson Margulies <ben...@basistech.com> wrote: > Does google actually support "*"? > > > > On Wed, Aug 27, 2014 at 9:54 AM, Milind <mili...@gmail.com> wrote: > > > I see. This is going to be extremely difficult to explain to end users. > > It doesn't work as they would expect. Some of the tokenizing rules are > > already somewhat confusing. Their expectation is that it should work the > > way their searches work in Google. > > > > It's difficult enough to recognize that because the period is surrounded > by > > a digit and alphabet (as opposed to 2 digits or 2 alphabets), it gets > > tokenized. So I'd have expected that C0001.DevNm00* would effectively > > become a search for C0001 OR DevNm00*. But now, because of the presence > of > > the wildcard, it's considered as 1 term and the period is not a > tokenizer. > > That's actually good, but now the fact that it's still considered as 2 > > terms for wildcard searches makes it very unintuitive. I don't suppose > > that I can do anything about making wildcard search use multiple terms if > > joined together with a tokenizer. But is there any way that I can force > it > > to go through an analyzer prior to doing the search? > > > > > > > > > > On Tue, Aug 26, 2014 at 4:21 PM, Jack Krupansky <j...@basetechnology.com > > > > wrote: > > > > > Sorry, but you can only use a wildcard on a single term. > "C0001.DevNm001" > > > gets indexed as two terms, "c0001" and "devnm001", so your wildcard > won't > > > match any term (at least in this case.) > > > > > > Also, if your query term includes a wildcard, it will not be fully > > > analyzed. Some filters such as lower case are defined as "multi-term", > so > > > they will be performed, but the standard tokenizer is not being called, > > so > > > the dot remains and this whole term is treated as one term, unlike the > > > index analysis. > > > > > > -- Jack Krupansky > > > > > > -----Original Message----- From: Milind > > > Sent: Tuesday, August 26, 2014 12:24 PM > > > To: java-user@lucene.apache.org > > > Subject: Why does this search fail? > > > > > > > > > I have a field with the value C0001.DevNm001. If I search for > > > > > > C0001.DevNm001 --> Get Hit > > > DevNm00* --> Get Hit > > > C0001.DevNm00* --> Get No Hit > > > > > > The field gets tokenized on the period since it's surrounded by a > letter > > > and and a number. The query gets evaluated as a prefix query. I'd > have > > > thought that this should have found the document. Any clues on why > this > > > doesn't work? > > > > > > The full code is below. > > > > > > Directory theDirectory = new RAMDirectory(); > > > Version theVersion = Version.LUCENE_47; > > > Analyzer theAnalyzer = new StandardAnalyzer(theVersion); > > > IndexWriterConfig theConfig = > > > new IndexWriterConfig(theVersion, > > theAnalyzer); > > > IndexWriter theWriter = new IndexWriter(theDirectory, > theConfig); > > > > > > String theFieldName = "Name"; > > > String theFieldValue = "C0001.DevNm001"; > > > Document theDocument = new Document(); > > > theDocument.add(new TextField(theFieldName, theFieldValue, > > > Field.Store.YES)); > > > theWriter.addDocument(theDocument); > > > theWriter.close(); > > > > > > String theQueryStr = theFieldName + ":C0001.DevNm00*"; > > > Query theQuery = > > > new QueryParser(theVersion, theFieldName, > > > theAnalyzer).parse(theQueryStr); > > > System.out.println(theQuery.getClass() + ", " + theQuery); > > > IndexReader theIndexReader = DirectoryReader.open(theDirectory); > > > IndexSearcher theSearcher = new IndexSearcher(theIndexReader); > > > TopScoreDocCollector collector = TopScoreDocCollector.create(10, > > > true); > > > theSearcher.search(theQuery, collector); > > > ScoreDoc[] theHits = collector.topDocs().scoreDocs; > > > System.out.println("Hits found: " + theHits.length); > > > > > > Output: > > > > > > class org.apache.lucene.search.PrefixQuery, Name:c0001.devnm00* > > > Hits found: 0 > > > > > > > > > -- > > > Regards > > > Milind > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > > > > > > > > > -- > > Regards > > Milind > > > -- Regards Milind