Re: Why does this search succeed with web app, but not Luke?

ohaya Thu, 06 Aug 2009 21:36:35 -0700

Hi Phil,

Well, kind of... but...


Then, why, when I do the search in Luke, do I get the results I cited:

xxxx  ==> succeeds

xxxx.yyy  ==> fails (no results)

I guess that I've been assuming that the search in Luke is "correct" and I've 
been using that to "test my understanding", but maybe that's an invalid 
assumption?

Jim





---- Phil Whelan <phil...@gmail.com> wrote: 
> Hi Jim,
> 
> > As I said, based on the terms in Luke, I would have expected a web app 
> > query on:
> >
> > path:file-1-2
> >
> > to succeed, and a query on:
> >
> > path:file-1-2.dat
> > to fail.
> >
> > But, instead both of those succeed when I do a web query.
> 
> This query will also pass through the same (hopefully) Analyzer and
> will be broken into terms. So the query will actually be for
> "file-1-2" and "dat" where "file-1-2" is followed immediately by
> "dat".
> 
> In indexing the terms position is stored, so
> "C:\dir1\dir2\file-1-1.dat" becomes...
> [0] c
> [1] dir1
> [2] dir2
> [3] file-1-1
> [4] dat
> 
> "file-1-1" is followed by "dat", so there is a match.
> 
> Does that make sense?
> 
> Cheers,
> Phil
> 
> >
> > Jim
> >
> >
> > ---- oh...@cox.net wrote:
> >> Phil,
> >>
> >> Both my indexer and the webapp are basically from the Lucene demos, the 
> >> indexer starting with the IndexFiles.java demo code, so I think they're 
> >> both using the StandardAnalyzer.
> >>
> >> What appears in Luke, when I select "path" is just the filename part, 
> >> without the extension, i.e., the "xxxx" part.
> >>
> >> That's why I said in my original post that I was kind of surprised that 
> >> doing a web query for "path:xxxx.yyy" succeeded, i.e, in the path field in 
> >> the index, there is no "xxxx.yyy", just "xxxx".
> >>
> >> Jim
> >>
> >> ---- Phil Whelan <phil...@gmail.com> wrote:
> >> > Hi Jim,
> >> >
> >> > Are you using the same Analyzer for indexing and searching? xxxx.yyy
> >> > will be seem as a HOSTNAME by StandardAnalyzer and will keep it as one
> >> > term, whereas another indexer might split this into 2 terms. This
> >> > should not matter either way as long as you are using the same
> >> > Analyzer for both indexing and searching.
> >> >
> >> > I would expect this to pass unless you are using NOT_ANALYZED, or the
> >> > WhitespaceAnalyzer, or something else that would not split on "/".
> >> >     path:xxxx.yyy
> >> >
> >> > In Luke, do you see 2 terms "xxxx" and "yyy", or just "xxxx.yyy", or
> >> > something else?
> >> >
> >> > Thanks,
> >> > Phil
> >> >
> >> > On Thu, Aug 6, 2009 at 1:03 PM, <oh...@cox.net> wrote:
> >> > > Hi,
> >> > >
> >> > > In my indexer app (based on the IndexFiles.java demo), I am adding the 
> >> > > "path" field:
> >> > >
> >> > >    doc.add(new Field("path", f.getPath(), Field.Store.YES, 
> >> > > Field.Index.ANALYZED));
> >> > >
> >> > > Per Luke, the full path (e.g., "c:\....\xxxx.yyy") gets parsed, and 
> >> > > one of the terms (again, per Luke) is "xxxx", i.e., the actual file 
> >> > > name, but without the extension.
> >> > >
> >> > > Then, when I search with Luke for "path:xxxx", that succeeds, as 
> >> > > expected, and when I search with Luke for "path:xxxx.yyy", that fails, 
> >> > > as expected.
> >> > >
> >> > > But, if I search using the demo web app, for "path:xxxx.yyy", it 
> >> > > succeeds.
> >> > >
> >> > > Since the Luke search for "path:xxxx.yyy" fails, I don't understand 
> >> > > why the web app search for "path:xxxx.yyy" would succeed?
> >> > >
> >> > > Thanks,
> >> > > Jim
> >> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Why does this search succeed with web app, but not Luke?

Reply via email to