Re: Why does this search succeed with web app, but not Luke?

Phil Whelan Thu, 06 Aug 2009 19:10:16 -0700

Hi Jim,

> As I said, based on the terms in Luke, I would have expected a web app query 
> on:
>
> path:file-1-2
>
> to succeed, and a query on:
>
> path:file-1-2.dat
> to fail.
>
> But, instead both of those succeed when I do a web query.


This query will also pass through the same (hopefully) Analyzer and
will be broken into terms. So the query will actually be for
"file-1-2" and "dat" where "file-1-2" is followed immediately by
"dat".

In indexing the terms position is stored, so
"C:\dir1\dir2\file-1-1.dat" becomes...
[0] c
[1] dir1
[2] dir2
[3] file-1-1
[4] dat

"file-1-1" is followed by "dat", so there is a match.

Does that make sense?

Cheers,
Phil

>
> Jim
>
>
> ---- oh...@cox.net wrote:
>> Phil,
>>
>> Both my indexer and the webapp are basically from the Lucene demos, the 
>> indexer starting with the IndexFiles.java demo code, so I think they're both 
>> using the StandardAnalyzer.
>>
>> What appears in Luke, when I select "path" is just the filename part, 
>> without the extension, i.e., the "xxxx" part.
>>
>> That's why I said in my original post that I was kind of surprised that 
>> doing a web query for "path:xxxx.yyy" succeeded, i.e, in the path field in 
>> the index, there is no "xxxx.yyy", just "xxxx".
>>
>> Jim
>>
>> ---- Phil Whelan <phil...@gmail.com> wrote:
>> > Hi Jim,
>> >
>> > Are you using the same Analyzer for indexing and searching? xxxx.yyy
>> > will be seem as a HOSTNAME by StandardAnalyzer and will keep it as one
>> > term, whereas another indexer might split this into 2 terms. This
>> > should not matter either way as long as you are using the same
>> > Analyzer for both indexing and searching.
>> >
>> > I would expect this to pass unless you are using NOT_ANALYZED, or the
>> > WhitespaceAnalyzer, or something else that would not split on "/".
>> >     path:xxxx.yyy
>> >
>> > In Luke, do you see 2 terms "xxxx" and "yyy", or just "xxxx.yyy", or
>> > something else?
>> >
>> > Thanks,
>> > Phil
>> >
>> > On Thu, Aug 6, 2009 at 1:03 PM, <oh...@cox.net> wrote:
>> > > Hi,
>> > >
>> > > In my indexer app (based on the IndexFiles.java demo), I am adding the 
>> > > "path" field:
>> > >
>> > >    doc.add(new Field("path", f.getPath(), Field.Store.YES, 
>> > > Field.Index.ANALYZED));
>> > >
>> > > Per Luke, the full path (e.g., "c:\....\xxxx.yyy") gets parsed, and one 
>> > > of the terms (again, per Luke) is "xxxx", i.e., the actual file name, 
>> > > but without the extension.
>> > >
>> > > Then, when I search with Luke for "path:xxxx", that succeeds, as 
>> > > expected, and when I search with Luke for "path:xxxx.yyy", that fails, 
>> > > as expected.
>> > >
>> > > But, if I search using the demo web app, for "path:xxxx.yyy", it 
>> > > succeeds.
>> > >
>> > > Since the Luke search for "path:xxxx.yyy" fails, I don't understand why 
>> > > the web app search for "path:xxxx.yyy" would succeed?
>> > >
>> > > Thanks,
>> > > Jim
>> >

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Why does this search succeed with web app, but not Luke?

Reply via email to