It is a good general assumption that Luke is correct.

Can you confirm that you are using StandardAnalyzer everywhere, for
indexing and searching?  This sort of issue is often caused by using
different analyzers.

What does Luke show as the indexed terms for path?  In a little index
I've just created with StandardAnalyzer and file paths Luke is showing
xxx.yyy as a term and not xxx.  The opposite to what you have.

There was a thread yesterday about acronyms which might be relevant.
As might writing a tiny self-contained program that indexes a few
paths and displays the terms that have been indexed and runs a few
searches.


--
Ian.


On Fri, Aug 7, 2009 at 5:36 AM, <oh...@cox.net> wrote:
> Hi Phil,
>
> Well, kind of... but...
>
> Then, why, when I do the search in Luke, do I get the results I cited:
>
> xxxx  ==> succeeds
>
> xxxx.yyy  ==> fails (no results)
>
> I guess that I've been assuming that the search in Luke is "correct" and I've 
> been using that to "test my understanding", but maybe that's an invalid 
> assumption?
>
> Jim
>
>
>
>
>
> ---- Phil Whelan <phil...@gmail.com> wrote:
>> Hi Jim,
>>
>> > As I said, based on the terms in Luke, I would have expected a web app 
>> > query on:
>> >
>> > path:file-1-2
>> >
>> > to succeed, and a query on:
>> >
>> > path:file-1-2.dat
>> > to fail.
>> >
>> > But, instead both of those succeed when I do a web query.
>>
>> This query will also pass through the same (hopefully) Analyzer and
>> will be broken into terms. So the query will actually be for
>> "file-1-2" and "dat" where "file-1-2" is followed immediately by
>> "dat".
>>
>> In indexing the terms position is stored, so
>> "C:\dir1\dir2\file-1-1.dat" becomes...
>> [0] c
>> [1] dir1
>> [2] dir2
>> [3] file-1-1
>> [4] dat
>>
>> "file-1-1" is followed by "dat", so there is a match.
>>
>> Does that make sense?
>>
>> Cheers,
>> Phil
>>
>> >
>> > Jim
>> >
>> >
>> > ---- oh...@cox.net wrote:
>> >> Phil,
>> >>
>> >> Both my indexer and the webapp are basically from the Lucene demos, the 
>> >> indexer starting with the IndexFiles.java demo code, so I think they're 
>> >> both using the StandardAnalyzer.
>> >>
>> >> What appears in Luke, when I select "path" is just the filename part, 
>> >> without the extension, i.e., the "xxxx" part.
>> >>
>> >> That's why I said in my original post that I was kind of surprised that 
>> >> doing a web query for "path:xxxx.yyy" succeeded, i.e, in the path field 
>> >> in the index, there is no "xxxx.yyy", just "xxxx".
>> >>
>> >> Jim
>> >>
>> >> ---- Phil Whelan <phil...@gmail.com> wrote:
>> >> > Hi Jim,
>> >> >
>> >> > Are you using the same Analyzer for indexing and searching? xxxx.yyy
>> >> > will be seem as a HOSTNAME by StandardAnalyzer and will keep it as one
>> >> > term, whereas another indexer might split this into 2 terms. This
>> >> > should not matter either way as long as you are using the same
>> >> > Analyzer for both indexing and searching.
>> >> >
>> >> > I would expect this to pass unless you are using NOT_ANALYZED, or the
>> >> > WhitespaceAnalyzer, or something else that would not split on "/".
>> >> >     path:xxxx.yyy
>> >> >
>> >> > In Luke, do you see 2 terms "xxxx" and "yyy", or just "xxxx.yyy", or
>> >> > something else?
>> >> >
>> >> > Thanks,
>> >> > Phil
>> >> >
>> >> > On Thu, Aug 6, 2009 at 1:03 PM, <oh...@cox.net> wrote:
>> >> > > Hi,
>> >> > >
>> >> > > In my indexer app (based on the IndexFiles.java demo), I am adding 
>> >> > > the "path" field:
>> >> > >
>> >> > >    doc.add(new Field("path", f.getPath(), Field.Store.YES, 
>> >> > > Field.Index.ANALYZED));
>> >> > >
>> >> > > Per Luke, the full path (e.g., "c:\....\xxxx.yyy") gets parsed, and 
>> >> > > one of the terms (again, per Luke) is "xxxx", i.e., the actual file 
>> >> > > name, but without the extension.
>> >> > >
>> >> > > Then, when I search with Luke for "path:xxxx", that succeeds, as 
>> >> > > expected, and when I search with Luke for "path:xxxx.yyy", that 
>> >> > > fails, as expected.
>> >> > >
>> >> > > But, if I search using the demo web app, for "path:xxxx.yyy", it 
>> >> > > succeeds.
>> >> > >
>> >> > > Since the Luke search for "path:xxxx.yyy" fails, I don't understand 
>> >> > > why the web app search for "path:xxxx.yyy" would succeed?
>> >> > >
>> >> > > Thanks,
>> >> > > Jim
>> >> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to