Hi Phil, Well, kind of... but...
Then, why, when I do the search in Luke, do I get the results I cited: xxxx ==> succeeds xxxx.yyy ==> fails (no results) I guess that I've been assuming that the search in Luke is "correct" and I've been using that to "test my understanding", but maybe that's an invalid assumption? Jim ---- Phil Whelan <[email protected]> wrote: > Hi Jim, > > > As I said, based on the terms in Luke, I would have expected a web app > > query on: > > > > path:file-1-2 > > > > to succeed, and a query on: > > > > path:file-1-2.dat > > to fail. > > > > But, instead both of those succeed when I do a web query. > > This query will also pass through the same (hopefully) Analyzer and > will be broken into terms. So the query will actually be for > "file-1-2" and "dat" where "file-1-2" is followed immediately by > "dat". > > In indexing the terms position is stored, so > "C:\dir1\dir2\file-1-1.dat" becomes... > [0] c > [1] dir1 > [2] dir2 > [3] file-1-1 > [4] dat > > "file-1-1" is followed by "dat", so there is a match. > > Does that make sense? > > Cheers, > Phil > > > > > Jim > > > > > > ---- [email protected] wrote: > >> Phil, > >> > >> Both my indexer and the webapp are basically from the Lucene demos, the > >> indexer starting with the IndexFiles.java demo code, so I think they're > >> both using the StandardAnalyzer. > >> > >> What appears in Luke, when I select "path" is just the filename part, > >> without the extension, i.e., the "xxxx" part. > >> > >> That's why I said in my original post that I was kind of surprised that > >> doing a web query for "path:xxxx.yyy" succeeded, i.e, in the path field in > >> the index, there is no "xxxx.yyy", just "xxxx". > >> > >> Jim > >> > >> ---- Phil Whelan <[email protected]> wrote: > >> > Hi Jim, > >> > > >> > Are you using the same Analyzer for indexing and searching? xxxx.yyy > >> > will be seem as a HOSTNAME by StandardAnalyzer and will keep it as one > >> > term, whereas another indexer might split this into 2 terms. This > >> > should not matter either way as long as you are using the same > >> > Analyzer for both indexing and searching. > >> > > >> > I would expect this to pass unless you are using NOT_ANALYZED, or the > >> > WhitespaceAnalyzer, or something else that would not split on "/". > >> > path:xxxx.yyy > >> > > >> > In Luke, do you see 2 terms "xxxx" and "yyy", or just "xxxx.yyy", or > >> > something else? > >> > > >> > Thanks, > >> > Phil > >> > > >> > On Thu, Aug 6, 2009 at 1:03 PM, <[email protected]> wrote: > >> > > Hi, > >> > > > >> > > In my indexer app (based on the IndexFiles.java demo), I am adding the > >> > > "path" field: > >> > > > >> > > doc.add(new Field("path", f.getPath(), Field.Store.YES, > >> > > Field.Index.ANALYZED)); > >> > > > >> > > Per Luke, the full path (e.g., "c:\....\xxxx.yyy") gets parsed, and > >> > > one of the terms (again, per Luke) is "xxxx", i.e., the actual file > >> > > name, but without the extension. > >> > > > >> > > Then, when I search with Luke for "path:xxxx", that succeeds, as > >> > > expected, and when I search with Luke for "path:xxxx.yyy", that fails, > >> > > as expected. > >> > > > >> > > But, if I search using the demo web app, for "path:xxxx.yyy", it > >> > > succeeds. > >> > > > >> > > Since the Luke search for "path:xxxx.yyy" fails, I don't understand > >> > > why the web app search for "path:xxxx.yyy" would succeed? > >> > > > >> > > Thanks, > >> > > Jim > >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
