Re: FilteredQuery returning entire index

2012-07-13 Thread James Nolan
Thanks for the heads up. I was using org.apache.lucene.search.FilteredQuery which says it applies a filter to an executed query. I instantiate (sorry if my lexicon is off) a FilteredQuery and pass my filter and the boolean query to the constructor. What I think should happen is that the FilteredQ

Re: Direct memory footprint of NIOFSDirectory

2012-07-13 Thread Vitaly Funstein
Lance, With all due respect - I am aware of the existence of MMapDirectory, but it does not answer my original question. Switching over to a different implementation from a well-tested one to work around a poorly understood issue carries non-zero risks in a production environment. Vitaly On Thu,

RE: can't find queries when they are one per line in target file

2012-07-13 Thread Ilya Zavorin
Ian, Turns out you were very close to the truth. The problem was in how I was ingesting the original file into memory before indexing. Thanks, Mr. Ilya Zavorin Applied Research and Consulting CACI Advanced Knowledge Solutions Division 4831 Walden Lane, Lanham, MD 20706 ph: 1-301-306-2859 fx: 1

RE: FilteredQuery returning entire index

2012-07-13 Thread Uwe Schindler
The Filter and the BooleanQuery are handled separately and only merged later. IndexReaders passed to Filter don't know anything about the executed Query, so you always see a view on the complete index. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thet

Re: can't find queries when they are one per line in target file

2012-07-13 Thread Ian Lea
It's hard to tell from your description exactly what you are indexing and searching for, but I'd hazard a guess that the problem is related to your "content of entire target file" comment. Maybe you need to read the files line by line. -- Ian. On Fri, Jul 13, 2012 at 6:02 PM, Ilya Zavorin wro

RE: can't find queries when they are one per line in target file

2012-07-13 Thread Ilya Zavorin
Here are the details: I ran 2 tests: 1. Index only the first target file (the one where all the queries are in one long line); Then loop over all queries and search for each using the code block below. 2. Index only the second target file (the one where all the queries are listed one per lin

RE: can't find queries when they are one per line in target file

2012-07-13 Thread Uwe Schindler
What do you mean with "files"? Without a complete description what you are doing we cannot answer your request. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Ilya Zavorin [mailto:izavo...@caci.com] > S

RE: can't find queries when they are one per line in target file

2012-07-13 Thread Ilya Zavorin
But why then does it find all the querries in the 1st file? I use exactly the same code. IZ -Original Message- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Friday, July 13, 2012 12:32 PM To: java-user@lucene.apache.org Subject: RE: can't find queries when they are one per line

RE: can't find queries when they are one per line in target file

2012-07-13 Thread Uwe Schindler
> String qStr = "Query1"; // or "Query2" or ... > QueryParser parser = ...; > IndexSearcher searcher = ...; > Query query = parser.parse(qStr); > TopDocs results = searcher.search(query, Integer.MAX_VALUE); ScoreDoc[] hits > = results.scoreDocs; > > returned no hits for the 2nd test. Maybe becaus

can't find queries when they are one per line in target file

2012-07-13 Thread Ilya Zavorin
Hi, I am using 3.4.0 and just discovered a weird issue. I have a set of simple English one-word queries and two target files that I want to search. One has all these queries in one line, i.e. something like this Query1 Query2 Query3 Query4 The other has them one per line, i.e. Query1 Query2 Q

RE: Pattern Analyzer

2012-07-13 Thread Dave Seltzer
I think you're absolutely right Erick, Thanks for the insight - that's the direction I'll be heading. Cheers, -D -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Friday, July 13, 2012 8:53 AM To: java-user@lucene.apache.org Subject: Re: Pattern Analyzer Su

Re: Pattern Analyzer

2012-07-13 Thread Erick Erickson
Sure, you can do it that way. But first I'd look over the zillion tokenizers and filters that are available and string together the ones that best suit your need. For instance, WhitespaceTokenizer and PatternReplaceFilter might make your regex much easier since the PatternReplaceFilter gets just th

Offsets in 3.6/4.0

2012-07-13 Thread Carsten Schnober
Dear list, I am working on a search application that depends on retrieving offsets for each match. Currently (in Lucene 3.6), this seems to be overly costly, at least in my solution that looks like this: --- TermPositionVector tfv