RE: SpanXXQuery Usage

2004-03-23 Thread Jochen Frey
Terry, With regular queries (non-Span-queries) you cannot request that results of OR / AND / NOT operations are near to one another (i.e. (A or B) near (C or D)). The span queries solve that problem by allowing any span query to be used in a SpanNearQuery (and vice versa). There are other

Query: A ? B

2004-03-04 Thread Jochen Frey
Hi Everyone. I am trying to figure out how create a query that matches A ? B Where ? is exactly one token. Can anyone tell me how to do that? Obviously it's easy to match 'A * B' where '*' is 0 or 1 tokens (just use a PhraseQuery and set slop to 1). However, if I require exactly one

RE: Query: A ? B

2004-03-04 Thread Jochen Frey
Term(field,six hundred * five)); Thanks! Jochen -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] Sent: Thursday, March 04, 2004 12:00 PM To: Lucene Users List Subject: Re: Query: A ? B Use WildcardQuery: A?B Otis --- Jochen Frey [EMAIL PROTECTED] wrote: Hi Everyone

RE: Query: A ? B

2004-03-04 Thread Jochen Frey
, at 4:29 PM, Jochen Frey wrote: Otis: Maybe I don't understand this right, but I *think* I am looking for something different: I am trying to write a query like this: my * house which should match my own house, my red house, my small house, but should not match my house ... you get

RE: Lucene scalability/clustering

2004-02-26 Thread Jochen Frey
Anson, One way of doing it is having subsets of your indexes / data on different machines. Each machine indexes its own data. You implement a system that distributes queries to the various machines and merges the results back. The working well completely depends on your

Benchmark (WAS: Indexing Speed: Documents vs. Sentences)

2003-12-19 Thread Jochen Frey
Hello, Here's is a benchmark. I am not sure if that is proper etiquette, but I will just paste it into this mail and hope that it gets funneled into the right channels. Cheers! Jochen benchmark ul p bHardware Environment/bbr/ liiDedicated machine for

FW: Indexing Speed: Documents vs. Sentences

2003-12-19 Thread Jochen Frey
dataset greater than a few M docs to experiment with. cheers, sv On Thu, 18 Dec 2003, Jochen Frey wrote: Hi, Yes, this is correct, I am dealing with a few 100GB (close to 1TB). I am, however, distributing the data across several machines and then merge the results from all

Sentence Endings: IndexWriter.maxFieldLength and Token.setPositionIncrement()

2003-12-19 Thread Jochen Frey
Hi! I hope this is the right forum for this post. I was wondering if other people would consider this a bug (it might be a feature and I am missing the point of it): .The default IndexWriter.maxFieldLength is 10,000. .The point of maxFieldLength is to limit memory usage. .The current position

RE: Indexing Speed: Documents vs. Sentences

2003-12-18 Thread Jochen Frey
Hi, Yes, this is correct, I am dealing with a few 100GB (close to 1TB). I am, however, distributing the data across several machines and then merge the results from all the machines together (until I find a better faster solution). Cheers! -Original Message- From:

Indexing Speed: Documents vs. Sentences

2003-12-17 Thread Jochen Frey
Hi, I am using Lucene to index a large number of web pages (a few 100GB) and the indexing speed is great. Lately I have been trying to index on a sentence level, not the document level. My problem is that the indexing speed has gone down dramatically and I am wondering if there is any way for me

RE: Indexing Speed: Documents vs. Sentences

2003-12-17 Thread Jochen Frey
? -Original Message- From: Jochen Frey [mailto:[EMAIL PROTECTED] Sent: Wednesday, December 17, 2003 4:17 PM To: 'Lucene Users List' Subject: Indexing Speed: Documents vs. Sentences Hi, I am using Lucene to index a large number of web pages (a few 100GB) and the indexing speed is great

RE: Indexing Speed: Documents vs. Sentences

2003-12-17 Thread Jochen Frey
, December 17, 2003 1:36 PM To: 'Lucene Users List' Subject: RE: Indexing Speed: Documents vs. Sentences When you parse the page you can prevent sentence-boundry hits from matching your criteria -Original Message- From: Jochen Frey [mailto:[EMAIL PROTECTED] Sent: Wednesday, December