Re: Very large queries?

2003-03-28 Thread Ype Kingma
Gary, On Thursday 27 March 2003 15:34, [EMAIL PROTECTED] wrote: > Let me describe what the goal is and how I could use Verity to accompish > this -- provided that Verity did not impose the limits it does. > > The documents being indexed are small, completely unstructured, textual > descriptions of

Re: Very large queries?

2003-03-28 Thread gary.h.merrill
Thanks for these suggestions. The ideas of adding taxonomy-related terms to the documents is an interesting one and bears some thought. However, if I have to pre-process the corpus to determine which terms to add, and then to add them, it would seem that I've already accomplished my primary goal

RE: Very large queries?

2003-03-28 Thread Alex Murzaku
how about this: assuming that your taxonomies are tree-like structures, you could expand every term in the documents to be indexed with the path where they belong in the tree (i.e. all hypernyms and hyponyms) - for this you use the same technique as when using thesauri. This will allow you to enter

I: incremental index

2003-03-28 Thread Rende Francesco, CS
Hi, > I'm a lucene user and i found it a very interesting software. > > My question is related to how manage incremental update of the lucene index. > In particular, adding more documents to a big index (~10 Gb) is the same of > creating a new segment and then merge the indexes? > Adding document

Re: I: incremental index

2003-03-28 Thread Otis Gospodnetic
Adding a new document does not immediately modify an index, so the time it takes to add a new document to an existing index is not proportional to the index size. It is constant. The execution time of optimize() is proportional to the index size, so you want to do that only if you really need it.

RE: Tokenize negative number

2003-03-28 Thread Lixin Meng
Browsing through some of previous discussion, but I have to say that I couldn't find a solution for this. Would you mind provide more clue on this? Regards, Lixin -Original Message- From: Terry Steichen [mailto:[EMAIL PROTECTED] Sent: Tuesday, March 25, 2003 7:14 PM To: Lucene Users List;

RE: Tokenize negative number

2003-03-28 Thread Otis Gospodnetic
You are using an Analyzer that throws out non-alphanumeric characters, StandardAnalyzer most likely. You can create your own Analyzer to do exactly what you want. A sample Analyzer is in the Lucene FAQ at http://jguru.com/ . Otis --- Lixin Meng <[EMAIL PROTECTED]> wrote: > Browsing through som

Re: I: incremental index

2003-03-28 Thread Leo Galambos
> Adding a new document does not immediately modify an index, so the time > it takes to add a new document to an existing index is not proportional > to the index size. It is constant. The execution time of optimize() > is proportional to the index size, so you want to do that only if you > reall

Re: I: incremental index

2003-03-28 Thread Otis Gospodnetic
I believe it takes constant time to add a new document to an index because when adding a new document a new segment is created on the disk, 'separate' from the other, existing, index segments. The size of the index may come into play when this new segment has to be merged with the existing segments

Alternate Boolean Query Parser?

2003-03-28 Thread Shah, Vineel
One of my clients is asking for an old-style boolean query search on my keywords fields. A string might look like this: "oracle admin*" and java and oracle and ("8.1.6" or "8.1.7") and ("solaris" or "unix" or "linux") There would probably be need for nested parenthesis, although I can't

Re: Wildcard searching - Case sensitiv?

2003-03-28 Thread Tatu Saloranta
On Friday 28 March 2003 08:37, [EMAIL PROTECTED] wrote: > Ok, thanks Otis, > > you have to write the terms lowercase when you're searching with wildcards. Or use the set method in QueryParser to ask it to automatically lower case those terms. Patch for that was added before 1.3RC1 (check javadocs

Re: Alternate Boolean Query Parser?

2003-03-28 Thread Tatu Saloranta
On Friday 28 March 2003 15:48, Shah, Vineel wrote: > One of my clients is asking for an old-style boolean query search on my > keywords fields. A string might look like this: > > "oracle admin*" and java and oracle and ("8.1.6" or "8.1.7") and > ("solaris" or "unix" or "linux") > > There woul

Re: Wildcard searching - Case sensitiv?

2003-03-28 Thread Test2 . Schwab
Ok, thanks Otis, you have to write the terms lowercase when you're searching with wildcards. Otis Gospodnetic