Re: Using multiple index searchers.

2008-11-23 Thread Henrik Axelsson
Thanks for the quick reply, time to get to work on a prototype! On Mon, Nov 24, 2008 at 2:12 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote: > If the data is unrelated, separate indexes will lead to the best performance. > Memory usage should be less or equal to one big index. > File descriptor usage

Re: Using multiple index searchers.

2008-11-23 Thread Yonik Seeley
If the data is unrelated, separate indexes will lead to the best performance. Memory usage should be less or equal to one big index. File descriptor usage can be minimized by either calling optimize before opening a new IndexSearcher (depends on how often you want to see updates), lowering the merg

Re: Using multiple index searchers.

2008-11-23 Thread Henrik Axelsson
Hi all, After reading the FAQ I have a question regarding the use of multiple indexes and thus IndexSearches on the one server. I work on ecommerce websites and am looking at replacing our current method of full text searching product descriptions and names with a Lucene implementation. I envisag

Re: Doing the lucene remove character \n (break line)

2008-11-23 Thread farnetani
I got. I finish now, before of you to send message, but thanks your comments!:-D Have a nice day! Jr. Erick Erickson wrote: > > What I'd do is make my own filter, probably one based upon one of > the pre-existing ones and modify the call to nextToken, examine that > token, and if it ends in a

Using multiple index searchers.

2008-11-23 Thread Henrik Axelsson
Hi all, After reading the FAQ I have a question regarding the use of multiple indexes and thus IndexSearches on the one server. I work on ecommerce websites and am looking at replacing our current method of full text searching product descriptions and names with a Lucene implementation. I envisag

Re: Doing the lucene remove character \n (break line)

2008-11-23 Thread Erick Erickson
What I'd do is make my own filter, probably one based upon one of the pre-existing ones and modify the call to nextToken, examine that token, and if it ends in a hyphen get the next token and return the concatenation of the two. I don't believe that there's a pre-existing filter that does this, but

Re: [ot] a reverse lucene

2008-11-23 Thread markharw00d
If you index the queries consider also that they can potentially be indexed in an optimised form. For example, take a phrase query for "Alonso Smith". You need only index one of these terms - an incoming document must contain both terms to be considered a match. If you chose to index this quer

Doing the lucene remove character \n (break line)

2008-11-23 Thread farnetani
I need to do lucene find the sentence: ARLEI FERREIRA FARNETANI JUNIOR [arlei] [ferreira] [farnetani] [junior](1) and too: ARLEI FERREIRA FAR- NETANI JUNIOR I'm using the Brazilian Analyzer, but the result is: [ARLEI] [FERREIRA] [FAR] [NETANI] [JUNIOR] I have to do that the lucene re

Re: [ot] a reverse lucene

2008-11-23 Thread Ian Holsman
Thanks for all the suggestions guys.. This is great! Andrzej Bialecki wrote: Ian Holsman wrote: Hi. apologies for the off-topic question. I was wondering if anyone knew of a open source solution (or a pointer to the algorithms) that do the reverse of lucene. By that I mean store a whole lot

[ANN] Luke 0.9.1 - bugfix release

2008-11-23 Thread Andrzej Bialecki
Hi all, A bugfix release of Luke is now available at the usual place: http://www.getopt.org/luke * New features and improvements: o Added ability to set the maximum count of boolean clauses in BooleanQuery. * Bug fixes: o Unbalanced tags breaking the XML export. Reported by T

Re: [ot] a reverse lucene

2008-11-23 Thread Andrzej Bialecki
Ian Holsman wrote: Hi. apologies for the off-topic question. I was wondering if anyone knew of a open source solution (or a pointer to the algorithms) that do the reverse of lucene. By that I mean store a whole lot of queries, and run them against a document to see which queries match it. (wi

Re: AW: Transforming german umlaute like ö,ä ,ü,ß into oe, ae, ue, ss

2008-11-23 Thread Koji Sekiguchi
> > Where do I get the CharFilter library? I'm using Lucene, not Solr. > > > > Thanks, > > Sascha > CharFilter is included in recent Solr nightly build. > It is not OOTB solution for Lucene now, sorry. > If I have time, I will make it for Lucene in this weekend. Now the patch available for Lucene

Re: [ot] a reverse lucene

2008-11-23 Thread David Sheldon
On Sun, Nov 23, 2008 at 02:57:28PM +1100, Ian Holsman wrote: > I can see the case for this would be a news-article and several people > writing queries to get alerted if it matched a certain condition. I haven't tried this, but if you have lots of queries and few documents then consider using luc

Re: [ot] a reverse lucene

2008-11-23 Thread Grant Ingersoll
The "formal" name for this stuff is "document filtering" or just "filtering". You can start on it, by looking at TREC, which had a filtering task for a number of years: http://trec.nist.gov/tracks.html At any rate, one approach is to store your queries as Lucene documents, albeit short one

Re: [ot] a reverse lucene

2008-11-23 Thread jm
I am using MemoryIndex in a similar scenario. I have not as many queries though, less than 100, but several 'articles' coming per second. Works nicely. On Sun, Nov 23, 2008 at 10:00 AM, Erik Hatcher <[EMAIL PROTECTED]> wrote: > > On Nov 22, 2008, at 10:57 PM, Ian Holsman wrote: >> >> Hi. apologie

Re: [ot] a reverse lucene

2008-11-23 Thread Ian Holsman
Thanks Erik. I'll start looking at that. regards Ian Erik Hatcher wrote: On Nov 22, 2008, at 10:57 PM, Ian Holsman wrote: Hi. apologies for the off-topic question. Not off-topic at all! I was wondering if anyone knew of a open source solution (or a pointer to the algorithms) that do the r

Re: [ot] a reverse lucene

2008-11-23 Thread Erik Hatcher
On Nov 22, 2008, at 10:57 PM, Ian Holsman wrote: Hi. apologies for the off-topic question. Not off-topic at all! I was wondering if anyone knew of a open source solution (or a pointer to the algorithms) that do the reverse of lucene. By that I mean store a whole lot of queries, and run the

Re: [ot] a reverse lucene

2008-11-23 Thread Cool The Breezer
May be RSS feed a solution. Just provide RSS feed as a search result for each query and people subscribing these RSS feed would get notifications in regular intervals. They need to install RSS clients, which can run queries in regular intervals. --- On Sun, 11/23/08, Ian Holsman <[EMAIL PROTE