Thanks for the quick reply, time to get to work on a prototype!
On Mon, Nov 24, 2008 at 2:12 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote:
> If the data is unrelated, separate indexes will lead to the best performance.
> Memory usage should be less or equal to one big index.
> File descriptor usage
If the data is unrelated, separate indexes will lead to the best performance.
Memory usage should be less or equal to one big index.
File descriptor usage can be minimized by either calling optimize
before opening a new IndexSearcher (depends on how often you want to
see updates), lowering the merg
Hi all,
After reading the FAQ I have a question regarding the use of multiple
indexes and thus IndexSearches on the one server.
I work on ecommerce websites and am looking at replacing our current
method of full text searching product descriptions and names with a
Lucene implementation. I envisag
I got. I finish now, before of you to send message, but thanks your
comments!:-D
Have a nice day!
Jr.
Erick Erickson wrote:
>
> What I'd do is make my own filter, probably one based upon one of
> the pre-existing ones and modify the call to nextToken, examine that
> token, and if it ends in a
Hi all,
After reading the FAQ I have a question regarding the use of multiple
indexes and thus IndexSearches on the one server.
I work on ecommerce websites and am looking at replacing our current method
of full text searching product descriptions and names with a Lucene
implementation. I envisag
What I'd do is make my own filter, probably one based upon one of
the pre-existing ones and modify the call to nextToken, examine that
token, and if it ends in a hyphen get the next token and return the
concatenation of the two. I don't believe that there's a pre-existing
filter that does this, but
If you index the queries consider also that they can potentially be
indexed in an optimised form.
For example, take a phrase query for "Alonso Smith". You need only index
one of these terms - an incoming document must contain both terms to be
considered a match. If you chose to index this quer
I need to do lucene find the sentence:
ARLEI FERREIRA FARNETANI JUNIOR
[arlei] [ferreira] [farnetani] [junior](1)
and too:
ARLEI FERREIRA FAR-
NETANI JUNIOR
I'm using the Brazilian Analyzer, but the result is:
[ARLEI] [FERREIRA] [FAR] [NETANI] [JUNIOR]
I have to do that the lucene re
Thanks for all the suggestions guys..
This is great!
Andrzej Bialecki wrote:
Ian Holsman wrote:
Hi. apologies for the off-topic question.
I was wondering if anyone knew of a open source solution (or a
pointer to the algorithms)
that do the reverse of lucene.
By that I mean store a whole lot
Hi all,
A bugfix release of Luke is now available at the usual place:
http://www.getopt.org/luke
* New features and improvements:
o Added ability to set the maximum count of boolean clauses in
BooleanQuery.
* Bug fixes:
o Unbalanced tags breaking the XML export. Reported by
T
Ian Holsman wrote:
Hi. apologies for the off-topic question.
I was wondering if anyone knew of a open source solution (or a pointer
to the algorithms)
that do the reverse of lucene.
By that I mean store a whole lot of queries, and run them against a
document to see which queries match it. (wi
> > Where do I get the CharFilter library? I'm using Lucene, not Solr.
> >
> > Thanks,
> > Sascha
> CharFilter is included in recent Solr nightly build.
> It is not OOTB solution for Lucene now, sorry.
> If I have time, I will make it for Lucene in this weekend.
Now the patch available for Lucene
On Sun, Nov 23, 2008 at 02:57:28PM +1100, Ian Holsman wrote:
> I can see the case for this would be a news-article and several people
> writing queries to get alerted if it matched a certain condition.
I haven't tried this, but if you have lots of queries and few documents
then consider using luc
The "formal" name for this stuff is "document filtering" or just
"filtering". You can start on it, by looking at TREC, which had a
filtering task for a number of years: http://trec.nist.gov/tracks.html
At any rate, one approach is to store your queries as Lucene
documents, albeit short one
I am using MemoryIndex in a similar scenario. I have not as many
queries though, less than 100, but several 'articles' coming per
second.
Works nicely.
On Sun, Nov 23, 2008 at 10:00 AM, Erik Hatcher
<[EMAIL PROTECTED]> wrote:
>
> On Nov 22, 2008, at 10:57 PM, Ian Holsman wrote:
>>
>> Hi. apologie
Thanks Erik.
I'll start looking at that.
regards
Ian
Erik Hatcher wrote:
On Nov 22, 2008, at 10:57 PM, Ian Holsman wrote:
Hi. apologies for the off-topic question.
Not off-topic at all!
I was wondering if anyone knew of a open source solution (or a
pointer to the algorithms)
that do the r
On Nov 22, 2008, at 10:57 PM, Ian Holsman wrote:
Hi. apologies for the off-topic question.
Not off-topic at all!
I was wondering if anyone knew of a open source solution (or a
pointer to the algorithms)
that do the reverse of lucene.
By that I mean store a whole lot of queries, and run the
May be RSS feed a solution. Just provide RSS feed as a search result for each
query and people subscribing these RSS feed would get notifications in regular
intervals. They need to install RSS clients, which can run queries in regular
intervals.
--- On Sun, 11/23/08, Ian Holsman <[EMAIL PROTE
18 matches
Mail list logo