querying multi-value fields

2009-10-12 Thread Angel, Eric
I have documents that store multiple values in some fields (using the document.add(new Field()) with the same field name). Here's what a typical document looks like: doc.option=value1 aaa doc.option=value2 bbb doc.option=value3 ccc I want my queries to only match individual values, for

RE: querying multi-value fields

2009-10-12 Thread Angel, Eric
the values you query/add to this field. On Mon, Oct 12, 2009 at 4:05 PM, Angel, Eric ean...@business.com wrote: I have documents that store multiple values in some fields (using the document.add(new Field()) with the same field name). Here's what a typical document looks like

RE: querying multi-value fields

2009-10-12 Thread Angel, Eric
or 20. But you still have to do the trick with getPositionIncrementGap in order to fail to match on something like bbb value3, where the last term is next to the frist term of the next token.. HTH Erick On Mon, Oct 12, 2009 at 4:31 PM, Angel, Eric ean...@business.com wrote: I need to analyze

RE: Realtime distributed

2009-10-11 Thread Angel, Eric
requirements. -J On Thu, Oct 8, 2009 at 7:00 PM, Angel, Eric ean...@business.com wrote: Does anyone have any recommendations?  I've looked at Katta, but it doesn't seem to support realtime searching.  It also uses hdfs, which I've heard can be slow.  I'm looking to serve 40gb

RE: 2.9: TopScoreDocCollector

2009-10-08 Thread Angel, Eric
(); Weight weight = query.weight(searcher); boolean allowOutOfOrder = weight.scoresDocsOutOfOrder(); TopScoreDocCollector coll = TopScoreDocCollector.create(numHits, allowOutOfOrder); searcher.search(weight, (Filter) null, coll); -jake On Wed, Oct 7, 2009 at 7:26 PM, Angel, Eric ean

Realtime distributed

2009-10-08 Thread Angel, Eric
Does anyone have any recommendations? I've looked at Katta, but it doesn't seem to support realtime searching. It also uses hdfs, which I've heard can be slow. I'm looking to serve 40gb of indexes and support about 1 million updates per day. Thx

2.9: TopScoreDocCollector

2009-10-07 Thread Angel, Eric
According to the documentation for 2.9, TopScoreDocCollector.create(numHits, boolean), the second parameter is whether documents are scored in order by the input - How do I choose? In other words, how would I know if the documents are scored in order or not? Eric

RE: Distributed Lucene Questions

2009-06-01 Thread Angel, Eric
Has anyone used Katta in production? It looks very interesting and feature-rich, but I'm wondering how stable it is and whether or not it can support fine-grained queries - for example, constant score queries, MultiSearcher, etc. -Original Message- From: Ken Krugler

RE: Indexing and Searching Web Application

2009-01-20 Thread Angel, Eric
There's a reopen() method in the IndexReader class. You can use that. -Original Message- From: Amin Mohammed-Coleman [mailto:ami...@gmail.com] Sent: Tuesday, January 20, 2009 5:02 AM To: java-user@lucene.apache.org Subject: Re: Indexing and Searching Web Application Am I supposed to

RE: Lucene index updation and performance

2009-01-16 Thread Angel, Eric
You can simply call IndexWriter.addDocument() for new jobs and IndexWriter.updateDocument http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/index/IndexWri ter.html Also, don't forget to optimize your index. Depending on your volume, you might want to optimize during slow traffic. Eric

RE: clustering with compass terracotta

2009-01-16 Thread Angel, Eric
-lusql-database- to.html http://zzzoot.blogspot.com/2008/09/katta-released-lucene-on-grid.html http://zzzoot.blogspot.com/2008/06/lucene-concurrent-search-performance. html http://zzzoot.blogspot.com/2008/06/simultaneous-threaded-query-lucene.ht ml 2009/1/15 Angel, Eric ean...@business.com: I just

clustering with compass terracotta

2009-01-15 Thread Angel, Eric
I just ran into this http://www.compass-project.org/docs/2.0.0/reference/html/needle-terracot ta.html and was wondering if any of you had tried anything like this and if so, what your experience was like. Eric

RE: Google finance-like suggestible search field

2009-01-14 Thread Angel, Eric
Peter, Why don't you put all your autocompletable values into a single document field and just query a single field? Google seems to only use two fields for autocomplete - symbol and company name. Eric -Original Message- From: Hayes, Peter [mailto:peter.ha...@fmr.com] Sent: Wednesday,

RE: ShingleMatrixFilter for synonyms

2009-01-13 Thread Angel, Eric
/lucene/analysis/shingle/ShingleF ilterTest.java As for multi-word tokens, you just have to make sure they don't get injected before something that would remove any portion of them. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Angel, Eric ean