Re: Problem fetching number of occurrences

2010-06-01 Thread Rebecca Watson
Hi, i was looking at another post which had this presentation in - it has a nice section on termfreqvectors: http://www.cnlp.org/presentations/slides/advancedluceneeu.pdf bec :) On 2 June 2010 13:56, Rebecca Watson wrote: > hi > > when you are indexing, use termvectors > org.apache.lucene.docu

Re: how to extend Similarity in this situation?

2010-06-01 Thread Rebecca Watson
Hi Li Li If you want to support some query types and not others you should overide/extend the queryparser so that you throw an exception / makes a different query type instead. Similarity doesn't do the actual scoring, it's used by the Query classes (actually the Scorer implementation used by the

Re: Problem fetching number of occurrences

2010-06-01 Thread Rebecca Watson
hi when you are indexing, use termvectors org.apache.lucene.document.Field.TermVector set this in the Field object constructor when you create your Field objects at index time. i've never done it but i'm pretty sure these can be retrieved at search time using one of the IndexReader.getTermFreqVec

Re: vector model usage

2010-06-01 Thread Rebecca Watson
Hi, if you want to store word+value pairs then use lucene scoring to weight the words with higher vaules against them, you should look at using payloads and the DelimitedPayloadTokenFilter which lets you specify e.g. word1|value1 word2|value2 ... and the values are stored as payloads against the w

Re: Wich way would you recommend for successive-words similarity and scoring ?

2010-06-01 Thread Otis Gospodnetic
Hi Pablo, This question comes up every once in a while. You'll find some previous discussions and answers here: http://search-lucene.com/?q=terms+closer+together+score Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ -

Re: Solr tutorial

2010-06-01 Thread Lance Norskog
Use solr-user@ instead of java-user@ . You'll find more knowledgeable people. On Mon, May 31, 2010 at 6:36 PM, N Hira wrote: > I don't know of a single tutorial that puts it all together, but the "rich > documents" feature implemented in Solr-284 would be where I would start: > https://issues.ap

Re: CFP for Lucene Revolution Conference, Boston, MA October 7 & 8 2010

2010-06-01 Thread Grant Ingersoll
Sorry for the noise, but thought I would send out a reminder to get your talks in... On May 17, 2010, at 8:43 AM, Grant Ingersoll wrote: > Lucene Revolution Call For Participation - Boston, Massachusetts October 7 & > 8, 2010 > > The first US conference dedicated to Apache Lucene and Solr is

Problem fetching number of occurrences

2010-06-01 Thread Sirish Vadala
Hello All: Can any one suggest me the best way to get the no. of occurrences of each word per document in Lucene? Eg: Let the indexed text be: If you are posting a question, please try search first. Your question may have already been answered. Now if I search for the word 'question', then I w

Re: vector model usage

2010-06-01 Thread Dionisis Koumouras
Thanks for your reply Grant. I checked out the TokenStream class and you are right but I'm afraid I didn't really make myself understood. What I want is to be able to create a Document out of key-value pairs of terms and float numbers representing word weights, insert the Document in the index and

Re: vector model usage

2010-06-01 Thread Grant Ingersoll
On May 31, 2010, at 6:25 AM, Dionisis Koumouras wrote: > Hi all, > I'm new to lucene but have used it succesfully for a few simple tasks. > > I am experimenting with the vector space representation of documents and > have managed to store and retrieve TermFreqVector objects. > > The question is

how to extend Similarity in this situation?

2010-06-01 Thread Li Li
I want to only support boolean or query(as many search engine do). But I want to boost document whose terms are closer. e.g. the query terms are 'apache lucene' doc1 apache has many projects such as lucene doc2 The Apache HTTP Server Project is an effort to develop and maintain an ... Lucene is a

RE: NumericField API

2010-06-01 Thread Uwe Schindler
Hi, > >> 3) NumericField API is marked as experimental and volatile > >> (http://lucene.apache.org/java/3_0_1/api/core/index.html). Is there > >> any other "stable" API I can rely on in Lucene 3.0? If not, what > >> would be > > possible > >> NumericField replacement I could use now? > > > > "Expe

Re: NumericField API

2010-06-01 Thread Mark Miller
On 6/1/10 9:34 AM, Mindaugas Žakšauskas wrote: It's just an early observation as historically Lucene has been doing an amazing job in terms of API stability. Yes it has :) Get ready for even more change in that area though :) -- - Mark http://www.lucidimagination.com ---

Re: NumericField API

2010-06-01 Thread Mindaugas Žakšauskas
Hi, Thanks for your reply Uwe. Just a couple of notes: >> In order to get rid of this exception, I had to change one of the > following: >> - SortField must be changed from SortField.STRING to SortField.LONG > > This does the trick and is *not* weird. You are using *numeric* fields, so > you cann

RE: What's DisjunctionMaxQuery ?

2010-06-01 Thread Itamar Syn-Hershko
See slide 18 in http://www.cnlp.org/presentations/slides/advancedluceneeu.pdf, and http://lucene.apache.org/java/2_0_0/api/org/apache/lucene/search/Disjunction MaxQuery.html. Itamar. -Original Message- From: Li Li [mailto:fancye...@gmail.com] Sent: Tuesday, June 01, 2010 11:42 AM To: jav

RE: NumericField API

2010-06-01 Thread Uwe Schindler
Hi, > I have recently been in charge of converting code that was using > pre-3.0 API to be compatible with 3.0 API. > > There was a piece of code which was storing a date field: > > String date = "20091231131415"; // MMddHHmmss new > Field("creationDate", date, Field.Store.YES, Field.Index.U

NumericField API

2010-06-01 Thread Mindaugas Žakšauskas
Hi, I have recently been in charge of converting code that was using pre-3.0 API to be compatible with 3.0 API. There was a piece of code which was storing a date field: String date = "20091231131415"; // MMddHHmmss new Field("creationDate", date, Field.Store.YES, Field.Index.UN_TOKENIZED);

Re: Is Lucene a "document oriented database"?

2010-06-01 Thread Shashi Kant
Great, thanks! I am curious to learn if anyone has used Lucene/Solr as a document-oriented db - any experiences to share? Lessons learned? I am considering a similar application using Solr and want to ensure we have a handle on potential issues. Thanks, Shashi On Tue, Jun 1, 2010 at 1:28 AM, L

What's DisjunctionMaxQuery ?

2010-06-01 Thread Li Li
anyone could show me some detail information about it ? thanks - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org