Lucene 2 in Maven 2 Global Repository

2006-09-23 Thread Iravanchi
Hi, The Lucene Version 2.0 jar file is not present in the Maven 2 repository at ibiblio.org. Will it become available? Hamed -- View this message in context: http://www.nabble.com/Lucene-2-in-Maven-2-Global-Repository-tf2324137.html#a6466170 Sent from the Lucene - Java Users mailing list arc

Re: Can I get the most hot term ?

2006-09-23 Thread Otis Gospodnetic
Doesn't this sound like the HighFreqTerms class in contrib/misc/. ? Otis - Original Message From: Chris Hostetter <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Saturday, September 23, 2006 3:54:49 PM Subject: Re: Can I get the most hot term ? : I'm pretty sure you have t

Re: Can I get the most hot term ?

2006-09-23 Thread Chris Hostetter
: I'm pretty sure you have to count them yourself, but that's made pretty easy : by the TermEnum, TermFreqVector etc. classes. I have only used a few of : these, so I can't be much help. But these sure seem like what you're looking : for. TermEnum has a docFreq member .. so you can iterate over it

Re: Analysis/tokenization of compound words

2006-09-23 Thread Otis Gospodnetic
Thanks for the pointers, Pasquale! Otis - Original Message From: Pasquale Imbemba <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Saturday, September 23, 2006 4:24:16 AM Subject: Re: Analysis/tokenization of compound words Otis, I forgot to mention that I make use of Lucene f

Re: Analysis/tokenization of compound words

2006-09-23 Thread Otis Gospodnetic
Yes, I think it's the same thing - word segmentation - http://www.google.com/search?q=word+segmentation You may get the same ad(word) as I did - Basistech folks from Cambridge, MA have various interesting products, some stuff that deals with CJK (not sure if they actually do word segmentation o

Re: Analysis/tokenization of compound words

2006-09-23 Thread Marvin Humphrey
On Sep 20, 2006, at 12:07 AM, Daniel Naber wrote: Writing a decomposer is difficult as you need both a large dictionary *without* compounds and a set of rules to avoid splitting at too many positions. Conceptually, how different is the problem of decompounding German from tokenizing languag

Re: Can I get the most hot term ?

2006-09-23 Thread Erick Erickson
I'm pretty sure you have to count them yourself, but that's made pretty easy by the TermEnum, TermFreqVector etc. classes. I have only used a few of these, so I can't be much help. But these sure seem like what you're looking for. Best Erick On 9/23/06, Weiming Yin <[EMAIL PROTECTED]> wrote: H

Can I get the most hot term ?

2006-09-23 Thread Weiming Yin
Hi, all, I am working with the follow situation. 1: a, b 2: b, c When I build index with 1 and 2, it gives me three terms, a, b and c. and 'b' is the most hot because it appears two times. Is there a method that I get all (or a part of) terms that sorted by appears time. For the before example,

Re: Help : get document lists from an index

2006-09-23 Thread Nicolas Lalevée
Le Samedi 23 Septembre 2006 10:32, Supheakmungkol SARIN a écrit : > Hello all, > > I'd like to know whether there is any built-in method that allow us to get > all the documents that a term belongs to from its index? you can, but the term should be indexed, and you have to retreive it from it's t

highlighting

2006-09-23 Thread Stelios Eliakis
Hi, I'm new to lucene and I'm interesting in highlighting. I want to extract the Best Fragment (passage) from a text file. When I use the following code I take the first fragment that contains my query. Nevertheless, the JavaDoc says that the function getBestFragment returns the best fragment. Do

searching for the part of a term.

2006-09-23 Thread heritrix . lucene
Hi All, How can i make my search so that if i am looking for the term "counting" the documents containing "accounting" must also come up. Similarly if i am looking for term "workload", document s containing work also come up as a search result. Wildcard query seems to work in the first case, bu

Re: ChainedFilter

2006-09-23 Thread Bhavin Pandya
Hi guys, Its is solved now I come to know that "If you are ANDing/ORing FilteredQuery with say BooleanQuery then Its not giving proper result so i added that BooleanQuery before creating FilteredQuery" May be i am wrong...but i changed the sequence of my queries ...and now its working...

Help : get document lists from an index

2006-09-23 Thread Supheakmungkol SARIN
Hello all, I'd like to know whether there is any built-in method that allow us to get all the documents that a term belongs to from its index? Thanks a lot & regards, SS

Re: Analysis/tokenization of compound words

2006-09-23 Thread Pasquale Imbemba
Otis, I forgot to mention that I make use of Lucene for noun retrieval from the lexicon. Pasquale Pasquale Imbemba ha scritto: Hi Otis, I am completing my bachelor thesis at the Free University of Bolzano (www.unibz.it). My project is exactly about what you need: a word splitter for Germa

Re: Analysis/tokenization of compound words

2006-09-23 Thread Pasquale Imbemba
Hi Otis, I am completing my bachelor thesis at the Free University of Bolzano (www.unibz.it). My project is exactly about what you need: a word splitter for German compound words. Raffaella Bernardi who is reading in CC is my supervisor. As some from the lucene mailing list has already suggest