Getting top n most frequent words ?

2004-01-12 Thread Ralph
Hi, does Lucene have functionality to get the top n most frequent words from a given text / stream / troken stream etc. ? With frequencies? Ralf -- +++ GMX - die erste Adresse für Mail, Message, More +++ Neu: Preissenkung für MMS und FreeMMS! http://www.gmx.net

Re: Retrieving the content from hits...

2004-01-05 Thread Ralph
My problem is before that. I only get one field, the filename field... the contents field is totally missing and I have no idea why... Ralf I believe since you created the field using a Reader, you have to use the Field.readerValue() method instead of the stringValue() method and then

Re: Query expansion

2003-12-10 Thread Ralph
How do you model/store your taxonomies/ontologies regarding your datastructure ? Do you use Java datastructures or RDF? Cheers, Ralf Hi Everybody, I wish to use an hierarchy of concept provided by an Ontology to refine or expand my query answer with Lucene. May I Know If someone have

Hits - how many documents?

2003-12-03 Thread Ralph
Hi, is there a maximum of documents Hits provide or is it unlimited (means limited to heap size of VM)? If there is a maximimum, what is the number? Ralf -- +++ GMX - die erste Adresse für Mail, Message, More +++ Neu: Preissenkung für MMS und FreeMMS! http://www.gmx.net

Re: Hits - how many documents?

2003-12-03 Thread Ralph
Does this mean Hits points to ALL documents and the last one might have a score of 0.0 ? If it does not contain all documents, where is the treshhold then? Or based on which condition it stops pointing to certain documents? Ralf On Wednesday, December 3, 2003, at 09:36 AM, Ralph wrote

How to change similarity measure...

2003-12-01 Thread Ralph
as possible :-) ? Kind Regards, Ralph -- HoHoHo! Seid Ihr auch alle schön brav gewesen? GMX Weihnachts-Special: Die 1. Adresse für Weihnachts- männer und -frauen! http://www.gmx.net/de/cgi/specialmail +++ GMX - die erste Adresse für Mail, Message, More