Return all distinct values

2006-03-30 Thread Java Programmer
Hello, I created small Lucene's application which stores lot of my users infomation, on of it is zipcode in numeric format eg. 50501, 63601 - zip codes are stored in Text fields so they are fully searchable what I want now to do is getting all unique zipcodes which was stored so far. Something like

RE: Return all distinct values

2006-03-30 Thread Ramana Jelda
Hi, Actually lucene does not provide you a straight forward Query to get UNIQUE results. But as far as I know, u can use HitsCollector & BitSet combination to count/get unique results. Regards, Jelda > -Original Message- > From: Java Programmer [mailto:[EMAIL PROTECTED] > Sent: Thursday

RE: Return all distinct values

2006-03-30 Thread mark harwood
This example code gets the unique terms for a field and a total num docs for each... String fieldName="myfield"; valueCounts=new ArrayList(); TermEnum termEnum; termEnum = indexReader.terms(new Term(fieldName,"")); Term term = termEnum.term(); while (term!=null) { if (!fieldName.equals(ter

RE: Return all distinct values

2006-03-30 Thread Paul . Illingworth
The IndexReader.terms() method gets a list of all the terms in an index. You need to somehow limit this to the terms for your ZipCode field which I don't know how to do. Luke has the ability to do this though so it is certainly possible. Regards Paul I.

Re: Return all distinct values

2006-03-30 Thread Volodymyr Bychkoviak
There is one little note. If index has deletions then counters could have wrong values... mark harwood wrote: This example code gets the unique terms for a field and a total num docs for each... String fieldName="myfield"; valueCounts=new ArrayList(); TermEnum termEnum; termEnum = indexReader.

Compound Indexes Problem

2006-03-30 Thread depsi programmer
Hello, I am using lucene for storing details of my students. I have used SetUseCompoundFile(True) and optimised the indexes. Now I am not able to convert them back to their original form Thanks in advance Depsi - New Yahoo! Messenger with

Merging partially sorted indices to form a new fully sorted single index?

2006-03-30 Thread chan kang
Hi, I've been trying to show the query results in a reverse-chronological order, and found out that the best way to do so is to pre-sort them if possible, so that, when searching, the relevant documents are shown in the reverse-chronological order(the most recent document at the top) even without r

writeChars method in IndexOutput

2006-03-30 Thread Dennis Kubes
I was reading up on conversion of characters to UTF-8 and I now understand why it is writing out UTF-8 (to be able to support most of the worlds languages with minimal space?). But after reading up on the algorithms for conversion as given below, does the writeChars method not support the U+1→U

Re: writeChars method in IndexOutput

2006-03-30 Thread Yonik Seeley
Lucene doesn't currently output totally valid UTF-8 Patches to make it do so are here: http://www.mail-archive.com/java-dev@lucene.apache.org/msg01987.html Should this be tackled pre or post 2.0? -Yonik http://incubator.apache.org/solr Solr, The Open Source Lucene Search Server On 3/30/06, Denni

RE: writeChars method in IndexOutput

2006-03-30 Thread Dennis Kubes
Is this modified UTF-8 such as is found in DataInput interface? Dennis -Original Message- From: Yonik Seeley [mailto:[EMAIL PROTECTED] Sent: Thursday, March 30, 2006 11:56 AM To: java-user@lucene.apache.org Subject: Re: writeChars method in IndexOutput Lucene doesn't currently output to

Re: writeChars method in IndexOutput

2006-03-30 Thread Yonik Seeley
On 3/30/06, Dennis Kubes <[EMAIL PROTECTED]> wrote: > Is this modified UTF-8 such as is found in DataInput interface? Yes, I believe so. -Yonik http://incubator.apache.org/solr Solr, The Open Source Lucene Search Server > -Original Message- > From: Yonik Seeley [mailto:[EMAIL PROTECTED]

Folksonomy

2006-03-30 Thread msftblows
I saw an implementation of folksonomy with Lucene...does anyone know about that or seen it?

RE: Compound Indexes Problem

2006-03-30 Thread Dennis Kubes
According to the Lucene In Action book you can convert from one compound to multi-file and vice versa by setting the setCompoundFile method to true or false. But in running this myself I found that while I can convert from multi-file to compound, it doesn't convert back. Here is the code that I u

Re: Folksonomy

2006-03-30 Thread Erik Hatcher
On Mar 30, 2006, at 2:49 PM, [EMAIL PROTECTED] wrote: I saw an implementation of folksonomy with Lucene...does anyone know about that or seen it? It's no secret that Otis' wonderful Simpy is powered by Lucene. I'm in the process of building a folksonomy-based application that allows collect

Re: Folksonomy

2006-03-30 Thread msftblows
Erik- I would be interested in seeing that...and then converting it to DotLucene :) Keep me updated! -Original Message- From: Erik Hatcher <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Thu, 30 Mar 2006 15:24:08 -0500 Subject: Re: Folksonomy On Mar 30, 2006, at 2:49 PM,

Re: Folksonomy

2006-03-30 Thread Otis Gospodnetic
Which one, btw? I'd be curious to know... Otis -- Simpy -- http://www.simpy.com/ -- Tag. Search. Share. - Original Message From: [EMAIL PROTECTED] I saw an implementation of folksonomy with Lucene...does anyone know about that or seen it? --

Re: Data structure of a Lucene Index

2006-03-30 Thread Doug Cutting
I talked about this a bit in a presentation at Haifa last year: http://www.haifa.ibm.com/Workshops/ir2005/papers/DougCutting-Haifa05.pdf See the section on "Seek versus Transfer". Doug Prasenjit Mukherjee wrote: It seems to me that lucene doesn't use B-tree for its indexing storage. Any paper

Implemented subclasses of Similarity class in Lucene

2006-03-30 Thread Ganesh Ramakrishnan
Hi Is anyone aware of subclasses of the Similarity class in Lucene? Two subclasses are: DefaultSimilarity and SimilarityDelegator . Are any other implemented subclasses of Similarity, developed by anyone else available on the web? For example, Language Model based similarity, or Okapi-BM simi

Any plans for a 1.9.2 release? Need timeout setting!

2006-03-30 Thread Bill Janssen
I presume the patch that gives us a way of overriding the default timeout for write locks has made it into the source DB, but I really need a jar file to point people at which contains it. Any chance of a 1.9.2 release? Bill - T

RE: Compound Indexes Problem

2006-03-30 Thread depsi programmer
Hello, Thanks for your responce. can you please guide me on how to break this single index into multiple pieces. when I try to do so it corrupts the index. I had created a index with max merge docs set to 10,000 with set compound indexes set to true. now I called optimize with max merge docs set

Re: Compound Indexes Problem

2006-03-30 Thread Raghavendra Prabhu
Does changing the merge factor and setting the options to SetUseCompoundfile(false) split a single index into multiple pieces. Even i have been doing something similar and would like to know how it is done Rdgs Prabhu On 3/31/06, depsi programmer <[EMAIL PROTECTED]> wrote: > > Hello, > Thank