Parsnips 1.4

2005-05-05 Thread Bill Tschumy
I have just released Parsnips 1.4, a commercial PIM for storage and retrieval of bookmarks, notes, file URLs and snippets of text. Parsnips is powered by Lucene. Please have a look at: -- Bill Tschumy Otherwise -- Austin, TX http://www.otherwise.com -

Re: what is QueryFilter.bits ???

2005-05-05 Thread Erik Hatcher
On May 5, 2005, at 4:41 PM, Pablo Gomes Ludermir wrote: I would like to know what exactly represent the bit pattern on the method QueryFilter.bits(IndexReader). The BitSet represents the documents in the index, sequentially (the order that they were indexed). Each bit represents a single docum

Re: Highlight problem

2005-05-05 Thread markharw00d
Thanks for pointing out this issue. The bug was related to having a doc bigger than the maxNumDocsToAnalyze setting. In this situation, the last fragment created was always sized from maxNumDocsToAnalyze position to the remainder of the doc (in your case, quite large!) I have fixed this in SVN

Re: Changing default "OR" to "AND" for QueryParser

2005-05-05 Thread Richard Littin
[EMAIL PROTECTED] wrote: Hello all, For us "OR" defaul causes confusion among the users because they expect narrower results as they add to the query but the oppsite happens because the terms are or-ed. It it possble to request QueryParser to use "AND" as a default instead of "OR"? What API ca

Changing default "OR" to "AND" for QueryParser

2005-05-05 Thread ivj2324234-lucene
Hello all, For us "OR" defaul causes confusion among the users because they expect narrower results as they add to the query but the oppsite happens because the terms are or-ed. It it possble to request QueryParser to use "AND" as a default instead of "OR"? What API call(s) is responsible for

what is QueryFilter.bits ???

2005-05-05 Thread Pablo Gomes Ludermir
Dear all, I would like to know what exactly represent the bit pattern on the method QueryFilter.bits(IndexReader). Sometime ago I posted a question here about a categorized search. Basically I needed to know in how many different categories (category is a Field that I index with the document) hav

Re: ArrayIndexOutOfBoundsException on BooleanScorer.score()

2005-05-05 Thread Matt Magoffin
The exception does come on the heels of an update to the index by a different thread than the one the search runs in. These log statements show the operations going on just prior to the exception: May-05 12:36:05 DEBUG - Indexing Lead 1024 May-05 12:36:06 TRACE - CON Closing IndexWriter [EMAIL PRO

ArrayIndexOutOfBoundsException on BooleanScorer.score()

2005-05-05 Thread Matt Magoffin
Hello, I'm having a tough time trying to get to the root of an exception I see sometimes on my Lucene 1.4.3 index. The exception is: java.lang.ArrayIndexOutOfBoundsException: 4 at org.apache.lucene.search.BooleanScorer.score(BooleanScorer.java:126) at org.apache.lucene.search.Scorer.score(Scorer

Re: highlight problem

2005-05-05 Thread markharw00d
All looks OK with that bit. At the risk of sounding obvious - are you mistaking the results from multiple documents as the highlighted content from just one document? eg the end of your "for" loop looks like this: System.out.print(result); } and you assume the printed display is from just

Re: highlight problem

2005-05-05 Thread yinjin
Hi, Mark, Please ignore my previous posting. I sent it by accident. Sorry for the confusing. The complete code is here: === Analyzer analyzer = new StandardAnalyzer(); BufferedReader in = new BufferedReader(new InputStreamReader(System.

Re: categorized search

2005-05-05 Thread Pablo Gomes Ludermir
Chris, That was partially what I needed. You got it right when I said I needed the number of categories that I particular term appears (and it works). But, I also would like to know in how many documents in each category that term appears. For instance: title:lucene appears in the category "searc

Re: highlight problem

2005-05-05 Thread Erik Hatcher
Ying - please properly subscribe to the java-user list - I've moderated in each of your mails thus far. Erik On May 5, 2005, at 2:18 PM, [EMAIL PROTECTED] wrote: Hi, Mark, Sorry for the confusing. The complete code is here: === Analyzer analyzer =

Re: highlight problem

2005-05-05 Thread yinjin
Hi, Mark, Sorry for the confusing. The complete code is here: === Analyzer analyzer = new StandardAnalyzer(); BufferedReader in = new BufferedReader(new InputStreamReader(System.in)); String line = in.readLine(); if (line.length() == -

Re: Distribution Strategies?

2005-05-05 Thread Chris Lu
I have tried to synchronize indexes among servers. My attempt was to mimic what Nutch has done, using Nutch Distributed File System. But I found it's kind of complicated. Then I stopped. If someone has some success there, please share it. Chris Steven J. Owens wrote: Hi guys, A friend just as

Re: highlight problem

2005-05-05 Thread mark harwood
As much as you have shown of the example output is roughly what I would expect - using the default simpleFragmenter you get roughly 100 character sized fragments and you have shown 3 fragments sized 97, 100 and 105 chars long separated by "...". > Of course the result is far more than this. So ar

RE: PerFieldSimilarity

2005-05-05 Thread Robichaud, Jean-Philippe
Thanks for the clarification... While studying more in depth the doc about Similarity, I discover something that is troubling be a little. The idf is calculated using the following formula: (Log (numDocInIndex/ (numDocWithTerm_t +1)) +1 While I agree this is fine for most application, it is not

Re: highlight problem

2005-05-05 Thread yinjin
Quoting mark harwood <[EMAIL PROTECTED]>: Hi, Mark, I just used StandardAnalyzer and code is as following: = Analyzer analyzer = new StandardAnalyzer(); BufferedReader in = new BufferedReader(new InputStreamReader(System.in));

Re: indexing synonyms / reducing the index size

2005-05-05 Thread Andrew Boyd
I have done the same as Luke but I needed lucene 1.9rc1 to accomplish it. I tried it with 1.4.3 but the queryparser could not handle it. Andrew -Original Message- From: Luke Shannon <[EMAIL PROTECTED]> Sent: May 5, 2005 8:54 AM To: java-user@lucene.apache.org, Pablo Gomes Ludermir <[EMAIL

String sort order

2005-05-05 Thread Daniel Massie
Hi I am looking for a way to provide a custom comparator to lucene which will allow me to sort Strings by case aswell as value, essentially using equalsIgnoreCase. Can anyone suggest a simple method of achieving this? Thanks Daniel --

Re: indexing synonyms / reducing the index size

2005-05-05 Thread Luke Shannon
Hi Pablo; I handle synonyms in the Query rather than the Index. Whenever I build a query I check to see if there is a synonym for each word, or a replacement for the entire string the user is searching on. If there is (either or both cases) I include all the synonyms/replacement strings applicable

Re: highlight problem

2005-05-05 Thread mark harwood
>> One of my >> search results from our >> records contains far too much of the text This is a problem I haven't seen before. I suspect it may have something to do with your choice of analyzer. Your paper will only ever be fragmented on "token gap" boundaries ie points in the token stream where t