Re: luceneweb: 500 Exception with results.jsp "parse(java.lang.String)"

2006-06-15 Thread Erik Hatcher
Ron, This issue was corrected just after the release of 2.0. You can grab a new results.jsp from Subversion or download it from here: Erik On Jun 15, 2006, at 9:48 PM, Ron Parker wrote:

luceneweb: 500 Exception with results.jsp "parse(java.lang.String)"

2006-06-15 Thread Ron Parker
Just installed Lucene 2.0.0, java version "1.4.2_10", Resin-3.0.18. The test index and query from the command line (http://lucene.apache.org/java/docs/demo.html) work successfully. I dropped the lucene.war into my Resin webapps directory and it created the luceneweb directory. When I naviga

Re: CJKAnalyzer - does it work?

2006-06-15 Thread Erik Hatcher
On Jun 15, 2006, at 7:29 PM, Ray Tsang wrote: Where did you get that chinese sentence from? That's funny! haha. 我不知道! ;) Erik On 6/15/06, Erik Hatcher <[EMAIL PROTECTED]> wrote: Rob, Your example is, hopefully, not exact since you used "C1..." which I presume was not what you or

Re: CJKAnalyzer - does it work?

2006-06-15 Thread Ray Tsang
Hi Erik, Where did you get that chinese sentence from? That's funny! haha. ray, On 6/15/06, Erik Hatcher <[EMAIL PROTECTED]> wrote: Rob, Your example is, hopefully, not exact since you used "C1..." which I presume was not what you originally tested with. CJKAnalyzer is working fine for me i

Re: CJKAnalyzer - does it work?

2006-06-15 Thread Erik Hatcher
Rob, Your example is, hopefully, not exact since you used "C1..." which I presume was not what you originally tested with. CJKAnalyzer is working fine for me in this example adapted from your code: public void testCJKAnalyzer() throws Exception { RAMDirectory directory = new RAMDire

RE: BooleanQuery.TooManyClauses on MultiSearcher

2006-06-15 Thread Rob Staveley (Tom)
The penny drops. Thank you so much for your time, Chris :-) -Original Message- From: Chris Hostetter [mailto:[EMAIL PROTECTED] Sent: 15 June 2006 18:43 To: java-user@lucene.apache.org Subject: RE: BooleanQuery.TooManyClauses on MultiSearcher : Incidentally, I'm getting BooleanQuery.Too

RE: BooleanQuery.TooManyClauses on MultiSearcher

2006-06-15 Thread Chris Hostetter
: Incidentally, I'm getting BooleanQuery.TooManyClauses when I search on : "james", but I don't when I search on "James". Surely the number of clauses : isn't dependent on the number of hits?! not the numebr of hits -- just hte number of terms in your index that start with the prefix. : However

RE: BooleanQuery.TooManyClauses on MultiSearcher

2006-06-15 Thread Rob Staveley (Tom)
It is a good point that you raise, Chris. I'm already treating To, Cc, From, MAIL-FROM, and RCPT-TO as separate fields (the latter fields being from SMTP). I'd like a "fast and loose" query on james, to find anything relevant to James. I guess to avoid getting too many Boolean terms, I should have

RE: BooleanQuery.TooManyClauses on MultiSearcher

2006-06-15 Thread Chris Hostetter
: I'm still trying to get my head around ConstantScorePrefixQuery. Could I : simply use this as a drop-in replacement for PrefixQuery? that's what it was designed to do .. you just need to grab a copy of ConstantScorePrefixQuery and PrefixFilter from the same package (ConstantScorePrefixQuery is

RE: BooleanQuery.TooManyClauses on MultiSearcher

2006-06-15 Thread Rob Staveley (Tom)
Incidentally, I'm getting BooleanQuery.TooManyClauses when I search on "james", but I don't when I search on "James". Surely the number of clauses isn't dependent on the number of hits?! However, I know that "fred" is relatively uncommon in my index and "neil" is relatively common and yet "fred"

RE: BooleanQuery.TooManyClauses on MultiSearcher

2006-06-15 Thread Chris Hostetter
: I'd quite like to avoid tokenising james from [EMAIL PROTECTED], because I : like the way PrefixQuery (when it works) matches [EMAIL PROTECTED] well sure ... but if you say that becaues you want "." and "-" to be treaded specially you could write an Email EmailAnalyzer that produces the token

RE: BooleanQuery.TooManyClauses on MultiSearcher

2006-06-15 Thread Rob Staveley (Tom)
I'm still trying to get my head around ConstantScorePrefixQuery. Could I simply use this as a drop-in replacement for PrefixQuery? -Original Message- From: Chris Hostetter [mailto:[EMAIL PROTECTED] Sent: 15 June 2006 18:22 To: java-user@lucene.apache.org; eks dev Subject: Re: BooleanQuery

Re: BooleanQuery.TooManyClauses on MultiSearcher

2006-06-15 Thread Chris Hostetter
: Did not check it, but solr is using SkippingFilter which is not yet : commited in Lucene... so this will maybe not work? Solr des not use SkippingFilter. -Hoss - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional com

RE: BooleanQuery.TooManyClauses on MultiSearcher

2006-06-15 Thread Rob Staveley (Tom)
I'd quite like to avoid tokenising james from [EMAIL PROTECTED], because I like the way PrefixQuery (when it works) matches [EMAIL PROTECTED] too. I'll take a look at ConstantScorePrefixQuery -Original Message- From: Chris Hostetter [mailto:[EMAIL PROTECTED] Sent: 15 June 2006 16:50 To:

Re: BooleanQuery.TooManyClauses on MultiSearcher

2006-06-15 Thread eks dev
Did not check it, but solr is using SkippingFilter which is not yet commited in Lucene... so this will maybe not work? By the way, any reason today not to commit SkippingFilter to Lucene? I actually see nothing to do for this, but to commit existing SkippingFilter. If there is something I do

Re: Query For Top Values

2006-06-15 Thread Chris Hostetter
google the mailing list archives for "category counts" and "facet counts" and i think you'll find several past discussions on this topic. : Date: Thu, 15 Jun 2006 10:45:14 -0400 : From: Mike Richmond <[EMAIL PROTECTED]> : Reply-To: java-user@lucene.apache.org : To: java-user@lucene.apache.org : S

RE: BooleanQuery.TooManyClauses on MultiSearcher

2006-06-15 Thread Chris Hostetter
: I guess the most expensive thing I'm doing from the perspective of Boolean : clauses is heavily using PrefixQuery. : : I want my user to be able to find e-mail to, cc or from [EMAIL PROTECTED], so : I opted for PrefixQuery on James. Bearing in mind that this is causing me : grief with BooleanQue

Query For Top Values

2006-06-15 Thread Mike Richmond
Hello All, Can anyone recommend a solution to get the top values of a field from the results of a query? For example, If I have a field named "from" is there a way to get the top occuring values in that field after I filter the results based on a query? Note: Luke has a great tool that enumerat

Re: BooleanQuery.TooManyClauses on MultiSearcher

2006-06-15 Thread Michael D. Curtin
Rob Staveley (Tom) wrote: I guess the most expensive thing I'm doing from the perspective of Boolean clauses is heavily using PrefixQuery. I want my user to be able to find e-mail to, cc or from [EMAIL PROTECTED], so I opted for PrefixQuery on James. Bearing in mind that this is causing me grie

Re: Searching UN_TOKENIZED fields

2006-06-15 Thread deshmol-lists
Thanks Michael, I general my queries could be more complex than the example I outlined earlier, so I do need to use the Query Parser. Hence, PerFieldAnalyzerWrapper and KeywordAnalyzer seemed to do the trick for me. PerFieldAnalyzerWrapper analyzer = new PerFieldAnalyzerWrapper(new StandardAnalyz

RE: BooleanQuery.TooManyClauses on MultiSearcher

2006-06-15 Thread Rob Staveley (Tom)
I guess the most expensive thing I'm doing from the perspective of Boolean clauses is heavily using PrefixQuery. I want my user to be able to find e-mail to, cc or from [EMAIL PROTECTED], so I opted for PrefixQuery on James. Bearing in mind that this is causing me grief with BooleanQuery.TooManyCl

Re: Searching UN_TOKENIZED fields

2006-06-15 Thread Michael D. Curtin
[EMAIL PROTECTED] wrote: Hi, I have a field indexed as follows: new Field(name, value, Store.YES, Index.UN_TOKENIZED) I would like to search this field for exact match of the query term. Thus if, for instance in the above code snippet: String name="PROJECT"; String value="Apache Lucene";

Searching UN_TOKENIZED fields

2006-06-15 Thread deshmol-lists
Hi, I have a field indexed as follows: new Field(name, value, Store.YES, Index.UN_TOKENIZED) I would like to search this field for exact match of the query term. Thus if, for instance in the above code snippet: String name="PROJECT"; String value="Apache Lucene"; I would like to get a hit

BooleanQuery.TooManyClauses on MultiSearcher

2006-06-15 Thread Rob Staveley (Tom)
I've just added a 3rd index directory (i.e. 3rd IndexSearcher) to my MultiSearcher and I'm getting BooleanQuery.TooManyClauses errors on queries which were working happily on 2 indexes. Here's an example query, which hopefully you'll find self-explanatory from the XML structure. 8<

FieldCache problems

2006-06-15 Thread Marcus Falck
Hello, I'm going to index a lot of data in lucene. I have just written a functional prototype. One requirement for the application I'm designing the prototype for are the ability to present the search results ordered by date and since the data is very frequently changed I can't have the Ind

RE: Document clustering using lucene

2006-06-15 Thread John Hamilton
I'v been thinking about a similar problem. However, it seems that the similarity score returned by a search is only relevant within those search results. You can't compare the similarity scores from two different searches. I think you will have to compute the similarities yourself using the t

Re: Document clustering using lucene

2006-06-15 Thread Paul Elschot
On Thursday 15 June 2006 13:50, Prasenjit Mukherjee wrote: > I want to do some document clustering on a corpus of ~ 100,000 > documents, with average doc size being ~ 7k. I have looked into carrot2 > but it seems to work only for relatively short documents and has soem > scalign issues for lar

Document clustering using lucene

2006-06-15 Thread Prasenjit Mukherjee
I want to do some document clustering on a corpus of ~ 100,000 documents, with average doc size being ~ 7k. I have looked into carrot2 but it seems to work only for relatively short documents and has soem scalign issues for large corpus. Certainly for these kind of corpus size, one cannot us

CJKAnalyzer - does it work?

2006-06-15 Thread Robert Haycock
Hi, I have a very simple example. An IndexWriter (Lucene 1.9.0) with CJKAnalyzer (latest version as of today). A Chinese friend of mine as given me a sentence and a word that appears in that sentence, eg: "C1C2C3C4C5C6C7C8" where the word is "C3C4". Here's code segment: IndexWriter writer = n

RE: Questions on Query Scorer

2006-06-15 Thread Ferdinand Chan
Thanks Mile, But in my code, I haven't query the term prohibited. Also, in my index, there isn't a field called prohibited -Original Message- From: Mile Rosu [mailto:[EMAIL PROTECTED] Sent: Thursday, June 15, 2006 5:44 PM To: java-user@lucene.apache.org Subject: RE: Questions on Query Sc

RE: Questions on Query Scorer

2006-06-15 Thread Mile Rosu
Hello, The problem may be rather in the name of the field you are querying - "prohibited" in your case. You can check with Luke(http://www.getopt.org/luke/) the structure of the index on which you are performing your query. Mile -Original Message- From: Ferdinand Chan [mailto:[EMAIL PRO

Question on MutliFieldQueryParser

2006-06-15 Thread Ferdinand Chan
My webapp is developed with Lucene 1.4.3 and I want to upgrade the Lucene library to version 2.0 But in Lucene 2.0, the MultiFieldQueryParser class was deprecated. I try to rewrite the code using a BooleanQuery as follows. Original code: String[] fields = {"TERM_A","TERM_B"}; quer

Questions on Query Scorer

2006-06-15 Thread Ferdinand Chan
How can I create a QueryScorer in Lucene 2.0??? When I create a QueryScorer using the following codes, BooleanQuery booleanQuery = new BooleanQuery(); booleanQuery.add(q1,BooleanClause.Occur.SHOULD); booleanQuery.add(q2,BooleanClause.Occur.SHOULD); QueryScorer scorer = new QueryScorer