Re: How to intergrate snowball in lucene

2006-03-06 Thread Patrick Kimber
Hi You should download the snowball contribution which is in the SubVersion repository: http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/snowball This can be built using ANT. Patrick On 06/03/06, Haritha_Parvatham [EMAIL PROTECTED] wrote: Hi, Can anyone giude me to intergrate

Exact Search

2006-03-06 Thread Waleed Tayea
Dear All. How can I perform an exact search on an index constructed with a morphological analyzer. Thanks in advance Waleed,

Re: carrot2 vs. vivisimo

2006-03-06 Thread Dawid Weiss
Hello, my team has been working for the last couple of days on integrating carrot2 into our project as a sort of src (search result clustering) solution. Great to hear this; is there a public URL or something? i was rather impressed with the results, until i checked out vivisimo's demo and

Re: Exact Search

2006-03-06 Thread Erik Hatcher
Could you please provide an example of some sample text, the terms that are emitted by the analyzer, and a query you'd like to work? Erik On Mar 6, 2006, at 5:50 AM, Waleed Tayea wrote: Dear All. How can I perform an exact search on an index constructed with a morphological

RE: How to intergrate snowball in lucene

2006-03-06 Thread Haritha_Parvatham
Hi kimber, Thanks for replying my query.I have downloaded the snowball.After building it ,What is the next step.How to inplement snowball in lucene.pls reply Thanks haritha -Original Message- From: Patrick Kimber [mailto:[EMAIL PROTECTED] Sent: Monday, March 06, 2006 3:52 PM To:

Re: How to intergrate snowball in lucene

2006-03-06 Thread Erik Hatcher
On Mar 6, 2006, at 6:30 AM, Haritha_Parvatham wrote: Hi kimber, Thanks for replying my query.I have downloaded the snowball.After building it ,What is the next step.How to inplement snowball in lucene.pls reply Simply use the SnowballAnalyzer that is part of the JAR file that got built.

File Name Search

2006-03-06 Thread Brian
Quick Question, Is it possible to create an index search based on file names? Thanks, B __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com

Search for synonyms - implemenetation for review

2006-03-06 Thread Andrew Schetinin
Dear all, Me and my college, Mr. Ziv Gome, would like to present here an implementation of synonyms search that we use in our server. Probably it will be interesting for those who worked on synonyms, or going to implement synonyms search. We hope that this mail will raise interesting ideas and

RE: Distributed Lucene..

2006-03-06 Thread Andrew Schetinin
Hello, We are implementing a distributed searcher and indexer based on Lucene. I cannot share its code but I may provide hints basing on our experience. What we did basically is having several machines indexing documents and creating small Lucene indexes. We hacked :-) IndexWriter of Lucene to

Re: File Name Search

2006-03-06 Thread Volodymyr Bychkoviak
Yes. It possible. I've developed such search for our LAN shared files. I'm using technique of rotating filenames to improve wildcard queries performance. details : http://mail-archives.apache.org/mod_mbox/lucene-java-user/200506.mbox/[EMAIL PROTECTED] Other improvements: WildCardQuery is

Re: File Name Search

2006-03-06 Thread Erik Hatcher
On Mar 6, 2006, at 8:07 AM, Brian wrote: Quick Question, Is it possible to create an index search based on file names? Of course. One option is to simply make filename another field. The question is, should it be an exact match on filename for querying? Or should the filename get

Re: File Name Search

2006-03-06 Thread Brian
Cool, Basically I have soming similar to: name_division.date_order_code So I'm guessing I need to tokenize. Thanks, B --- Erik Hatcher [EMAIL PROTECTED] wrote: On Mar 6, 2006, at 8:07 AM, Brian wrote: Quick Question, Is it possible to create an index search based on file

Re: File Name Search

2006-03-06 Thread Brian
GREAT!! I don't have any questions today, I just wanted to make sure it was possible first. I'll be starting this in a few days (when I get an Okie Dokie...) Then I'm sure I'll have some questions. Thanks for the link and the reply. V/R B --- Volodymyr Bychkoviak [EMAIL PROTECTED] wrote: Yes.

Multisearch

2006-03-06 Thread WATHELET Thomas
I made a multi search into my Lucene index. It's work properly but I would like to know if it's possible to know in witch index de document belong to.

Re: Distributed Lucene..

2006-03-06 Thread Samuru Jackson
Do you plan to release some kind of a commerical product including an API? I ask because I'm evaluating different technologies for a prototype which is part of my diploma thesis. The problem is that I have to deal with real huge data amounts and one machine is simply not enough to handle those

Re: Search for synonyms - implemenetation for review

2006-03-06 Thread mark harwood
Sounds like you've been tackling a number of the issues I was concerned with fuzzy searching. It's essentially the same problem - the user types one word and the engine searches for several variants. The FuzzyLikeThisQuery class in the queries module of the contrib area in SVN contains similar

Re: Multisearch

2006-03-06 Thread Erik Hatcher
On Mar 6, 2006, at 10:05 AM, WATHELET Thomas wrote: I made a multi search into my Lucene index. It's work properly but I would like to know if it's possible to know in witch index de document belong to. This just came up the other day as well, and was covered in the past. Here's the thread

Re: File Name Search

2006-03-06 Thread Erik Hatcher
If and how you tokenize is entirely dependent on how the queries need to work. Lucene index design really is driven from querying needs backwards. Erik On Mar 6, 2006, at 10:00 AM, Brian wrote: Cool, Basically I have soming similar to: name_division.date_order_code So I'm

Help on Similarity

2006-03-06 Thread Eugene
Hi, I tried implementing my own Similarity and setting it in IndexWriter.setSimilarity(new CosSimilarity()). But, there's something weird, it doesn't seem to call the methods in my Similarity. For example, when I set the idf to return 0.0f the Similarity still gives me a score 0.0f. How

RE: Multisearch

2006-03-06 Thread WATHELET Thomas
Thanks a lot. -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: 06 March 2006 16:25 To: java-user@lucene.apache.org Subject: Re: Multisearch On Mar 6, 2006, at 10:05 AM, WATHELET Thomas wrote: I made a multi search into my Lucene index. It's work properly but I

Re: Help on Similarity

2006-03-06 Thread Eugene
With respect to the earlier post there seems to be a bug in lucene 1.9.1 I tried using the similarity below and changed idf to: public float idf(int docFreq, int numDocs) { float f = (float)(Math.log((double)numDocs/(double)(docFreq+1) + 1.0)); return f; } Now, when I print the

Using NOT queries inside parentheses

2006-03-06 Thread Satuluri, Venu_Madhav
Hi, The following query does not work as expected for me: alwaysTrueField:true (-name:john) neither does this: alwaysTrueField:true +(-name:john) It returns zero results, despite there being many documents without name john. (alwaysTrueField is, needless to say, true for all documents). This

RE: Search for synonyms - implemenetation for review

2006-03-06 Thread Rami Hansenne
Hi, I've been working on a project where Lucene queries were expanded with synonyms/related concepts and used a DisjunctionMaxQuery with lower boost factors for the synonym subqueries. This solved part of the problem, but still a number of annoying side effects remained. I've experimented a

Re: Help on Similarity

2006-03-06 Thread Chris Hostetter
: I tried implementing my own Similarity and setting it in : IndexWriter.setSimilarity(new CosSimilarity()). that only changes the Similarity used by the IndexWriter when writing out hte index files (which is really only used to get the lengthNorm) if you wnat to change the Similarity used at

Re: Using NOT queries inside parentheses

2006-03-06 Thread Chris Hostetter
: The following query does not work as expected for me: : alwaysTrueField:true (-name:john) : neither does this: : alwaysTrueField:true +(-name:john) : Does lucene run a sub-query for each part of the query inside : parentheses, which is why the NOT query that is alone doesn't work? I am

sumOfSquaredWeights for lengthNorm

2006-03-06 Thread Eugene
Hi, I would like to override the Similarity class lengthNorm(String fieldName, int numTerms) so that it behaves similar to queryNorm(float sumOfSquaredWeights). So the method signature becomes lengthNorm(String fieldName, float sumOfSquaredWeights) where sumOfSquaredWeights = sum of the

Re: MultiPhraseQuery

2006-03-06 Thread Daniel Naber
On Sonntag 05 März 2006 19:03, Eric Jain wrote: I need to write a function that copies a MultiPhraseQuery and changes the field the query applies to. Unfortunately the API allows access to neither the contained terms nor the field! The other query classes I have so far dealt with all seem to

Re: MultiPhraseQuery

2006-03-06 Thread Erik Hatcher
On Mar 6, 2006, at 4:43 PM, Daniel Naber wrote: On Sonntag 05 März 2006 19:03, Eric Jain wrote: I need to write a function that copies a MultiPhraseQuery and changes the field the query applies to. Unfortunately the API allows access to neither the contained terms nor the field! The other

Re: Multisearch

2006-03-06 Thread Raul Raja Martinez
Wouldn't it make sense to have a Hit know where he came from such as hit.getIndex() instead of having to invoke subSearcher or subDoc? Just a thought Erik Hatcher wrote: On Mar 6, 2006, at 10:05 AM, WATHELET Thomas wrote: I made a multi search into my Lucene index. It's work properly but I

BooleanQuery$TooManyClauses with 1.9.1 when Number RangeQuery

2006-03-06 Thread Youngho Cho
Hello I upgade to 1.9.1 and reindexing I used NumberTool when I index the number. after upgrade I got following error when number range query. with query 2006-03-07 09:08:03,216 [TP-Processor3] DEBUG com.nannet.jettiger.util.word.JetTigerAdapter - Queryafter=+prntid:56 +(+(+attid:113

Re: Help interpreting explanation

2006-03-06 Thread Eugene
Thanks, Chris for your clear explanations, it seems there are a lot info on using Lucene but info for the internal workings of Lucene is hard to come by. I got some more questions which I'll ask in-line. Chris Hostetter wrote: : Since i'm using a boolean OR query i figured it must be

Re: sumOfSquaredWeights for lengthNorm

2006-03-06 Thread Eugene
Hi, My comments in-line. Chris Hostetter wrote: : I would like to override the Similarity class lengthNorm(String : fieldName, int numTerms) so that it behaves similar to queryNorm(float : sumOfSquaredWeights). So the method signature becomes lengthNorm(String : fieldName, float

Re: BooleanQuery$TooManyClauses with 1.9.1 when Number RangeQuery

2006-03-06 Thread Chris Hostetter
: I upgade to 1.9.1 and reindexing : I used NumberTool when I index the number. : : after upgrade I got following error when number range query. : with query The possibility of a TooManyClauses exception has always existed with RangeQuery and numbers, even when using NumberTool. Even if you

Re: Help interpreting explanation

2006-03-06 Thread Chris Hostetter
: on using Lucene but info for the internal workings of Lucene is hard to : come by. As with many OS code bases: the code is the documentation. : 1) I'm using the default QueryParser to parse and return a query so it's : a Boolean-OR query. So does this mean it uses the DisjunctionSumScorer :

Re: sumOfSquaredWeights for lengthNorm

2006-03-06 Thread Chris Hostetter
: 1) the boosts associated with Fields and Documents at indexing time, : which are combined with the lengthNorm at index time to determine a single : norm value for the doc/field pair. : : I don;t think this is what I want because the lengthNorm is still using : the # of terms. You can

Re: MultiPhraseQuery

2006-03-06 Thread Eric Jain
Daniel Naber wrote: Please try to add this to MultiPhraseQuery and let us know if it helps: public List getTerms() { return termArrays; } That is indeed all I need (the list wouldn't have to be mutable though). Any chance this could be committed? Incidentally, would be helpful if