Re: Encryption

2006-05-06 Thread Sebastian Marius Kirsch
On Sat, May 06, 2006 at 01:16:43AM +, George Washington wrote: > I am using Lucene to index as well as to store complete source documents > (typically few tens of thousands of documents, not millions). I would like > to protect the source documents with encryption but have the following > qu

Re: for the similarity measure

2006-04-28 Thread Sebastian Marius Kirsch
On Fri, Apr 28, 2006 at 01:54:51PM +0800, jason wrote: > After reading the code, I found the similarity measure in Lucene is not the > same as the cosine coefficient measure commonly used. I dont know it is > correct. And I wonder whether i can use the cosine coefficient measure in > lucene or mayb

Re: Scoring with FunctionQueries?

2006-03-08 Thread Sebastian Marius Kirsch
On Tue, Mar 07, 2006 at 06:19:53PM -0800, Chris Hostetter wrote: > once you've tried the suggestions above, can you make send out a > selfcontained JUnit test showing the problems? Thanks, Chris, glad you agree that it doesn't work as you expect it to work. I will try your suggestions and send in

Re: Scoring with FunctionQueries?

2006-03-07 Thread Sebastian Marius Kirsch
Dear Chris, thanks very much for your quick answer. I tried both approaches, and both don't seem to do what I want. Perhaps I did not understand you properly. I generated a small in-memory index (six documents) for testing your suggestions, with some text in field "content" and a numeric score i

Scoring with FunctionQueries?

2006-03-07 Thread Sebastian Marius Kirsch
Hello, I have been trying out Yonik's excellent FunctionQuery (from Solr), but am having some problems regarding the scoring of FunctionQueries in conjunction with other queries. I am currently researching a data fusion approach, where you have several separate scores for a document and combine t

Re: Lucene + LSI

2005-12-13 Thread Sebastian Marius Kirsch
On Tue, Dec 13, 2005 at 10:53:42AM +, adasal wrote: > There seem to be quite a few alternatives around. I would be interested in > comments on the following:- > The work at NITLE > using Contextual > Network Search (CNS) a graph-based alternative

Re: About Combining Scores

2005-11-13 Thread Sebastian Marius Kirsch
On Sun, Nov 13, 2005 at 12:04:41AM +0100, Karl Koch wrote: > My aim is to combine this two scores. The Lucenes score is normalisied > between 0.0 and 1.0 (if the score exceeded 1.0 at some point) or less then > 1.0 (if it did not). The user model looks the same in this perspective - > although base

Re: n-gram indexing

2005-07-24 Thread Sebastian Marius Kirsch
Hi Rajeev, I wrote a filter for generating n-grams a while back; I intended to use it for statistics, but I guess you can also use it for search. I also thought of the "boosting effect" you describe when I implemented it, though I never actually tried whether it works that way. It's in the Lucene

Re: index phrases

2005-06-21 Thread Sebastian Marius Kirsch
On Tue, Jun 21, 2005 at 02:06:31PM -0400, Erik Hatcher wrote: > A contribution with dependencies is fine, especially Apache ones. We > can put this code in the contrib area if you'd like to contribute > it. If so, please create a Bugzilla issue and attach the sources. Hi Erik, thanks for th

Re: index phrases

2005-06-21 Thread Sebastian Marius Kirsch
On Tue, Jun 21, 2005 at 04:01:41PM +0200, Roxana Angheluta wrote: > I would like to include phrases (of a certain maximum length given as a > parameter) in the index. I know this is non-standard for e.g. searching, > where a PhraseQuery can be built which makes use of the terms positions. > Howe

Re: managing docids for ParallelReader

2005-06-04 Thread Sebastian Marius Kirsch
Dear Doug, thanks for your message. On Fri, Jun 03, 2005 at 09:37:01AM -0700, Doug Cutting wrote: > Sebastian Marius Kirsch wrote: > >I took up your suggestion to use a ParallelReader for adding more > >fields to existing documents. I now have two indexes with the same > &g

Re: managing docids for ParallelReader

2005-06-03 Thread Sebastian Marius Kirsch
Hi Doug, I took up your suggestion to use a ParallelReader for adding more fields to existing documents. I now have two indexes with the same number of documents, but different fields. One field is duplicated (the id field.) I wrote a small class to merge those two indexes into one index; it is a

Augmenting an existing index (was: ACLs and Lucene)

2005-05-30 Thread Sebastian Marius Kirsch
Hello, I have a similar problem, for which ParallelReader looks like a good solution -- except for the problem of creating a set of indices with matching document numbers. I want to augment the documents in an existing index with information that can be extracted from the same index. (Basically,