Re: Lucene as primary object storage

2007-06-29 Thread karl wettin
28 jun 2007 kl. 15.37 skrev Emmanuel Bernard: I don't really like the idea actually: I'm much comfortable with having my data in a relational DB :) If you don't mind, please develop that a bit further. I think Lucene is suited pretty well for object storage if you also need it as an

Re: Can I delete without shuffling document IDs?

2007-06-29 Thread karl wettin
29 jun 2007 kl. 05.08 skrev Daniel Noll: I just wanted to put the question out in case someone has solved the exact same problem already. I've posted some experiments in the LUCENE-879. The patch replace delted documents with a new dummy document. The second patch contains some merge

Re: queryparser

2007-06-29 Thread pratik shinghal
On 6/29/07, Erick Erickson [EMAIL PROTECTED] wrote: What do you get if you do a System.out.println(que.toString())? And what analyzer are you using? Erick On 6/28/07, pratik shinghal [EMAIL PROTECTED] wrote: i m using lucene(org.apache.lucene) and i want the java code for parsing single

Pagination

2007-06-29 Thread Lee Li Bin
Hi, does anyone knows how to do pagination on jsp page using the number of hits return? Or any other solutions? Do provide me with some sample coding if possible or a step by step guide. Sry if I'm asking too much, I'm new to lucene. Thanks

Lucene 2.2, NFS, Lock obtain timed out

2007-06-29 Thread Patrick Kimber
Hi, We are sharing a Lucene index in a Linux cluster over an NFS share. We have multiple servers reading and writing to the index. I am getting regular lock exceptions e.g. Lock obtain timed out:

Re: Lucene 2.2, NFS, Lock obtain timed out

2007-06-29 Thread Doron Cohen
hi Patrick, Mike is the expert in this, but until he gets in, can you add details on the update pattern - note that the DeletionPolicy you describe below is not (afaik) related to the write lock time-out issues you are facing. The DeletionPolicy manages better the interaction between an

Re: Lucene 2.2, NFS, Lock obtain timed out

2007-06-29 Thread Patrick Kimber
Hi Doron Thanks for your reply. I am working on the details of the update pattern. It will take me some time as I cannot reproduce the issue on demand. To answer your other questions, yes, we do have multiple writers. One writer per node in the cluster. I will post the results of my

Re: Lucene 2.2, NFS, Lock obtain timed out

2007-06-29 Thread Patrick Kimber
Hi As requested, I have been trying to improve the logging in the application so I can give you more details of the update pattern. I am using the Lucene Index Accessor contribution to co-ordinate the readers and writers:

Re: Luke faster + Index Searcher is slow

2007-06-29 Thread Nott
Hi I even tried like this . But I'm not getting any benfifts. How to use Expert Search can you assist ? File indexFile = new File(fileName); FSDirectory dir = FSDirectory.getDirectory(indexFile); indexSearcher =new IndexSearcher(dir);

Re: Luke faster + Index Searcher is slow

2007-06-29 Thread Nott
Hi I even tried like this . But I'm not getting any benfifts. How to use Expert Search can you assist ? File indexFile = new File(fileName); FSDirectory dir = FSDirectory.getDirectory(indexFile); indexSearcher =new IndexSearcher(dir);

Re: Lucene 2.2, NFS, Lock obtain timed out

2007-06-29 Thread Mark Miller
This is an interesting choice. Perhaps you have modified LuceneIndexAccessor, but it seems to me (without knowing much about your setup) that you would have odd reader behavior. On a 3 node system, if you add docs with node 1 and 2 but not 3 and your doing searches against all 3 nodes, node 3

Re: Lucene 2.2, NFS, Lock obtain timed out

2007-06-29 Thread Patrick Kimber
Hi Mark Yes, thank you. I can see your point and I think we might have to pay some attention to this issue. But, we sometimes see this error on an NFS share within 2 minutes of starting the test so I don't think this is the only problem. Once again, thanks for the idea. I will certainly be

Geneology, nicknames, levenstein, soundex/metaphone, etc

2007-06-29 Thread Darren Hartford
Hey all, As you can tell by the subject, interested in 'name searching' and 'nearby name' searching. Scenarios include Geneology and Similar-Person-from-Different-Datasources matchings. Assuming java-based lucene, and more than likely the Solr project. *nickname: would it be feasible to create

Re: Lucene 2.2, NFS, Lock obtain timed out

2007-06-29 Thread Mark Miller
If your getting java.io.FileNotFoundException: /mnt/nfstest/repository/lucene/lucene-icm-test-1-0/segments_h75 within 2 minutes, this is very odd indeed. That would seem to imply your deletion policy is not working. You might try just using one of the nodes as the writer. In Michaels

Re: Lucene 2.2, NFS, Lock obtain timed out

2007-06-29 Thread Patrick Kimber
Hi Mark I just ran my test again... and the error occurred after 10 minutes - which is the time when my deletion policy is triggered. So... I think you might have found the answer to my problem. I will spend more time looking at it on Monday. Thank you very much for your help and enjoy your

Re: Geneology, nicknames, levenstein, soundex/metaphone, etc

2007-06-29 Thread Grant Ingersoll
You may find this thread useful: http://www.gossamer-threads.com/ lists/lucene/java-user/47824?search_string=record%20linkage;#47824 although it doesn't answer all your ?'s I think in the end you will need to do post processing on the results, but maybe not. On Jun 29, 2007, at 11:41 AM,

Re: Pagination

2007-06-29 Thread Chris Lu
After search, you will just get an object Hits, and go through all of the documents by hits.doc(i). The pagination is controlled by you. Lucene is pre-caching first 200 documents and lazy loading the rest by batch size 200. -- Chris Lu - Instant Scalable Full-Text Search

Re: Payloads and PhraseQuery

2007-06-29 Thread Peter Keegan
I tried to subclass PhraseScorer, but discovered that it's an abstract class and its subclasses (ExactPhraseScorer and SloppyPhraseScorer) are final classes. So instead, I extended Scorer with my custom scorer and extended PhraseWeight (after making it public). My scorer's constructor is passed

Re: queryparser

2007-06-29 Thread Erick Erickson
Well, I'd suggest the first thing you do is remove your custom tokenizer and see what results you get with one of the normal parsers. Then creep up on your custom analyzer bit by bit. Otherwise, it's almost impossible to figure out what's going on except by setting breakpoints in your analyzer

Re: Lucene 2.2, NFS, Lock obtain timed out

2007-06-29 Thread Chris Hostetter
: We are sharing a Lucene index in a Linux cluster over an NFS share. We have : multiple servers reading and writing to the index. : : I am getting regular lock exceptions e.g. : Lock obtain timed out: :

Re: Lucene 2.2, NFS, Lock obtain timed out

2007-06-29 Thread Chris Hostetter
: Perhaps i'm missing something, but i thought NativeFSLock was not suitable : for NFS? ... or is is this what lockd provides? (my NFS knowledge is : very out of date) Do'h! I just read the docs for NativeFSLockFactory and noticed the For example, for NFS servers there sometimes must be a

Re: queryparser

2007-06-29 Thread pratik shinghal
when i m using normal tokenizers i m getting track as a result and not getting 9 . and when i m using this custom analyser and checking the output , i m getting the right output as track 9 . but as soon as i use queryparser using the same custom analyser i get only track and not 9 . so

Limiting search results by a collection of terms

2007-06-29 Thread rengelhardt
I’m currently a bit confused on how to accomplish limiting my search results in Lucene (v1.4.3 can’t easily upgrade for this project). Hopefully someone can help point me in the correct direction. Essentially my application is comprised of several objects, namely User, Group, and Document

Re: Lucene 2.2, NFS, Lock obtain timed out

2007-06-29 Thread Doron Cohen
Patrick Kimber wrote: As requested, I have been trying to improve the logging in the application so I can give you more details of the update pattern. I am using the Lucene Index Accessor contribution to co-ordinate the readers and writers: http://www.nabble.com/Fwd%3A-Contribution%3A-

Re: Lucene 2.2, NFS, Lock obtain timed out

2007-06-29 Thread Doron Cohen
Yonik wrote: Note that some Solr users have reported a similar issue. https://issues.apache.org/jira/browse/SOLR-240 Seems the scenario there is without using native locks? - i get the stacktrace below ... with useNativeLocks turned off

Re: Lucene 2.2, NFS, Lock obtain timed out

2007-06-29 Thread Doron Cohen
Mark Miller wrote: You might try just using one of the nodes as the writer. In Michaels comments, he always seems to mention the pattern of one writer many readers on nfs. In this case you could use no LockFactory and perhaps gain a little speed there. One thing I would worry about if

spannearquery help

2007-06-29 Thread Akanksha Baid
I have two strings - String1 contains multiple words String2 contains just 1 word I need to search my index to find hits where String1 and String2 occur within a distance slop = d of each other. Order is important. Also, ideally I would like to do a fuzzy search on String1. Is there some way

Re: Lucene 2.2, NFS, Lock obtain timed out

2007-06-29 Thread Mark Miller
Never used the IndexAccessor patch, so I may be wrong in the following. No, let's fix it... /;- Don't mean to wade in over my head here, but just to help out those that have not used LuceneIndexAccessor. I am fairly certain that using the LuceneIndexAccessor could easily create the

Re: spannearquery help

2007-06-29 Thread Mark Miller
I would look at Query getFieldQuery(String field, String queryText) in QueryParser for inspiration. feed the two strings, one at a time to the analyzer. With the results from String1 do something like: ListSpanQuery clauses = new ArrayListSpanQuery(v.size()); for (int

Re: Lucene 2.2, NFS, Lock obtain timed out

2007-06-29 Thread Yonik Seeley
On 6/29/07, Doron Cohen [EMAIL PROTECTED] wrote: Note that some Solr users have reported a similar issue. https://issues.apache.org/jira/browse/SOLR-240 Seems the scenario there is without using native locks? - i get the stacktrace below ... with useNativeLocks turned off Yes... but that

Re: Limiting search results by a collection of terms

2007-06-29 Thread Mark Miller
You do want to use a QueryFilter. The method you suggest sounds good. Make a field called group with a term for each group it belongs to, a field called user with the users it belongs to etc. QueryFilter will take a query, i.e. group:managers Pass the Filter to the search method on your