Best Practices for Distributing Lucene Indexing and Searching

2005-03-01 Thread Luke Francl
uld bring over the traditional FSDirectory, but maybe someone else has some ideas. Please let me know if you've got any other ideas or a best practice to follow. Thanks, Luke Francl - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

RE: Concurrent searching & re-indexing

2005-02-17 Thread Luke Francl
sage: http://nagoya.apache.org/eyebrowse/[EMAIL PROTECTED]&msgNo=11986 Any advice would still be appreciated. Currently, I'm catching the error and doing a retry in the finally block, but I am not confident in this solution due to the difficulty of reproducing the problem. Regards, Luke Fr

Re: RangeQuery With Date

2005-02-07 Thread Luke Francl
Your dates need to be stored in lexicographical order for the RangeQuery to work. Index them using this date format: MMDD. Also, I'm not sure if the QueryParser can handle range queries with only one end point. You may need to create this query programmatically. Regards, Luke Francl

RE: google mini? who needs it when Lucene is there

2005-01-27 Thread Luke Francl
money. Getting Lucene/Nutch to the point where it is possible to easily install it on a computer and administrate its settings in a user-friendly way is a great goal, though. Regards, Luke Francl From: jian chen [mailto:[EMAIL PROTECTED] Sent: Thu 1/27/2005 5:

Re: Why IndexReader.lastModified(index) is depricated?

2005-01-20 Thread Luke Francl
pdated. I think it would be nice to have it back, but it should be clearly noted that it is for informational purposes _only_. If you want to see if the index has changed, use the version number. Luke Francl LIMO co-developer http://limo.sourcefor

Re: How to add a Lucene index to a jar file?

2005-01-17 Thread Luke Francl
think folks have posted code for this to the list > previously. I think it would be cool if Lucene included a class that facilitated creating a RAMDirectory from an index in a JAR file, for the kind of point-and-click startup that Bill is talking about. Regar

RE: Multi-threading problem: couldn't delete segments

2005-01-13 Thread Luke Francl
On Thu, 2005-01-13 at 12:33, David Townsend wrote: > Just read your old post. I'm not quite sure whether I've read this correctly. > Is the search worker thread also doing deletes from the index > > "a test script is going that is hitting the search > part of our application (I think the script

RE: Multi-threading problem: couldn't delete segments

2005-01-13 Thread Luke Francl
On Thu, 2005-01-13 at 12:25, David Townsend wrote: > The problem could be you're writing to an index with multiple processes. This > can happen if you're using a shared file system (NFS?). We saw this problem > when we had two IndexWriters getting access to a single index at the same > time. U

Re: Multi-threading problem: couldn't delete segments

2005-01-13 Thread Luke Francl
n Windows, you can't rename a file while another thread has it open. If I am performing a search, is it possible that the IndexReader is holding open the "segments" file when there is an attempt by my indexing code to overwrite it with File.renameTo()? Thanks, Luke Francl On T

RE: How do you handle dynamic html pages?

2005-01-10 Thread Luke Francl
On Mon, 2005-01-10 at 10:26, Kevin L. Cobb wrote: > I don't like to periodically re-index everything because 1) you can't be > confident that your searches are as up to date as they could be, and 2) > you are wasting cycles either checking for documents that may or may not > need to be updated, or

Re: How do you handle dynamic html pages?

2005-01-10 Thread Luke Francl
whenever) every day and that would probably handle your needs. Regards, Luke Francl - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Check to see if index is optimized

2005-01-07 Thread Luke Francl
deletions, it does not need to be optimized. You can find out if it has deletions with IndexReader.hasDeletions. I am not sure what the cost of optimization is if the index doesn't need it. Perhaps someone else on this list knows. Regard

Re: Use search engine technology for object persistence

2005-01-07 Thread Luke Francl
On Fri, 2005-01-07 at 08:05, Erik Hatcher wrote: > Interesting article: > > http://www.javaworld.com/javaworld/jw-01-2005/jw-0103-search_p.html Sort of off-topic, but does this mean JavaWorld is publishing again? I had read Bill Venners's post from back in January '04 that they shut down.

Multi-threading problem: couldn't delete segments

2005-01-06 Thread Luke Francl
calls? Or is it more likely that I'm missing something in my class that modifies the index? The code is attached. Thank you, Luke Francl // $Id: LuceneIndexer.java 20473 2004-10-19 17:20:10Z lfrancl $ package com.ancept.ams.search.lucene; import com.ancept.ams.asset.AssetUtils; import c

Re: partial updating of lucene

2004-12-09 Thread Luke Francl
ource intensive so I wouldn't recommend it. The better solution is to add a stored keyword field that stores the location of your document, and then re-index it from the source. Regards, Luke Francl - To unsubscribe, e-ma

Re: LIMO problems

2004-12-09 Thread Luke Francl
nning Tomcat as permission to write files to your webapps/limo.war directory (or whatever it's called, I don't actually use Tomcat), it should work. If you don't want to do that for security reasons, simply create the file and put it there yourself. It should be at the same level

Re: LIMO problems

2004-12-09 Thread Luke Francl
application. If you have any other questions, please don't hesitate to ask. Regards, Luke Francl LIMO developer P.S.: LIMO 0.5.2 adds a new index file browser that shows you some interesting details about your index files. Check it out! ---

Re: GETVALUES +SEARCH

2004-12-01 Thread Luke Francl
On Wed, 2004-12-01 at 11:12, petite_abeille wrote: > Not really, except perhaps that a Lucene Document could theoretically > have multiple identical keys... not something that anyone would want to > do though :o) And why not? I use this to store closed captioned text. Each entry must be stored s

Re: Document-Map, Hits-List

2004-12-01 Thread Luke Francl
On Wed, 2004-12-01 at 10:39, petite_abeille wrote: > You don't need to iterate through anything upfront... you simply do it > on-demand... eg when invoking List.get() you would invoke the > underlying Hits.doc()... > > In other words, there is _no_ new data structure... simply an > additional

Re: Document-Map, Hits-List

2004-12-01 Thread Luke Francl
On Wed, 2004-12-01 at 10:27, Otis Gospodnetic wrote: > This is very similar to what I do - I create a List of Maps from Hits > and its Documents. So I think this change may be handy, if doable (I > didn't look into changing the two Lucene classes, actually). How do you avoid the problem Eric ju

Re: modifying existing index

2004-11-23 Thread Luke Francl
to recreate the whole index. Just mark the document as deleted using the IndexReader and then add it again with the IndexWriter. Remember to close your IndexReader and IndexWriter after doing this. The deleted document will be removed the next time you optimize you

Re: Limo 0.5

2004-11-22 Thread Luke Francl
eturns. I'll add enumerating the terms in an index to my list of things to add. Regards, Luke Francl - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

RE: disadvantages

2004-11-21 Thread Luke Francl
Well that really depends on how big your index is and what they search for, now doesn't it? ;) -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Sun 11/21/2004 2:52 PM To: Lucene Users List Subject: Re: disadvantages On Nov 21, 2004, at 12:00 PM, Miguel Angel wrote:

LIMO 0.5 released

2004-11-21 Thread Luke Francl
etting me join it; and to Andrzej Bialecki for Luke from which I appropriated several ideas and his GrowableStringArray class. If you are interested in getting involved, LIMO is now available in SourceForge CVS. Regards, Luke Francl

Re: Too many files exception

2004-11-18 Thread Luke Francl
ttp://jakarta.apache.org/lucene/docs/api/org/apache/lucene/index/IndexWriter.html#getUseCompoundFile() Combined with closing your IndexReaders, this should fix the problem. Regards, Luke Francl - To unsubscribe, e-mail: [EMAIL P

Re: BooleanQuery - TooManyClauses Issue

2004-11-16 Thread Luke Francl
On Tue, 2004-11-16 at 16:32, Paul Elschot wrote: > Once you approach 1000 days, you'll get the same problem again, > so you might want to use a filter for the dates. > See DateFilter and the archives on MMDD. Can anyone point to a good example of how to use the DateFilter? Thanks, Luke ---

Re: _4c.fnm missing

2004-11-16 Thread Luke Francl
On Tue, 2004-11-16 at 14:57, Luke Shannon wrote: > This is the latest error I have received: > > IndexReader out of date and no longer valid for delete, undelete, or setNorm > operations What you need to do is check the version number of the index to determine if you need to open a new IndexReade

Re: Lucene : avoiding locking (incremental indexing)

2004-11-15 Thread Luke Francl
On Mon, 2004-11-15 at 16:50, [EMAIL PROTECTED] wrote: > So far I am seeing 2 solutions and honestly I don't love either totally. I > am thinking that without changes to Lucene itself, the best "general" way to > implement this might be to have a queue of changes and have Lucene work off > this

Understanding TooManyClauses-Exception and Query-RAM-size

2004-11-15 Thread Luke Francl
query "foo +(bar baz)" do you include number_of_fields * number_of_documents part for each term in the query? Or just for the entire thing? Thanks, Luke Francl - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Lucene : avoiding locking (incremental indexing)

2004-11-15 Thread Luke Francl
This is how I implemented incremental indexing. If anyone sees anything wrong, please let me know. Our motivation is similar to John Eichel's. We have a digital asset management system and when users update, delete or create a new asset, they need to see their results immediately. The most import

Re: Index File

2004-11-15 Thread Luke Francl
As long as you are closing your IndexSearchers when you are done with them you should not have problems with file handles. When using Lucene 1.2 (pre-compound file format) on Windows, I ran into this problem because Windows only lets an application open something like 1000 file handles. On Unix the

Re: Index File

2004-11-15 Thread Luke Francl
n the index with every search. Luke Francl - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

RE: Index File

2004-11-15 Thread Luke Francl
On Fri, 2004-11-12 at 19:07, Richard Greenane wrote: > You might wat to look at LUKE @ http://www.getopt.org/luke/ > A great tool for checking the index to make sure that everything is > there There is also a web-based tool that you can run in your servlet container called LIMO. I've added some qu

Re: Bug in the BooleanQuery optimizer? ..TooManyClauses

2004-11-12 Thread Luke Francl
On Fri, 2004-11-12 at 14:52, Daniel Naber wrote: > There are two different issues: first, reorder the query so that those > terms with less matches appear first, because as soon as the first term > with 0 matches occurs, search stops. There will probably be a > non-so-difficult implementation f

Re: Bug in the BooleanQuery optimizer? ..TooManyClauses

2004-11-12 Thread Luke Francl
On Thu, 2004-11-11 at 14:48, Daniel Naber wrote: > On Thursday 11 November 2004 20:57, Sanyi wrote: > > > What I'm saying is that there is no reason for the optimizer to expand > > wild* to more than 1024 variations > > That's the point: there is no query optimizer in Lucene. Would it be possibl

Re: Lucene : avoiding locking

2004-11-12 Thread Luke Francl
On Fri, 2004-11-12 at 09:51, Luke Shannon wrote: > Hi Luke; > > Currently I am experimenting with checking if the index is lock using > IndexReader.locked before creating a writer. If this turns out to be the > case I was thinking of just unlocking the file. > > Do you think this is a good strate

Re: Lucene : avoiding locking

2004-11-12 Thread Luke Francl
batch indexing. Regards, Luke Francl On Thu, 2004-11-11 at 18:33, Luke Shannon wrote: > Syncronizing the method didn't seem to help. The lock is being detected > right here in the code: > > while (uidIter.term() != null > && uidIter.term().field() == "uid"

Re: What is the difference between these searches?

2004-11-09 Thread Luke Francl
On Tue, 2004-11-09 at 16:00, Paul Elschot wrote: > Lucene has no provision for matching by being prohibited only. This can > be achieved by indexing something for each document that can be > used in queries to match always, combined with something prohibited > in a query. > But doing this is bad f

Re: What is the difference between these searches?

2004-11-09 Thread Luke Francl
On Tue, 2004-11-09 at 15:48, Erik Hatcher wrote: > This last query has a required clause, which is what BooleanQuery > requires when there is a NOT clause. You're getting what you want here > because you've got an item_type:xyz clause as required. In your first > example, you're requiring fie

What is the difference between these searches?

2004-11-09 Thread Luke Francl
in bar. Fiddling around with the Lucene Index Toolbox, I think that this query does what I want: +item_type:xyz field_name:foo -field_name:bar Can someone explain to me why these queries return different results? Thanks, Luke Francl

Re: Thread safety of QueryParser

2003-08-26 Thread Luke Francl
Thank you for the update, Doug. On Tue, 2003-08-26 at 11:57, Doug Cutting wrote: > This method constructs a new query parser each time it is called, so it > is thread safe. Perhaps the JGuru FAQ should be updated... Luke - T

Thread safety of QueryParser

2003-08-25 Thread Luke Francl
f the QueryParser. Is that what the "f" parameter of the QueryParser(String f, Analyzer a) constrcutor is for? Thanks for your advice, Luke Francl - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands,