Re: Index Replication / Clustering

2005-06-27 Thread Chris Lu
DBSight, a J2EE search engine on database, meets most of your requirements. It has clustering support. Basically you can configure one DBSight server specially for indexing on database content. Another or several other DBSight servers devoted to search, and they subscribe to the indexing server

Bad File Descriptor

2005-06-27 Thread Yousef Ourabi
Hello: I have yet another file-system related error, if I set the boolean create to true on idex writer, I have no problems (other than never getting more than one hit, because I create and close the idnexwriter in one method to assure the class is the only instance) However, if I set it to false,

Fwd: when is the commit.lock released?

2005-06-27 Thread jian chen
Hi, I haven't heard anything back. Probably this email got lost on the way or whatsoever. Anyway, could anyone enlighten me on this? Thanks, Jian -- Forwarded message -- From: jian chen <[EMAIL PROTECTED]> Date: Jun 26, 2005 12:59 PM Subject: when is the commit.lock released?

Re: Indexing Forums (Document & Field Paradigm)

2005-06-27 Thread Erik Hatcher
Simply storing a parentId doesn't help query the hierarchy though - for example, search for all "lucene" containing documents in a specific forum thread, children and all. One technique I've used to index hierarchy is to come up with a path string such as "/parent/child/grandchild" and inde

RE: How to filter search based on file path

2005-06-27 Thread Tony Schwartz
I have two ideas: 1. wildcard query on the path field, but this only works if you have a small number of elements in any path hierarchy. 2. store each path component as a separate "term" a. for instance: /people/tony/property yields the following terms: path:/people and path:/people/ton

Re: proximity search not working when extending the QueryParser

2005-06-27 Thread Erik Hatcher
Ross - could you please show us a bit of your code so we can see explicitly what you're doing and how it's not working as expected? set/getPhraseSlop are quite straightforward, so unless you're mixing up the static versus instance parse methods with QueryParser then I don't know what could

Re: Lock File exceptions

2005-06-27 Thread jian chen
Hi, Recently I looked at the locking mechanism of Lucene. If I am correct, I think the process for grabbing the lock file will time out by default in 10 seconds. When the process timed out, it will print out the IOException. The lucene locking mechanism is not within threads in the same JVM. It u

Re: Lock File exceptions

2005-06-27 Thread Yousef Ourabi
Mini-Follow UP: Wouldn't the parameter boolean create false create the segments if it is not found? My understanding is that the create variable either creates or doesn't the actual directory on startup...does this affect key files as well? IOException caught here: /var/jeteye/index/segments (No s

Lock File exceptions

2005-06-27 Thread Yousef Ourabi
Hello: I get this lock-file exception on both Windows and Linux, my app is running inside tomcat 5.5.9, jvm 1.5.03...has anyone seen this before? If I delete the LOCK file it works, but obviously I shouldn't do that...Just wondering what's up? IOException caught here: Lock obtain timed out: Lock@

How to filter search based on file path

2005-06-27 Thread kambiz Afkhamian
Hi I've indexed my website from the application root. When I run a query, it beasically searches all content below the application root folder. I would like to create feature that would allow users to search specific folder/paths of this website. (i.e I would like to limit my query search to

proximity search not working when extending the QueryParser

2005-06-27 Thread Angelov, Rossen
When I'm using the QueryParser directly, the proximity search works fine and getPhraseSlop() returns the correct slop int. The problem is when I extend QueryParser. When extending it, getPhraseSlop always returns the default value - 0. It's like setPhraseSlop is never called. Does anybody know if

Re: Indexing Forums (Document & Field Paradigm)

2005-06-27 Thread Dan Funk
You could have a parentId field in each document - which will give you a nice hierarchy. You could also create a topicId (Linux, Microsoft, etc...) and a storyId. At that point you can quickly identify the topic and story for the message - and you can also search within a specific thread (AND

Indexing Forums (Document & Field Paradigm)

2005-06-27 Thread Yousef Ourabi
Hello: Thanks for all the help so far it has been fantastic. I have a question on the document and field paradigm, this works great for flat-files, like a word document, or web-page but what about nested forums (ala slashdot) where in theory a specifc chat thread is nested or is nested inside anoth

RE: Alternate Lucene Query Highlighter

2005-06-27 Thread Bohl, David
I uploaded the class. See bug#35518. -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Friday, June 24, 2005 3:59 PM To: java-user@lucene.apache.org Subject: Re: Alternate Lucene Query Highlighter David - please create a new issue in Lucene's Bugzilla system (see t

Re: issues building a large index

2005-06-27 Thread Otis Gospodnetic
Hi, Perhaps using hprof with cpu=samples may reveal more information about what CPU is doing. I think this is a valid use case. Otis --- Lokesh Bajaj <[EMAIL PROTECTED]> wrote: > Thanks for the idea. I have tried with both 512m and 1024m with the > same results. I also turned on verbose gc lo

Re: Index Replication / Clustering

2005-06-27 Thread Nader Henein
Considerations that you may want to think about when sanitizing your clustered indecies: 1) Number of documents available vs. number of documents in the persistent store. 2) Are all the document up to date (involves comparing the existence and the last date updated of Lucene documents to persi

Re: Index Replication / Clustering

2005-06-27 Thread Paul Smith
On 27/06/2005, at 7:14 PM, Nader Henein wrote: I implemented a JMS based solution about a year ago because I thought it would solve my atomicity problem and give me a centralized way of indexing, you'll have to use the pluggable persistence (if you use ActiveMQ) to be able to recover from

Re: Index Replication / Clustering

2005-06-27 Thread Paul Smith
If you use ActiveMQ for JMS, you can take advantage of it's Composite Destination feature and have a virtual Queue/Topic that is actually several Queues/Topics. This is what we use to keep a mirror index server completely in sync. The application sends an update message to a queue

Re: Index Replication / Clustering

2005-06-27 Thread Nader Henein
I implemented a JMS based solution about a year ago because I thought it would solve my atomicity problem and give me a centralized way of indexing, you'll have to use the pluggable persistence (if you use ActiveMQ) to be able to recover from a failure and you'll also need some way of maintaini

Re: Index Replication / Clustering

2005-06-27 Thread Stephane Bailliez
Hi Paul, Thanks for the reply. Many interesting points. Paul Smith wrote: Why not try using JMS messaging to send messages to the indexing server that Document X needs to be updated via a JMS queue? This gives you the flexibility to have the indexing system down but not lose the message t