Re: GData Server - Lucene storage

2006-06-02 Thread Ian Holsman
On 02/06/2006, at 9:37 AM, Simon Willnauer wrote: The biggest problem with the lucene storage is to achieve a transactional state. Imagine the following scenario: An update request comes in. - the entry to update will be added to the lucene writer who writes the update. But another delete

Re: GData Server - Lucene storage

2006-06-02 Thread Simon Willnauer
On 6/2/06, Ian Holsman [EMAIL PROTECTED] wrote: On 02/06/2006, at 9:37 AM, Simon Willnauer wrote: The biggest problem with the lucene storage is to achieve a transactional state. Imagine the following scenario: An update request comes in. - the entry to update will be added to the lucene

Re: GData Server - Lucene storage

2006-06-02 Thread Yonik Seeley
On 6/1/06, Simon Willnauer [EMAIL PROTECTED] wrote: So the results of the search are entry ids and a corresponding feed. These entries will be retrieved from the storage and send back to the client. In the simplest case of using a lucene stored field to store the original entry, it's a single

Re: GData Server - Lucene storage

2006-06-02 Thread Simon Willnauer
On 6/2/06, Yonik Seeley [EMAIL PROTECTED] wrote: On 6/1/06, Simon Willnauer [EMAIL PROTECTED] wrote: So the results of the search are entry ids and a corresponding feed. These entries will be retrieved from the storage and send back to the client. In the simplest case of using a lucene

Re: Flexible Indexing (was Re: Lucene Planning)

2006-06-02 Thread Grant Ingersoll
I thought it was you, but wasn't sure. I would also like a way to store the frequency of the term in the overall collection (probably should go in the Term dictionary, but not sure, at the cost of an additional VInt per term, but I am open to other places to store it). Right now, in order to

[jira] Commented: (LUCENE-504) FuzzyQuery produces a java.lang.NegativeArraySizeException in PriorityQueue.initialize if I use Integer.MAX_VALUE as BooleanQuery.MaxClauseCount

2006-06-02 Thread Paul Borgermans (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-504?page=comments#action_12414438 ] Paul Borgermans commented on LUCENE-504: Here the same problem, used a MultiFieldQueryParser in combination with fuzzy search which triggered the exception

Re: Flexible Indexing (was Re: Lucene Planning)

2006-06-02 Thread Marvin Humphrey
On Jun 2, 2006, at 6:48 AM, Grant Ingersoll wrote: I thought it was you, but wasn't sure. I'm always looking for ways to minimize Term Vectors, because I consider excerpting/highlighting a core feature rather than an add- on, and they seem like such overkill. It bothers me that they

Re: GData Server - Lucene storage

2006-06-02 Thread Yonik Seeley
On 6/2/06, Simon Willnauer [EMAIL PROTECTED] wrote: This is also true. This problem is still the server response, if i queue some updates / inserts or index them into a RamDir I still have the problem of concurrent indexing. The client should wait for the writing process to finish correctly

RE: GData Server - Lucene storage

2006-06-02 Thread Robert Engels
What we've done is that if the number of incoming documents is less than some threshold, we serialize the documents to a pending file instead of using the IndexWriter. If it greater than the threshold it is assumed an index rebuild is occurring and so the updates are passed directly to the

Re: GData Server - Lucene storage

2006-06-02 Thread Simon Willnauer
On 6/2/06, Yonik Seeley [EMAIL PROTECTED] wrote: On 6/2/06, Simon Willnauer [EMAIL PROTECTED] wrote: This is also true. This problem is still the server response, if i queue some updates / inserts or index them into a RamDir I still have the problem of concurrent indexing. The client should

Re: GData Server - Lucene storage

2006-06-02 Thread Tatu Saloranta
--- Simon Willnauer [EMAIL PROTECTED] wrote: ... Using the client thread as the indexing thread might just cause some performance drawback but that's considerable for Actually, I would not even assume that: handing tasks over between threads causes context switch, and more cache misses. In

Re: GData Server - Lucene storage

2006-06-02 Thread Simon Willnauer
On 6/2/06, Yonik Seeley [EMAIL PROTECTED] wrote: On 6/2/06, Simon Willnauer [EMAIL PROTECTED] wrote: So this would happen quiet often due to updates and inserts. Hmm it is more and more a bad idea to use a lucene index as a storage. Rather go straight to a Database. Yes, if you need to be

Re: GData Server - Lucene storage

2006-06-02 Thread Erik Hatcher
On Jun 2, 2006, at 12:56 PM, Simon Willnauer wrote: I had a look at the licence of the sleepycat BerkleyDB for Java dist. and in my opinion it is alright to use and distribute it with the gdata server. Are there any experts on licencing? Is it ok for the ASF to use that? It's ok to use it,

Re: svn commit: r410680 - in /lucene/java/branches/lucene_2_0: CHANGES.txt src/jsp/results.jsp

2006-06-02 Thread Doug Cutting
Otis Gospodnetic wrote: Developing mainly in trunk makes sense. However, I don't get trunk - branch to make a point release. What if there are other changes already in the trunk (e.g. new features), which we don't want in a point release? That makes it harder! But the common case, in my

Re: GData Server - Lucene storage

2006-06-02 Thread jason rutherglen
Yonik, It might be interesting to merge using BDB into Solr, as an option to provide better realtime updates. Perhaps the replication could be used as well in place of rsync? I don't have any experience with BDB replication, anyone have thoughts on the matter? Jason - Original Message

Re: GData Server - Lucene storage

2006-06-02 Thread Yonik Seeley
On 6/2/06, jason rutherglen [EMAIL PROTECTED] wrote: It might be interesting to merge using BDB into Solr, as an option to provide better realtime updates. Perhaps the replication could be used as well in place of rsync? I don't have any experience with BDB replication, anyone have thoughts

Re: GData Server - Lucene storage

2006-06-02 Thread Andi Vajda
On Fri, 2 Jun 2006, jason rutherglen wrote: It might be interesting to merge using BDB into Solr, as an option to provide better realtime updates. Perhaps the replication could be used as well in place of rsync? I don't have any experience with BDB replication, anyone have thoughts on the

Re: GData Server - Lucene storage

2006-06-02 Thread Otis Gospodnetic
Simon, I look a quick look at the UML PDF. It seems to me that various *Services are overly complicated. Since you can have only 1 thread modifying the Lucene index, perhaps you should go the same route as IndexModifier (I never used it, but it looks like people are using it to manage

Re: GData Server - Lucene storage

2006-06-02 Thread Simon Willnauer
On 6/2/06, Otis Gospodnetic [EMAIL PROTECTED] wrote: Simon, I look a quick look at the UML PDF. It seems to me that various *Services are overly complicated. Since you can have only 1 thread modifying the Lucene index, perhaps you should go the same route as IndexModifier (I never used it,

Piecemeal svn diffs

2006-06-02 Thread Marvin Humphrey
Greets, I'm considering preparing some more patches which build upon the bytecounts-as-string-headers patch. That patch is several hundred lines long, and the stuff I'm thinking about could be thousands. I'd like to break up the patches by theme rather than just take one giant svn

[jira] Commented: (LUCENE-537) Refactor of spell check

2006-06-02 Thread Karl Wettin (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-537?page=comments#action_12414534 ] Karl Wettin commented on LUCENE-537: This code is no good. Please ignore it. Refactor of spell check --- Key: LUCENE-537 URL:

[jira] Resolved: (LUCENE-587) Explanation.toHtml outputs invalid HTML

2006-06-02 Thread Hoss Man (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-587?page=all ] Hoss Man resolved LUCENE-587: - Resolution: Fixed Assign To: Hoss Man thanks forspotting this. I've commited a fix that basically does what you suggest, but includes a br / between the

Re: GData Server - Lucene storage

2006-06-02 Thread jason rutherglen
Is it possible to turn off directory locking with BDB? How is the performance compared to regular FSDirectory for queries? - Original Message From: Andi Vajda [EMAIL PROTECTED] To: java-dev@lucene.apache.org; jason rutherglen [EMAIL PROTECTED] Sent: Friday, June 2, 2006 10:52:27 AM

Re: GData Server - Lucene storage

2006-06-02 Thread Andi Vajda
On Fri, 2 Jun 2006, jason rutherglen wrote: Is it possible to turn off directory locking with BDB? How is the performance compared to regular FSDirectory for queries? The DBLock class in the org.apache.lucene.store.db package (to which DbDirectory belongs) does absolutely nothing. This is

[jira] Assigned: (LUCENE-503) Contrib: ThaiAnalyzer to enable Thai full-text search in Lucene

2006-06-02 Thread Hoss Man (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-503?page=all ] Hoss Man reassigned LUCENE-503: --- Assign To: Hoss Man I don't know anything about the Thai language ... but this code is clean, fairly easy to follow, and has tests that pass. If no one (who

[jira] Created: (LUCENE-588) Escaped wildcard character in wildcard term not handled correctly

2006-06-02 Thread Sunil Kamath (JIRA)
Escaped wildcard character in wildcard term not handled correctly - Key: LUCENE-588 URL: http://issues.apache.org/jira/browse/LUCENE-588 Project: Lucene - Java Type: Bug Components: QueryParser

gjc compile

2006-06-02 Thread Vic Bancroft
The following diff seemed to help build a nice native binary in my fedora. The first modification makes using the new core archive file name and the second avoids a problematic class . . . [EMAIL PROTECTED] lucene-trunk]$ svn diff Index: src/gcj/Makefile

Re: gjc compile

2006-06-02 Thread Andi Vajda
On Fri, 2 Jun 2006, Vic Bancroft wrote: The following diff seemed to help build a nice native binary in my fedora. The first modification makes using the new core archive file name and the second avoids a problematic class . . . You can actually compile all of Lucene + a bunch of contribs