[jira] Commented: (LUCENE-675) Lucene benchmark: objective performance test for Lucene

2006-09-21 Thread Andrzej Bialecki (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-675?page=comments#action_12436442 ] Andrzej Bialecki commented on LUCENE-675: -- Yes, that could be a good additional source. However, IMHO the primary corpus should be widely known and

[jira] Commented: (LUCENE-675) Lucene benchmark: objective performance test for Lucene

2006-09-21 Thread Paul Smith (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-675?page=comments#action_12436443 ] Paul Smith commented on LUCENE-675: --- From a strict performance point of view, a standard set of important, but don't forget other languages. From a

[jira] Updated: (LUCENE-443) ConjunctionScorer tune-up

2006-09-21 Thread Paul Elschot (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-443?page=all ] Paul Elschot updated LUCENE-443: Attachment: Conjunction20060921.patch Iirc the orginal performance problem was caused by creation of objects in the tight loop doing skipTo() on al the

[jira] Commented: (LUCENE-443) ConjunctionScorer tune-up

2006-09-21 Thread Paul Elschot (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-443?page=comments#action_12436453 ] Paul Elschot commented on LUCENE-443: - I just overlooked the grant by Abdul to the ASF. ConjunctionScorer tune-up -

Re: Clustering IndexWriter?

2006-09-21 Thread adasal
Don't be coy, what's your comapany? Adam On 21/09/06, Steve Harris [EMAIL PROTECTED] wrote: Warning, I'm a vendor dude but this isn't really a vendor message. My IT guy had mentioned to me that a bunch of the open source products we use (JIRA, JForum etc) have Lucene inside and in the name of

[jira] Commented: (LUCENE-675) Lucene benchmark: objective performance test for Lucene

2006-09-21 Thread Karl Wettin (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-675?page=comments#action_12436502 ] Karl Wettin commented on LUCENE-675: It is also interesting to know how much time is consumed to assemble an instance of Document from the storage. According

Re: Clustering IndexWriter?

2006-09-21 Thread Vic Bancroft
adasal wrote: Don't be coy, what's your comapany? This URL is derivable from the text, with a little search ening help . . . ** http://www.terracottatech.com/terracotta_spring.shtml more, l8r, v On 21/09/06, Steve Harris [EMAIL PROTECTED] wrote: Warning, I'm a vendor dude but this

[jira] Commented: (LUCENE-675) Lucene benchmark: objective performance test for Lucene

2006-09-21 Thread Grant Ingersoll (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-675?page=comments#action_12436516 ] Grant Ingersoll commented on LUCENE-675: Since this has dependencies, do you think we should put it under contrib? I would be for a Performance directory

[jira] Commented: (LUCENE-675) Lucene benchmark: objective performance test for Lucene

2006-09-21 Thread Andrzej Bialecki (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-675?page=comments#action_12436518 ] Andrzej Bialecki commented on LUCENE-675: -- The dependency on commons-compress could be avoided - I used this just to be able to unpack tar.gz files, we

Re: Re: Clustering IndexWriter?

2006-09-21 Thread Steve Harris
Yep, that's us. No secret, just didn't want to make my question an billboard :-). Just needed a bit of info from the people who know best. Cheers, steve On 9/21/06, Vic Bancroft [EMAIL PROTECTED] wrote: adasal wrote: Don't be coy, what's your comapany? This URL is derivable from the text,

Re: Clustering IndexWriter?

2006-09-21 Thread Yonik Seeley
On 9/20/06, Steve Harris [EMAIL PROTECTED] wrote: Is clustering the IndexWriter really all I need to do? Hi Steve, Could you explain the details of what clustering really means in this context? -Yonik http://incubator.apache.org/solr Solr, the open-source Lucene search server

Re: Re: Clustering IndexWriter?

2006-09-21 Thread Steve Harris
Sure, I'm fairly new to Lucene but what I was trying to do was make it so that an index could be shared among multiple nodes. If an index is updated in any way it would be updated across the cluster coherently. In my first version I was really only taking advantage of the fact that we detect

Re: Re: Clustering IndexWriter?

2006-09-21 Thread Otis Gospodnetic
I don't fully follow, and I don't even have the it's late! excuse. It sounds like you want to have the same index on multiple nodes in the cluster and when a data change occurs, you want to synchronously make the same change to all indices in your cluster. Is that it? Solr has a different

[jira] Updated: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-09-21 Thread Ning Li (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-565?page=all ] Ning Li updated LUCENE-565: --- Attachment: NewIndexModifier.Sept21.patch This is to update the delete-support patch after the commit of the new merge policy. - Very few changes to IndexWriter. -

[jira] Commented: (LUCENE-675) Lucene benchmark: objective performance test for Lucene

2006-09-21 Thread Otis Gospodnetic (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-675?page=comments#action_12436587 ] Otis Gospodnetic commented on LUCENE-675: - I still haven't gotten my employer to sign and fax the CCLA, so I'm stuck and can't contribute my search

Re: Re: Clustering IndexWriter?

2006-09-21 Thread Chris Hostetter
: Questions: : Is this useful in the real world : Would it be possible to get that one small thing changed. I'm not really clear on what the small thing is that you are asking about ... you mentioned SegmentInfos subclassing Vector, are you proposing an alternative? If you've got a patch that

help on Lock.obtain(lockWaitTimeout)

2006-09-21 Thread Michael McCandless
I'm working on a LockFactory that uses java.nio.* (OS native locks) for its locks. This should be a big help for people who keep finding their lock files left on disk due to abnormal shutdown, etc (because OS will free the locks, nomatter what, in theory). I thought I was nearly done but

Re: Re: Re: Clustering IndexWriter?

2006-09-21 Thread Steve Harris
Fair question. All I did/need was take SegmentInfos and instead of subclassing Vector I made it contain a Vector. Went from subclassing to aggregation. As far as I could tell from reading the code it would make no difference to anyone and should have no performance impact (good or bad). It just

Re: [jira] Updated: (LUCENE-665) temporary file access denied on Windows

2006-09-21 Thread Doron Cohen
Thanks for the comments! Indeed the first version I wrote followed the pattern you suggest (let's name it pattern_1 for the discussion). However with pattern_1 I could not cover the case of a method originally not throwing an exception. The problem is that in pattern_1 we have to catch the

Re: help on Lock.obtain(lockWaitTimeout)

2006-09-21 Thread Yonik Seeley
On 9/21/06, Michael McCandless [EMAIL PROTECTED] wrote: Anyway, my first reaction was to change this to use System.currentTimeMillis() to measure elapsed time, but then I remembered is a dangerous approach because whenever the clock on the machine is updated (eg by a time-sync NTP client) it

Re: Re: Re: Re: Clustering IndexWriter?

2006-09-21 Thread Steve Harris
Good question. May or may not be performant enough. Only time (and testing) will tell. My guess is that it will depend heavily on the rate in which the data changes (or read write ratio). Believe me, I'm not proposing that everyone go out and cluster lucene with terracotta dso. I'm really just

Re: Re: Re: Re: Clustering IndexWriter?

2006-09-21 Thread Yonik Seeley
On 9/21/06, Steve Harris [EMAIL PROTECTED] wrote: My guess is that some segment of the world cares a lot about realtime coherent updates and some segment of the world needs blinding speed. Part of my research is to gather the expertise of this group on these issues. I hear ya... There is

Re: Re: Re: Re: Re: Clustering IndexWriter?

2006-09-21 Thread Steve Harris
Interesting. I wonder, I have a notification mechanism at my disposal as well. I wonder if it could be worked out that, much like a mvc, an IndexReader could be notified when the underlying Directory has changed so that the reader can adjust itself? Cheers, Steve On 9/21/06, Yonik Seeley

Re: help on Lock.obtain(lockWaitTimeout)

2006-09-21 Thread Doron Cohen
For obtain(timeout), to prevent waiting too long you could compute the maximum number of times that obtain() can be executed (assuming, as in current code, that obtain() executes in no time). Then break if either it was executed sufficiently many times or if time is up. I don't see how to prevent

Re: Re: Re: Re: Re: Clustering IndexWriter?

2006-09-21 Thread Yonik Seeley
On 9/21/06, Steve Harris [EMAIL PROTECTED] wrote: Interesting. I wonder, I have a notification mechanism at my disposal as well. I wonder if it could be worked out that, much like a mvc, an IndexReader could be notified when the underlying Directory has changed so that the reader can adjust

Re: Re: Re: Re: Re: Re: Clustering IndexWriter?

2006-09-21 Thread Steve Harris
I'm don't know list servers rules but I figured I would just include the text of the file I changed. If that is bad form give me a heads up and I won't do it again. Would this change break anything or bother anyone? package org.apache.lucene.index; /** * Copyright 2004 The Apache Software

Re: Re: Re: Re: Re: Re: Re: Clustering IndexWriter?

2006-09-21 Thread Steve Harris
Oops, I made a change and didn't test it. Doh, This should work better: package org.apache.lucene.index; /** * Copyright 2004 The Apache Software Foundation * * Licensed under the Apache License, Version 2.0 (the License); * you may not use this file except in compliance with the License. * You

Re: Distributed Indexes, Searches and HDFS

2006-09-21 Thread Yonik Seeley
On 9/21/06, Chris D [EMAIL PROTECTED] wrote: The cronjob/link solution which is quite clean, doesn't work well in a windows environment. While it's my favorite, no dice... Rats. There may be hope yet for that on Windows. Hard links work on Windows, but the only problem is that you can't