date:20080318

[jira] Updated: (LUCENE-1238) intermittent failures of TestTimeLimitedCollector.testTimeoutMultiThreaded in nightly tests

2008-03-18 Thread Doron Cohen (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doron Cohen updated LUCENE-1238:


Summary: intermittent failures of  
TestTimeLimitedCollector.testTimeoutMultiThreaded in nightly tests  (was: 
intermittent faiures of  TestTimeLimitedCollector.testTimeoutMultiThreaded in 
nightly tests)

fixed typo in summary.

 intermittent failures of  TestTimeLimitedCollector.testTimeoutMultiThreaded 
 in nightly tests
 

 Key: LUCENE-1238
 URL: https://issues.apache.org/jira/browse/LUCENE-1238
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 2.4
Reporter: Doron Cohen
Assignee: Doron Cohen
Priority: Minor

 Occasionly TestTimeLimitedCollector.testTimeoutMultiThreaded fails. e.g. with 
 this output:
 {noformat}
[junit] - Standard Error -
[junit] Exception in thread Thread-97 
 junit.framework.AssertionFailedError: no hits found!
[junit] at junit.framework.Assert.fail(Assert.java:47)
[junit] at junit.framework.Assert.assertTrue(Assert.java:20)
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector.doTestTimeout(TestTimeLimitedCollector.java:152)
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector.access$100(TestTimeLimitedCollector.java:38)
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector$1.run(TestTimeLimitedCollector.java:231)
[junit] Exception in thread Thread-85 
 junit.framework.AssertionFailedError: no hits found!
[junit] at junit.framework.Assert.fail(Assert.java:47)
[junit] at junit.framework.Assert.assertTrue(Assert.java:20)
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector.doTestTimeout(TestTimeLimitedCollector.java:152)
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector.access$100(TestTimeLimitedCollector.java:38)
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector$1.run(TestTimeLimitedCollector.java:231)
[junit] -  ---
[junit] Testcase: 
 testTimeoutMultiThreaded(org.apache.lucene.search.TestTimeLimitedCollector):  
 FAILED
[junit] some threads failed! expected:50 but was:48
[junit] junit.framework.AssertionFailedError: some threads failed! 
 expected:50 but was:48
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector.doTestMultiThreads(TestTimeLimitedCollector.java:255)
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector.testTimeoutMultiThreaded(TestTimeLimitedCollector.java:220)
[junit]
 {noformat}
 Problem either in test or in TimeLimitedCollector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Updated: (LUCENE-1238) intermittent failures of TestTimeLimitedCollector.testTimeoutMultiThreaded in nightly tests

2008-03-18 Thread Doron Cohen (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doron Cohen updated LUCENE-1238:


Lucene Fields: [New, Patch Available]  (was: [New])

 intermittent failures of  TestTimeLimitedCollector.testTimeoutMultiThreaded 
 in nightly tests
 

 Key: LUCENE-1238
 URL: https://issues.apache.org/jira/browse/LUCENE-1238
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 2.4
Reporter: Doron Cohen
Assignee: Doron Cohen
Priority: Minor
 Attachments: LUCENE-1238.patch


 Occasionly TestTimeLimitedCollector.testTimeoutMultiThreaded fails. e.g. with 
 this output:
 {noformat}
[junit] - Standard Error -
[junit] Exception in thread Thread-97 
 junit.framework.AssertionFailedError: no hits found!
[junit] at junit.framework.Assert.fail(Assert.java:47)
[junit] at junit.framework.Assert.assertTrue(Assert.java:20)
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector.doTestTimeout(TestTimeLimitedCollector.java:152)
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector.access$100(TestTimeLimitedCollector.java:38)
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector$1.run(TestTimeLimitedCollector.java:231)
[junit] Exception in thread Thread-85 
 junit.framework.AssertionFailedError: no hits found!
[junit] at junit.framework.Assert.fail(Assert.java:47)
[junit] at junit.framework.Assert.assertTrue(Assert.java:20)
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector.doTestTimeout(TestTimeLimitedCollector.java:152)
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector.access$100(TestTimeLimitedCollector.java:38)
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector$1.run(TestTimeLimitedCollector.java:231)
[junit] -  ---
[junit] Testcase: 
 testTimeoutMultiThreaded(org.apache.lucene.search.TestTimeLimitedCollector):  
 FAILED
[junit] some threads failed! expected:50 but was:48
[junit] junit.framework.AssertionFailedError: some threads failed! 
 expected:50 but was:48
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector.doTestMultiThreads(TestTimeLimitedCollector.java:255)
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector.testTimeoutMultiThreaded(TestTimeLimitedCollector.java:220)
[junit]
 {noformat}
 Problem either in test or in TimeLimitedCollector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Updated: (LUCENE-1238) intermittent failures of TestTimeLimitedCollector.testTimeoutMultiThreaded in nightly tests

2008-03-18 Thread Doron Cohen (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doron Cohen updated LUCENE-1238:


Attachment: LUCENE-1238.patch

Problem was in test.

However fix adds a greediness option to time-limited-collector (TLC):
* A greedy TLC upon timeout would allow the wrapped collector to 
  collect current doc and only then throw the timeout exception.
* A non greedy TLC (the default, as before) will immediately throw the 
exception.

For the test, setting to greedy allows to require that at least one doc was 
collected.

I addition this patch:
* Adds missing javadocs for TLC constructor.
* Increase slack in timeout requirements in the test.
  This is to prevent further noise in this: 
 HLC is required to timeout not too soon and not too late, but in a busy 
machine 
 the not too late part is problematic to test.
 I considered to removing this part (not too late), but decided to leave it 
in for now.
* Adds a test for the setGreedy() option.

All TLC tests pass.

 intermittent failures of  TestTimeLimitedCollector.testTimeoutMultiThreaded 
 in nightly tests
 

 Key: LUCENE-1238
 URL: https://issues.apache.org/jira/browse/LUCENE-1238
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 2.4
Reporter: Doron Cohen
Assignee: Doron Cohen
Priority: Minor
 Attachments: LUCENE-1238.patch


 Occasionly TestTimeLimitedCollector.testTimeoutMultiThreaded fails. e.g. with 
 this output:
 {noformat}
[junit] - Standard Error -
[junit] Exception in thread Thread-97 
 junit.framework.AssertionFailedError: no hits found!
[junit] at junit.framework.Assert.fail(Assert.java:47)
[junit] at junit.framework.Assert.assertTrue(Assert.java:20)
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector.doTestTimeout(TestTimeLimitedCollector.java:152)
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector.access$100(TestTimeLimitedCollector.java:38)
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector$1.run(TestTimeLimitedCollector.java:231)
[junit] Exception in thread Thread-85 
 junit.framework.AssertionFailedError: no hits found!
[junit] at junit.framework.Assert.fail(Assert.java:47)
[junit] at junit.framework.Assert.assertTrue(Assert.java:20)
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector.doTestTimeout(TestTimeLimitedCollector.java:152)
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector.access$100(TestTimeLimitedCollector.java:38)
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector$1.run(TestTimeLimitedCollector.java:231)
[junit] -  ---
[junit] Testcase: 
 testTimeoutMultiThreaded(org.apache.lucene.search.TestTimeLimitedCollector):  
 FAILED
[junit] some threads failed! expected:50 but was:48
[junit] junit.framework.AssertionFailedError: some threads failed! 
 expected:50 but was:48
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector.doTestMultiThreads(TestTimeLimitedCollector.java:255)
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector.testTimeoutMultiThreaded(TestTimeLimitedCollector.java:220)
[junit]
 {noformat}
 Problem either in test or in TimeLimitedCollector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Commented: (LUCENE-1239) IndexWriter deadlock when using ConcurrentMergeScheduler

2008-03-18 Thread Michael Lossos (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-1239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12579792#action_12579792
]

Michael Lossos commented on LUCENE-1239:

Here's the related Compass bug:
http://issues.compass-project.org/browse/CMP-581

In case this ends up being a problem in Compass's ExecutorMergeScheduler,
though that doesn't look to be the case at the moment.

IndexWriter deadlock when using ConcurrentMergeScheduler

Key: LUCENE-1239
URL: https://issues.apache.org/jira/browse/LUCENE-1239
Project: Lucene - Java
Issue Type: Bug
Components: Index
Affects Versions: 2.3.1
Environment: Compass 2.0.0M3 (nightly build #57), Lucene 2.3.1,
Spring Framework 2.0.7.0
Reporter: Michael Lossos

I'm trying to update our application from Compass 2.0.0M1 with Lucene 2.2 to
Compass 2.0.0M3 (latest build) with Lucene 2.3.1. I'm holding all other
things constant and only changing the Compass and Lucene jars. I'm recreating
the search index for our data and seeing deadlock in Lucene's IndexWriter. It
appears to be waiting on a signal from the merge thread. I've tried creating
a simple reproduction case for this but to no avail.
Doing the exact same steps with Compass 2.0.0M1 and Lucene 2.2 has no
problems and recreates our search index. That is to say, it's not our code.
In particular, the main thread performing the commit (Lucene document save)
from Compass is calling Lucene's IndexWriter.optimize(). We're using
Compass's ExecutorMergeScheduler to handle the merging, and it is calling
IndexWriter.merge(). The main thread in IndexWriter.optimize() enters the
wait() at the bottom of that method and is never notified. I can't tell if
this is because optimizeMergesPending() is returning true incorrectly, or if
IndexWriter.merge()'s notifyAll() is being called prematurely. Looking at the
code, it doesn't seem possible for IndexWriter.optimize() to be waiting and
miss a notifyAll(), and Lucene's IndexWriter.merge() was recently fixed to
always call notifyAll() even on exceptions -- that is all the relevant
IndexWriter code looks properly synchronized. Nevertheless, I'm seeing the
deadlock behavior described, and it's reproducible using our app and our test
data set.
Could someone familiar with IndexWriter's synchronization code take another
look at it? I'm sorry that I can't give you a simple reproduction test case.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Created: (LUCENE-1239) IndexWriter deadlock when using ConcurrentMergeScheduler

2008-03-18 Thread Michael Lossos (JIRA)

IndexWriter deadlock when using ConcurrentMergeScheduler


 Key: LUCENE-1239
 URL: https://issues.apache.org/jira/browse/LUCENE-1239
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Affects Versions: 2.3.1
 Environment: Compass 2.0.0M3 (nightly build #57), Lucene 2.3.1, Spring 
Framework 2.0.7.0
Reporter: Michael Lossos


I'm trying to update our application from Compass 2.0.0M1 with Lucene 2.2 to 
Compass 2.0.0M3 (latest build) with Lucene 2.3.1. I'm holding all other things 
constant and only changing the Compass and Lucene jars. I'm recreating the 
search index for our data and seeing deadlock in Lucene's IndexWriter. It 
appears to be waiting on a signal from the merge thread. I've tried creating a 
simple reproduction case for this but to no avail.

Doing the exact same steps with Compass 2.0.0M1 and Lucene 2.2 has no problems 
and recreates our search index. That is to say, it's not our code.

In particular, the main thread performing the commit (Lucene document save) 
from Compass is calling Lucene's IndexWriter.optimize(). We're using Compass's 
ExecutorMergeScheduler to handle the merging, and it is calling 
IndexWriter.merge(). The main thread in IndexWriter.optimize() enters the 
wait() at the bottom of that method and is never notified. I can't tell if this 
is because optimizeMergesPending() is returning true incorrectly, or if 
IndexWriter.merge()'s notifyAll() is being called prematurely. Looking at the 
code, it doesn't seem possible for IndexWriter.optimize() to be waiting and 
miss a notifyAll(), and Lucene's IndexWriter.merge() was recently fixed to 
always call notifyAll() even on exceptions -- that is all the relevant 
IndexWriter code looks properly synchronized. Nevertheless, I'm seeing the 
deadlock behavior described, and it's reproducible using our app and our test 
data set.

Could someone familiar with IndexWriter's synchronization code take another 
look at it? I'm sorry that I can't give you a simple reproduction test case.



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Assigned: (LUCENE-1239) IndexWriter deadlock when using ConcurrentMergeScheduler

2008-03-18 Thread Michael McCandless (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-1239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Michael McCandless reassigned LUCENE-1239:
--

Assignee: Michael McCandless

IndexWriter deadlock when using ConcurrentMergeScheduler

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Commented: (LUCENE-1239) IndexWriter deadlock when using ConcurrentMergeScheduler

2008-03-18 Thread Michael McCandless (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-1239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12579803#action_12579803
]

Michael McCandless commented on LUCENE-1239:

Is it possible to call IndexWriter.setInfoStream(..), get the hang to happen,
and post the resulting output?

IndexWriter deadlock when using ConcurrentMergeScheduler

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Commented: (LUCENE-1239) IndexWriter deadlock when using ConcurrentMergeScheduler

2008-03-18 Thread Michael McCandless (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-1239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12579819#action_12579819
]

Michael McCandless commented on LUCENE-1239:

If you replace Compass's ExecutorMergeScheduler with Lucene's
ConcurrentMergeScheduler, does the deadlock still happen?

One thing that makes me nervous about ExecutorMergeScheduler is this comment:

// Compass: No need to execute continous merges, we simply reschedule another
merge, if there is any, using executor manager

and the corresponding change which is to schedule a new job instead of using
the while loop to run new merges. If I understand that code correctly, the
executorManager will re-call the run() method on MergeThread when there is a
cascaded merge. But that won't do the right thing because it will run
startMerge rather than the newly returned (cascaded) merge. That would then
cause the deadlock because the cascaded merge is never issued.

IndexWriter deadlock when using ConcurrentMergeScheduler

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Resolved: (LUCENE-1239) IndexWriter deadlock when using ConcurrentMergeScheduler

2008-03-18 Thread Michael Lossos (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-1239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Michael Lossos resolved LUCENE-1239.

Resolution: Invalid

You're right, when I use Lucene's ConcurrentMergeScheduler, I don't see the
deadlock. I'll bounce this back to Compass for fixing. Thank you Michael for
looking into this!

IndexWriter deadlock when using ConcurrentMergeScheduler

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Commented: (LUCENE-1239) IndexWriter deadlock when using ConcurrentMergeScheduler

2008-03-18 Thread Michael McCandless (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-1239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12579829#action_12579829
]

Michael McCandless commented on LUCENE-1239:

Phew :)

IndexWriter deadlock when using ConcurrentMergeScheduler

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

WordNet synonyms overhead

2008-03-18 Thread Harald Näger

Hi,

I am especially interessted in the WordNet synonym expansion that was 
discussed in the Lucene in Action book. Is there anyone here on the list 
who has experience with this approach?

I'm curious about how much the synonym expansion will increase the size of an 
index. Are there any reliable figures of real-life applications?

Kind regards,
Harald

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: WordNet synonyms overhead

2008-03-18 Thread Mathieu Lecarme


Harald Näger a écrit :

Hi,

I am especially interessted in the WordNet synonym expansion that was 
discussed in the Lucene in Action book. Is there anyone here on the list 
who has experience with this approach?


I'm curious about how much the synonym expansion will increase the size of an 
index. Are there any reliable figures of real-life applications?
  
Query expansion is better than index expansion. Faster use, smaller 
index, less noise when you search.


M.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

SearchTimeout tests was Re: Build failed in Hudson: Lucene-trunk #408

2008-03-18 Thread Grant Ingersoll

Seems like we will need to explore some timeout values.  This is  
always tricky when you get on Hudson, as Solr has similar problems  
with some of it's time based tests.



On Mar 17, 2008, at 10:55 PM, Apache Hudson Server wrote:


See http://hudson.zones.apache.org/hudson/job/Lucene-trunk/408/changes

Changes:

[doronc] fix formatting in CHANGES.txt to prevent perl errors in  
creating changes.html.


[mikemccand] LUCENE-1233: correct javadocs
   [junit]
   [junit] Testsuite:  
org.apache.lucene.search.TestTimeLimitedCollector
   [junit] Tests run: 5, Failures: 1, Errors: 0, Time elapsed:  
13.849 sec

   [junit]
   [junit] - Standard Error -
   [junit] Exception in thread Thread-97  
junit.framework.AssertionFailedError: no hits found!

   [junit]  at junit.framework.Assert.fail(Assert.java:47)
   [junit]  at junit.framework.Assert.assertTrue(Assert.java:20)
   [junit] 	at  
org 
.apache 
.lucene 
.search 
.TestTimeLimitedCollector 
.doTestTimeout(TestTimeLimitedCollector.java:152)
   [junit] 	at  
org.apache.lucene.search.TestTimeLimitedCollector.access 
$100(TestTimeLimitedCollector.java:38)
   [junit] 	at org.apache.lucene.search.TestTimeLimitedCollector 
$1.run(TestTimeLimitedCollector.java:231)
   [junit] Exception in thread Thread-85  
junit.framework.AssertionFailedError: no hits found!

   [junit]  at junit.framework.Assert.fail(Assert.java:47)
   [junit]  at junit.framework.Assert.assertTrue(Assert.java:20)
   [junit] 	at  
org 
.apache 
.lucene 
.search 
.TestTimeLimitedCollector 
.doTestTimeout(TestTimeLimitedCollector.java:152)
   [junit] 	at  
org.apache.lucene.search.TestTimeLimitedCollector.access 
$100(TestTimeLimitedCollector.java:38)
   [junit] 	at org.apache.lucene.search.TestTimeLimitedCollector 
$1.run(TestTimeLimitedCollector.java:231)

   [junit] -  ---
   [junit] Testcase:  
testTimeoutMultiThreaded 
(org.apache.lucene.search.TestTimeLimitedCollector):	FAILED

   [junit] some threads failed! expected:50 but was:48
   [junit] junit.framework.AssertionFailedError: some threads  
failed! expected:50 but was:48
   [junit] 	at  
org 
.apache 
.lucene 
.search 
.TestTimeLimitedCollector 
.doTestMultiThreads(TestTimeLimitedCollector.java:255)
   [junit] 	at  
org 
.apache 
.lucene 
.search 
.TestTimeLimitedCollector 
.testTimeoutMultiThreaded(TestTimeLimitedCollector.java:220)

   [junit]
   [junit]
   [junit] Test org.apache.lucene.search.TestTimeLimitedCollector  
FAILED


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Commented: (LUCENE-1239) IndexWriter deadlock when using ConcurrentMergeScheduler

2008-03-18 Thread Shay Banon (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-1239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12579874#action_12579874
]

Shay Banon commented on LUCENE-1239:

Yea, it looks like it is my bad, great catch!. While trying to create a better
scheduler (at least in terms of reusing threads instead of creating them), I
wondered if there is a chance that the current scheduler can be enhanced to
support an extension point for that... . I can give such a refactoring a go if
you think it make sense.

IndexWriter deadlock when using ConcurrentMergeScheduler

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Commented: (LUCENE-1239) IndexWriter deadlock when using ConcurrentMergeScheduler

2008-03-18 Thread Michael McCandless (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-1239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12579909#action_12579909
]

Michael McCandless commented on LUCENE-1239:

{quote}
I wondered if there is a chance that the current scheduler can be enhanced to
support an extension point for that... . I can give such a refactoring a go if
you think it make sense.
{quote}
That would be much appreciated! You should start from the trunk version of
CMS: it already has been somewhat factored to allow subclasses to override
things, though I think maybe not quite enough for this case.

IndexWriter deadlock when using ConcurrentMergeScheduler

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Commented: (LUCENE-400) NGramFilter -- construct n-grams from a TokenStream

2008-03-18 Thread Steven Rowe (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12579933#action_12579933
 ] 

Steven Rowe commented on LUCENE-400:


re-ping, Otis, do you still plan to commit?

 NGramFilter -- construct n-grams from a TokenStream
 ---

 Key: LUCENE-400
 URL: https://issues.apache.org/jira/browse/LUCENE-400
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Analysis
Affects Versions: unspecified
 Environment: Operating System: All
 Platform: All
Reporter: Sebastian Kirsch
Assignee: Otis Gospodnetic
Priority: Minor
 Fix For: 2.4

 Attachments: LUCENE-400.patch, NGramAnalyzerWrapper.java, 
 NGramAnalyzerWrapperTest.java, NGramFilter.java, NGramFilterTest.java


 This filter constructs n-grams (token combinations up to a fixed size, 
 sometimes
 called shingles) from a token stream.
 The filter sets start offsets, end offsets and position increments, so
 highlighting and phrase queries should work.
 Position increments  1 in the input stream are replaced by filler tokens
 (tokens with termText _ and endOffset - startOffset = 0) in the output
 n-grams. (Position increments  1 in the input stream are usually caused by
 removing some tokens, eg. stopwords, from a stream.)
 The filter uses CircularFifoBuffer and UnboundedFifoBuffer from Apache
 Commons-Collections.
 Filter, test case and an analyzer are attached.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: SearchTimeout tests was Re: Build failed in Hudson: Lucene-trunk #408

2008-03-18 Thread Doron Cohen

Hi Grant, I looked at this already - see patch in LUCENE-1238 - I thought
'll give a
few more hours for comments and then commit.  I believe this should solve
the
problem, more details in JIRA.

BR. Doron

On Tue, Mar 18, 2008 at 3:39 PM, Grant Ingersoll [EMAIL PROTECTED]
wrote:

 Seems like we will need to explore some timeout values.  This is
 always tricky when you get on Hudson, as Solr has similar problems
 with some of it's time based tests.


 On Mar 17, 2008, at 10:55 PM, Apache Hudson Server wrote:

  See http://hudson.zones.apache.org/hudson/job/Lucene-trunk/408/changes
 
  Changes:
 
  [doronc] fix formatting in CHANGES.txt to prevent perl errors in
  creating changes.html.
 
  [mikemccand] LUCENE-1233: correct javadocs
 [junit]
 [junit] Testsuite:
  org.apache.lucene.search.TestTimeLimitedCollector
 [junit] Tests run: 5, Failures: 1, Errors: 0, Time elapsed:
  13.849 sec
 [junit]
 [junit] - Standard Error -
 [junit] Exception in thread Thread-97
  junit.framework.AssertionFailedError: no hits found!
 [junit]at junit.framework.Assert.fail(Assert.java:47)
 [junit]at junit.framework.Assert.assertTrue(Assert.java:20)
 [junit]at
  org
  .apache
  .lucene
  .search
  .TestTimeLimitedCollector
  .doTestTimeout(TestTimeLimitedCollector.java:152)
 [junit]at
  org.apache.lucene.search.TestTimeLimitedCollector.access
  $100(TestTimeLimitedCollector.java:38)
 [junit]at org.apache.lucene.search.TestTimeLimitedCollector
  $1.run(TestTimeLimitedCollector.java:231)
 [junit] Exception in thread Thread-85
  junit.framework.AssertionFailedError: no hits found!
 [junit]at junit.framework.Assert.fail(Assert.java:47)
 [junit]at junit.framework.Assert.assertTrue(Assert.java:20)
 [junit]at
  org
  .apache
  .lucene
  .search
  .TestTimeLimitedCollector
  .doTestTimeout(TestTimeLimitedCollector.java:152)
 [junit]at
  org.apache.lucene.search.TestTimeLimitedCollector.access
  $100(TestTimeLimitedCollector.java:38)
 [junit]at org.apache.lucene.search.TestTimeLimitedCollector
  $1.run(TestTimeLimitedCollector.java:231)
 [junit] -  ---
 [junit] Testcase:
  testTimeoutMultiThreaded
  (org.apache.lucene.search.TestTimeLimitedCollector):  FAILED
 [junit] some threads failed! expected:50 but was:48
 [junit] junit.framework.AssertionFailedError: some threads
  failed! expected:50 but was:48
 [junit]at
  org
  .apache
  .lucene
  .search
  .TestTimeLimitedCollector
  .doTestMultiThreads(TestTimeLimitedCollector.java:255)
 [junit]at
  org
  .apache
  .lucene
  .search
  .TestTimeLimitedCollector
  .testTimeoutMultiThreaded(TestTimeLimitedCollector.java:220)
 [junit]
 [junit]
 [junit] Test org.apache.lucene.search.TestTimeLimitedCollector
  FAILED

 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Commented: (LUCENE-1238) intermittent failures of TestTimeLimitedCollector.testTimeoutMultiThreaded in nightly tests

2008-03-18 Thread Doron Cohen (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12579960#action_12579960
 ] 

Doron Cohen commented on LUCENE-1238:
-

I intend to commit this later today.

 intermittent failures of  TestTimeLimitedCollector.testTimeoutMultiThreaded 
 in nightly tests
 

 Key: LUCENE-1238
 URL: https://issues.apache.org/jira/browse/LUCENE-1238
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 2.4
Reporter: Doron Cohen
Assignee: Doron Cohen
Priority: Minor
 Attachments: LUCENE-1238.patch


 Occasionly TestTimeLimitedCollector.testTimeoutMultiThreaded fails. e.g. with 
 this output:
 {noformat}
[junit] - Standard Error -
[junit] Exception in thread Thread-97 
 junit.framework.AssertionFailedError: no hits found!
[junit] at junit.framework.Assert.fail(Assert.java:47)
[junit] at junit.framework.Assert.assertTrue(Assert.java:20)
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector.doTestTimeout(TestTimeLimitedCollector.java:152)
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector.access$100(TestTimeLimitedCollector.java:38)
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector$1.run(TestTimeLimitedCollector.java:231)
[junit] Exception in thread Thread-85 
 junit.framework.AssertionFailedError: no hits found!
[junit] at junit.framework.Assert.fail(Assert.java:47)
[junit] at junit.framework.Assert.assertTrue(Assert.java:20)
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector.doTestTimeout(TestTimeLimitedCollector.java:152)
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector.access$100(TestTimeLimitedCollector.java:38)
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector$1.run(TestTimeLimitedCollector.java:231)
[junit] -  ---
[junit] Testcase: 
 testTimeoutMultiThreaded(org.apache.lucene.search.TestTimeLimitedCollector):  
 FAILED
[junit] some threads failed! expected:50 but was:48
[junit] junit.framework.AssertionFailedError: some threads failed! 
 expected:50 but was:48
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector.doTestMultiThreads(TestTimeLimitedCollector.java:255)
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector.testTimeoutMultiThreaded(TestTimeLimitedCollector.java:220)
[junit]
 {noformat}
 Problem either in test or in TimeLimitedCollector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Resolved: (LUCENE-1238) intermittent failures of TestTimeLimitedCollector.testTimeoutMultiThreaded in nightly tests

2008-03-18 Thread Doron Cohen (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doron Cohen resolved LUCENE-1238.
-

   Resolution: Fixed
Lucene Fields: [Patch Available]  (was: [Patch Available, New])

Committed. 

 intermittent failures of  TestTimeLimitedCollector.testTimeoutMultiThreaded 
 in nightly tests
 

 Key: LUCENE-1238
 URL: https://issues.apache.org/jira/browse/LUCENE-1238
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 2.4
Reporter: Doron Cohen
Assignee: Doron Cohen
Priority: Minor
 Attachments: LUCENE-1238.patch


 Occasionly TestTimeLimitedCollector.testTimeoutMultiThreaded fails. e.g. with 
 this output:
 {noformat}
[junit] - Standard Error -
[junit] Exception in thread Thread-97 
 junit.framework.AssertionFailedError: no hits found!
[junit] at junit.framework.Assert.fail(Assert.java:47)
[junit] at junit.framework.Assert.assertTrue(Assert.java:20)
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector.doTestTimeout(TestTimeLimitedCollector.java:152)
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector.access$100(TestTimeLimitedCollector.java:38)
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector$1.run(TestTimeLimitedCollector.java:231)
[junit] Exception in thread Thread-85 
 junit.framework.AssertionFailedError: no hits found!
[junit] at junit.framework.Assert.fail(Assert.java:47)
[junit] at junit.framework.Assert.assertTrue(Assert.java:20)
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector.doTestTimeout(TestTimeLimitedCollector.java:152)
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector.access$100(TestTimeLimitedCollector.java:38)
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector$1.run(TestTimeLimitedCollector.java:231)
[junit] -  ---
[junit] Testcase: 
 testTimeoutMultiThreaded(org.apache.lucene.search.TestTimeLimitedCollector):  
 FAILED
[junit] some threads failed! expected:50 but was:48
[junit] junit.framework.AssertionFailedError: some threads failed! 
 expected:50 but was:48
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector.doTestMultiThreads(TestTimeLimitedCollector.java:255)
[junit] at 
 org.apache.lucene.search.TestTimeLimitedCollector.testTimeoutMultiThreaded(TestTimeLimitedCollector.java:220)
[junit]
 {noformat}
 Problem either in test or in TimeLimitedCollector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Created: (LUCENE-1240) TermsFilter: reuse TermDocs

2008-03-18 Thread Trejkaz (JIRA)

TermsFilter: reuse TermDocs
---

 Key: LUCENE-1240
 URL: https://issues.apache.org/jira/browse/LUCENE-1240
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Affects Versions: 2.3.1
Reporter: Trejkaz


TermsFilter currently calls termDocs(Term) once per term in the TermsFilter.  
If we sort the terms it's filtering on, this can be optimised to call 
termDocs() once and then skip(Term) once per term, which should significantly 
speed up this filter.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Updated: (LUCENE-1240) TermsFilter: reuse TermDocs

2008-03-18 Thread Trejkaz (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Trejkaz updated LUCENE-1240:


Lucene Fields: [New, Patch Available]  (was: [New])

 TermsFilter: reuse TermDocs
 ---

 Key: LUCENE-1240
 URL: https://issues.apache.org/jira/browse/LUCENE-1240
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Affects Versions: 2.3.1
Reporter: Trejkaz
 Attachments: terms-filter.patch


 TermsFilter currently calls termDocs(Term) once per term in the TermsFilter.  
 If we sort the terms it's filtering on, this can be optimised to call 
 termDocs() once and then skip(Term) once per term, which should significantly 
 speed up this filter.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Resolved: (LUCENE-1240) TermsFilter: reuse TermDocs

2008-03-18 Thread Mark Harwood (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Harwood resolved LUCENE-1240.
--

   Resolution: Fixed
Fix Version/s: 2.3.2
Lucene Fields: [New, Patch Available]  (was: [Patch Available, New])

Committed this fix and added new Junit test as part of  r638631

 TermsFilter: reuse TermDocs
 ---

 Key: LUCENE-1240
 URL: https://issues.apache.org/jira/browse/LUCENE-1240
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Affects Versions: 2.3.1
Reporter: Trejkaz
 Fix For: 2.3.2

 Attachments: terms-filter.patch


 TermsFilter currently calls termDocs(Term) once per term in the TermsFilter.  
 If we sort the terms it's filtering on, this can be optimised to call 
 termDocs() once and then skip(Term) once per term, which should significantly 
 speed up this filter.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Fieldable, AbstractField, Field

2008-03-18 Thread Chris Hostetter


: Really, I think we could go just back to a single Field class instead
: of the three classes Fieldable, AbstractField and Field.  If we had
: this then LUCENE-1219 would be easier to cleanly implement.

It's probably worth reviewing the orriginal reasons why Fieldable and 
AbstractField were added...

http://issues.apache.org/jira/browse/LUCENE-545

I'm not intimately familiar with most of this, but at it's core, the 
purpose seems primarily related to Fields in *returned* documents after a 
search has been performed, particularly relating to lazy loading -- so 
that alternate impls could be returned based on FieldSelector options.  
I'm not sure how much consideration was given to the impacts on future 
changes to the API of Documents/Fields being *indexed*.

somwhere i wrote a nice long diatribe on how in my opinion the biggest 
flaw in Lucene's general API was the reuse of Document and Field for 
two radically differnet purposes, such that half the methods in each class 
are meaningless in the other half of the contexts they are used for... i 
can't find it, but here's a less angry version of the same sentiment, pls 
some followup discussion...

http://www.nabble.com/-jira--Created%3A-%28LUCENE-778%29-Allow-overriding-a-Document-to8406796.html
https://issues.apache.org/jira/browse/LUCENE-778

(note that parallel disscussion occured both in email replies and in Jira 
comments, they are both worth reading)

All of this is fixable in Lucene 3.0, where we will be free to change the 
API; but in the meantime, the fact that 2.3 uses an interface means we 
are stuck with supporting it without changing it in 2.4 since right now 
clients can implement their own Fieldable impl and then pass it to 
Document.add(Fieldable) before indexing the doc.

(things would be a lot easier if the old Document.add(Field) has been left 
alone and document as being explicitly for *indexing* docs, while a new 
method was used for Documents being returned by searches ... but that's 
water under the bridge)

The best short term approach I can think of for addressing LUCENE-1219 
in 2.4: 
 1) list the new methods in a new interface that extends Fieldable 
(ByteArrayReuseFieldable or something)
 2) add the new methods to AbstractField so that it implements 
ByteArrayReuseFieldable 
 3) put an instanceof check for ByteArrayReuseFieldable in 
DocumentsWriter. 

It's not pretty, but it's backwards compatible.


This reminds me of a slightly off topic idea that's been floating arround 
in the back of my head for a while relating to our backwards compatibility 
commitments and the issues of interfaces and abstract classes (which i 
haven't though through all the way, but i'm throwing it out there as 
long as we're talking about it) ...

Committers tend to prefer abstract classes for extension points because it 
makes it easier to support backwards compatibility in the cases were we 
want to add methods to extendable APIs and the default behavior for 
these new methods is simple (or obvious delegation to existing methods) 
so that people who have writen custom impls can upgrade easily without 
needing to invest time in making changes.  

But abstract classes can be harder to mock when doing mock testing, and 
some developers would prefer interfaces that they can implement with 
their existing classes -- i suspect these people who would prefer 
interfaces are willing to invest the time to make changes to their impls 
when upgrading lucene if the interfaces were to change.

Perhaps the solution is a middle ground: altering our APIs such that all 
extension points we advertise have both an abstract base class as well as 
an interface and all methods that take them as arguments use the interface 
name. then we relax our backcompat commitments 
such that we garuntee between minor releases that the interfaces won't 
change unless the corrisponding abstract base class changes to acocunt 
for it ... so if customers subclass the base class their code will 
continue to work, but if they implement the interface directly ignoring 
the base class they are on their own to ensure their code compiles against 
future minor versions.

Like i said, i haven't thought it through completely, but at first glance 
it seems like it would give both commiters and lucene users a lot 
of extra flexibility, without sacrificing much in the way of compatibility 
commitments.  they key would be in adopting it rigirously and religiously.

-Hoss


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: TokenFilter question

2008-03-18 Thread Chris Hostetter


: I was trying to apply both
: org.apache.solr.analysis.WordDelimiterFilter and 
: org.apache.lucene.analysis.ngram.NGramTokenFilter.
: 
: Can I achive this with lucene's TokenStream?

Sure ... you just have to pick an ordering and wrap one arround the other.  
Solr does this anytime you define an analyzer using a tokenizer and a 
list of filters

: While thinking about TokenFilters, I came to an idea that 
: the TokenStream should have a structured representation. 

I've thought about that once or twice over the years as well... it would 
make things like multiword synonyms a lot easier to deal with if instead 
of a TokenStream we could have a directed TokenGraph with a single start 
and a single end (ie: only one node with no incoming links and only one 
node with no outgoing links).

But even if you had a graph based api for Analyzers to express the set of 
tokens found, what would the end product look like?  what would the 
format be of an index that stored Term position information as graph 
connections (esentially 3 dimensional info) instead of simple 
numeric position (1 dimensional) ?  could it be searched as quickly?

Most of the time, things that I think would be easier with a TokenGraph 
are still feasible using judicious use of positionIncrement, slop, and 
artifical marker tokens ... with Payloads even more complex things 
should move into the realm of practical (but it's likely I'm putting 
Payloads on too much of a pedestal ... I've never actually tried using 
them for anything)


-Hoss


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Updated: (LUCENE-1238) intermittent failures of TestTimeLimitedCollector.testTimeoutMultiThreaded in nightly tests

[jira] Updated: (LUCENE-1238) intermittent failures of TestTimeLimitedCollector.testTimeoutMultiThreaded in nightly tests

[jira] Updated: (LUCENE-1238) intermittent failures of TestTimeLimitedCollector.testTimeoutMultiThreaded in nightly tests

[jira] Commented: (LUCENE-1239) IndexWriter deadlock when using ConcurrentMergeScheduler

[jira] Created: (LUCENE-1239) IndexWriter deadlock when using ConcurrentMergeScheduler

[jira] Assigned: (LUCENE-1239) IndexWriter deadlock when using ConcurrentMergeScheduler

[jira] Commented: (LUCENE-1239) IndexWriter deadlock when using ConcurrentMergeScheduler

[jira] Commented: (LUCENE-1239) IndexWriter deadlock when using ConcurrentMergeScheduler

[jira] Resolved: (LUCENE-1239) IndexWriter deadlock when using ConcurrentMergeScheduler

[jira] Commented: (LUCENE-1239) IndexWriter deadlock when using ConcurrentMergeScheduler

WordNet synonyms overhead

Re: WordNet synonyms overhead

SearchTimeout tests was Re: Build failed in Hudson: Lucene-trunk #408

[jira] Commented: (LUCENE-1239) IndexWriter deadlock when using ConcurrentMergeScheduler

[jira] Commented: (LUCENE-1239) IndexWriter deadlock when using ConcurrentMergeScheduler

[jira] Commented: (LUCENE-400) NGramFilter -- construct n-grams from a TokenStream

Re: SearchTimeout tests was Re: Build failed in Hudson: Lucene-trunk #408

[jira] Commented: (LUCENE-1238) intermittent failures of TestTimeLimitedCollector.testTimeoutMultiThreaded in nightly tests

[jira] Resolved: (LUCENE-1238) intermittent failures of TestTimeLimitedCollector.testTimeoutMultiThreaded in nightly tests

[jira] Created: (LUCENE-1240) TermsFilter: reuse TermDocs

[jira] Updated: (LUCENE-1240) TermsFilter: reuse TermDocs

[jira] Resolved: (LUCENE-1240) TermsFilter: reuse TermDocs

Re: Fieldable, AbstractField, Field

Re: TokenFilter question

24 matches

Site Navigation

Mail list logo

Footer information