RE: Efficient Query Evaluation using a Two-Level Retrieval Process

2009-11-15 Thread Uwe Schindler
I see the attachment... (in java-dev) Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de _ From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] Sent: Monday, November 16, 2009 8:13 AM To: solr-...@lucene.apache.org C

Re: Efficient Query Evaluation using a Two-Level Retrieval Process

2009-11-15 Thread Shalin Shekhar Mangar
Hey Joaquin, The mailing list strips off attachments. Can you please upload it somewhere and give us the link? On Mon, Nov 16, 2009 at 12:35 PM, J. Delgado wrote: > Please find attached the paper on "Efficient Query Evaluation using a > Two-Level Retrieval Process". I believe that such approach

Build failed in Hudson: Lucene-trunk #1010

2009-11-15 Thread Apache Hudson Server
See -- A timer trigger started this job Building remotely on lucene.zones.apache.org (Solaris 10) Checking out http://svn.apache.org/repos/asf/lucene/java/trunk ERROR: Failed to check out http://

Re: Lucene 3.0.0 RC1

2009-11-15 Thread Robert Muir
fyi, one reason I tossed the idea at Uwe was that its currently difficult for me to even develop patches for new functionality/3.1 (ex. there is no Version.LUCENE_31 constant yet) i saw this the other day working on LUCENE-2067... no worries i do not want to any do heavy committing On Sun, Nov 15

[jira] Commented: (LUCENE-2051) Contrib Analyzer Setters should be deprecated and replace with ctor arguments

2009-11-15 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778193#action_12778193 ] Robert Muir commented on LUCENE-2051: - {quote} sure, they solve two different things.

Re: Lucene 3.0.0 RC1

2009-11-15 Thread Michael McCandless
+1 to create the branch now. Mike On Sun, Nov 15, 2009 at 4:34 PM, Uwe Schindler wrote: > Hallo Committers, > > I want to start the release process tomorrow. The question: > > Mark Miller created the branch after the release of 2.9.0, but the release > TODO says, that I should create the branch

[jira] Commented: (LUCENE-2051) Contrib Analyzer Setters should be deprecated and replace with ctor arguments

2009-11-15 Thread Simon Willnauer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778192#action_12778192 ] Simon Willnauer commented on LUCENE-2051: - bq. right, but still, will they this st

[jira] Commented: (LUCENE-2051) Contrib Analyzer Setters should be deprecated and replace with ctor arguments

2009-11-15 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778190#action_12778190 ] Robert Muir commented on LUCENE-2051: - bq. This is different. the StopawareAnalyzer#ge

[jira] Updated: (LUCENE-2051) Contrib Analyzer Setters should be deprecated and replace with ctor arguments

2009-11-15 Thread Simon Willnauer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-2051: Attachment: LUCENE-2051.patch Attached complete patch. Thanks for the pointer with the bo

Re: Lucene 3.0.0 RC1

2009-11-15 Thread Michael Busch
I think it's okay to create a branch now and agree on a code freeze (except severe bugs or doc patches) on the branch once the first RC is out. And yes, discouraging big commits on trunk during the freeze is always a good idea. Michael On 11/15/09 1:42 PM, Mark Miller wrote: The reason I c

Re: [jira] Issue Comment Edited: (LUCENE-2037) Allow Junit4 tests in our environment.

2009-11-15 Thread Erick Erickson
That thought occurred to me earlier, but I don't know enough specifics yet. I intend to find out though Erick On Sun, Nov 15, 2009 at 8:46 AM, Robert Muir (JIRA) wrote: > >[ > https://issues.apache.org/jira/browse/LUCENE-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment

[jira] Commented: (LUCENE-2051) Contrib Analyzer Setters should be deprecated and replace with ctor arguments

2009-11-15 Thread Simon Willnauer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778184#action_12778184 ] Simon Willnauer commented on LUCENE-2051: - bq. should we expose the getDefaultStop

Re: Lucene 3.0.0 RC1

2009-11-15 Thread Mark Miller
The reason I created the branch at the last minute was because the 2.9 release was so large. Having to commit the release check/fix flurry of activity against trunk and a branch would have been quite a pain. The hope was also to keep devs on making the release right, rather than continue with trunk

Lucene 3.0.0 RC1

2009-11-15 Thread Uwe Schindler
Hallo Committers, I want to start the release process tomorrow. The question: Mark Miller created the branch after the release of 2.9.0, but the release TODO says, that I should create the branch before the first RC. Robert also wanted this today, because it would allow us to work for 3.1, while

RE: svn commit: r836387 - in /lucene/java/branches/lucene_2_9_back_compat_tests/src: java/org/apache/lucene/util/ThreadInterruptedException.java test/org/apache/lucene/index/TestIndexWriter.java

2009-11-15 Thread Uwe Schindler
In the backwards branch there was no need to add the there not existing ThreadInterruptedException class (in the BW branch only src/test, but never src/java should be updated). Catching RuntimeException would have been enough. But it doesn't matter, that was only what I thought first - don't touch

[jira] Commented: (LUCENE-2051) Contrib Analyzer Setters should be deprecated and replace with ctor arguments

2009-11-15 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778163#action_12778163 ] Robert Muir commented on LUCENE-2051: - simon, should we expose the getDefaultStopSet()

[jira] Assigned: (LUCENE-1154) System Reqs page should be release specific

2009-11-15 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler reassigned LUCENE-1154: - Assignee: Uwe Schindler (was: Grant Ingersoll) > System Reqs page should be release spe

[jira] Resolved: (LUCENE-1558) Make IndexReader/Searcher ctors readOnly=true by default

2009-11-15 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-1558. Resolution: Fixed > Make IndexReader/Searcher ctors readOnly=true by default > ---

[jira] Resolved: (LUCENE-2053) When thread is interrupted we should throw a clear exception

2009-11-15 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-2053. Resolution: Fixed OK, trying again! > When thread is interrupted we should throw

Re: Bug in StandardAnalyzer + StopAnalyzer?

2009-11-15 Thread Robert Muir
Uwe, I will throw another twist at it: I do not like the name of Tokenizer.reset(Reader) method. I wish it was called .setReader() or something else, I think it is confusing for Tokenizer to have .reset() and .reset(Reader), when the latter should hardly ever be overridden. Great example of how in

[jira] Commented: (LUCENE-2053) When thread is interrupted we should throw a clear exception

2009-11-15 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778139#action_12778139 ] Michael McCandless commented on LUCENE-2053: OK I'll commit shortly. Basicall

[jira] Updated: (LUCENE-2051) Contrib Analyzer Setters should be deprecated and replace with ctor arguments

2009-11-15 Thread Simon Willnauer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-2051: Attachment: LUCENE-2051.patch Attached first patch, CJK and Arabic Analyzer still missing.

RE: Bug in StandardAnalyzer + StopAnalyzer?

2009-11-15 Thread Uwe Schindler
Yes, but on the other hand it does not hurt to automaticall reset in the analyzer *krr* I do not know how to proceed. I think we should keep it as it was since the beginning of Lucene (call to reset inside analyzer, QP) and document it correctly. You are right, at the beginning, BaseTokenS

Re: Bug in StandardAnalyzer + StopAnalyzer?

2009-11-15 Thread Robert Muir
ok, at one point i do not think BaseTokenStreamTestCase did. if this is the case, then its the consumer's responsibility to call reset, and we should remove extra resets() inside reusableTokenStream() from analyzers that have it... and probably improve the docs of this contract. On Sun, Nov 15, 2

[jira] Updated: (LUCENE-1154) System Reqs page should be release specific

2009-11-15 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1154: -- Attachment: LUCENE-1154-site.patch LUCENE-1154-trunk.patch Here a patch for ve

RE: Bug in StandardAnalyzer + StopAnalyzer?

2009-11-15 Thread Uwe Schindler
Even QueryParser calls reset() as first call. Also BaseTokenStreamTestCase does it. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de _ From: Robert Muir [mailto:rcm...@gmail.com] Sent: Sunday, November 15, 2009 6:14 PM To: java-d

Re: Bug in StandardAnalyzer + StopAnalyzer?

2009-11-15 Thread Robert Muir
Uwe, not so sure it doesn't need to be there, what about other consumers such as QueryParser? On Sun, Nov 15, 2009 at 12:02 PM, Uwe Schindler wrote: > I checked again, reset() on the top filter does not need to be there, as > the indexer calls it automatically as the first call after > reusable

Re: Bug in StandardAnalyzer + StopAnalyzer?

2009-11-15 Thread Robert Muir
right, this is a consistency issue. when reusing token streams, we should call reset(Reader) on the tokenizer, and also call reset() on the chain, and it should be passed down entire chain if all filters call super.reset() the reason you don't see it happening in StandardAnalyzer, is because none

Re: Bug in StandardAnalyzer + StopAnalyzer?

2009-11-15 Thread Eran Sevi
OK. Thanks for the clarification. it's a bit confusing - maybe the comments should be updated so other people won't do the same mistake as I did. On Sun, Nov 15, 2009 at 7:02 PM, Uwe Schindler wrote: > I checked again, reset() on the top filter does not need to be there, as > the indexer calls

RE: Bug in StandardAnalyzer + StopAnalyzer?

2009-11-15 Thread Uwe Schindler
I checked again, reset() on the top filter does not need to be there, as the indexer calls it automatically as the first call after reusableTokenStream. For reusing only reset(Reader) must be called. It's a little bit strange that both methods have the same name, the reset(Reader) one has a complet

RE: Bug in StandardAnalyzer + StopAnalyzer?

2009-11-15 Thread Uwe Schindler
It should be there... But ist unimplemented in the TokenFilters used by Standard/Stop Analyzer. Buf for consistency it should be there. I'll talk with Robert Muir about it. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u

Re: Bug in StandardAnalyzer + StopAnalyzer?

2009-11-15 Thread Eran Sevi
Good point. I missed that part :) since only the tokenizer uses the reader, we must call it directly. So the reset() on the filteredTokenStream was omitted on purpose because there's not underlying implementation? or is it really missing? On Sun, Nov 15, 2009 at 6:30 PM, Uwe Schindler wrote: >

[jira] Commented: (LUCENE-2053) When thread is interrupted we should throw a clear exception

2009-11-15 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778122#action_12778122 ] Uwe Schindler commented on LUCENE-2053: --- Patch looks good, even I do not understand

RE: Bug in StandardAnalyzer + StopAnalyzer?

2009-11-15 Thread Uwe Schindler
It must call both reset on the top-level TokenStream and reset(Reader) on the Tokenizer-. If the latter is not done, how should the TokenStream get his new Reader? - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de _ From: Eran Sevi

Bug in StandardAnalyzer + StopAnalyzer?

2009-11-15 Thread Eran Sevi
Hi, when changing my code to support the not-so-new reusableTokenStream I noticed that in the cases when a SavedStream class was used in an analyzer (Standard,Stop and maybe others as well) the reset() method is called on the tokenizer instead of on the filter. The filter implementation of reset()

Re: A new Lucene Directory available

2009-11-15 Thread Sanne Grinovero
Hi again Earwin, thanks you very much for spotting the byte reading issue, it's definitely not as I wanted it. https://jira.jboss.org/jira/browse/ISPN-276 I never tried to defend an improved updates/s ratio, just maybe compared to scheduled rsyncs :-) Our goal is to scale on queries/sec while usag

Re: A new Lucene Directory available

2009-11-15 Thread Earwin Burrfoot
> About the RAMDirectory comparison, as you said yourself the bytes > aren't read constantly but just at index reopen so I wouldn't be too > worried about the "bunch of methods" as they're executed once per > segment loading; The bytes /are/ read constantly (readByte() method). I believe that is th

Re: Build failed in Hudson: Lucene-trunk #1009

2009-11-15 Thread Michael McCandless
Oh, yeah. So Erick, or anyone, if you see it fail on BW branch, ignore it for now. I'm going to be offline for a while starting shortly -- Uwe (or anyone) if the my last patch on LUCENE-2053 looks OK, feel free to commit it to trunk & BW branch (& roll a new tag). Else I'll commit when I'm back

[jira] Commented: (LUCENE-2037) Allow Junit4 tests in our environment.

2009-11-15 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778101#action_12778101 ] Robert Muir commented on LUCENE-2037: - Is there some way to use Junit4 parameterized t

[jira] Issue Comment Edited: (LUCENE-2037) Allow Junit4 tests in our environment.

2009-11-15 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778101#action_12778101 ] Robert Muir edited comment on LUCENE-2037 at 11/15/09 1:45 PM: -

RE: Build failed in Hudson: Lucene-trunk #1009

2009-11-15 Thread Uwe Schindler
It may still fail in BW branch. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Michael McCandless [mailto:luc...@mikemccandless.com] > Sent: Sunday, November 15, 2009 2:30 PM > To: java-dev@lucene.apach

[jira] Updated: (LUCENE-2053) When thread is interrupted we should throw a clear exception

2009-11-15 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-2053: --- Attachment: LUCENE-2053.patch Attached my current approach for fixing the test. Inc

Re: Build failed in Hudson: Lucene-trunk #1009

2009-11-15 Thread Michael McCandless
Until I can fix it for real, I've disabled the flakey part of the test (committed a little bit ago), so if you svn up you should see no failures from this. Mike On Sun, Nov 15, 2009 at 8:27 AM, Erick Erickson wrote: > Hmmm, I was running into this intermittently last night, but thought it was >

Re: Build failed in Hudson: Lucene-trunk #1009

2009-11-15 Thread Michael McCandless
Right, this was an intentional change, to match the semantics of the "normal" InterruptedException. Either interrupt status is set, or an InterruptedException is being thrown, but not both. It's like the olympic torch. There can be only one. Ie, if you interrupt a thread in Lucene, you'll then

Re: A new Lucene Directory available

2009-11-15 Thread Sanne Grinovero
Hi Earwin, thanks for the insight, as I mentioned I have no proper benchmarks to back my statements but I can see how it behaves, so absolutely I could be too optimistic. They are currently profiling Infinispan and speeding up some internals, so I'll wait for these tasks to finish to begin testing

RE: Build failed in Hudson: Lucene-trunk #1009

2009-11-15 Thread Uwe Schindler
In the patch you changed the behaviour from RuntimeException -> ThreadInterruptedException in a way, that the catch blocks did not contain anymore the Thread.currentThread().interrupt() call. That's what I meant. You catch the exception but do not anymore explicitely interrupt the thread again. U

[jira] Reopened: (LUCENE-2053) When thread is interrupted we should throw a clear exception

2009-11-15 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reopened LUCENE-2053: Reopening to address buggy intermittent test failure... > When thread is interrupted

Re: Build failed in Hudson: Lucene-trunk #1009

2009-11-15 Thread Michael McCandless
Sorry, what exactly did I change in the patch? The bug in the test is that it simply sleeps for 1 msec, and then interrupts again. In the child thread I assert that interrupt status was cleared on catching the ThreadInterruptedException, but that assert intermittently fails, I'm thinking because

Re: [jira] Commented: (LUCENE-2037) Allow Junit4 tests in our environment.

2009-11-15 Thread Erick Erickson
Good suggestions, it's really helpful to have someone intimately familiar with the code suggest the next direction. I didn't want to go too far afield for the proof-of-concept, I mostly wanted to have a place to start. LuceneTestCaseJ4 should be useful both as a template and a base to build with. I

RE: Build failed in Hudson: Lucene-trunk #1009

2009-11-15 Thread Uwe Schindler
Maybe that was the reason to explicitely interrupting the thread again after catching the InterruptedException. Why did you change this behaviour in the patch? - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > Fr

Re: Build failed in Hudson: Lucene-trunk #1009

2009-11-15 Thread Michael McCandless
Yeah it's a thread safety intermittent sort of thing. The main thread is double-interrupt()ing the child thread before the child thread succeeds in raising & catching the ThreadInterruptedException from the first interrupt(). I'm trying to fix it but it's proving devilish ;) Mike On Sun, Nov 15

Re: A new Lucene Directory available

2009-11-15 Thread Sanne Grinovero
Hi Lukas, Our reference during early design was Lucene 2.4.1, but we look forward for compatibility and new tricks. Current trunk is compatible towards Lucene's trunk, but I won't close ISPN-275 until it's confirmed against a released Lucene 3.0.0 : hopefully this will come before Infinispan 4 rele

RE: Build failed in Hudson: Lucene-trunk #1009

2009-11-15 Thread Uwe Schindler
... and now, also the trunk variant passes... very strange. If you run the test alone, it passes, if the whole suite it breaks. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Michael McCandless [ma

RE: Build failed in Hudson: Lucene-trunk #1009

2009-11-15 Thread Uwe Schindler
Interesting, the test-tag variant passes... - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Michael McCandless [mailto:luc...@mikemccandless.com] > Sent: Sunday, November 15, 2009 11:37 AM > To: java-dev@

Re: A new Lucene Directory available

2009-11-15 Thread Earwin Burrfoot
Terracotta guys "easy-clustered" Lucene a few years ago. I'm yet to see at least one person saying it worked for him allright. This new directory ain't gonna be faster than RAMDirectory, as syncs on a map doesn't matter, they are taken once per opened file -> once per reopen, which is not happenin

[jira] Commented: (LUCENE-2037) Allow Junit4 tests in our environment.

2009-11-15 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778086#action_12778086 ] Uwe Schindler commented on LUCENE-2037: --- One thing that would also be good: We have

Re: Build failed in Hudson: Lucene-trunk #1009

2009-11-15 Thread Michael McCandless
Hmmm... the failure is from LUCENE-2053. The test asserts that on getting a ThreadInterruptedException, the interrupt status of the thread is cleared, but in this case it's not. [junit] - Standard Output --- [junit] FAILED; InterruptedException hit but thread.inter