date:20090708

RE: software grants

2009-07-08 Thread Uwe Schindler

Hi Grant,

 

 I think it is pretty clear that when the code lives in the public

 somewhere else (i.e. source forge or Google code, etc.) it needs to go

 through a grant. 

 

 That being said, I'm not particularly concerned about Trie, for the

 record.

 

Trie was in Sourceforge's SVN as part of panFMP, so it lived in public
before. The last revision was 342:

 

http://panfmp.svn.sourceforge.net/viewvc/panfmp/main/trunk/src/de/pangaea/me
tadataportal/search/TrieRangeQuery.java?revision=315
http://panfmp.svn.sourceforge.net/viewvc/panfmp/main/trunk/src/de/pangaea/m
etadataportal/search/TrieRangeQuery.java?revision=315view=markuppathrev=34
2 view=markuppathrev=342

http://panfmp.svn.sourceforge.net/viewvc/panfmp/main/trunk/src/de/pangaea/me
tadataportal/utils/TrieUtils.java?revision=308
http://panfmp.svn.sourceforge.net/viewvc/panfmp/main/trunk/src/de/pangaea/m
etadataportal/utils/TrieUtils.java?revision=308view=markuppathrev=342
view=markuppathrev=342

 

The first version in Lucenes contrib was a modified version of the above SVN
revision (see LUCENE-1470).

 

After that it was deleted from panFMP's SVN and the new and further
optimized Lucene version was used for this project. If you like, we can fill
out a software grant to be sure (if it is still possible to do this after
the code transfer). I am the only person that must sign the grant on my
side. I can do a checkout of these two files, tar and md5 them.

 

Uwe

Re: broken links when building web-site

2009-07-08 Thread Grant Ingersoll

Yes, I've seen those too and have always wrote them off as Forrest  
errors.  I could never track down anything actually wrong on the site,  
so I ignored it.  The broken-links.xml file has been checked in for a  
good long time, I believe.



On Jul 7, 2009, at 3:00 PM, Uwe Schindler wrote:


I tried to build the docs inside trunk and also the docs in the site
(https://svn.apache.org/repos/asf/lucene/java/site), which both fail  
to

build.

The error is the same here (Win XP), except, that it says, that it  
cannot

find the images (which are indeed not available).

The last time I generated the site docs for revision 784758, after  
that
Grant applied LUCENE-1706. Maybe he missed to commit some new images  
for the

lucidimagination powered search.

But from the change in broken-links.xml, I see, that Grant must have  
seen
the same error, but ignored it. The docs seem to be correct, so I  
think this

error is not fatal.

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


-Original Message-
From: Michael McCandless [mailto:luc...@mikemccandless.com]
Sent: Tuesday, July 07, 2009 8:25 PM
To: java-dev@lucene.apache.org
Subject: broken links when building web-site

I'm trying to regen the web site docs (w/ forrest), for LUCENE-1522,
but I'm hitting a BUILD FAILED at the end, I think because of these
broken links:

X [0]
images/instruction_arrow.pngBROKEN:
/lucene/h2.1522/src/site/src/documentation/content/xdocs/ 
images.instructio

n_arrow.png
(No such file or directory)
X [0]
skin/images/current.gif BROKEN:
/tango/offload/usr/local/src/apache-forrest-0.8/main/webapp/. (Is a
directory)
X [0]
skin/images/chapter.gif BROKEN:
/tango/offload/usr/local/src/apache-forrest-0.8/main/webapp/. (Is a
directory)
X [0] skin/images/page.gif

BROKEN:

/tango/offload/usr/local/src/apache-forrest-0.8/main/webapp/. (Is a
directory)

Does anyone else see this?

Mike

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org




-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: software grants

2009-07-08 Thread Yonik Seeley

On Tue, Jul 7, 2009 at 10:27 PM, Grant Ingersollgsing...@apache.org wrote:
 I think it is pretty clear that when the code lives in the public somewhere
 else (i.e. source forge or Google code, etc.) it needs to go through a
 grant.

It's not clear to me... I think it's just another factor to consider.
It also matters how big of a body of code it is, how many people
developed it over how long, what licenses were used over it's
development history, etc.  Just because someone may make a patch or
feature available on github first does not mean a software grant is
automatically needed.

-Yonik
http://www.lucidimagination.com

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1726) IndexWriter.readerPool create new segmentReader outside of sync block

2009-07-08 Thread Jason Rutherglen (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Rutherglen updated LUCENE-1726:
-

Attachment: LUCENE-1726.trunk.test.patch

I tried the test on trunk and get the same error. They're all
docstore related files so maybe extra doc stores are being
opened?

{code} 
[junit] MockRAMDirectory: cannot close: there are still open files: 
{_s4.fdt=2, _g2.fdx=2, _s4.fdx=2, _g2.tvf=2, _dw.fdx=2, _g2.tvd=2, _g2.tvx=2, 
_ks.tvf=2, _n9.tvx=2, _ks.tvx=2, _n9.fdx=2, _ks.fdx=2, _dw.cfx=1, _n9.tvf=2, 
_cp.cfx=1, _s4.tvf=2, _dw.tvx=2, _87.fdx=2, _fr.tvx=2, _87.tvf=2, _fr.tvd=2, 
_87.fdt=2, _ks.tvd=2, _s4.tvd=2, _dw.tvd=2, _n9.fdt=2, _g2.fdt=2, _87.tvd=2, 
_fr.fdt=2, _dw.fdt=2, _dj.cfx=1, _s4.tvx=2, _ks.fdt=2, _n9.tvd=2, _fr.tvf=2, 
_fr.fdx=2, _dw.tvf=2, _87.tvx=2}
[junit] java.lang.RuntimeException: MockRAMDirectory: cannot close: there 
are still open files: {_s4.fdt=2, _g2.fdx=2, _s4.fdx=2, _g2.tvf=2, _dw.fdx=2, 
_g2.tvd=2, _g2.tvx=2, _ks.tvf=2, _n9.tvx=2, _ks.tvx=2, _n9.fdx=2, _ks.fdx=2, 
_dw.cfx=1, _n9.tvf=2, _cp.cfx=1, _s4.tvf=2, _dw.tvx=2, _87.fdx=2, _fr.tvx=2, 
_87.tvf=2, _fr.tvd=2, _87.fdt=2, _ks.tvd=2, _s4.tvd=2, _dw.tvd=2, _n9.fdt=2, 
_g2.fdt=2, _87.tvd=2, _fr.fdt=2, _dw.fdt=2, _dj.cfx=1, _s4.tvx=2, _ks.fdt=2, 
_n9.tvd=2, _fr.tvf=2, _fr.fdx=2, _dw.tvf=2, _87.tvx=2}
[junit] at 
org.apache.lucene.store.MockRAMDirectory.close(MockRAMDirectory.java:278)
[junit] at 
org.apache.lucene.index.Test1726.testIndexing(Test1726.java:48)
[junit] at 
org.apache.lucene.util.LuceneTestCase.runTest(LuceneTestCase.java:88)
{code}

 IndexWriter.readerPool create new segmentReader outside of sync block
 -

 Key: LUCENE-1726
 URL: https://issues.apache.org/jira/browse/LUCENE-1726
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Affects Versions: 2.4.1
Reporter: Jason Rutherglen
Assignee: Michael McCandless
Priority: Trivial
 Fix For: 3.1

 Attachments: LUCENE-1726.patch, LUCENE-1726.patch, LUCENE-1726.patch, 
 LUCENE-1726.patch, LUCENE-1726.trunk.test.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 I think we will want to do something like what field cache does
 with CreationPlaceholder for IndexWriter.readerPool. Otherwise
 we have the (I think somewhat problematic) issue of all other
 readerPool.get* methods waiting for an SR to warm.
 It would be good to implement this for 2.9.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Issue Comment Edited: (LUCENE-1726) IndexWriter.readerPool create new segmentReader outside of sync block

2009-07-08 Thread Jason Rutherglen (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12728787#action_12728787
 ] 

Jason Rutherglen edited comment on LUCENE-1726 at 7/8/09 9:47 AM:
--

I tried the test on trunk and get the same error. They're all
docstore related files so maybe extra doc stores are being
opened?

{code} 
   [junit] MockRAMDirectory: cannot close: there are still open
files: {_s4.fdt=2, _g2.fdx=2, _s4.fdx=2, _g2.tvf=2, _dw.fdx=2,
_g2.tvd=2, _g2.tvx=2, _ks.tvf=2, _n9.tvx=2, _ks.tvx=2,
_n9.fdx=2, _ks.fdx=2, _dw.cfx=1, _n9.tvf=2, _cp.cfx=1,
_s4.tvf=2, _dw.tvx=2, _87.fdx=2, _fr.tvx=2, _87.tvf=2,
_fr.tvd=2, _87.fdt=2, _ks.tvd=2, _s4.tvd=2, _dw.tvd=2,
_n9.fdt=2, _g2.fdt=2, _87.tvd=2, _fr.fdt=2, _dw.fdt=2,
_dj.cfx=1, _s4.tvx=2, _ks.fdt=2, _n9.tvd=2, _fr.tvf=2,
_fr.fdx=2, _dw.tvf=2, _87.tvx=2} [junit]
java.lang.RuntimeException: MockRAMDirectory: cannot close:
there are still open files: {_s4.fdt=2, _g2.fdx=2, _s4.fdx=2,
_g2.tvf=2, _dw.fdx=2, _g2.tvd=2, _g2.tvx=2, _ks.tvf=2,
_n9.tvx=2, _ks.tvx=2, _n9.fdx=2, _ks.fdx=2, _dw.cfx=1,
_n9.tvf=2, _cp.cfx=1, _s4.tvf=2, _dw.tvx=2, _87.fdx=2,
_fr.tvx=2, _87.tvf=2, _fr.tvd=2, _87.fdt=2, _ks.tvd=2,
_s4.tvd=2, _dw.tvd=2, _n9.fdt=2, _g2.fdt=2, _87.tvd=2,
_fr.fdt=2, _dw.fdt=2, _dj.cfx=1, _s4.tvx=2, _ks.fdt=2,
_n9.tvd=2, _fr.tvf=2, _fr.fdx=2, _dw.tvf=2, _87.tvx=2} [junit]
at
org.apache.lucene.store.MockRAMDirectory.close(MockRAMDirectory.j
ava:278) [junit]at
org.apache.lucene.index.Test1726.testIndexing(Test1726.java:48)
[junit] at
org.apache.lucene.util.LuceneTestCase.runTest(LuceneTestCase.java
:88)
{code}

  was (Author: jasonrutherglen):
I tried the test on trunk and get the same error. They're all
docstore related files so maybe extra doc stores are being
opened?

{code} 
[junit] MockRAMDirectory: cannot close: there are still open files: 
{_s4.fdt=2, _g2.fdx=2, _s4.fdx=2, _g2.tvf=2, _dw.fdx=2, _g2.tvd=2, _g2.tvx=2, 
_ks.tvf=2, _n9.tvx=2, _ks.tvx=2, _n9.fdx=2, _ks.fdx=2, _dw.cfx=1, _n9.tvf=2, 
_cp.cfx=1, _s4.tvf=2, _dw.tvx=2, _87.fdx=2, _fr.tvx=2, _87.tvf=2, _fr.tvd=2, 
_87.fdt=2, _ks.tvd=2, _s4.tvd=2, _dw.tvd=2, _n9.fdt=2, _g2.fdt=2, _87.tvd=2, 
_fr.fdt=2, _dw.fdt=2, _dj.cfx=1, _s4.tvx=2, _ks.fdt=2, _n9.tvd=2, _fr.tvf=2, 
_fr.fdx=2, _dw.tvf=2, _87.tvx=2}
[junit] java.lang.RuntimeException: MockRAMDirectory: cannot close: there 
are still open files: {_s4.fdt=2, _g2.fdx=2, _s4.fdx=2, _g2.tvf=2, _dw.fdx=2, 
_g2.tvd=2, _g2.tvx=2, _ks.tvf=2, _n9.tvx=2, _ks.tvx=2, _n9.fdx=2, _ks.fdx=2, 
_dw.cfx=1, _n9.tvf=2, _cp.cfx=1, _s4.tvf=2, _dw.tvx=2, _87.fdx=2, _fr.tvx=2, 
_87.tvf=2, _fr.tvd=2, _87.fdt=2, _ks.tvd=2, _s4.tvd=2, _dw.tvd=2, _n9.fdt=2, 
_g2.fdt=2, _87.tvd=2, _fr.fdt=2, _dw.fdt=2, _dj.cfx=1, _s4.tvx=2, _ks.fdt=2, 
_n9.tvd=2, _fr.tvf=2, _fr.fdx=2, _dw.tvf=2, _87.tvx=2}
[junit] at 
org.apache.lucene.store.MockRAMDirectory.close(MockRAMDirectory.java:278)
[junit] at 
org.apache.lucene.index.Test1726.testIndexing(Test1726.java:48)
[junit] at 
org.apache.lucene.util.LuceneTestCase.runTest(LuceneTestCase.java:88)
{code}
  
 IndexWriter.readerPool create new segmentReader outside of sync block
 -

 Key: LUCENE-1726
 URL: https://issues.apache.org/jira/browse/LUCENE-1726
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Affects Versions: 2.4.1
Reporter: Jason Rutherglen
Assignee: Michael McCandless
Priority: Trivial
 Fix For: 3.1

 Attachments: LUCENE-1726.patch, LUCENE-1726.patch, LUCENE-1726.patch, 
 LUCENE-1726.patch, LUCENE-1726.trunk.test.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 I think we will want to do something like what field cache does
 with CreationPlaceholder for IndexWriter.readerPool. Otherwise
 we have the (I think somewhat problematic) issue of all other
 readerPool.get* methods waiting for an SR to warm.
 It would be good to implement this for 2.9.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1693) AttributeSource/TokenStream API improvements

2009-07-08 Thread Mark Miller (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12728823#action_12728823
]

Mark Miller commented on LUCENE-1693:
-

Mr. Busch my friend, I'll buy both you and Uwe *many* beers if you resolve this
issue soon!

AttributeSource/TokenStream API improvements

Key: LUCENE-1693
URL: https://issues.apache.org/jira/browse/LUCENE-1693
Project: Lucene - Java
Issue Type: Improvement
Components: Analysis
Reporter: Michael Busch
Assignee: Michael Busch
Priority: Minor
Fix For: 2.9

Attachments: LUCENE-1693.patch, LUCENE-1693.patch, LUCENE-1693.patch,
LUCENE-1693.patch, LUCENE-1693.patch, LUCENE-1693.patch, lucene-1693.patch,
TestCompatibility.java, TestCompatibility.java, TestCompatibility.java,
TestCompatibility.java

This patch makes the following improvements to AttributeSource and
TokenStream/Filter:
- removes the set/getUseNewAPI() methods (including the standard
ones). Instead by default incrementToken() throws a subclass of
UnsupportedOperationException. The indexer tries to call
incrementToken() initially once to see if the exception is thrown;
if so, it falls back to the old API.
- introduces interfaces for all Attributes. The corresponding
implementations have the postfix 'Impl', e.g. TermAttribute and
TermAttributeImpl. AttributeSource now has a factory for creating
the Attribute instances; the default implementation looks for
implementing classes with the postfix 'Impl'. Token now implements
all 6 TokenAttribute interfaces.
- new method added to AttributeSource:
addAttributeImpl(AttributeImpl). Using reflection it walks up in the
class hierarchy of the passed in object and finds all interfaces
that the class or superclasses implement and that extend the
Attribute interface. It then adds the interface-instance mappings
to the attribute map for each of the found interfaces.
- AttributeImpl now has a default implementation of toString that uses
reflection to print out the values of the attributes in a default
formatting. This makes it a bit easier to implement AttributeImpl,
because toString() was declared abstract before.
- Cloning is now done much more efficiently in
captureState. The method figures out which unique AttributeImpl
instances are contained as values in the attributes map, because
those are the ones that need to be cloned. It creates a single
linked list that supports deep cloning (in the inner class
AttributeSource.State). AttributeSource keeps track of when this
state changes, i.e. whenever new attributes are added to the
AttributeSource. Only in that case will captureState recompute the
state, otherwise it will simply clone the precomputed state and
return the clone. restoreState(AttributeSource.State) walks the
linked list and uses the copyTo() method of AttributeImpl to copy
all values over into the attribute that the source stream
(e.g. SinkTokenizer) uses.
The cloning performance can be greatly improved if not multiple
AttributeImpl instances are used in one TokenStream. A user can
e.g. simply add a Token instance to the stream instead of the individual
attributes. Or the user could implement a subclass of AttributeImpl that
implements exactly the Attribute interfaces needed. I think this
should be considered an expert API (addAttributeImpl), as this manual
optimization is only needed if cloning performance is crucial. I ran
some quick performance tests using Tee/Sink tokenizers (which do
cloning) and the performance was roughly 20% faster with the new
API. I'll run some more performance tests and post more numbers then.
Note also that when we add serialization to the Attributes, e.g. for
supporting storing serialized TokenStreams in the index, then the
serialization should benefit even significantly more from the new API
than cloning.
Also, the TokenStream API does not change, except for the removal
of the set/getUseNewAPI methods. So the patches in LUCENE-1460
should still work.
All core tests pass, however, I need to update all the documentation
and also add some unit tests for the new AttributeSource
functionality. So this patch is not ready to commit yet, but I wanted
to post it already for some feedback.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1726) IndexWriter.readerPool create new segmentReader outside of sync block

2009-07-08 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12728824#action_12728824
 ] 

Michael McCandless commented on LUCENE-1726:


Hmm... I'll dig into this test case.

 IndexWriter.readerPool create new segmentReader outside of sync block
 -

 Key: LUCENE-1726
 URL: https://issues.apache.org/jira/browse/LUCENE-1726
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Affects Versions: 2.4.1
Reporter: Jason Rutherglen
Assignee: Michael McCandless
Priority: Trivial
 Fix For: 3.1

 Attachments: LUCENE-1726.patch, LUCENE-1726.patch, LUCENE-1726.patch, 
 LUCENE-1726.patch, LUCENE-1726.trunk.test.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 I think we will want to do something like what field cache does
 with CreationPlaceholder for IndexWriter.readerPool. Otherwise
 we have the (I think somewhat problematic) issue of all other
 readerPool.get* methods waiting for an SR to warm.
 It would be good to implement this for 2.9.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1726) IndexWriter.readerPool create new segmentReader outside of sync block

2009-07-08 Thread Jason Rutherglen (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12728833#action_12728833
 ] 

Jason Rutherglen commented on LUCENE-1726:
--

Mike,

I was wondering if you can recommend techniques or tools for
debugging this type of multithreading issue? (i.e. how do you go
about figuring this type of issue out?) 

 IndexWriter.readerPool create new segmentReader outside of sync block
 -

 Key: LUCENE-1726
 URL: https://issues.apache.org/jira/browse/LUCENE-1726
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Affects Versions: 2.4.1
Reporter: Jason Rutherglen
Assignee: Michael McCandless
Priority: Trivial
 Fix For: 3.1

 Attachments: LUCENE-1726.patch, LUCENE-1726.patch, LUCENE-1726.patch, 
 LUCENE-1726.patch, LUCENE-1726.trunk.test.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 I think we will want to do something like what field cache does
 with CreationPlaceholder for IndexWriter.readerPool. Otherwise
 we have the (I think somewhat problematic) issue of all other
 readerPool.get* methods waiting for an SR to warm.
 It would be good to implement this for 2.9.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1726) IndexWriter.readerPool create new segmentReader outside of sync block

2009-07-08 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12728853#action_12728853
 ] 

Michael McCandless commented on LUCENE-1726:


I don't have any particular tools...

First I simplify the test as much as possible while still hitting the
failure (eg this failure happens w/ only 2 threads), then see if the
error will happen if I turn on IndexWriter's infoStream (it doesn't
for this, so far).  If so, I scrutinize the series of events to find
the hazard; else, I turn off infoStream and add back in a small number
of prints, as long as failure still happens.

Often I use a simple Python script that runs the test over  over
until a failure happens, saving the log, and then scrutinize that.

It's good to start with a rough guess, eg this failure is w/ only doc
stores so it seems likely the merging logic that opens doc stores just
before kicking off the merge may be to blame.


 IndexWriter.readerPool create new segmentReader outside of sync block
 -

 Key: LUCENE-1726
 URL: https://issues.apache.org/jira/browse/LUCENE-1726
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Affects Versions: 2.4.1
Reporter: Jason Rutherglen
Assignee: Michael McCandless
Priority: Trivial
 Fix For: 3.1

 Attachments: LUCENE-1726.patch, LUCENE-1726.patch, LUCENE-1726.patch, 
 LUCENE-1726.patch, LUCENE-1726.trunk.test.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 I think we will want to do something like what field cache does
 with CreationPlaceholder for IndexWriter.readerPool. Otherwise
 we have the (I think somewhat problematic) issue of all other
 readerPool.get* methods waiting for an SR to warm.
 It would be good to implement this for 2.9.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1693) AttributeSource/TokenStream API improvements

2009-07-08 Thread Michael Busch (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12728879#action_12728879
]

Michael Busch commented on LUCENE-1693:
---

Alright, I hope you are coming to Oakland in November!

I had a few (literally) sleepless nights last week to meet some internal
deadlines; but it looks like I'll now have time to work on Lucene, so I'll
continue on this issue tonight!

AttributeSource/TokenStream API improvements

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail:

[jira] Updated: (LUCENE-1726) IndexWriter.readerPool create new segmentReader outside of sync block

2009-07-08 Thread Michael McCandless (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-1726:
---

Attachment: LUCENE-1726.patch

OK the problem happens when a segment is first opened by a merge that
doesn't need to merge the doc stores; later, an NRT reader is opened
that separately opens the doc stores of the same [pooled]
SegmentReader, but then it's the merge that closes the read-only clone
of the reader.

In this case the separately opened (by the NRT reader) doc stores are
not closed by the merge thread.  It's the mirror image of LUCENE-1639.

I've fixed it by pulling all shared readers in a SegmentReader into a
separate static class (CoreReaders).  Cloned SegmentReaders share the
same instance of this class so that if a clone later opens the doc
stores, any prior ancestor (that the clone was created from) would
also close those readers if it's the reader to decRef to 0.

I did something similar for LUCENE-1609 (which I'll now hit conflicts
on after committing this... sigh).

I plan to commit in a day or so.


 IndexWriter.readerPool create new segmentReader outside of sync block
 -

 Key: LUCENE-1726
 URL: https://issues.apache.org/jira/browse/LUCENE-1726
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Affects Versions: 2.4.1
Reporter: Jason Rutherglen
Assignee: Michael McCandless
Priority: Trivial
 Fix For: 3.1

 Attachments: LUCENE-1726.patch, LUCENE-1726.patch, LUCENE-1726.patch, 
 LUCENE-1726.patch, LUCENE-1726.patch, LUCENE-1726.trunk.test.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 I think we will want to do something like what field cache does
 with CreationPlaceholder for IndexWriter.readerPool. Otherwise
 we have the (I think somewhat problematic) issue of all other
 readerPool.get* methods waiting for an SR to warm.
 It would be good to implement this for 2.9.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1726) IndexWriter.readerPool create new segmentReader outside of sync block

2009-07-08 Thread Jason Rutherglen (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12728909#action_12728909
]

Jason Rutherglen commented on LUCENE-1726:
--

The test now passes, needs to go in the patch, perhaps in
TestIndexWriterReader? Great work on this, it's easier to
understand SegmentReader now that all the shared objects are in
one object (CoreReaders). It should make debugging go more
smoothly.

Is there a reason we're not synchronizing on SR.core in
openDocStores? Couldn't we synchronize on core for the cloning
methods?

IndexWriter.readerPool create new segmentReader outside of sync block
-

Key: LUCENE-1726
URL: https://issues.apache.org/jira/browse/LUCENE-1726
Project: Lucene - Java
Issue Type: Improvement
Components: Index
Affects Versions: 2.4.1
Reporter: Jason Rutherglen
Assignee: Michael McCandless
Priority: Trivial
Fix For: 3.1

Attachments: LUCENE-1726.patch, LUCENE-1726.patch, LUCENE-1726.patch,
LUCENE-1726.patch, LUCENE-1726.patch, LUCENE-1726.trunk.test.patch

Original Estimate: 48h
Remaining Estimate: 48h

I think we will want to do something like what field cache does
with CreationPlaceholder for IndexWriter.readerPool. Otherwise
we have the (I think somewhat problematic) issue of all other
readerPool.get* methods waiting for an SR to warm.
It would be good to implement this for 2.9.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: A Comparison of Open Source Search Engines

2009-07-08 Thread Otis Gospodnetic


Interesting, I never realized there was lucene-java-...@apache.org .

My thoughts are on 
http://www.jroller.com/otis/entry/open_source_search_engine_benchmark (and in 
several comments in the blog itself).

Otis



- Original Message 
 From: Sean Owen sro...@gmail.com
 To: lucene-java-...@apache.org
 Sent: Monday, July 6, 2009 11:06:14 AM
 Subject: A Comparison of Open Source Search Engines
 
 http://zooie.wordpress.com/2009/07/06/a-comparison-of-open-source-search-engines-and-indexing-twitter/
 
 I imagine many of you already saw this -- Lucene does pretty well in
 this shootout.
 The only area it tended to lag, it seems, is memory usage and speed in
 some cases.
 
 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1726) IndexWriter.readerPool create new segmentReader outside of sync block

2009-07-08 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12728938#action_12728938
 ] 

Michael McCandless commented on LUCENE-1726:


bq. Is there a reason we're not synchronizing on SR.core in openDocStores?

I was going to say because IW sychronizes but in fact it doesn't,
properly, because when merging we go and open doc stores in
unsynchronized context.  So I'll synchronize(core) in
SR.openDocStores.

bq. Couldn't we synchronize on core for the cloning methods?

I don't think that's needed?  The core is simply carried over to the
newly cloned reader.



 IndexWriter.readerPool create new segmentReader outside of sync block
 -

 Key: LUCENE-1726
 URL: https://issues.apache.org/jira/browse/LUCENE-1726
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Affects Versions: 2.4.1
Reporter: Jason Rutherglen
Assignee: Michael McCandless
Priority: Trivial
 Fix For: 3.1

 Attachments: LUCENE-1726.patch, LUCENE-1726.patch, LUCENE-1726.patch, 
 LUCENE-1726.patch, LUCENE-1726.patch, LUCENE-1726.trunk.test.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 I think we will want to do something like what field cache does
 with CreationPlaceholder for IndexWriter.readerPool. Otherwise
 we have the (I think somewhat problematic) issue of all other
 readerPool.get* methods waiting for an SR to warm.
 It would be good to implement this for 2.9.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: A Comparison of Open Source Search Engines

2009-07-08 Thread Jorge Handl

On Mon, Jul 6, 2009 at 6:01 PM, Earwin Burrfoot ear...@gmail.com wrote:

 Anybody knows other interesting open-source search engines?


http://hounder.org

[jira] Commented: (LUCENE-1706) Site search powered by Lucene/Solr

2009-07-08 Thread Grant Ingersoll (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-1706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12729007#action_12729007
]

Grant Ingersoll commented on LUCENE-1706:
-

Checking...

Site search powered by Lucene/Solr
--

Key: LUCENE-1706
URL: https://issues.apache.org/jira/browse/LUCENE-1706
Project: Lucene - Java
Issue Type: New Feature
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Minor
Fix For: 2.9

Attachments: LUCENE-1706.patch, LUCENE-1706.patch

For a number of years now, the Lucene community has been criticized for not
eating our own dog food when it comes to search. My company has built and
hosts a site search (http://www.lucidimagination.com/search) that is powered
by Apache Solr and Lucene and we'd like to donate it's use to the Lucene
community. Additionally, it allows one to search all of the Lucene content
from a single place, including web, wiki, JIRA and mail archives. See also
http://www.lucidimagination.com/search/document/bf22a570bf9385c7/search_on_lucene_apache_org
You can see it live on Mahout, Tika and Solr
Lucid has a fault tolerant setup with replication and fail over as well as
monitoring services in place. We are committed to maintaining and expanding
the search capabilities on the site.
The following patch adds a skin to the Forrest site that enables the Lucene
site to search Lucene only content using Lucene/Solr. When a search is
submitted, it automatically selects the Lucene facet such that only Lucene
content is searched. From there, users can then narrow/broaden their search
criteria.
I plan on committing in a 3 or 4 days.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1726) IndexWriter.readerPool create new segmentReader outside of sync block

2009-07-08 Thread Jason Rutherglen (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12729024#action_12729024
]

Jason Rutherglen commented on LUCENE-1726:
--

{quote}I don't think that's needed? The core is simply carried
over to the newly cloned reader.{quote}

Right however wouldn't it be somewhat cleaner to sync on core
for all clone operations given we don't want those to occur
(external to IW) at the same time? Ultimately we want core to be
the controller of it's resources rather than the SR being cloned?

I ran the test with the SRMapValue sync code, (4 threads) with
the sync on SR.core in openDocStore for 10 minutes, 2 core
Windows XML laptop Java 6.14 and no errors. Then same with 2
threads for 5 minutes and no errors. I'll keep on running it to
see if we can get an error.

I'm still a little confused as to why we're going to see the bug
if readerPool.get is syncing on the SRMapValue. I guess there's
a slight possibility of the error, and perhaps a more randomized
test would produce it.

IndexWriter.readerPool create new segmentReader outside of sync block
-

Attachments: LUCENE-1726.patch, LUCENE-1726.patch, LUCENE-1726.patch,
LUCENE-1726.patch, LUCENE-1726.patch, LUCENE-1726.trunk.test.patch

Original Estimate: 48h
Remaining Estimate: 48h

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1731) Allow ConstantScoreQuery to use custom rewrite method if using for highlighting

2009-07-08 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12729034#action_12729034
 ] 

Mark Miller commented on LUCENE-1731:
-

Hey Ashley,

This was added to the SpanScorer Scorer for the Highlighter a while back as 
part of resolving that Solr issue. Hopefully I will have to time to make it the 
default by 2.9's release, but its there as an option now if you use the 
SpanScorer.

The issue was:   LUCENE-1425 - Add ConstantScore highlighting support 
to SpanScorer

 Allow ConstantScoreQuery to use custom rewrite method if using for 
 highlighting
 ---

 Key: LUCENE-1731
 URL: https://issues.apache.org/jira/browse/LUCENE-1731
 Project: Lucene - Java
  Issue Type: Improvement
  Components: contrib/highlighter
Affects Versions: 2.4, 2.4.1
Reporter: Ashley Sole
Priority: Minor

 I'd like to submit a patch for ConstantScoreQuery which simply contains a 
 setter method to state whether it is being used for highlighting or not. 
 If it is being used for highlighting, then the rewrite method can take each 
 of the terms in the filter and create a BooleanQuery to return (if the number 
 of terms in the filter are less than 1024), otherwise it simply uses the old 
 rewrite method.
 This allows you to highlight upto 1024 terms when using a ConstantScoreQuery, 
 which since it is a filter, will currently not be highlighted.
 The idea for this came from Mark Millers article Bringing the Highlighter 
 back to Wildcard Queries in Solr 1.4, I would just like to make it available 
 in core lucene

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

RE: software grants

Re: broken links when building web-site

Re: software grants

[jira] Updated: (LUCENE-1726) IndexWriter.readerPool create new segmentReader outside of sync block

[jira] Issue Comment Edited: (LUCENE-1726) IndexWriter.readerPool create new segmentReader outside of sync block

[jira] Commented: (LUCENE-1693) AttributeSource/TokenStream API improvements

[jira] Commented: (LUCENE-1726) IndexWriter.readerPool create new segmentReader outside of sync block

[jira] Commented: (LUCENE-1726) IndexWriter.readerPool create new segmentReader outside of sync block

[jira] Commented: (LUCENE-1726) IndexWriter.readerPool create new segmentReader outside of sync block

[jira] Commented: (LUCENE-1693) AttributeSource/TokenStream API improvements

[jira] Updated: (LUCENE-1726) IndexWriter.readerPool create new segmentReader outside of sync block

[jira] Commented: (LUCENE-1726) IndexWriter.readerPool create new segmentReader outside of sync block

Re: A Comparison of Open Source Search Engines

[jira] Commented: (LUCENE-1726) IndexWriter.readerPool create new segmentReader outside of sync block

Re: A Comparison of Open Source Search Engines

[jira] Commented: (LUCENE-1706) Site search powered by Lucene/Solr

[jira] Commented: (LUCENE-1726) IndexWriter.readerPool create new segmentReader outside of sync block

[jira] Commented: (LUCENE-1731) Allow ConstantScoreQuery to use custom rewrite method if using for highlighting

18 matches

Site Navigation

Mail list logo

Footer information