date:20110118

[
https://issues.apache.org/jira/browse/LUCENE-2295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Shai Erera updated LUCENE-2295:
---

Attachment: LUCENE-2295-2-3x.patch

Patch against 3x. Removed the get/set from IWC and changed code which used it.
I also added some clarifying notes to the deprecation note in
IW.setMaxFieldLength.

I will post a separate patch for trunk where this setting will be removed
altogether.

Create a MaxFieldLengthAnalyzer to wrap any other Analyzer and provide the
same functionality as MaxFieldLength provided on IndexWriter
---

Key: LUCENE-2295
URL: https://issues.apache.org/jira/browse/LUCENE-2295
Project: Lucene - Java
Issue Type: Improvement
Components: contrib/analyzers
Reporter: Shai Erera
Assignee: Uwe Schindler
Fix For: 3.1, 4.0

Attachments: LUCENE-2295-2-3x.patch, LUCENE-2295-trunk.patch,
LUCENE-2295.patch

A spinoff from LUCENE-2294. Instead of asking the user to specify on
IndexWriter his requested MFL limit, we can get rid of this setting entirely
by providing an Analyzer which will wrap any other Analyzer and its
TokenStream with a TokenFilter that keeps track of the number of tokens
produced and stop when the limit has reached.
This will remove any count tracking in IW's indexing, which is done even if I
specified UNLIMITED for MFL.
Let's try to do it for 3.1.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Let's drop Maven Artifacts !

2011-01-18 Thread Earwin Burrfoot

Somehow, they were made available since 2.0
- http://repo2.maven.org/maven2/org/apache/lucene/lucene-core/

The pom's are minimal, sans dependencies, so eg if your project
depends on lucene-spellchecker, lucene-core won't be transitively
included and your build is gonna fail (you therefore had to add
dependency on the core to your project yourself).
But they were enough to download and link jars/sources/javadocs.

On Tue, Jan 18, 2011 at 12:40, Shai Erera ser...@gmail.com wrote:
 Out of curiosity, how did the Maven people integrate Lucene before we had
 Maven artifacts. To the best of my understanding, we never had proper Maven
 artifacts (Steve is working on that in LUCENE-2657).

 Shai

 On Tue, Jan 18, 2011 at 11:03 AM, Simon Willnauer
 simon.willna...@googlemail.com wrote:

 On Tue, Jan 18, 2011 at 9:33 AM, Thomas Koch tho...@koch.ro wrote:
  Hi,
 
  the developers list may not be the right place to find strong maven
  supporters. All developers know lucene from inside out and are perfectly
  fine
  to install lucene from whatever artifact.
  Those people using maven are your end users, that propably don't even
  subscribe to users@.

 big +1 for this comment! I have to admit that I am not a big maven fan
 and each time I have to use it its a pain in the ass but it is the
 de-facto standard for the majority of java projects on this planet so
 really there is not much of an option in my opinion. A project like
 lucene has to release maven artifacts even if its a pain.

 Simon
 
  Thomas Koch, http://www.koch.ro
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: dev-h...@lucene.apache.org
 
 





-- 
Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com)
Phone: +7 (495) 683-567-4
ICQ: 104465785

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly


[ 
https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983113#action_12983113
 ] 

Robert Muir commented on LUCENE-2657:
-

bq. I think this patch is ready to be committed to trunk.

Well first of all, you obviously worked hard on this, but we need to think this 
one through before committing.
Can we put this code in a separate project, that takes care of maven support 
for lucene?
The problem is there are two camps die maven die and maven or die. There 
will *never* be consensus.

The only way for maven to survive, is for the users that care about it, to 
support itself, just like other packaging systems
such as debian, redhat rpm, freebsd/mac ports, etc etc that we lucene, don't 
deal with.
They can't continue to whine to people like me, who don't give a shit about it, 
to support it and produce its crazy ass
complicated artifacts.

Instead the people who care about these packaging systems, and know how to make 
them work must deal with them.

Personally I really don't like:
* Having two build systems
* Having one build system (ant) rely upon the other (maven) to create release 
artifacts.

Basically, the ant build system is our build. I think it needs to be able to 
fully build lucene for a 
release without involving any other build systems such as Make or Maven.


 Replace Maven POM templates with full POMs, and change documentation 
 accordingly
 

 Key: LUCENE-2657
 URL: https://issues.apache.org/jira/browse/LUCENE-2657
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.1, 4.0
Reporter: Steven Rowe
Assignee: Steven Rowe
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch


 The current Maven POM templates only contain dependency information, the bare 
 bones necessary for uploading artifacts to the Maven repository.
 The full Maven POMs in the attached patch include the information necessary 
 to run a multi-module Maven build, in addition to serving the same purpose as 
 the current POM templates.
 Several dependencies are not available through public maven repositories.  A 
 profile in the top-level POM can be activated to install these dependencies 
 from the various {{lib/}} directories into your local repository.  From the 
 top-level directory:
 {code}
 mvn -N -Pbootstrap install
 {code}
 Once these non-Maven dependencies have been installed, to run all Lucene/Solr 
 tests via Maven's surefire plugin, and populate your local repository with 
 all artifacts, from the top level directory, run:
 {code}
 mvn install
 {code}
 When one Lucene/Solr module depends on another, the dependency is declared on 
 the *artifact(s)* produced by the other module and deposited in your local 
 repository, rather than on the other module's un-jarred compiler output in 
 the {{build/}} directory, so you must run {{mvn install}} on the other module 
 before its changes are visible to the module that depends on it.
 To create all the artifacts without running tests:
 {code}
 mvn -DskipTests install
 {code}
 I almost always include the {{clean}} phase when I do a build, e.g.:
 {code}
 mvn -DskipTests clean install
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly

2011-01-18 Thread Steven Rowe (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983123#action_12983123
]

Steven Rowe commented on LUCENE-2657:
-

bq. Can we put this code in a separate project, that takes care of maven
support for lucene?

I'd rather not. The Lucene project has published Maven artifacts since the
1.9.1 release. I think we should continue to do that.

bq. The only way for maven to survive, is for the users that care about it, to
support itself, just like other packaging systems such as debian, redhat rpm,
freebsd/mac ports, etc etc that we lucene, don't deal with.

OK, those are pretty obviously red herrings. Can we concentrate on the actual
issue here without dragging in those extraneous things? Maven artifacts, not
those other things, have been provided by Lucene since the 1.9.1 release. We
obviously *do* deal with Maven.

bq. They can't continue to whine to people like me, who don't give a shit about
it, to support it and produce its crazy ass complicated artifacts.

The latest patch on this release uses the Ant artifacts directly. POMs are
provided. You know, just like it has been since the 1.9.1 release.

bq. Instead the people who care about these packaging systems, and know how to
make them work must deal with them.

Um, like the patch on this issue is doing?

bq. Basically, the ant build system is our build. I think it needs to be able
to fully build lucene for a release without involving any other build systems
such as Make or Maven.

This patch uses the Ant-produced artifacts to prepare for Maven artifact
publishing. Maven itself is not invoked in the process. An Ant plugin handles
the artifact deployment.

I seriously do not understand why this is such a big deal. Why can't we just
keep publishing Maven artifacts? You know, like we have for the past 15-20
releases.

Replace Maven POM templates with full POMs, and change documentation
accordingly

Key: LUCENE-2657
URL: https://issues.apache.org/jira/browse/LUCENE-2657
Project: Lucene - Java
Issue Type: Improvement
Components: Build
Affects Versions: 3.1, 4.0
Reporter: Steven Rowe
Assignee: Steven Rowe
Fix For: 3.1, 4.0

Attachments: LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch,
LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch,
LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch,
LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch

The current Maven POM templates only contain dependency information, the bare
bones necessary for uploading artifacts to the Maven repository.
The full Maven POMs in the attached patch include the information necessary
to run a multi-module Maven build, in addition to serving the same purpose as
the current POM templates.
Several dependencies are not available through public maven repositories. A
profile in the top-level POM can be activated to install these dependencies
from the various {{lib/}} directories into your local repository. From the
top-level directory:
{code}
mvn -N -Pbootstrap install
{code}
Once these non-Maven dependencies have been installed, to run all Lucene/Solr
tests via Maven's surefire plugin, and populate your local repository with
all artifacts, from the top level directory, run:
{code}
mvn install
{code}
When one Lucene/Solr module depends on another, the dependency is declared on
the *artifact(s)* produced by the other module and deposited in your local
repository, rather than on the other module's un-jarred compiler output in
the {{build/}} directory, so you must run {{mvn install}} on the other module
before its changes are visible to the module that depends on it.
To create all the artifacts without running tests:
{code}
mvn -DskipTests install
{code}
I almost always include the {{clean}} phase when I do a build, e.g.:
{code}
mvn -DskipTests clean install
{code}

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Let's drop Maven Artifacts !

On Tue, Jan 18, 2011 at 5:29 AM, Hardy Ferentschik s...@ferentschik.de wrote:

 It also means that someone outside the dev community will at some stage
 create some
 pom files and upload the artifact to a (semi-) public repository.

This sounds great! this is how open source works, those who care about
it, will make it happen!

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly


[ 
https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983135#action_12983135
 ] 

Chris Male commented on LUCENE-2657:


I'm a little lost at what this patch introduces that is imposing? Ant itself 
has maven support as part of its trunk code base so its clearly not too 
imposing for them.

Is your issue that this patch introduces things that get in your way somehow 
with using ant to do builds? or are you against committing this due to your 
general concerns with Maven?

 Replace Maven POM templates with full POMs, and change documentation 
 accordingly
 

 Key: LUCENE-2657
 URL: https://issues.apache.org/jira/browse/LUCENE-2657
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.1, 4.0
Reporter: Steven Rowe
Assignee: Steven Rowe
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch


 The current Maven POM templates only contain dependency information, the bare 
 bones necessary for uploading artifacts to the Maven repository.
 The full Maven POMs in the attached patch include the information necessary 
 to run a multi-module Maven build, in addition to serving the same purpose as 
 the current POM templates.
 Several dependencies are not available through public maven repositories.  A 
 profile in the top-level POM can be activated to install these dependencies 
 from the various {{lib/}} directories into your local repository.  From the 
 top-level directory:
 {code}
 mvn -N -Pbootstrap install
 {code}
 Once these non-Maven dependencies have been installed, to run all Lucene/Solr 
 tests via Maven's surefire plugin, and populate your local repository with 
 all artifacts, from the top level directory, run:
 {code}
 mvn install
 {code}
 When one Lucene/Solr module depends on another, the dependency is declared on 
 the *artifact(s)* produced by the other module and deposited in your local 
 repository, rather than on the other module's un-jarred compiler output in 
 the {{build/}} directory, so you must run {{mvn install}} on the other module 
 before its changes are visible to the module that depends on it.
 To create all the artifacts without running tests:
 {code}
 mvn -DskipTests install
 {code}
 I almost always include the {{clean}} phase when I do a build, e.g.:
 {code}
 mvn -DskipTests clean install
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly


[ 
https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983141#action_12983141
 ] 

Chris Male commented on LUCENE-2657:


Alright I can appreciate your concern.  I think comparing Maven to RPM or 
FreeBSD ports is going a little far, but I can understand the point you're 
making.

What if this were committed so that those of us who do understand maven and do 
like using it, could?  
This issue about whether maven artifacts need to then be released or not can be 
part of a greater discussion (as is already taking place).

By committing this we then make it easier for someone else outside of the 
project to create the correct artifacts which are then available from the
central maven repository, if thats the decision thats made which is also the 
one you support.

 Replace Maven POM templates with full POMs, and change documentation 
 accordingly
 

 Key: LUCENE-2657
 URL: https://issues.apache.org/jira/browse/LUCENE-2657
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.1, 4.0
Reporter: Steven Rowe
Assignee: Steven Rowe
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch


 The current Maven POM templates only contain dependency information, the bare 
 bones necessary for uploading artifacts to the Maven repository.
 The full Maven POMs in the attached patch include the information necessary 
 to run a multi-module Maven build, in addition to serving the same purpose as 
 the current POM templates.
 Several dependencies are not available through public maven repositories.  A 
 profile in the top-level POM can be activated to install these dependencies 
 from the various {{lib/}} directories into your local repository.  From the 
 top-level directory:
 {code}
 mvn -N -Pbootstrap install
 {code}
 Once these non-Maven dependencies have been installed, to run all Lucene/Solr 
 tests via Maven's surefire plugin, and populate your local repository with 
 all artifacts, from the top level directory, run:
 {code}
 mvn install
 {code}
 When one Lucene/Solr module depends on another, the dependency is declared on 
 the *artifact(s)* produced by the other module and deposited in your local 
 repository, rather than on the other module's un-jarred compiler output in 
 the {{build/}} directory, so you must run {{mvn install}} on the other module 
 before its changes are visible to the module that depends on it.
 To create all the artifacts without running tests:
 {code}
 mvn -DskipTests install
 {code}
 I almost always include the {{clean}} phase when I do a build, e.g.:
 {code}
 mvn -DskipTests clean install
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2584) Concurrency issues in SegmentInfo.files() could lead to ConcurrentModificationException


[ 
https://issues.apache.org/jira/browse/LUCENE-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983142#action_12983142
 ] 

Shai Erera commented on LUCENE-2584:


On one hand, it's good to add the files to a Set, so that we can be sure they 
are added uniquely. On the other hand though, if we expect files are added 
properly, then adding to the set is redundant. Since this code is executed once 
per SI instance, I think explicitly adding to a Set is better.

Note that while the assert you added will work, if someone runs without 
assertions he may get duplicate file names, if indeed they are added twice. I 
think that it's not so crucial to know that the same files was added twice, 
it's a very unlikely bug, but it is crucial that files() return unique names.

So can you please use a Set in the method instead of the assert (like it's done 
on trunk). Also, while you're at it, the method doesn't have javadocs - they 
appear in regular comments. Can you convert them to javadocs (there is a 
warning there about not modifying the returned List, but it's not visible as 
javadocs :).

 Concurrency issues in SegmentInfo.files() could lead to 
 ConcurrentModificationException
 ---

 Key: LUCENE-2584
 URL: https://issues.apache.org/jira/browse/LUCENE-2584
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Affects Versions: 2.9, 2.9.1, 2.9.2, 2.9.3, 3.0, 3.0.1, 3.0.2
Reporter: Alexander Kanarsky
Priority: Minor
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2584-branch_3x.patch, 
 LUCENE-2584-lucene-2_9.patch, LUCENE-2584-lucene-3_0.patch


 The multi-threaded call of the files() in SegmentInfo could lead to the 
 ConcurrentModificationException if one thread is not finished additions to 
 the ArrayList (files) yet while the other thread already obtained it as 
 cached (see below). This is a rare exception, but it would be nice to fix. I 
 see the code is no longer problematic in the trunk (and others ported from 
 flex_1458), looks it was fixed while implementing post 3.x features. The fix 
 to 3.x and 2.9.x branches could be the same - create the files set first and 
 populate it, and then assign to the member variable at the end of the method. 
 This will resolve the issue. I could prepare the patch for 2.9.4 and 3.x, if 
 needed.
 --
 INFO: [19] webapp= path=/replication params={command=fetchindexwt=javabin} 
 status=0 QTime=1
 Jul 30, 2010 9:13:05 AM org.apache.solr.core.SolrCore execute
 INFO: [19] webapp= path=/replication params={command=detailswt=javabin} 
 status=0 QTime=24
 Jul 30, 2010 9:13:05 AM org.apache.solr.handler.ReplicationHandler doFetch
 SEVERE: SnapPull failed
 java.util.ConcurrentModificationException
 at 
 java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372)
 at java.util.AbstractList$Itr.next(AbstractList.java:343)
 at java.util.AbstractCollection.addAll(AbstractCollection.java:305)
 at org.apache.lucene.index.SegmentInfos.files(SegmentInfos.java:826)
 at 
 org.apache.lucene.index.DirectoryReader$ReaderCommit.init(DirectoryReader.java:916)
 at 
 org.apache.lucene.index.DirectoryReader.getIndexCommit(DirectoryReader.java:856)
 at 
 org.apache.solr.search.SolrIndexReader.getIndexCommit(SolrIndexReader.java:454)
 at 
 org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:261)
 at 
 org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:264)
 at 
 org.apache.solr.handler.ReplicationHandler$1.run(ReplicationHandler.java:146)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-2295) Create a MaxFieldLengthAnalyzer to wrap any other Analyzer and provide the same functionality as MaxFieldLength provided on IndexWriter


 [ 
https://issues.apache.org/jira/browse/LUCENE-2295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-2295:
---

Attachment: LUCENE-2295-2-trunk.patch

Patch against trunk - removes maxFieldLength handling from all the code.

 Create a MaxFieldLengthAnalyzer to wrap any other Analyzer and provide the 
 same functionality as MaxFieldLength provided on IndexWriter
 ---

 Key: LUCENE-2295
 URL: https://issues.apache.org/jira/browse/LUCENE-2295
 Project: Lucene - Java
  Issue Type: Improvement
  Components: contrib/analyzers
Reporter: Shai Erera
Assignee: Uwe Schindler
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2295-2-3x.patch, LUCENE-2295-2-trunk.patch, 
 LUCENE-2295-trunk.patch, LUCENE-2295.patch


 A spinoff from LUCENE-2294. Instead of asking the user to specify on 
 IndexWriter his requested MFL limit, we can get rid of this setting entirely 
 by providing an Analyzer which will wrap any other Analyzer and its 
 TokenStream with a TokenFilter that keeps track of the number of tokens 
 produced and stop when the limit has reached.
 This will remove any count tracking in IW's indexing, which is done even if I 
 specified UNLIMITED for MFL.
 Let's try to do it for 3.1.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2324) Per thread DocumentsWriters that write their own private segments

[
https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983154#action_12983154
]

Michael McCandless commented on LUCENE-2324:

I ran a quick perf test here: I built the 10M Wikipedia index,
Standard codec, using 6 threads. Trunk took 541.6 sec; RT took 518.2
sec (only a bit faster), but the test wasn't really fair because it
flushed @ docCount=12870.

But I can't test flush by RAM -- that's not working yet on RT right?

(The search results matched, which is nice!)

Then I ran a single-threaded test. Trunk took 1097.1 sec and RT took
1040.5 sec -- a bit faster! Presumably in the noise (we don't expect
a speedup?), but excellent that it's not slower...

I think we lost infoStream output on the details of flushing? I can't
see when which DWPTs are flushing...

Per thread DocumentsWriters that write their own private segments
-

Key: LUCENE-2324
URL: https://issues.apache.org/jira/browse/LUCENE-2324
Project: Lucene - Java
Issue Type: Improvement
Components: Index
Reporter: Michael Busch
Assignee: Michael Busch
Priority: Minor
Fix For: Realtime Branch

Attachments: LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch,
LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch,
LUCENE-2324.patch, LUCENE-2324.patch, LUCENE-2324.patch, lucene-2324.patch,
lucene-2324.patch, LUCENE-2324.patch, test.out, test.out, test.out, test.out

See LUCENE-2293 for motivation and more details.
I'm copying here Mike's summary he posted on 2293:
Change the approach for how we buffer in RAM to a more isolated
approach, whereby IW has N fully independent RAM segments
in-process and when a doc needs to be indexed it's added to one of
them. Each segment would also write its own doc stores and
normal segment merging (not the inefficient merge we now do on
flush) would merge them. This should be a good simplification in
the chain (eg maybe we can remove the *PerThread classes). The
segments can flush independently, letting us make much better
concurrent use of IO CPU.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2856) Create IndexWriter event listener, specifically for merges

[
https://issues.apache.org/jira/browse/LUCENE-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983155#action_12983155
]

Michael McCandless commented on LUCENE-2856:

The ReaderEvent is never generated? Is that still work-in-progress?
When would this be invoked? Only if IW is pooling readers? Maybe we
should hold off on that for a separate issue?

Why were the added checks needed in SegmentInfo? Oh I see, it's
because you compute the sizeInBytes of the merged segment before the
merge completes... hmm. I think I'd prefer that this SegmentInfo not
be published until the Type == COMPLETE.

How come merge is not also final in MergeEvent?

I agree we should change the name. IndexEventListener?

I don't think we need CompositeSegmentListener? Why not an API to
just add/remove listeners? Also: are we sure this belongs in IWC?
This is analogous to infoStream, which is on IW. It's not a config
parameter that affects indexing.

Should we also track segment flushed/aborted events?

Can you add some jdocs and mark the API as experimental?

Create IndexWriter event listener, specifically for merges
--

Key: LUCENE-2856
URL: https://issues.apache.org/jira/browse/LUCENE-2856
Project: Lucene - Java
Issue Type: Improvement
Components: Index
Affects Versions: 4.0
Reporter: Jason Rutherglen
Attachments: LUCENE-2856.patch, LUCENE-2856.patch, LUCENE-2856.patch

The issue will allow users to monitor merges occurring within IndexWriter
using a callback notifier event listener. This can be used by external
applications such as Solr to monitor large segment merges.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly


[ 
https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983158#action_12983158
 ] 

Chris Male commented on LUCENE-2657:


That was basically what I was getting at (perhaps not clearly enough).  

Would a satisfactory compromise be to view this patch as adding development 
support for maven, which is not to do with whether maven artifacts are released 
or not? 

The discussion about release process, artifacts and build system flamewars can 
then happen outside of this.

 Replace Maven POM templates with full POMs, and change documentation 
 accordingly
 

 Key: LUCENE-2657
 URL: https://issues.apache.org/jira/browse/LUCENE-2657
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.1, 4.0
Reporter: Steven Rowe
Assignee: Steven Rowe
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch


 The current Maven POM templates only contain dependency information, the bare 
 bones necessary for uploading artifacts to the Maven repository.
 The full Maven POMs in the attached patch include the information necessary 
 to run a multi-module Maven build, in addition to serving the same purpose as 
 the current POM templates.
 Several dependencies are not available through public maven repositories.  A 
 profile in the top-level POM can be activated to install these dependencies 
 from the various {{lib/}} directories into your local repository.  From the 
 top-level directory:
 {code}
 mvn -N -Pbootstrap install
 {code}
 Once these non-Maven dependencies have been installed, to run all Lucene/Solr 
 tests via Maven's surefire plugin, and populate your local repository with 
 all artifacts, from the top level directory, run:
 {code}
 mvn install
 {code}
 When one Lucene/Solr module depends on another, the dependency is declared on 
 the *artifact(s)* produced by the other module and deposited in your local 
 repository, rather than on the other module's un-jarred compiler output in 
 the {{build/}} directory, so you must run {{mvn install}} on the other module 
 before its changes are visible to the module that depends on it.
 To create all the artifacts without running tests:
 {code}
 mvn -DskipTests install
 {code}
 I almost always include the {{clean}} phase when I do a build, e.g.:
 {code}
 mvn -DskipTests clean install
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2474) Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey)

[
https://issues.apache.org/jira/browse/LUCENE-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983159#action_12983159
]

Michael McCandless commented on LUCENE-2474:

bq. Still, I think that using CopyOnWriteArrayList is best here.

OK I'll switch back to COWAL... it makes me nervous though. I like
being defensive and the added cost of CHM iteration really should be
negligible here.

{quote}
I'd like even more for there to be just a single CopyOnWriteArrayList per
top-level reader that is then propagated to all sub/segment readers,
including new ones on a reopen. But I guess Mike indicated that was currently
too hard/hairy.
{quote}

This did get hairy... eg if you make a MultiReader (or ParallelReader)
w/ subs... what should happen to their listeners? Ie what if the subs
already have listeners enrolled?

It also spooked me that apps may think they have to re-register after
re-open (if we stick w/ ArrayList) since then the list'd just grow...
it's trappy.

And, if you pull an NRT reader from IW (which is what reopen does
under the hood for an NRT reader), how to share its listeners? Ie,
we'd have to add a setter to IW as well, so it's also single source
(propagates on reopen).

This is why I fell back to a simple static as the baby step for now.

{quote}
The static is really non-optimal though - among other problems, it requires
systems with multiple readers (and wants to do different things with different
readers, such as maintain separate caches) to figure out what top-level reader
a segment reader is associated with. And given that we are dealing with
IndexReader instances in the callbacks, and not ReaderContext objects, this
seems impossible?
{quote}

ReaderContext doesn't really make sense here?

Ie, the listener is invoked when any/all composite readers sharing a
given segment have now closed (ie when the RC for that segment's core
drops to 0), or when a composite reader is closed.

Also, in practice, is it really so hard for the app to figure out
which SR goes to which of their caches? Isn't this typically a
containsKey against the app level caches...?

Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean
custom caches that use the IndexReader (getFieldCacheKey)

Key: LUCENE-2474
URL: https://issues.apache.org/jira/browse/LUCENE-2474
Project: Lucene - Java
Issue Type: Improvement
Components: Search
Reporter: Shay Banon
Attachments: LUCENE-2474.patch, LUCENE-2474.patch

Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean
custom caches that use the IndexReader (getFieldCacheKey).
A spin of: https://issues.apache.org/jira/browse/LUCENE-2468. Basically, its
make a lot of sense to cache things based on IndexReader#getFieldCacheKey,
even Lucene itself uses it, for example, with the CachingWrapperFilter.
FieldCache enjoys being called explicitly to purge its cache when possible
(which is tricky to know from the outside, especially when using NRT -
reader attack of the clones).
The provided patch allows to plug a CacheEvictionListener which will be
called when the cache should be purged for an IndexReader.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly

2011-01-18 Thread Earwin Burrfoot (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983160#action_12983160
]

Earwin Burrfoot commented on LUCENE-2657:
-

bq. we need to be very clear and it has no effect on artifacts
I feel something was missed in the heat of debate. Eg:
bq. The latest patch on this release uses the Ant artifacts directly.
bq. This patch uses the Ant-produced artifacts to prepare for Maven artifact
publishing.
bq. Maven itself is not invoked in the process. An Ant plugin handles the
artifact deployment.
I will now try to decipher these quotes.
It seems the patch takes the artifacts produced by Ant, as a part of our usual
(and only) build process, and shoves it down Maven repository's throat along
with a bunch of pom-descriptors.
Nothing else is happening.

Also, after everything that has been said, I think nobody in his right mind
will *force* anyone to actually use the Ant target in question as a part of
release. But it's nice to have it around, in case some user-friendly commiter
would like to push (I'd like to reiterate - ant generated) artifacts into Maven.

Replace Maven POM templates with full POMs, and change documentation
accordingly

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly


[ 
https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983161#action_12983161
 ] 

Robert Muir commented on LUCENE-2657:
-

Chris: well thats the problem with maven, it tries to be too many things, a 
dependency management tool,
a packaging system, a build system, ...

So, thats why I said we have to just be very clear about which exact scope of 
maven we are discussing.
If the patch presented here is against /dev-tools, and is to assist developers 
who like maven, then as
I said before I am totally ok with this, but I'm only speaking for myself.

Because maven is so many things, and due to Earwin's confusion, I think it 
would be good in general to 
add a README.txt to dev-tools anyway, that states what exactly it is (tools to 
assist lucene/solr developers,
that aren't supported, its not bugs if they stop working, and will be deleted 
if they rot).

Separately what you said about other code in trunk is totally true... for 
example its my opinion that there is 
a lot of code in lucene's contrib that should be moved out to something like 
apache-extras... currently lucene's 
contrib has to compile and pass tests or the build fails... there is definitely 
some stuff in there that is more
sandboxy, slows down lucene core development, but itself isnt getting much 
maintenance other than devs
doing the minimum work to make them pass tests... and we should be keep other 
options in mind for stuff like this.



 Replace Maven POM templates with full POMs, and change documentation 
 accordingly
 

 Key: LUCENE-2657
 URL: https://issues.apache.org/jira/browse/LUCENE-2657
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.1, 4.0
Reporter: Steven Rowe
Assignee: Steven Rowe
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch


 The current Maven POM templates only contain dependency information, the bare 
 bones necessary for uploading artifacts to the Maven repository.
 The full Maven POMs in the attached patch include the information necessary 
 to run a multi-module Maven build, in addition to serving the same purpose as 
 the current POM templates.
 Several dependencies are not available through public maven repositories.  A 
 profile in the top-level POM can be activated to install these dependencies 
 from the various {{lib/}} directories into your local repository.  From the 
 top-level directory:
 {code}
 mvn -N -Pbootstrap install
 {code}
 Once these non-Maven dependencies have been installed, to run all Lucene/Solr 
 tests via Maven's surefire plugin, and populate your local repository with 
 all artifacts, from the top level directory, run:
 {code}
 mvn install
 {code}
 When one Lucene/Solr module depends on another, the dependency is declared on 
 the *artifact(s)* produced by the other module and deposited in your local 
 repository, rather than on the other module's un-jarred compiler output in 
 the {{build/}} directory, so you must run {{mvn install}} on the other module 
 before its changes are visible to the module that depends on it.
 To create all the artifacts without running tests:
 {code}
 mvn -DskipTests install
 {code}
 I almost always include the {{clean}} phase when I do a build, e.g.:
 {code}
 mvn -DskipTests clean install
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly

2011-01-18 Thread Earwin Burrfoot (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983162#action_12983162
 ] 

Earwin Burrfoot commented on LUCENE-2657:
-

Thanks, but I'm not the one confused here. : )

 Replace Maven POM templates with full POMs, and change documentation 
 accordingly
 

 Key: LUCENE-2657
 URL: https://issues.apache.org/jira/browse/LUCENE-2657
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.1, 4.0
Reporter: Steven Rowe
Assignee: Steven Rowe
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch


 The current Maven POM templates only contain dependency information, the bare 
 bones necessary for uploading artifacts to the Maven repository.
 The full Maven POMs in the attached patch include the information necessary 
 to run a multi-module Maven build, in addition to serving the same purpose as 
 the current POM templates.
 Several dependencies are not available through public maven repositories.  A 
 profile in the top-level POM can be activated to install these dependencies 
 from the various {{lib/}} directories into your local repository.  From the 
 top-level directory:
 {code}
 mvn -N -Pbootstrap install
 {code}
 Once these non-Maven dependencies have been installed, to run all Lucene/Solr 
 tests via Maven's surefire plugin, and populate your local repository with 
 all artifacts, from the top level directory, run:
 {code}
 mvn install
 {code}
 When one Lucene/Solr module depends on another, the dependency is declared on 
 the *artifact(s)* produced by the other module and deposited in your local 
 repository, rather than on the other module's un-jarred compiler output in 
 the {{build/}} directory, so you must run {{mvn install}} on the other module 
 before its changes are visible to the module that depends on it.
 To create all the artifacts without running tests:
 {code}
 mvn -DskipTests install
 {code}
 I almost always include the {{clean}} phase when I do a build, e.g.:
 {code}
 mvn -DskipTests clean install
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Let's drop Maven Artifacts !

2011-01-18 Thread Stevo Slavić

More than one build tools is not way to go, I believe everyone agrees
on that, and that it's not an issue.

Have you guys at least considered making a switch to a build tool that
knows to produce maven artifacts (or enhancing exiting one to take
care of that)? E.g. ant+ivy, gradle, maven itself.

IMO making a switch to a modern build tool or enhancing existing one
to produce maven artifacts at the moment is out of best interest for
any open source project including this one, it will be out of benefit
for projec users/contributors, developers, and project as a whole:
- official project binaries will (continue to) be available to as
large as possible user base so you'll get more potential testers/bug
reporters, and more potential contributors, and more potential
commercial/paying customers which will raise project quality, bring
new ideas, and finance future development
- modern build tools have declarative dependency management so it will
be easier to develop and contribute, at least one won't have to wait
for dependency libs to get downloaded together with sources every time
project is checked out and you will not have to manually download
new/updated 3rd party dependencies, just change build script/metadata
- modern build tools try to be and mostly are non intrusive, and
promote good proven solutions like standard project structure/layout
so it's easier to get started and productive on such projects compared
to projects with custom layout;
- modern build tools are better integrated with current development
infrastructure tools, like IDEs, and continuous integration servers.

This switch would also make it easier to maintain project metadata, to
keep metadata DRY, so that publishing Maven artifacts even if decided
not to be part of main release process, can be done with not much
effort and enough credibility.

If who cares about project maven artifact consumers regardless of size
of that community attitude is accepted and official project stand, and
project community size is not considered as project asset, I don't
understand why project is being published under open source license.

Regards,
Stevo.


On Tue, Jan 18, 2011 at 11:50 AM, Robert Muir rcm...@gmail.com wrote:
 On Tue, Jan 18, 2011 at 5:29 AM, Hardy Ferentschik s...@ferentschik.de 
 wrote:

 It also means that someone outside the dev community will at some stage
 create some
 pom files and upload the artifact to a (semi-) public repository.

 This sounds great! this is how open source works, those who care about
 it, will make it happen!

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly


[ 
https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983163#action_12983163
 ] 

Chris Male commented on LUCENE-2657:


Ant does many things too and we use it in a specific way so I see no problem 
defining what we intend our maven support to be for.

So I'm feeling some consensus (fortunately I spoke too soon before) that if we 
target this toward being a development tool which is not
forced upon any users / release managers.

Is this okay with you Steven?

A README.txt describing the scope of the dev-tools sounds appropriate 
irrespective of what happens here.  I certainly wasn't aware of what
their maintenance plan was.

 Replace Maven POM templates with full POMs, and change documentation 
 accordingly
 

 Key: LUCENE-2657
 URL: https://issues.apache.org/jira/browse/LUCENE-2657
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.1, 4.0
Reporter: Steven Rowe
Assignee: Steven Rowe
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch


 The current Maven POM templates only contain dependency information, the bare 
 bones necessary for uploading artifacts to the Maven repository.
 The full Maven POMs in the attached patch include the information necessary 
 to run a multi-module Maven build, in addition to serving the same purpose as 
 the current POM templates.
 Several dependencies are not available through public maven repositories.  A 
 profile in the top-level POM can be activated to install these dependencies 
 from the various {{lib/}} directories into your local repository.  From the 
 top-level directory:
 {code}
 mvn -N -Pbootstrap install
 {code}
 Once these non-Maven dependencies have been installed, to run all Lucene/Solr 
 tests via Maven's surefire plugin, and populate your local repository with 
 all artifacts, from the top level directory, run:
 {code}
 mvn install
 {code}
 When one Lucene/Solr module depends on another, the dependency is declared on 
 the *artifact(s)* produced by the other module and deposited in your local 
 repository, rather than on the other module's un-jarred compiler output in 
 the {{build/}} directory, so you must run {{mvn install}} on the other module 
 before its changes are visible to the module that depends on it.
 To create all the artifacts without running tests:
 {code}
 mvn -DskipTests install
 {code}
 I almost always include the {{clean}} phase when I do a build, e.g.:
 {code}
 mvn -DskipTests clean install
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Resolved: (LUCENE-2295) Create a MaxFieldLengthAnalyzer to wrap any other Analyzer and provide the same functionality as MaxFieldLength provided on IndexWriter


 [ 
https://issues.apache.org/jira/browse/LUCENE-2295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera resolved LUCENE-2295.


Resolution: Fixed

Committed revision 1060340 (trunk).
Committed revision 1060342 (3x).

 Create a MaxFieldLengthAnalyzer to wrap any other Analyzer and provide the 
 same functionality as MaxFieldLength provided on IndexWriter
 ---

 Key: LUCENE-2295
 URL: https://issues.apache.org/jira/browse/LUCENE-2295
 Project: Lucene - Java
  Issue Type: Improvement
  Components: contrib/analyzers
Reporter: Shai Erera
Assignee: Uwe Schindler
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2295-2-3x.patch, LUCENE-2295-2-trunk.patch, 
 LUCENE-2295-trunk.patch, LUCENE-2295.patch


 A spinoff from LUCENE-2294. Instead of asking the user to specify on 
 IndexWriter his requested MFL limit, we can get rid of this setting entirely 
 by providing an Analyzer which will wrap any other Analyzer and its 
 TokenStream with a TokenFilter that keeps track of the number of tokens 
 produced and stop when the limit has reached.
 This will remove any count tracking in IW's indexing, which is done even if I 
 specified UNLIMITED for MFL.
 Let's try to do it for 3.1.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Let's drop Maven Artifacts !

On Tue, Jan 18, 2011 at 6:56 AM, Stevo Slavić ssla...@gmail.com wrote:
 More than one build tools is not way to go, I believe everyone agrees
 on that, and that it's not an issue.

 Have you guys at least considered making a switch to a build tool that
 knows to produce maven artifacts (or enhancing exiting one to take
 care of that)? E.g. ant+ivy, gradle, maven itself.


I think its important to look at the build system as supporting
development too, but most features being developed today are against
lucene's core: which has no dependencies at all.

For example, our ant build supports rapidly running the core tests
(splitting them across different jvms in parallel: i've looked at the
support for parallel testing in other build systems like maven and I
think ours is significantly better for our tests).

This compile-test-debug lifecycle is important, for the lucene core
tests its very fast.

So while I might agree with you that for something like Solr
development, perhaps ant+ivy is something worth considering, I think
its overkill and would be a step backwards for lucene, we would only
slow down development.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Windows test failure VelocityResponseWriter, unmodified trunk.

2011-01-18 Thread Erick Erickson

Yep, already tried a fresh checkout before sending the e-mail. At first
glance this looks like a classpath issue hopefully just on my machine,
but it was late last night and I wanted to give someone a chance to pipe
up with Ooops, I was changing that and.. Yes, I'm lazy when I can
be. Er... Efficient that is.

Erick

On Tue, Jan 18, 2011 at 12:19 AM, Yonik Seeley
yo...@lucidimagination.comwrote:

 On Mon, Jan 17, 2011 at 10:42 PM, Erick Erickson
 erickerick...@gmail.com wrote:
  H, a fresh, unmodified checkout of Solr will fail on my Windows7 box
 if
  I run ant -Dtestcase=VelocityResponseWriterTest test. It succeeds on my
  Mac. Anyone got a clue? Or should I look into it? Of course it succeeds
 in
  IntelliJ. S

 My windows laptop took a vacation (a permanent one) so I can't verify.
 But  when I see NoSuchMethod runtime exceptions, I usually try a fresh
 checkout first.  It's sometimes just stuff not getting cleaned up
 properly.

 -Yonik
 http://www.lucidimagination.com

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Updated: (SOLR-445) XmlUpdateRequestHandler bad documents mid batch aborts rest of batch

[
https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Erick Erickson updated SOLR-445:

Attachment: SOLR-445-3_x.patch
SOLR-445.patch

I think it's ready for review, both trunk and 3_x. Would someone look this over
and commit it if they think it's ready?

Note to self: do NOT call initCore in a test case just because you need a
different schema.

The problem I was having with running tests was because I needed a schema file
with a required field so I naively called initCore with schema11.xml in spite
of the fact that @BeforeClass called it with just schema.xml. Which apparently
does bad things with the state of *something* and caused other tests to fail...
I can get TestDistributedSearch to fail on unchanged source code simply by
calling initCore with schema11.xml and doing nothing else in a new test case in
BasicFunctionalityTest. So I put my new tests that required schema11 in a new
file instead.

The XML file attached is not intended to be committed, it is just a convenience
for anyone checking out this patch to run against a Solr instance to see what
is returned.

This seems to return the data in the SolrJ case as well.

NOTE: This does change the behavior of Solr. Without this patch, the first
document that is incorrect stops processing. Now, it continues merrily on
adding documents as it can. Is this desirable behavior? It would be easy to
abort on first error if that's the consensus, and I could take some tedious
record-keeping out. I think there's no big problem with continuing on, since
the state of committed documents is indeterminate already when errors occur so
worrying about this should be part of a bigger issue.

XmlUpdateRequestHandler bad documents mid batch aborts rest of batch

Key: SOLR-445
URL: https://issues.apache.org/jira/browse/SOLR-445
Project: Solr
Issue Type: Bug
Components: update
Affects Versions: 1.3
Reporter: Will Johnson
Assignee: Erick Erickson
Fix For: Next

Attachments: SOLR-445-3_x.patch, SOLR-445.patch, SOLR-445.patch,
solr-445.xml

Has anyone run into the problem of handling bad documents / failures mid
batch. Ie:
add
doc
field name=id1/field
/doc
doc
field name=id2/field
field name=myDateFieldI_AM_A_BAD_DATE/field
/doc
doc
field name=id3/field
/doc
/add
Right now solr adds the first doc and then aborts. It would seem like it
should either fail the entire batch or log a message/return a code and then
continue on to add doc 3. Option 1 would seem to be much harder to
accomplish and possibly require more memory while Option 2 would require more
information to come back from the API. I'm about to dig into this but I
thought I'd ask to see if anyone had any suggestions, thoughts or comments.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Windows test failure VelocityResponseWriter, unmodified trunk.

2011-01-18 Thread Erick Erickson

Robert:

Thanks, that's just the kind of hint I was looking for. I'll be able to
spend
some time on this a bit later.

Erick

On Tue, Jan 18, 2011 at 7:55 AM, Robert Muir rcm...@gmail.com wrote:

 Erick, I think i know the problem: see
 https://issues.apache.org/jira/browse/SOLR-2303

 perhaps the issue is somehow not fixed though. feel free to re-open
 it and we can try to get to the bottom of it...
 But i suspect it has to do with log4j jars being in ant's classpath,
 and somewhere in solr's build it must be adding ant's classpath to the
 junit runtime classpath... i know i cleared this up for lucene but
 perhaps i missed a spot for solr.

 On Tue, Jan 18, 2011 at 7:50 AM, Erick Erickson erickerick...@gmail.com
 wrote:
  Yep, already tried a fresh checkout before sending the e-mail. At first
  glance this looks like a classpath issue hopefully just on my machine,
  but it was late last night and I wanted to give someone a chance to pipe
  up with Ooops, I was changing that and.. Yes, I'm lazy when I can
  be. Er... Efficient that is.
  Erick
 
  On Tue, Jan 18, 2011 at 12:19 AM, Yonik Seeley 
 yo...@lucidimagination.com
  wrote:
 
  On Mon, Jan 17, 2011 at 10:42 PM, Erick Erickson
  erickerick...@gmail.com wrote:
   H, a fresh, unmodified checkout of Solr will fail on my Windows7
 box
   if
   I run ant -Dtestcase=VelocityResponseWriterTest test. It succeeds on
   my
   Mac. Anyone got a clue? Or should I look into it? Of course it
 succeeds
   in
   IntelliJ. S
 
  My windows laptop took a vacation (a permanent one) so I can't verify.
  But  when I see NoSuchMethod runtime exceptions, I usually try a fresh
  checkout first.  It's sometimes just stuff not getting cleaned up
  properly.
 
  -Yonik
  http://www.lucidimagination.com
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Assigned: (LUCENE-2584) Concurrency issues in SegmentInfo.files() could lead to ConcurrentModificationException


 [ 
https://issues.apache.org/jira/browse/LUCENE-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera reassigned LUCENE-2584:
--

Assignee: Shai Erera

 Concurrency issues in SegmentInfo.files() could lead to 
 ConcurrentModificationException
 ---

 Key: LUCENE-2584
 URL: https://issues.apache.org/jira/browse/LUCENE-2584
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Affects Versions: 2.9, 2.9.1, 2.9.2, 2.9.3, 3.0, 3.0.1, 3.0.2
Reporter: Alexander Kanarsky
Assignee: Shai Erera
Priority: Minor
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2584-branch_3x.patch, 
 LUCENE-2584-lucene-2_9.patch, LUCENE-2584-lucene-3_0.patch


 The multi-threaded call of the files() in SegmentInfo could lead to the 
 ConcurrentModificationException if one thread is not finished additions to 
 the ArrayList (files) yet while the other thread already obtained it as 
 cached (see below). This is a rare exception, but it would be nice to fix. I 
 see the code is no longer problematic in the trunk (and others ported from 
 flex_1458), looks it was fixed while implementing post 3.x features. The fix 
 to 3.x and 2.9.x branches could be the same - create the files set first and 
 populate it, and then assign to the member variable at the end of the method. 
 This will resolve the issue. I could prepare the patch for 2.9.4 and 3.x, if 
 needed.
 --
 INFO: [19] webapp= path=/replication params={command=fetchindexwt=javabin} 
 status=0 QTime=1
 Jul 30, 2010 9:13:05 AM org.apache.solr.core.SolrCore execute
 INFO: [19] webapp= path=/replication params={command=detailswt=javabin} 
 status=0 QTime=24
 Jul 30, 2010 9:13:05 AM org.apache.solr.handler.ReplicationHandler doFetch
 SEVERE: SnapPull failed
 java.util.ConcurrentModificationException
 at 
 java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372)
 at java.util.AbstractList$Itr.next(AbstractList.java:343)
 at java.util.AbstractCollection.addAll(AbstractCollection.java:305)
 at org.apache.lucene.index.SegmentInfos.files(SegmentInfos.java:826)
 at 
 org.apache.lucene.index.DirectoryReader$ReaderCommit.init(DirectoryReader.java:916)
 at 
 org.apache.lucene.index.DirectoryReader.getIndexCommit(DirectoryReader.java:856)
 at 
 org.apache.solr.search.SolrIndexReader.getIndexCommit(SolrIndexReader.java:454)
 at 
 org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:261)
 at 
 org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:264)
 at 
 org.apache.solr.handler.ReplicationHandler$1.run(ReplicationHandler.java:146)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-2584) Concurrency issues in SegmentInfo.files() could lead to ConcurrentModificationException


 [ 
https://issues.apache.org/jira/browse/LUCENE-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-2584:
---

Attachment: LUCENE-2584.patch

Patch against 3x - fixes the bug according to Alexander's other patch (but uses 
HashSet all the way), and I added a CHANGES entry and test case to 
TestSegmentInfo. I plan to commit this soon and also backport to 3.0 and 2.9

 Concurrency issues in SegmentInfo.files() could lead to 
 ConcurrentModificationException
 ---

 Key: LUCENE-2584
 URL: https://issues.apache.org/jira/browse/LUCENE-2584
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Affects Versions: 2.9, 2.9.1, 2.9.2, 2.9.3, 3.0, 3.0.1, 3.0.2
Reporter: Alexander Kanarsky
Assignee: Shai Erera
Priority: Minor
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2584-branch_3x.patch, 
 LUCENE-2584-lucene-2_9.patch, LUCENE-2584-lucene-3_0.patch, LUCENE-2584.patch


 The multi-threaded call of the files() in SegmentInfo could lead to the 
 ConcurrentModificationException if one thread is not finished additions to 
 the ArrayList (files) yet while the other thread already obtained it as 
 cached (see below). This is a rare exception, but it would be nice to fix. I 
 see the code is no longer problematic in the trunk (and others ported from 
 flex_1458), looks it was fixed while implementing post 3.x features. The fix 
 to 3.x and 2.9.x branches could be the same - create the files set first and 
 populate it, and then assign to the member variable at the end of the method. 
 This will resolve the issue. I could prepare the patch for 2.9.4 and 3.x, if 
 needed.
 --
 INFO: [19] webapp= path=/replication params={command=fetchindexwt=javabin} 
 status=0 QTime=1
 Jul 30, 2010 9:13:05 AM org.apache.solr.core.SolrCore execute
 INFO: [19] webapp= path=/replication params={command=detailswt=javabin} 
 status=0 QTime=24
 Jul 30, 2010 9:13:05 AM org.apache.solr.handler.ReplicationHandler doFetch
 SEVERE: SnapPull failed
 java.util.ConcurrentModificationException
 at 
 java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372)
 at java.util.AbstractList$Itr.next(AbstractList.java:343)
 at java.util.AbstractCollection.addAll(AbstractCollection.java:305)
 at org.apache.lucene.index.SegmentInfos.files(SegmentInfos.java:826)
 at 
 org.apache.lucene.index.DirectoryReader$ReaderCommit.init(DirectoryReader.java:916)
 at 
 org.apache.lucene.index.DirectoryReader.getIndexCommit(DirectoryReader.java:856)
 at 
 org.apache.solr.search.SolrIndexReader.getIndexCommit(SolrIndexReader.java:454)
 at 
 org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:261)
 at 
 org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:264)
 at 
 org.apache.solr.handler.ReplicationHandler$1.run(ReplicationHandler.java:146)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2584) Concurrency issues in SegmentInfo.files() could lead to ConcurrentModificationException


[ 
https://issues.apache.org/jira/browse/LUCENE-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983183#action_12983183
 ] 

Michael McCandless commented on LUCENE-2584:


Patch looks good Shai!

I don't think you need to backport to 2.9/3.0 immediately (unless you really 
want to!)?  We can backport if/when we do another release...

 Concurrency issues in SegmentInfo.files() could lead to 
 ConcurrentModificationException
 ---

 Key: LUCENE-2584
 URL: https://issues.apache.org/jira/browse/LUCENE-2584
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Affects Versions: 2.9, 2.9.1, 2.9.2, 2.9.3, 3.0, 3.0.1, 3.0.2
Reporter: Alexander Kanarsky
Assignee: Shai Erera
Priority: Minor
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2584-branch_3x.patch, 
 LUCENE-2584-lucene-2_9.patch, LUCENE-2584-lucene-3_0.patch, LUCENE-2584.patch


 The multi-threaded call of the files() in SegmentInfo could lead to the 
 ConcurrentModificationException if one thread is not finished additions to 
 the ArrayList (files) yet while the other thread already obtained it as 
 cached (see below). This is a rare exception, but it would be nice to fix. I 
 see the code is no longer problematic in the trunk (and others ported from 
 flex_1458), looks it was fixed while implementing post 3.x features. The fix 
 to 3.x and 2.9.x branches could be the same - create the files set first and 
 populate it, and then assign to the member variable at the end of the method. 
 This will resolve the issue. I could prepare the patch for 2.9.4 and 3.x, if 
 needed.
 --
 INFO: [19] webapp= path=/replication params={command=fetchindexwt=javabin} 
 status=0 QTime=1
 Jul 30, 2010 9:13:05 AM org.apache.solr.core.SolrCore execute
 INFO: [19] webapp= path=/replication params={command=detailswt=javabin} 
 status=0 QTime=24
 Jul 30, 2010 9:13:05 AM org.apache.solr.handler.ReplicationHandler doFetch
 SEVERE: SnapPull failed
 java.util.ConcurrentModificationException
 at 
 java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372)
 at java.util.AbstractList$Itr.next(AbstractList.java:343)
 at java.util.AbstractCollection.addAll(AbstractCollection.java:305)
 at org.apache.lucene.index.SegmentInfos.files(SegmentInfos.java:826)
 at 
 org.apache.lucene.index.DirectoryReader$ReaderCommit.init(DirectoryReader.java:916)
 at 
 org.apache.lucene.index.DirectoryReader.getIndexCommit(DirectoryReader.java:856)
 at 
 org.apache.solr.search.SolrIndexReader.getIndexCommit(SolrIndexReader.java:454)
 at 
 org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:261)
 at 
 org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:264)
 at 
 org.apache.solr.handler.ReplicationHandler$1.run(ReplicationHandler.java:146)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2472) The terms index divisor in IW should be set via IWC not via getReader


[ 
https://issues.apache.org/jira/browse/LUCENE-2472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983189#action_12983189
 ] 

Shai Erera commented on LUCENE-2472:


This is already set on IWC (set/getReaderTermsIndexDivisor).

So I guess all that's needed is to deprecate IW.getReader(int) on 3x and remove 
from trunk?

 The terms index divisor in IW should be set via IWC not via getReader
 -

 Key: LUCENE-2472
 URL: https://issues.apache.org/jira/browse/LUCENE-2472
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 3.1, 4.0


 The getReader call gives a false sense of security... since if deletions have 
 already been applied (and IW is pooling) the readers have already been loaded 
 with a divisor of 1.
 Better to set the divisor up front in IWC.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Let's drop Maven Artifacts !

It seems to me that if we have a fix for the things that ail our Maven support 
(Steve's work), that it isn't then the reason for holding up a release and we 
should just keep them as there are a significant number of users who consume 
Lucene that way (via the central repository).  I agree that we should not 
switch our build system,  but supporting the POMs is no different than 
supporting the IntelliJ/Eclipse generation tools (they are both problematic 
since they are not automated)   


On Jan 18, 2011, at 7:48 AM, Robert Muir wrote:

 On Tue, Jan 18, 2011 at 6:56 AM, Stevo Slavić ssla...@gmail.com wrote:
 More than one build tools is not way to go, I believe everyone agrees
 on that, and that it's not an issue.
 
 Have you guys at least considered making a switch to a build tool that
 knows to produce maven artifacts (or enhancing exiting one to take
 care of that)? E.g. ant+ivy, gradle, maven itself.
 
 
 I think its important to look at the build system as supporting
 development too, but most features being developed today are against
 lucene's core: which has no dependencies at all.
 
 For example, our ant build supports rapidly running the core tests
 (splitting them across different jvms in parallel: i've looked at the
 support for parallel testing in other build systems like maven and I
 think ours is significantly better for our tests).
 
 This compile-test-debug lifecycle is important, for the lucene core
 tests its very fast.
 
 So while I might agree with you that for something like Solr
 development, perhaps ant+ivy is something worth considering, I think
 its overkill and would be a step backwards for lucene, we would only
 slow down development.
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 

--
Grant Ingersoll
http://www.lucidimagination.com


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2374) Add reflection API to AttributeSource/AttributeImpl

2011-01-18 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983195#action_12983195
 ] 

Uwe Schindler commented on LUCENE-2374:
---

In my opinion, there is lots of code duplication in unmainainable analysis.jsp. 
I think we should open a new issue to remove it and replace by an XSL or 
alternatively make its internal functionality backed by 
FieldAnalysisReuqestHandler.

 Add reflection API to AttributeSource/AttributeImpl
 ---

 Key: LUCENE-2374
 URL: https://issues.apache.org/jira/browse/LUCENE-2374
 Project: Lucene - Java
  Issue Type: Improvement
  Components: contrib/analyzers
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2374-3x.patch, LUCENE-2374-3x.patch, 
 LUCENE-2374-3x.patch, shot1.png, shot2.png, shot3.png, shot4.png


 AttributeSource/TokenStream inspection in Solr needs to have some insight 
 into the contents of AttributeImpls. As LUCENE-2302 has some problems with 
 toString() [which is not structured and conflicts with CharSequence's 
 definition for CharTermAttribute], I propose an simple API that get a default 
 implementation in AttributeImpl (just like toString() current):
 - IteratorMap.EntryString,? AttributeImpl.contentsIterator() returns an 
 iterator (for most attributes its a singleton) of a key-value pair, e.g. 
 term-foobar,startOffset-Integer.valueOf(0),...
 - AttributeSource gets the same method, it just concat the iterators of each 
 getAttributeImplsIterator() AttributeImpl
 No backwards problems occur, as the default toString() method will work like 
 before (it just gets iterator and lists), but we simply remove the 
 documentation for the format. (Char)TermAttribute gets a special impl fo 
 toString() according to CharSequence and a corresponding iterator.
 I also want to remove the abstract hashCode() and equals() methods from 
 AttributeImpl, as they are not needed and just create work for the 
 implementor.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Let's drop Maven Artifacts !

2011-01-18 Thread Earwin Burrfoot

On Tue, Jan 18, 2011 at 17:00, Robert Muir rcm...@gmail.com wrote:
 On Tue, Jan 18, 2011 at 8:54 AM, Grant Ingersoll gsing...@apache.org wrote:
 It seems to me that if we have a fix for the things that ail our Maven 
 support (Steve's work), that it isn't then the reason for holding up a 
 release and we should just keep them as there are a significant number of 
 users who consume Lucene that way (via the central repository).  I agree 
 that we should not switch our build system,  but supporting the POMs is no 
 different than supporting the IntelliJ/Eclipse generation tools (they are 
 both problematic since they are not automated)


 its totally different in every way! we don't release the
 intellij/eclipse stuff, its for internal use only.
 additionally, there are no release artifacts generated by these
Latest code from LUCENE-2657 does not generate any new artifacts. It
uploads those you already have (built via ant) to the repo.


-- 
Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com)
Phone: +7 (495) 683-567-4
ICQ: 104465785

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Exception hit on 3_0 branch

2011-01-18 Thread Shai Erera

Hi

I ran tests on 3_0 branch and hit this:

[junit] Testcase:
testRankByte(org.apache.lucene.search.function.TestFieldScoreQuery):
Caused an ERROR
[junit] null
[junit] java.util.ConcurrentModificationException
[junit] at
java.util.WeakHashMap$HashIterator.next(WeakHashMap.java:169)
[junit] at
org.apache.lucene.search.FieldCacheImpl.getCacheEntries(FieldCacheImpl.java:75)
[junit] at
org.apache.lucene.util.LuceneTestCase.assertSaneFieldCaches(LuceneTestCase.java:133)
[junit] at
org.apache.lucene.util.LuceneTestCase.tearDown(LuceneTestCase.java:100)
[junit] at
org.apache.lucene.search.function.FunctionTestSetup.tearDown(FunctionTestSetup.java:86)
[junit] at
org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:216)

I couldn't reproduce it the second time I ran the test (test only and all
tests), and I don't know if it applies to 3x/trunk too. I can dig into it
later, but sending to the list in case someone wants to look at it before.

I see that the method is called from tearDown() and ConcurrentModEx suggests
someone added to the set during while someone else iterated over it -- could
it be that the tests step on each other somehow?

Shai

[jira] Commented: (SOLR-1604) Wildcards, ORs etc inside Phrase Queries

2011-01-18 Thread Salman Akram (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983208#action_12983208
 ] 

Salman Akram commented on SOLR-1604:


I tried the patch with latest non-grayed file but still inOrder doesn't seem to 
have any impact.

Results for a b~5 and b a~5 are still different.

Also any feedback about CommonGrams integration?

Thanks a lot for all the help!

 Wildcards, ORs etc inside Phrase Queries
 

 Key: SOLR-1604
 URL: https://issues.apache.org/jira/browse/SOLR-1604
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
Reporter: Ahmet Arslan
Priority: Minor
 Fix For: Next

 Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, 
 ComplexPhrase.zip, ComplexPhraseQueryParser.java, SOLR-1604.patch


 Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports 
 wildcards, ORs, ranges, fuzzies inside phrase queries.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Resolved: (LUCENE-2584) Concurrency issues in SegmentInfo.files() could lead to ConcurrentModificationException


 [ 
https://issues.apache.org/jira/browse/LUCENE-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera resolved LUCENE-2584.


   Resolution: Fixed
Fix Version/s: (was: 4.0)
   3.0.4
   2.9.5
Lucene Fields: [New, Patch Available]  (was: [New])

Committed revision 1060358 (3x).
Committed revision 1060391 (3.0).
Committed revision 1060398 (2.9).

Thanks Alexander !

 Concurrency issues in SegmentInfo.files() could lead to 
 ConcurrentModificationException
 ---

 Key: LUCENE-2584
 URL: https://issues.apache.org/jira/browse/LUCENE-2584
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Affects Versions: 2.9, 2.9.1, 2.9.2, 2.9.3, 3.0, 3.0.1, 3.0.2
Reporter: Alexander Kanarsky
Assignee: Shai Erera
Priority: Minor
 Fix For: 2.9.5, 3.0.4, 3.1

 Attachments: LUCENE-2584-branch_3x.patch, 
 LUCENE-2584-lucene-2_9.patch, LUCENE-2584-lucene-3_0.patch, LUCENE-2584.patch


 The multi-threaded call of the files() in SegmentInfo could lead to the 
 ConcurrentModificationException if one thread is not finished additions to 
 the ArrayList (files) yet while the other thread already obtained it as 
 cached (see below). This is a rare exception, but it would be nice to fix. I 
 see the code is no longer problematic in the trunk (and others ported from 
 flex_1458), looks it was fixed while implementing post 3.x features. The fix 
 to 3.x and 2.9.x branches could be the same - create the files set first and 
 populate it, and then assign to the member variable at the end of the method. 
 This will resolve the issue. I could prepare the patch for 2.9.4 and 3.x, if 
 needed.
 --
 INFO: [19] webapp= path=/replication params={command=fetchindexwt=javabin} 
 status=0 QTime=1
 Jul 30, 2010 9:13:05 AM org.apache.solr.core.SolrCore execute
 INFO: [19] webapp= path=/replication params={command=detailswt=javabin} 
 status=0 QTime=24
 Jul 30, 2010 9:13:05 AM org.apache.solr.handler.ReplicationHandler doFetch
 SEVERE: SnapPull failed
 java.util.ConcurrentModificationException
 at 
 java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372)
 at java.util.AbstractList$Itr.next(AbstractList.java:343)
 at java.util.AbstractCollection.addAll(AbstractCollection.java:305)
 at org.apache.lucene.index.SegmentInfos.files(SegmentInfos.java:826)
 at 
 org.apache.lucene.index.DirectoryReader$ReaderCommit.init(DirectoryReader.java:916)
 at 
 org.apache.lucene.index.DirectoryReader.getIndexCommit(DirectoryReader.java:856)
 at 
 org.apache.solr.search.SolrIndexReader.getIndexCommit(SolrIndexReader.java:454)
 at 
 org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:261)
 at 
 org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:264)
 at 
 org.apache.solr.handler.ReplicationHandler$1.run(ReplicationHandler.java:146)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2324) Per thread DocumentsWriters that write their own private segments

2011-01-18 Thread Jason Rutherglen (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983209#action_12983209
 ] 

Jason Rutherglen commented on LUCENE-2324:
--

bq. I can't test flush by RAM - that's not working yet on RT right?

Right, we're only flushing by doc count, so we could be flushing segments that 
are too small?  However we can see some of the concurrency gains by not 
sync'ing on IW and allowing documents updates to continue while flushing.

 Per thread DocumentsWriters that write their own private segments
 -

 Key: LUCENE-2324
 URL: https://issues.apache.org/jira/browse/LUCENE-2324
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Michael Busch
Assignee: Michael Busch
Priority: Minor
 Fix For: Realtime Branch

 Attachments: LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, 
 LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, 
 LUCENE-2324.patch, LUCENE-2324.patch, LUCENE-2324.patch, lucene-2324.patch, 
 lucene-2324.patch, LUCENE-2324.patch, test.out, test.out, test.out, test.out


 See LUCENE-2293 for motivation and more details.
 I'm copying here Mike's summary he posted on 2293:
 Change the approach for how we buffer in RAM to a more isolated
 approach, whereby IW has N fully independent RAM segments
 in-process and when a doc needs to be indexed it's added to one of
 them. Each segment would also write its own doc stores and
 normal segment merging (not the inefficient merge we now do on
 flush) would merge them. This should be a good simplification in
 the chain (eg maybe we can remove the *PerThread classes). The
 segments can flush independently, letting us make much better
 concurrent use of IO  CPU.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (SOLR-2316) SynonymFilterFactory should ensure synonyms argument is provided.

2011-01-18 Thread David Smiley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983214#action_12983214
 ] 

David Smiley commented on SOLR-2316:


Both.

 SynonymFilterFactory should ensure synonyms argument is provided.
 -

 Key: SOLR-2316
 URL: https://issues.apache.org/jira/browse/SOLR-2316
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Reporter: David Smiley
Priority: Minor
 Fix For: 3.1

 Attachments: 2316.patch


 If for some reason the synonyms attribute is not present on the filter 
 factory configuration, a latent NPE will eventually show up during 
 indexing/searching.  Instead a helpful error should be thrown at 
 initialization.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (SOLR-445) XmlUpdateRequestHandler bad documents mid batch aborts rest of batch

2011-01-18 Thread Simon Rosenthal (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983215#action_12983215
]

Simon Rosenthal commented on SOLR-445:
--

bq. Don't allow autocommits during an update. Simple. Or, rather, all update
requests block at the beginning during an autocommit. If an update request has
too many documents, don't do so many documents in an update. (Lance)
Lance - How do you (dynamically ) disable autocommits during a specific update
? That functionality would also be useful in other use cases, but that's
another issue).

bq. NOTE: This does change the behavior of Solr. Without this patch, the first
document that is incorrect stops processing. Now, it continues merrily on
adding documents as it can. Is this desirable behavior? It would be easy to
abort on first error if that's the consensus, and I could take some tedious
record-keeping out. I think there's no big problem with continuing on, since
the state of committed documents is indeterminate already when errors occur so
worrying about this should be part of a bigger issue.

I think it should be an option, if possible. I can see use cases where
abort-on-first-error is desirable, but also situations where you know one or
two documents may be erroneous, and its worth continuing on in order to index
the other 99%

XmlUpdateRequestHandler bad documents mid batch aborts rest of batch

Key: SOLR-445
URL: https://issues.apache.org/jira/browse/SOLR-445
Project: Solr
Issue Type: Bug
Components: update
Affects Versions: 1.3
Reporter: Will Johnson
Assignee: Erick Erickson
Fix For: Next

Attachments: SOLR-445-3_x.patch, SOLR-445.patch, SOLR-445.patch,
solr-445.xml

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Lucene-Solr-tests-only-trunk - Build # 3881 - Failure

2011-01-18 Thread Apache Hudson Server

Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/3881/

No tests ran.

Build Log (for compile errors):
[...truncated 62 lines...]
+ 
JAVADOCS_ARTIFACTS=/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/javadocs
+ set +x
Checking for files containing nocommit (exits build with failure if list is 
non-empty):
+ cd /home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout
+ JAVA_HOME=/home/hudson/tools/java/latest1.5 
/home/hudson/tools/ant/latest1.7/bin/ant clean
Buildfile: build.xml

clean:

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build

clean:

clean:
 [echo] Building analyzers-common...

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/modules/analysis/build/common
 [echo] Building analyzers-icu...

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/modules/analysis/build/icu
 [echo] Building analyzers-phonetic...

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/modules/analysis/build/phonetic
 [echo] Building analyzers-smartcn...

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/modules/analysis/build/smartcn
 [echo] Building analyzers-stempel...

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/modules/analysis/build/stempel
 [echo] Building benchmark...

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/modules/benchmark/build

clean-contrib:

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/contrib/analysis-extras/build
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/contrib/analysis-extras/lucene-libs

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/contrib/clustering/build

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/contrib/dataimporthandler/target

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/contrib/extraction/build

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/build

BUILD SUCCESSFUL
Total time: 3 seconds
+ cd 
/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene
+ JAVA_HOME=/home/hudson/tools/java/latest1.5 
/home/hudson/tools/ant/latest1.7/bin/ant compile compile-test build-contrib
Buildfile: build.xml

jflex-uptodate-check:

jflex-notice:

javacc-uptodate-check:

javacc-notice:

init:

clover.setup:

clover.info:
 [echo] 
 [echo]   Clover not found. Code coverage reports disabled.
 [echo] 

clover:

common.compile-core:
[mkdir] Created dir: 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/classes/java
[javac] Compiling 507 source files to 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/classes/java
[javac] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/util/Version.java:73:
 warning: [dep-ann] deprecated name isnt annotated with @Deprecated
[javac]   public boolean onOrAfter(Version other) {
[javac]  ^
[javac] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/index/IndexWriter.java:985:
 method does not override a method from its superclass
[javac]   @Override
[javac]^
[javac] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/queryParser/CharStream.java:34:
 warning: [dep-ann] deprecated name isnt annotated with @Deprecated
[javac]   int getColumn();
[javac]   ^
[javac] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/queryParser/CharStream.java:41:
 warning: [dep-ann] deprecated name isnt annotated with @Deprecated
[javac]   int getLine();
[javac]   ^
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] 1 error
[...truncated 10 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2374) Add reflection API to AttributeSource/AttributeImpl

2011-01-18 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983224#action_12983224
 ] 

Mark Miller commented on LUCENE-2374:
-

Agreed Uwe.

 Add reflection API to AttributeSource/AttributeImpl
 ---

 Key: LUCENE-2374
 URL: https://issues.apache.org/jira/browse/LUCENE-2374
 Project: Lucene - Java
  Issue Type: Improvement
  Components: contrib/analyzers
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2374-3x.patch, LUCENE-2374-3x.patch, 
 LUCENE-2374-3x.patch, shot1.png, shot2.png, shot3.png, shot4.png


 AttributeSource/TokenStream inspection in Solr needs to have some insight 
 into the contents of AttributeImpls. As LUCENE-2302 has some problems with 
 toString() [which is not structured and conflicts with CharSequence's 
 definition for CharTermAttribute], I propose an simple API that get a default 
 implementation in AttributeImpl (just like toString() current):
 - IteratorMap.EntryString,? AttributeImpl.contentsIterator() returns an 
 iterator (for most attributes its a singleton) of a key-value pair, e.g. 
 term-foobar,startOffset-Integer.valueOf(0),...
 - AttributeSource gets the same method, it just concat the iterators of each 
 getAttributeImplsIterator() AttributeImpl
 No backwards problems occur, as the default toString() method will work like 
 before (it just gets iterator and lists), but we simply remove the 
 documentation for the format. (Char)TermAttribute gets a special impl fo 
 toString() according to CharSequence and a corresponding iterator.
 I also want to remove the abstract hashCode() and equals() methods from 
 AttributeImpl, as they are not needed and just create work for the 
 implementor.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (SOLR-1604) Wildcards, ORs etc inside Phrase Queries

2011-01-18 Thread Ahmet Arslan (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983225#action_12983225
 ] 

Ahmet Arslan commented on SOLR-1604:


When you add debugQuery=on to your search URL, you should see something like 
in the debug section:
spanNear([text:a, text:b], 5, false) , false here means un-ordered phrase 
query. Do you see it?

I will look into CommonGrams this weekend.

 Wildcards, ORs etc inside Phrase Queries
 

 Key: SOLR-1604
 URL: https://issues.apache.org/jira/browse/SOLR-1604
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
Reporter: Ahmet Arslan
Priority: Minor
 Fix For: Next

 Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, 
 ComplexPhrase.zip, ComplexPhraseQueryParser.java, SOLR-1604.patch


 Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports 
 wildcards, ORs, ranges, fuzzies inside phrase queries.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly

2011-01-18 Thread Ryan McKinley (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983234#action_12983234
 ] 

Ryan McKinley commented on LUCENE-2657:
---

Steve, great work with this patch -- it takes care of all the previous concerns 
about our problematic maven support.

With this patch, we now have:
 * testable maven artifacts
 * easy repo distribution
 * ant is still *the* build system

The RM can choose to ignore the generate-maven-artifacts target and let someone 
else push the artifacts.  

As with most religious conflicts -- I hope the resolution is not conversion, 
rather something that lets everyone to live (work)  in peace.  





 Replace Maven POM templates with full POMs, and change documentation 
 accordingly
 

 Key: LUCENE-2657
 URL: https://issues.apache.org/jira/browse/LUCENE-2657
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.1, 4.0
Reporter: Steven Rowe
Assignee: Steven Rowe
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch


 The current Maven POM templates only contain dependency information, the bare 
 bones necessary for uploading artifacts to the Maven repository.
 The full Maven POMs in the attached patch include the information necessary 
 to run a multi-module Maven build, in addition to serving the same purpose as 
 the current POM templates.
 Several dependencies are not available through public maven repositories.  A 
 profile in the top-level POM can be activated to install these dependencies 
 from the various {{lib/}} directories into your local repository.  From the 
 top-level directory:
 {code}
 mvn -N -Pbootstrap install
 {code}
 Once these non-Maven dependencies have been installed, to run all Lucene/Solr 
 tests via Maven's surefire plugin, and populate your local repository with 
 all artifacts, from the top level directory, run:
 {code}
 mvn install
 {code}
 When one Lucene/Solr module depends on another, the dependency is declared on 
 the *artifact(s)* produced by the other module and deposited in your local 
 repository, rather than on the other module's un-jarred compiler output in 
 the {{build/}} directory, so you must run {{mvn install}} on the other module 
 before its changes are visible to the module that depends on it.
 To create all the artifacts without running tests:
 {code}
 mvn -DskipTests install
 {code}
 I almost always include the {{clean}} phase when I do a build, e.g.:
 {code}
 mvn -DskipTests clean install
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Let's drop Maven Artifacts !

I still don't see why you care so much.  You have people willing to maintain it 
and it is no sweat off your back and it is used by a pretty large chunk of 
downstream users.  And don't tell me it is what holds up releases b/c it simply 
isn't true.


On Jan 18, 2011, at 9:12 AM, Robert Muir wrote:

 On Tue, Jan 18, 2011 at 9:10 AM, Earwin Burrfoot ear...@gmail.com wrote:
 Latest code from LUCENE-2657 does not generate any new artifacts. It
 uploads those you already have (built via ant) to the repo.
 
 
 yep, thats releasing artifacts. thats the whole point of this email
 thread (read the title, thanks)
 
 the intellij/eclipse stuff is just unreleased stuff that sits in our
 SVN. it doesnt get uploaded anywhere.
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Let's drop Maven Artifacts !

On Tue, Jan 18, 2011 at 10:53 AM, Grant Ingersoll gsing...@apache.org wrote:
 I still don't see why you care so much.  You have people willing to maintain 
 it and it is no sweat off your back and it is used by a pretty large chunk of 
 downstream users.  And don't tell me it is what holds up releases b/c it 
 simply isn't true.


it is what holds up releases. the last time i brought up releasing, it
was totally destroyed because of maven.

the RM shouldn't have to deal with 2 build systems, packaging systems,
and repository hell, and that's what maven artifacts require.

If there is a large chunk of downstream users, then they can handle
this downstream, it doesn't need to be in lucene, just like we don't
deal with other packaging systems.

Unfortunately there is a very loud minority that care about maven,
most of us that think the situation is ridiculous have totally given
up arguing about it, except me, i don't want to put out a shitty
release with broken maven artifacts like in the past, i'd rather let
some downstream project deal with maven instead.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Resolved: (LUCENE-2755) Some improvements to CMS

[
https://issues.apache.org/jira/browse/LUCENE-2755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Michael McCandless resolved LUCENE-2755.

Resolution: Fixed

Some improvements to CMS

Key: LUCENE-2755
URL: https://issues.apache.org/jira/browse/LUCENE-2755
Project: Lucene - Java
Issue Type: Improvement
Components: Index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Minor
Fix For: 3.1, 4.0

Attachments: LUCENE-2755.patch

While running optimize on a large index, I've noticed several things that got
me to read CMS code more carefully, and find these issues:
* CMS may hold onto a merge if maxMergeCount is hit. That results in the
MergeThreads taking merges from the IndexWriter until they are exhausted, and
only then that blocked merge will run. I think it's unnecessary that that
merge will be blocked.
* CMS sorts merges by segments size, doc-based and not bytes-based. Since the
default MP is LogByteSizeMP, and I hardly believe people care about doc-based
size segments anymore, I think we should switch the default impl. There are
two ways to make it extensible, if we want:
** Have an overridable member/method in CMS that you can extend and override
- easy.
** Have OneMerge be comparable and let the MP determine the order (e.g. by
bytes, docs, calibrate deletes etc.). Better, but will need to tap into
several places in the code, so more risky and complicated.
On the go, I'd like to add some documentation to CMS - it's not very easy to
read and follow.
I'll work on a patch.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2324) Per thread DocumentsWriters that write their own private segments

2011-01-18 Thread Michael Busch (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983246#action_12983246
]

Michael Busch commented on LUCENE-2324:
---

{quote}
I ran a quick perf test here: I built the 10M Wikipedia index,
Standard codec, using 6 threads. Trunk took 541.6 sec; RT took 518.2
sec (only a bit faster), but the test wasn't really fair because it
flushed @ docCount=12870.
{quote}

Thanks for running the tests!
Hmm that's a bit disappointing - we were hoping for more speedup.
Flushing by docCount is currently per DWPT, so every initial segment
in your test had 12870 docs. I guess there's a lot of merging happening.

Maybe you could rerun with higher docCount?

bq. But I can't test flush by RAM - that's not working yet on RT right?

True. I'm going to add that soonish. There's one thread-safety bug
related to deletes that needs to be fixed too.

{quote}
Then I ran a single-threaded test. Trunk took 1097.1 sec and RT took
1040.5 sec - a bit faster! Presumably in the noise (we don't expect
a speedup?), but excellent that it's not slower...
{quote}

Yeah I didn't expect much speedup - cool! :) Maybe because some
code is gone, like the WaitQueue, not sure how much overhead that
added in the single-threaded case.

{quote}
I think we lost infoStream output on the details of flushing? I can't
see when which DWPTs are flushing...
{quote}

Oh yeah, good point, I'll add some infoStream messages to DWPT!

Per thread DocumentsWriters that write their own private segments
-

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly

2011-01-18 Thread Chris A. Mattmann (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983247#action_12983247
 ] 

Chris A. Mattmann commented on LUCENE-2657:
---

+1 for Steve's patch, great work and you beat me to it. 


 Replace Maven POM templates with full POMs, and change documentation 
 accordingly
 

 Key: LUCENE-2657
 URL: https://issues.apache.org/jira/browse/LUCENE-2657
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.1, 4.0
Reporter: Steven Rowe
Assignee: Steven Rowe
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch


 The current Maven POM templates only contain dependency information, the bare 
 bones necessary for uploading artifacts to the Maven repository.
 The full Maven POMs in the attached patch include the information necessary 
 to run a multi-module Maven build, in addition to serving the same purpose as 
 the current POM templates.
 Several dependencies are not available through public maven repositories.  A 
 profile in the top-level POM can be activated to install these dependencies 
 from the various {{lib/}} directories into your local repository.  From the 
 top-level directory:
 {code}
 mvn -N -Pbootstrap install
 {code}
 Once these non-Maven dependencies have been installed, to run all Lucene/Solr 
 tests via Maven's surefire plugin, and populate your local repository with 
 all artifacts, from the top level directory, run:
 {code}
 mvn install
 {code}
 When one Lucene/Solr module depends on another, the dependency is declared on 
 the *artifact(s)* produced by the other module and deposited in your local 
 repository, rather than on the other module's un-jarred compiler output in 
 the {{build/}} directory, so you must run {{mvn install}} on the other module 
 before its changes are visible to the module that depends on it.
 To create all the artifacts without running tests:
 {code}
 mvn -DskipTests install
 {code}
 I almost always include the {{clean}} phase when I do a build, e.g.:
 {code}
 mvn -DskipTests clean install
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Let's drop Maven Artifacts !


On Jan 18, 2011, at 11:12 AM, Robert Muir wrote:

 
 there is a very loud minority that care about maven,
 most of us that think the situation is ridiculous have totally given
 up arguing about it, except me, i don't want to put out a shitty
 release with broken maven artifacts like in the past, i'd rather let
 some downstream project deal with maven instead.

+1. What a fantastic idea for an apache extra's project :)

I'll open my arms to first class maven the first time it sees the light of 
consensus ;)

- Mark
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Let's drop Maven Artifacts !


On Jan 18, 2011, at 11:12 AM, Robert Muir wrote:

 On Tue, Jan 18, 2011 at 10:53 AM, Grant Ingersoll gsing...@apache.org wrote:
 I still don't see why you care so much.  You have people willing to maintain 
 it and it is no sweat off your back and it is used by a pretty large chunk 
 of downstream users.  And don't tell me it is what holds up releases b/c it 
 simply isn't true.
 
 
 it is what holds up releases. the last time i brought up releasing, it
 was totally destroyed because of maven.

I'll grant you it held up the last release _ONCE WE DECIDED TO RELEASE_, but 
don't act like it is why we don't release very often, because it isn't.

 
 the RM shouldn't have to deal with 2 build systems, packaging systems,
 and repository hell, and that's what maven artifacts require.

And Steve has said he would fix it and it won't require two build systems, so 
your main complaint is solved.
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2472) The terms index divisor in IW should be set via IWC not via getReader


[ 
https://issues.apache.org/jira/browse/LUCENE-2472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983252#action_12983252
 ] 

Michael McCandless commented on LUCENE-2472:


bq. So I guess all that's needed is to deprecate IW.getReader(int) on 3x and 
remove from trunk?

+1

Though, it's already removed on trunk.  So we just need to deprecate on 3.x...

 The terms index divisor in IW should be set via IWC not via getReader
 -

 Key: LUCENE-2472
 URL: https://issues.apache.org/jira/browse/LUCENE-2472
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 3.1, 4.0


 The getReader call gives a false sense of security... since if deletions have 
 already been applied (and IW is pooling) the readers have already been loaded 
 with a divisor of 1.
 Better to set the divisor up front in IWC.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Let's drop Maven Artifacts !

On Tue, Jan 18, 2011 at 11:27 AM, Mark Miller markrmil...@gmail.com wrote:
 I'll open my arms to first class maven the first time it sees the light of 
 consensus ;)

thats the main thing missing from releasing maven artifacts... looking
at previous threads I don't really see consensus that we need to do
this.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Lucene-Solr-tests-only-trunk - Build # 3864 - Failure

2011-01-18 Thread Michael McCandless

This was caused by a latent bug in PrefixCodedTermsReader...

But, I'm about to replace that w/ BlockTermsReader, so I'll leave this
bug there...

Mike

On Mon, Jan 17, 2011 at 2:05 AM, Apache Hudson Server
hud...@hudson.apache.org wrote:
 Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/3864/

 1 tests failed.
 REGRESSION:  org.apache.lucene.util.automaton.fst.TestFSTs.testRealTerms

 Error Message:
 null

 Stack Trace:
 junit.framework.AssertionFailedError
        at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1127)
        at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1059)
        at 
 org.apache.lucene.index.codecs.intblock.FixedIntBlockIndexInput$Index.read(FixedIntBlockIndexInput.java:167)
        at 
 org.apache.lucene.index.codecs.sep.SepPostingsReaderImpl.readTerm(SepPostingsReaderImpl.java:167)
        at 
 org.apache.lucene.index.codecs.pulsing.PulsingPostingsReaderImpl.readTerm(PulsingPostingsReaderImpl.java:135)
        at 
 org.apache.lucene.index.codecs.PrefixCodedTermsReader$FieldReader$SegmentTermsEnum.next(PrefixCodedTermsReader.java:508)
        at 
 org.apache.lucene.index.codecs.PrefixCodedTermsReader$FieldReader$SegmentTermsEnum.seek(PrefixCodedTermsReader.java:431)
        at org.apache.lucene.index.TermsEnum.seek(TermsEnum.java:68)
        at 
 org.apache.lucene.util.automaton.fst.TestFSTs.testRealTerms(TestFSTs.java:1016)




 Build Log (for compile errors):
 [...truncated 2947 lines...]



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (SOLR-2303) remove unnecessary (and problematic) log4j jars in contribs


[ 
https://issues.apache.org/jira/browse/SOLR-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983260#action_12983260
 ] 

Erick Erickson commented on SOLR-2303:
--

I am Officially Confused, but the culprit appears to be 
log4j-over-slf4j-1.5.5.jar

3_x has:
   log4j jars in solr/contrib/extraction and solr/contrib/clustering
   a bunch of slf4j jars in solr/lib (but NOT log4j-over-slf4j-1.5.5.jar, see 
below).
   All tests succeed just fine.

Trunk has:
  no log4j jars in contrib
  the same slf4j jars as in 3_x BUT ALSO log4j-over-slf4j-1.5.5.jar
  VelocityResponseWriterTest fails


In trunk, removing log4j-over-slf4j-1.5.5.jar allows VelocityResponseWriterTest 
and all other tests to succeed.

in 3_x, removing the log4j jars from solr/contrib makes no difference, all 
tests pass.

So I propose that the fix for this is to remove the log4j files from 3_x and 
the log4j-over-slf4j-1.5.5.jar from trunk.

Should I create a patch? And do patches actually remove jars like this?

 remove unnecessary (and problematic) log4j jars in contribs
 ---

 Key: SOLR-2303
 URL: https://issues.apache.org/jira/browse/SOLR-2303
 Project: Solr
  Issue Type: Improvement
  Components: Build
Reporter: Robert Muir
 Fix For: 4.0

 Attachments: SOLR-2303.patch


 In solr 4.0 there is log4j-over-slf4j.
 But if you have log4j jars also in the classpath (e.g. contrib/extraction, 
 contrib/clustering) you can get strange errors such as:
 java.lang.NoSuchMethodError: org.apache.log4j.Logger.setAdditivity(Z)V
 So I think we should remove the log4j jars in these contribs, all tests pass 
 with them removed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Reopened: (SOLR-2303) remove unnecessary (and problematic) log4j jars in contribs


 [ 
https://issues.apache.org/jira/browse/SOLR-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson reopened SOLR-2303:
--


See previous comment, I believe that there are some jars in Solr that need to 
be removed.

 remove unnecessary (and problematic) log4j jars in contribs
 ---

 Key: SOLR-2303
 URL: https://issues.apache.org/jira/browse/SOLR-2303
 Project: Solr
  Issue Type: Improvement
  Components: Build
Reporter: Robert Muir
 Fix For: 4.0

 Attachments: SOLR-2303.patch


 In solr 4.0 there is log4j-over-slf4j.
 But if you have log4j jars also in the classpath (e.g. contrib/extraction, 
 contrib/clustering) you can get strange errors such as:
 java.lang.NoSuchMethodError: org.apache.log4j.Logger.setAdditivity(Z)V
 So I think we should remove the log4j jars in these contribs, all tests pass 
 with them removed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (SOLR-2303) remove unnecessary (and problematic) log4j jars in contribs

[
https://issues.apache.org/jira/browse/SOLR-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983265#action_12983265
]

Robert Muir commented on SOLR-2303:
---

Erick, actually i think the issue is that log4j-over-slf4j conflicts with
log4j, if log4j is in the classpath.

The problem is that currently, the solr build runs tests with whatever is in
ant's classpath.
This is why the tests pass for you, even if you remove all logging jars, but
this is obviously bad as its not really a repeatable build.

So to fix this, we need to use includeantruntime=no in the junit tasks, and
also not include $java.class.path in the test classpath.
instead, we explicitly include the ant libs we supply (especially since we
extend some of them for testing).

This might make some warnings or even errors for ant 1.8 users, but I think
thats ok.

remove unnecessary (and problematic) log4j jars in contribs
---

Key: SOLR-2303
URL: https://issues.apache.org/jira/browse/SOLR-2303
Project: Solr
Issue Type: Improvement
Components: Build
Reporter: Robert Muir
Fix For: 4.0

Attachments: SOLR-2303.patch

In solr 4.0 there is log4j-over-slf4j.
But if you have log4j jars also in the classpath (e.g. contrib/extraction,
contrib/clustering) you can get strange errors such as:
java.lang.NoSuchMethodError: org.apache.log4j.Logger.setAdditivity(Z)V
So I think we should remove the log4j jars in these contribs, all tests pass
with them removed.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (SOLR-2303) remove unnecessary (and problematic) log4j jars in contribs

2011-01-18 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983266#action_12983266
 ] 

Mark Miller commented on SOLR-2303:
---

Hey Erick,

If I remember right, log4j-over-slf4j is in there for proper zookeeper logging 
(hoping they switch to slf4j). Rather than dropping it, we should likely try 
and figure out how to keep and fix the issue - as suggested by Robert.

 remove unnecessary (and problematic) log4j jars in contribs
 ---

 Key: SOLR-2303
 URL: https://issues.apache.org/jira/browse/SOLR-2303
 Project: Solr
  Issue Type: Improvement
  Components: Build
Reporter: Robert Muir
 Fix For: 4.0

 Attachments: SOLR-2303.patch


 In solr 4.0 there is log4j-over-slf4j.
 But if you have log4j jars also in the classpath (e.g. contrib/extraction, 
 contrib/clustering) you can get strange errors such as:
 java.lang.NoSuchMethodError: org.apache.log4j.Logger.setAdditivity(Z)V
 So I think we should remove the log4j jars in these contribs, all tests pass 
 with them removed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: Let's drop Maven Artifacts !

2011-01-18 Thread Steven A Rowe

On 1/18/2011 at 11:34 AM, Robert Muir wrote:
 On Tue, Jan 18, 2011 at 11:27 AM, Mark Miller markrmil...@gmail.com
 wrote:
  I'll open my arms to first class maven the first time it sees the light
  of consensus ;)
 
 thats the main thing missing from releasing maven artifacts... looking
 at previous threads I don't really see consensus that we need to do
 this.

I think there is consensus that the RM does not have to release Maven artifacts.

There clearly is no consensus for removing Maven support from Lucene.  

 Unfortunately there is a very loud minority that care about maven

I would wager that there is a sizable silent *majority* of users who literally 
depend on Lucene's Maven artifacts.

Steve

Re: Let's drop Maven Artifacts !

On Tue, Jan 18, 2011 at 12:03 PM, Steven A Rowe sar...@syr.edu wrote:

 There clearly is no consensus for removing Maven support from Lucene.

and see there is my problem, there was no consensus to begin with, now
suddenly its de-facto required. Maven is quite an insidious computer
virus.


 Unfortunately there is a very loud minority that care about maven

 I would wager that there is a sizable silent *majority* of users who 
 literally depend on Lucene's Maven artifacts.

I can't help but remind myself, this is the same argument Oracle
offered up for the whole reason hudson debacle
(http://hudson-labs.org/content/whos-driving-thing)

Declaring that I have a secret pocket of users that want XYZ isn't
open source consensus.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Let's drop Maven Artifacts !

2011-01-18 Thread Michael Busch


On 1/18/11 9:13 AM, Robert Muir wrote:

I can't help but remind myself, this is the same argument Oracle
offered up for the whole reason hudson debacle
(http://hudson-labs.org/content/whos-driving-thing)

Declaring that I have a secret pocket of users that want XYZ isn't
open source consensus.


Well everyone using ant+ivy or maven as their build system likely
consumes artifacts from maven repos.

I'm surprised you're so much against keeping to publish.  I too really
really want to keep ant as Lucene's build tool.  Maven has made me
suicidal in the past.  But I don't want to stop publishing artifacts
to commonly used repos.

I guess we could try to figure out how many people download the
artifacts from m2 repos.  Maybe they have download statistics?
But then what?  What number would justify stopping to publish?

 Michael

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: Let's drop Maven Artifacts !

2011-01-18 Thread Steven A Rowe

On 1/18/2011 at 12:14 PM, Robert Muir wrote:
 On Tue, Jan 18, 2011 at 12:03 PM, Steven A Rowe sar...@syr.edu wrote:
 
  There clearly is no consensus for removing Maven support from Lucene.
 
 and see there is my problem, there was no consensus to begin with, now
 suddenly its de-facto required. Maven is quite an insidious computer
 virus.

So you think you personally have the power to remove functionality from Lucene 
that has the support of multiple committers?

  Unfortunately there is a very loud minority that care about maven
 
  I would wager that there is a sizable silent *majority* of users who
 literally depend on Lucene's Maven artifacts.
 
 I can't help but remind myself, this is the same argument Oracle
 offered up for the whole reason hudson debacle
 (http://hudson-labs.org/content/whos-driving-thing)
 
 Declaring that I have a secret pocket of users that want XYZ isn't
 open source consensus.

In summary: you claim a silent majority (of devs) in favor of your position, 
and I claim a silent majority (of users) in favor of mine.  Your move: my 
majority, of which I have no proof, has no standing.  Sweet.

I dunno - why are we at war?  Why is it so damn important that you *remove* 
functionality that devs care about and will support?

Steve

[jira] Commented: (LUCENE-2324) Per thread DocumentsWriters that write their own private segments

[
https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983316#action_12983316
]

Michael McCandless commented on LUCENE-2324:

The branch is looking very nice!! Very clean :)

Random comments:

Why does DW.anyDeletions need to be sync'd?

Missing headers on at least DocumentsWriterPerThreadPool,
ThreadAffinityDWTP.

IWC.setIndexerThreadPool's javadoc is stale.

On ThreadAffinityDWTP... it may be better if we had a single queue,
where threads wait in line, if no DWPT is available? And when a DWPT
finishes it then notifies any waiting threads? (Ie, instead of queue-per-DWPT).

I see the fieldInfos.update(dwpt.getFieldInfos()) (in
DW.updateDocument) -- is there a risk that two threads bring a new
field into existence at the same time, but w/ different config? Eg
one doc omitsTFAP and the other doesn't? Or, on flush, does each DWPT
use its private FieldInfos to correctly flush the segment? (Hmm: do
we seed each DWPT w/ the original FieldInfos created by IW on init?).

How are we handling the case of open IW, do delete-by-term but no
added docs?

Does DW.pushDeletes really need to sync on IW? BufferedDeletes is
sync'd already.

DW.substractFlushedDocs is mis-spelled (not sure it's used though).

In DW.deleteTerms... shouldn't we skip a DWPT if it has no buffered
docs?

Per thread DocumentsWriters that write their own private segments
-

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Let's drop Maven Artifacts !


On Jan 18, 2011, at 12:30 PM, Michael Busch wrote:
 
 I guess we could try to figure out how many people download the
 artifacts from m2 repos.  Maybe they have download statistics?
 But then what?  What number would justify stopping to publish?
 
 Michael

Realistically, I would expect that Maven artifacts would still be published, 
even if we kick them out of the Lucene project to Apache extras.
If some of the people care as much as they say they do, they will figure out 
how to make poms and whatever downstream, and a Committer into Maven will put 
them on the official Apache repo. It will just more truly not be a concern to 
the rest of us.

- Mark
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (SOLR-2303) remove unnecessary (and problematic) log4j jars in contribs

[
https://issues.apache.org/jira/browse/SOLR-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983317#action_12983317
]

Erick Erickson commented on SOLR-2303:
--

Ah, I think the light finally dawns. And helps explain why I'm getting
different results on different machines/environments

There's a reason they don't often let me near build systems.

Ok, splendid. I suggested removing things to see if it was a bad idea. It is.
Almost.

So does it still make sense to remove the log4j jars in contrib in the 3_x
branch?

Robert:
I did as you suggested, and of course started getting classNotFound errors
for JUnitTestRunner and so-on. So I included these lines in Solr's build.xml.

pathelement path=${common-solr.dir}/../lucene/lib/ant-junit-1.7.1.jar /
pathelement path=${common-solr.dir}/../lucene/lib/ant-1.7.1.jar /
pathelement path=${common-solr.dir}/../lucene/lib/junit-4.7.jar /

in place of java.class.path and all is well. Is this the path you'd go down?
I'm not very comfortable having Solr reach over into Lucene, but what do I know?

It should be fairly obvious by now that I'm not very ant-sophisticated, is
there a preferred way of doing this? Because if this is OK, it seems we should
also remove junit-4.7.jar from ../solr/lib and point anything that needs it
should path to ../lucene/lib as well.

I'm currently testing similar changes on the 3_x build with log4j files
removed. But that worked before as well.

Let me know

remove unnecessary (and problematic) log4j jars in contribs
---

Key: SOLR-2303
URL: https://issues.apache.org/jira/browse/SOLR-2303
Project: Solr
Issue Type: Improvement
Components: Build
Reporter: Robert Muir
Fix For: 4.0

Attachments: SOLR-2303.patch

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Let's drop Maven Artifacts !


On Jan 18, 2011, at 12:28 PM, Steven A Rowe wrote:

 On 1/18/2011 at 12:14 PM, Robert Muir wrote:
 On Tue, Jan 18, 2011 at 12:03 PM, Steven A Rowe sar...@syr.edu wrote:
 
 There clearly is no consensus for removing Maven support from Lucene.
 
 and see there is my problem, there was no consensus to begin with, now
 suddenly its de-facto required. Maven is quite an insidious computer
 virus.
 
 So you think you personally have the power to remove functionality from 
 Lucene that has the support of multiple committers?

If he thought that, he would have removed maven from svn by now!

From my point of view, but perhaps I misremember:

At some point, Grant or someone put in some Maven poms. I don't think anyone 
else really paid attention. Later, as we did releases, and saw and dealt with 
these poms, most of us commented against Maven support. It just feels to me 
like it slipped in - and really its the type of thing that should have been 
more discussed and thought out, and perhaps voted upon. Maven snuck into Lucene 
IMO. To my knowledge, the majority of core developers do not want maven in the 
build and/or frown on dealing with Maven. We could always have a little vote to 
gauge numbers - I just have not wanted to rush to another vote thread myself ;) 
Users are important too - but they don't get official votes - it's up to each 
of us to consider the User feelings/vote in our opinions/votes as we see fit 
IMO.

- Mark
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2324) Per thread DocumentsWriters that write their own private segments

2011-01-18 Thread Michael Busch (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983346#action_12983346
 ] 

Michael Busch commented on LUCENE-2324:
---

bq. Why does DW.anyDeletions need to be sync'd?

Hmm good point.  Actually only the call to DW.pendingDeletes.any() needs to be 
synced, but not the loop that calls the DWPTs.

{quote}
In ThreadAffinityDWTP... it may be better if we had a single queue,
where threads wait in line, if no DWPT is available? And when a DWPT
finishes it then notifies any waiting threads? (Ie, instead of queue-per-DWPT).
{quote}

Whole foods instead of safeway? :)
Yeah that would be fairer.  A large doc (= a full cart) wouldn't block unlucky 
other docs.  I'll make that change, good idea!

{quote}
I see the fieldInfos.update(dwpt.getFieldInfos()) (in
DW.updateDocument) - is there a risk that two threads bring a new
field into existence at the same time, but w/ different config? Eg
one doc omitsTFAP and the other doesn't? Or, on flush, does each DWPT
use its private FieldInfos to correctly flush the segment? (Hmm: do
we seed each DWPT w/ the original FieldInfos created by IW on init?).
{quote}

Every DWPT has its own private FieldInfos.  When a segment is flushed the DWPT 
uses its private FI and then it updates the original DW.fieldInfos (from IW), 
which is a synchronized call.  

The only consumer of DW.getFieldInfos() is SegmentMerger in IW.  Hmm, given 
that IW.flush() isn't synchronized anymore I assume this can lead into a 
problem?  E.g. the SegmentMerger gets a FieldInfos that's newer than the list 
of segments it's trying to flush?

bq. How are we handling the case of open IW, do delete-by-term but no added 
docs?

DW has a SegmentDeletes (pendingDeletes) which gets pushed to the last segment. 
 We only add delTerms to DW.pendingDeletes if we couldn't push it to any DWPT.  
Btw. I think the whole pushDeletes business isn't working correctly yet, I'm 
looking into it.  I need to understand the code that coalesces the deletes 
better. 

bq. In DW.deleteTerms... shouldn't we skip a DWPT if it has no buffered docs?

Yeah, I did that already, but not committed yet.

 Per thread DocumentsWriters that write their own private segments
 -

 Key: LUCENE-2324
 URL: https://issues.apache.org/jira/browse/LUCENE-2324
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Michael Busch
Assignee: Michael Busch
Priority: Minor
 Fix For: Realtime Branch

 Attachments: LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, 
 LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, 
 LUCENE-2324.patch, LUCENE-2324.patch, LUCENE-2324.patch, lucene-2324.patch, 
 lucene-2324.patch, LUCENE-2324.patch, test.out, test.out, test.out, test.out


 See LUCENE-2293 for motivation and more details.
 I'm copying here Mike's summary he posted on 2293:
 Change the approach for how we buffer in RAM to a more isolated
 approach, whereby IW has N fully independent RAM segments
 in-process and when a doc needs to be indexed it's added to one of
 them. Each segment would also write its own doc stores and
 normal segment merging (not the inefficient merge we now do on
 flush) would merge them. This should be a good simplification in
 the chain (eg maybe we can remove the *PerThread classes). The
 segments can flush independently, letting us make much better
 concurrent use of IO  CPU.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Resolved: (LUCENE-2472) The terms index divisor in IW should be set via IWC not via getReader


 [ 
https://issues.apache.org/jira/browse/LUCENE-2472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera resolved LUCENE-2472.


   Resolution: Fixed
Fix Version/s: (was: 4.0)

You're right Mike. I committed the deprecation note to revision 1060545.

 The terms index divisor in IW should be set via IWC not via getReader
 -

 Key: LUCENE-2472
 URL: https://issues.apache.org/jira/browse/LUCENE-2472
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 3.1


 The getReader call gives a false sense of security... since if deletions have 
 already been applied (and IW is pooling) the readers have already been loaded 
 with a divisor of 1.
 Better to set the divisor up front in IWC.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Let's drop Maven Artifacts !

2011-01-18 Thread Michael Busch


On 1/18/11 10:44 AM, Mark Miller wrote:

 From my point of view, but perhaps I misremember:

At some point, Grant or someone put in some Maven poms.
I did. :) It was a ton of work and especially getting the 
maven-ant-tasks to work was a nightmare!



I don't think anyone else really paid attention.


All those patches were attached to a jira issue, and the issue was open 
for a while, with people asking for published maven artifacts.



Later, as we did releases, and saw and dealt with these poms, most of us 
commented against Maven support.


So can you explain what the problem with the maven support is?  Isn't it 
enough to just call the ant target and copying the generated files 
somewhere?  When I did releases I never thought it made the release any 
harder.  Just two additional easy steps.



It just feels to me like it slipped in - and really its the type of thing that 
should have been more discussed and thought out, and perhaps voted upon. Maven 
snuck into Lucene IMO. To my knowledge, the majority of core developers do not 
want maven in the build and/or frown on dealing with Maven. We could always 
have a little vote to gauge numbers - I just have not wanted to rush to another 
vote thread myself ;) Users are important too - but they don't get official 
votes - it's up to each of us to consider the User feelings/vote in our 
opinions/votes as we see fit IMO.

- Mark
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org





-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: Let's drop Maven Artifacts !

2011-01-18 Thread Steven A Rowe

On 1/18/2011 at 1:45 PM, Mark Miller wrote:
 At some point, Grant or someone put in some Maven poms. I don't think
 anyone else really paid attention. Later, as we did releases, and saw and
 dealt with these poms, most of us commented against Maven support. It just
 feels to me like it slipped in - and really its the type of thing that
 should have been more discussed and thought out, and perhaps voted upon.
 Maven snuck into Lucene IMO.

Lucene's policy is commit-then-review, and lazy consensus is the rule, right?

[jira] Created: (LUCENE-2873) TestIndexWriterReader fails: too many open files

TestIndexWriterReader fails: too many open files


 Key: LUCENE-2873
 URL: https://issues.apache.org/jira/browse/LUCENE-2873
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 3.1
 Environment: java version 1.6.0
Java(TM) SE Runtime Environment (build pxi3260sr9-20101125_01(SR9))
IBM J9 VM (build 2.4, JRE 1.6.0 IBM J9 2.4 Linux x86-32 
jvmxi3260sr9-20101124_69295 (JIT enabled, AOT enabled)
J9VM - 20101124_069295
JIT  - r9_20101028_17488ifx2
GC   - 20101027_AA)
JCL  - 20101119_01

Reporter: Robert Muir


{noformat}
[junit] Testsuite: org.apache.lucene.index.TestIndexWriterReader
[junit] Testcase: 
testAddIndexesAndDoDeletesThreads(org.apache.lucene.index.TestIndexWriterReader):
 Caused an ERROR
[junit] 
/home/cron/branch_3x/lucene/build/test/6/test7430286492423218781tmp/_90.prx 
(Too many open files)
[junit] java.io.FileNotFoundException: 
/home/cron/branch_3x/lucene/build/test/6/test7430286492423218781tmp/_90.prx 
(Too many open files)
[junit] at java.io.RandomAccessFile.open(Native Method)
[junit] at java.io.RandomAccessFile.init(RandomAccessFile.java:229)
[junit] at 
org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput$Descriptor.init(SimpleFSDirectory.java:69)
[junit] at 
org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.init(SimpleFSDirectory.java:90)
[junit] at 
org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.init(NIOFSDirectory.java:91)
[junit] at 
org.apache.lucene.store.NIOFSDirectory.openInput(NIOFSDirectory.java:78)
[junit] at 
org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:353)
[junit] at 
org.apache.lucene.store.MockDirectoryWrapper.openInput(MockDirectoryWrapper.java:358)
[junit] at 
org.apache.lucene.store.Directory.openInput(Directory.java:139)
[junit] at 
org.apache.lucene.index.SegmentReader$CoreReaders.init(SegmentReader.java:135)
[junit] at 
org.apache.lucene.index.SegmentReader.get(SegmentReader.java:583)
[junit] at 
org.apache.lucene.index.SegmentReader.get(SegmentReader.java:561)
[junit] at 
org.apache.lucene.index.DirectoryReader.init(DirectoryReader.java:101)
[junit] at 
org.apache.lucene.index.ReadOnlyDirectoryReader.init(ReadOnlyDirectoryReader.java:27)
[junit] at 
org.apache.lucene.index.DirectoryReader$1.doBody(DirectoryReader.java:78)
[junit] at 
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:697)
[junit] at 
org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:72)
[junit] at 
org.apache.lucene.index.IndexReader.open(IndexReader.java:344)
[junit] at 
org.apache.lucene.index.IndexReader.open(IndexReader.java:230)
[junit] at 
org.apache.lucene.index.TestIndexWriterReader.testAddIndexesAndDoDeletesThreads(TestIndexWriterReader.java:381)
[junit] at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1007)
[junit] at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:939)
[junit] 
[junit] 
[junit] Tests run: 18, Failures: 0, Errors: 1, Time elapsed: 13.56 sec
[junit] 
[junit] - Standard Error -
[junit] NOTE: reproduce with: ant test -Dtestcase=TestIndexWriterReader 
-Dtestmethod=testAddIndexesAndDoDeletesThreads 
-Dtests.seed=-7781539944268912038:-6865031686554264582
[junit] NOTE: test params are: locale=ar_SD, timezone=Asia/Almaty
[junit] NOTE: all tests run in this JVM:
[junit] [TestMergeSchedulerExternal, TestCharFilter, 
TestISOLatin1AccentFilter, TestCharTermAttributeImpl, TestDoc, 
TestFieldsReader, TestFilterIndexReader, TestIndexWriterReader]
[junit] NOTE: Linux 2.6.32-24-generic x86/IBM Corporation 1.6.0 
(32-bit)/cpus=1,threads=3,free=6495816,total=11920384
[junit] -  ---
[junit] TEST org.apache.lucene.index.TestIndexWriterReader FAILED
{noformat}


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Let's drop Maven Artifacts !


On Jan 18, 2011, at 2:37 PM, Steven A Rowe wrote:

 On 1/18/2011 at 1:45 PM, Mark Miller wrote:
 At some point, Grant or someone put in some Maven poms. I don't think
 anyone else really paid attention. Later, as we did releases, and saw and
 dealt with these poms, most of us commented against Maven support. It just
 feels to me like it slipped in - and really its the type of thing that
 should have been more discussed and thought out, and perhaps voted upon.
 Maven snuck into Lucene IMO.
 
 Lucene's policy is commit-then-review, and lazy consensus is the rule, right?

Right - clearly this is not some sneaky or underhanded thing that happened. 
Certainly this is how a lot of legit things happen.

The only reason I feel it was more of a Maven sneaking in thing is that in IRC 
I have learned how many active core devs really didn't want Maven in the build 
at a later time. I think we just didn't really know what was happening / paid 
attention. I don't mean to characterize incorrectly. If you asked me back then, 
I prob would not have understood the consequences whatsoever and said, please 
go ahead! Patches welcome.

People's opinions have shifted though - we have more committers now - perhaps 
the Maven support side is larger than the against now.

Just stating things as I roughly knew them - happy to see things cleared up, 
fined tuned.

- Mark
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Let's drop Maven Artifacts !


On Jan 18, 2011, at 2:41 PM, Michael Busch wrote:

 
 So can you explain what the problem with the maven support is?  Isn't it 
 enough to just call the ant target and copying the generated files somewhere? 
  When I did releases I never thought it made the release any harder.  Just 
 two additional easy steps.
 

Robert and I have gone over this a fair amount in previous exchanges I think, 
if you really want to know particulars. Suffice it to say, the problems so far 
have not been large, it feels like the likelihood of future larger problems is 
growing, if you ask people that seem to like/care about Maven support, the 
problems are probably not really a problem or easily addressable, if you ask 
people that dislike/don't want Maven, the problems are probably just not worth 
ever having to run into when we are still convinced this could be handled 
downstream.

If I remember right, a large reason Robert is against is that he doesn't want 
to sign/support/endorse something he doesn't understand or care about as a 
Release Manager? But thats probably a major simplification of his previous 
arguments. And the pro Maven team has offered their counters to that.


- Mark
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Let's drop Maven Artifacts !

On Tue, Jan 18, 2011 at 2:57 PM, Mark Miller markrmil...@gmail.com wrote:

If I remember right, a large reason Robert is against is that he doesn't want
to sign/support/endorse something he doesn't understand or care about as a
Release Manager? But thats probably a major simplification of his previous
arguments. And the pro Maven team has offered their counters to that.

Well, i definitely don't want to produce a jacked-up release. And I
listed in the last 99-email maven thread, a reference to how many of
the previous releases have had various bugs/problems with maven. The
problem is, as it is in our code now, there is no way to verify these
magical files will actually work. and yet we all just ignore the fact
we are probably shipping broken artifacts and go with the release
anyway?

(separately, for reference i know that Uwe has the releasing down to
an art and is probably the sole person here that could actually do a
release without having maven jacked up, so he isn't included)

But for the rest of us, we don't understand maven. why can't it be
handled downstream?
And it sets a tone for future things, for instance *the most popular
issue* in lucene, its not flexible indexing, its not realtime search,
its not column stride fields, its... make Lucene an OSGI bundle?

https://issues.apache.org/jira/browse/LUCENE?report=com.atlassian.jira.plugin.system.project:popularissues-panel

Anyway i think we are making a search engine library, and if someone
else can deal with these hassles, they should. we should focus on
search engine stuff and getting out solid releases.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Let's drop Maven Artifacts !

On Tue, Jan 18, 2011 at 3:11 PM, Michael Busch busch...@gmail.com wrote:
 I'm not sure what's so complicated or mysterious about maven artifacts.  A
 maven artifact consists of normal jar file(s) plus a POM file containing
 some metadata, like the artifact name and group.

its the POM files that cause problems and reported bugs. i don't think
they are simple at all, in fact i think they are more complicated than
ant build.xml files!

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2755) Some improvements to CMS

[
https://issues.apache.org/jira/browse/LUCENE-2755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983359#action_12983359
]

Robert Muir commented on LUCENE-2755:
-

Mike, fyi it looks like we are hung again in hudson:
https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/3866/

Not sure if its the same deadlock you found.

Some improvements to CMS

Attachments: LUCENE-2755.patch

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Reopened: (LUCENE-2755) Some improvements to CMS

[
https://issues.apache.org/jira/browse/LUCENE-2755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Robert Muir reopened LUCENE-2755:
-

I am reopening just so we don't miss fixing the deadlock... its hung in the
same exact part of the tests as earlier
today so I think its somehow related...

Some improvements to CMS

Attachments: LUCENE-2755.patch

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Let's drop Maven Artifacts !

To follow up Steven:

Yes - Maven is part of Lucene now - it got in with lazy consensus or whatever 
method - and now it's basically a first class citizen. I would have to get 
consensus to drop it much more than you would have to get consensus to keep it. 
This is exactly why I don't want it to stick around or grow when it could be a 
downstream project. All of this continued Maven work just looks more stuff we 
will have maintain/support in the future it seems to me.

Honestly though - if it looks like the majority are for Maven - I drop my 
objection.

- Mark


On Jan 18, 2011, at 2:45 PM, Mark Miller wrote:

 
 On Jan 18, 2011, at 2:37 PM, Steven A Rowe wrote:
 
 On 1/18/2011 at 1:45 PM, Mark Miller wrote:
 At some point, Grant or someone put in some Maven poms. I don't think
 anyone else really paid attention. Later, as we did releases, and saw and
 dealt with these poms, most of us commented against Maven support. It just
 feels to me like it slipped in - and really its the type of thing that
 should have been more discussed and thought out, and perhaps voted upon.
 Maven snuck into Lucene IMO.
 
 Lucene's policy is commit-then-review, and lazy consensus is the rule, right?
 
 Right - clearly this is not some sneaky or underhanded thing that happened. 
 Certainly this is how a lot of legit things happen.
 
 The only reason I feel it was more of a Maven sneaking in thing is that in 
 IRC I have learned how many active core devs really didn't want Maven in the 
 build at a later time. I think we just didn't really know what was happening 
 / paid attention. I don't mean to characterize incorrectly. If you asked me 
 back then, I prob would not have understood the consequences whatsoever and 
 said, please go ahead! Patches welcome.
 
 People's opinions have shifted though - we have more committers now - perhaps 
 the Maven support side is larger than the against now.
 
 Just stating things as I roughly knew them - happy to see things cleared up, 
 fined tuned.
 
 - Mark


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (SOLR-2303) remove unnecessary (and problematic) log4j jars in contribs

[
https://issues.apache.org/jira/browse/SOLR-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983367#action_12983367
]

Robert Muir commented on SOLR-2303:
---

bq. OK, scratch the notion of removing the junit-4.7.jar file from Solr, the
test cases...er...stop compiling. But the rest still stands.

{quote}
pathelement path=${common-solr.dir}/../lucene/lib/ant-junit-1.7.1.jar /
pathelement path=${common-solr.dir}/../lucene/lib/ant-1.7.1.jar /
pathelement path=${common-solr.dir}/../lucene/lib/junit-4.7.jar /

in place of java.class.path and all is well. Is this the path you'd go down?
I'm not very comfortable having Solr reach over into Lucene, but what do I know?
{quote}

Yeah, in general it would be good to explicitly include ant, ant-junit, and
junit into our classpath for tests.
I know i fooled with trying to do this across all of lucene and solr, there are
some twists:
* when the clover build is enabled, we have to actually use the ant
runtime/java.class.path, because clover injects itself via ant's classpath via
-lib. There
might be a better way to configure clover to avoid this, but failing that we
have to sometimes support throwing ant's classpath into the classpath like we
do now.
* the contrib/ant gets tricky (i dont remember why) especially with clover
enabled :)
* finally, ant 1.8 support might break, since we specifically include ant 1.7
stuff in our lib. But its generally what we want, better to have a reliable
classpath in
our build/tests than to compile/test with whatever version of ant the person
happens to be using. Ant gets angry if you try to put ant 1.7.jar into an ant
1.8 runtime...

the same situation exists for compilation actually, but I *think* i fixed that
one... you would have to re-check :)

remove unnecessary (and problematic) log4j jars in contribs
---

Key: SOLR-2303
URL: https://issues.apache.org/jira/browse/SOLR-2303
Project: Solr
Issue Type: Improvement
Components: Build
Reporter: Robert Muir
Assignee: Erick Erickson
Fix For: 4.0

Attachments: SOLR-2303.patch

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Let's drop Maven Artifacts !



On Jan 18, 2011, at 12:13 PM, Robert Muir wrote:

 I can't help but remind myself, this is the same argument Oracle
 offered up for the whole reason hudson debacle
 (http://hudson-labs.org/content/whos-driving-thing)
 
 Declaring that I have a secret pocket of users that want XYZ isn't
 open source consensus.



You were very quick to cite your own secret pocket of users when you called 
those who support it the vocal minority.  So, if you want to continue baiting 
the discussion we can, but as I see it, we have committers willing to support 
it, so what's the big deal?
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Let's drop Maven Artifacts !

On Tue, Jan 18, 2011 at 3:49 PM, Grant Ingersoll gsing...@apache.org wrote:


 On Jan 18, 2011, at 12:13 PM, Robert Muir wrote:

 I can't help but remind myself, this is the same argument Oracle
 offered up for the whole reason hudson debacle
 (http://hudson-labs.org/content/whos-driving-thing)

 Declaring that I have a secret pocket of users that want XYZ isn't
 open source consensus.



 You were very quick to cite your own secret pocket of users when you called 
 those who support it the vocal minority.  So, if you want to continue 
 baiting the discussion we can, but as I see it, we have committers willing to 
 support it, so what's the big deal?

I don't think they are that secret, you can look at the last maven
discussion and see several other committers who spoke up against it.
they are just sick of the discussion i gather and have given up
fighting it.

The problem again, is the magical special artifacts.

I dont see consensus here for maven... when you have it, get back to me.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Let's drop Maven Artifacts !

On Tue, Jan 18, 2011 at 3:49 PM, Grant Ingersoll gsing...@apache.org wrote:


 On Jan 18, 2011, at 12:13 PM, Robert Muir wrote:

 I can't help but remind myself, this is the same argument Oracle
 offered up for the whole reason hudson debacle
 (http://hudson-labs.org/content/whos-driving-thing)

 Declaring that I have a secret pocket of users that want XYZ isn't
 open source consensus.



 You were very quick to cite your own secret pocket of users when you called 
 those who support it the vocal minority.  So, if you want to continue 
 baiting the discussion we can, but as I see it, we have committers willing to 
 support it, so what's the big deal?

http://www.lucidimagination.com/search/document/474564645f673fbb/discussion_about_release_frequency

You can look there, and see the responses of several other committers
about maven.

I think i like Yonik's comment best: Maven is not a part of the
release process, if you think it should be, maybe you should call a
vote?

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Let's drop Maven Artifacts !



On Jan 18, 2011, at 3:55 PM, Robert Muir wrote:

 On Tue, Jan 18, 2011 at 3:49 PM, Grant Ingersoll gsing...@apache.org wrote:
 
 
 On Jan 18, 2011, at 12:13 PM, Robert Muir wrote:
 
 I can't help but remind myself, this is the same argument Oracle
 offered up for the whole reason hudson debacle
 (http://hudson-labs.org/content/whos-driving-thing)
 
 Declaring that I have a secret pocket of users that want XYZ isn't
 open source consensus.
 
 
 
 You were very quick to cite your own secret pocket of users when you called 
 those who support it the vocal minority.  So, if you want to continue 
 baiting the discussion we can, but as I see it, we have committers willing 
 to support it, so what's the big deal?
 
 I don't think they are that secret, you can look at the last maven
 discussion and see several other committers who spoke up against it.
 they are just sick of the discussion i gather and have given up
 fighting it.

Wow, so who is the vocal minority now?  

 
 The problem again, is the magical special artifacts.
 
 I dont see consensus here for maven... when you have it, get back to me.

As I see, it you have you, Shai and Miller (and Yonik, likely from the last go 
around).  On the Maven side, you have me, Steve, McKinley and Busch, plus some 
users/contributors. 

In other words, I don't see consensus for dropping it.  When you have it, get 
back to me.  
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Let's drop Maven Artifacts !

On Tue, Jan 18, 2011 at 4:06 PM, Grant Ingersoll gsing...@apache.org wrote:

 In other words, I don't see consensus for dropping it.  When you have it, get 
 back to me.

Thats not how things are added to the release process.
So currently, maven is not included in the release process.

I don't care if your poll on the users list has 100% of users checking
maven, you biased your poll already by mentioning that its because we
are considering dropping maven support at the start of the email, so
its total garbage.

There's a lot of totally insane things I could poll the user list and
get lots of responses for, that I think the devs would disagree with.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Let's drop Maven Artifacts !

2011-01-18 Thread Michael Busch


It's sad how aggressive these discussions get.  There's really no reason.

On 1/18/11 1:10 PM, Robert Muir wrote:

On Tue, Jan 18, 2011 at 4:06 PM, Grant Ingersollgsing...@apache.org  wrote:

In other words, I don't see consensus for dropping it.  When you have it, get 
back to me.

Thats not how things are added to the release process.
So currently, maven is not included in the release process.

I don't care if your poll on the users list has 100% of users checking
maven, you biased your poll already by mentioning that its because we
are considering dropping maven support at the start of the email, so
its total garbage.

There's a lot of totally insane things I could poll the user list and
get lots of responses for, that I think the devs would disagree with.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org





-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Assigned: (SOLR-2307) PHPSerialized fails with sharded queries

2011-01-18 Thread Hoss Man (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man reassigned SOLR-2307:
--

Assignee: Hoss Man

 PHPSerialized fails with sharded queries
 

 Key: SOLR-2307
 URL: https://issues.apache.org/jira/browse/SOLR-2307
 Project: Solr
  Issue Type: Bug
  Components: Response Writers
Affects Versions: 1.3, 1.4.1
Reporter: Antonio Verni
Assignee: Hoss Man
Priority: Minor
 Attachments: PHPSerializedResponseWriter.java.patch, 
 PHPSerializedResponseWriter.java.patch, 
 PHPSerializedResponseWriter.java.patch, TestPHPSerializedResponseWriter.java, 
 TestPHPSerializedResponseWriter.java


 Solr throws a java.lang.IllegalArgumentException: Map size must not be 
 negative exception when using the PHP Serialized response writer with 
 sharded queries. 
 To reproduce the issue start your preferred example and try the following 
 query:
 http://localhost:8983/solr/select/?q=*:*wt=phpsshards=localhost:8983/solr,localhost:8983/solr
 It is caused by the JSONWriter implementation of writeSolrDocumentList and 
 writeSolrDocument. Overriding this two methods in the 
 PHPSerializedResponseWriter to handle the SolrDocument size seems to solve 
 the issue.
 Attached my patch made against trunk rev 1055588.
 cheers,
 Antonio

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Let's drop Maven Artifacts !

2011-01-18 Thread Peter Karich

 Why not vote for or against 'maven artifacts'?

http://www.doodle.com/2qp35b42vstivhvx

I'm using lucene+solr a lot times via maven.
Elasticsearch uses lucene via gradle.
Solandra uses lucene via ivy and so on ;)
So maven artifacts are not only very handy for maven folks.
But I think no artifacts would be better than broken ones.

Why not trying to 'switch' to ivy build system? It's ant but handles
dependencies better IMO.

Regards,
Peter.

 On Tue, Jan 18, 2011 at 3:49 PM, Grant Ingersoll gsing...@apache.org wrote:

 On Jan 18, 2011, at 12:13 PM, Robert Muir wrote:

 I can't help but remind myself, this is the same argument Oracle
 offered up for the whole reason hudson debacle
 (http://hudson-labs.org/content/whos-driving-thing)

 Declaring that I have a secret pocket of users that want XYZ isn't
 open source consensus.


 You were very quick to cite your own secret pocket of users when you called 
 those who support it the vocal minority.  So, if you want to continue 
 baiting the discussion we can, but as I see it, we have committers willing 
 to support it, so what's the big deal?
 I don't think they are that secret, you can look at the last maven
 discussion and see several other committers who spoke up against it.
 they are just sick of the discussion i gather and have given up
 fighting it.

 The problem again, is the magical special artifacts.

 I dont see consensus here for maven... when you have it, get back to me.

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Let's drop Maven Artifacts !


On Jan 18, 2011, at 4:10 PM, Robert Muir wrote:

 On Tue, Jan 18, 2011 at 4:06 PM, Grant Ingersoll gsing...@apache.org wrote:
 
 In other words, I don't see consensus for dropping it.  When you have it, 
 get back to me.
 
 Thats not how things are added to the release process.
 So currently, maven is not included in the release process.
 
 I don't care if your poll on the users list has 100% of users checking
 maven, you biased your poll already by mentioning that its because we
 are considering dropping maven support at the start of the email, so
 its total garbage.

Sorry, I'm not a professional poll writer.  Even if I didn't include it, it 
would take all of a half of a second for someone to figure it out.  As you can 
see by the responses, though, I think people are simply answering it.

It's just software and we have people willing to maintain the Maven stuff.  I 
simply don't get what the big deal is in keeping something that people find 
useful and has (enough) committer support.
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Let's drop Maven Artifacts !

On Tue, Jan 18, 2011 at 4:38 PM, Grant Ingersoll gsing...@apache.org wrote:

 It's just software and we have people willing to maintain the Maven stuff.  I 
 simply don't get what the big deal is in keeping something that people find 
 useful and has (enough) committer support.

Why not call a committer vote then?

[] -- maintain maven ourselves instead of working on search features,
and slower releases.
[] -- let others maintain maven downstream, instead we work on search
features, and faster releases.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Resolved: (SOLR-2307) PHPSerialized fails with sharded queries

2011-01-18 Thread Hoss Man (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Hoss Man resolved SOLR-2307.

Resolution: Fixed
Fix Version/s: 4.0
3.1

Committed revision 1060585. -- trunk

Committed revision 1060589. - 3x.

thanks again for the great patch Antonio

PHPSerialized fails with sharded queries

Key: SOLR-2307
URL: https://issues.apache.org/jira/browse/SOLR-2307
Project: Solr
Issue Type: Bug
Components: Response Writers
Affects Versions: 1.3, 1.4.1
Reporter: Antonio Verni
Assignee: Hoss Man
Priority: Minor
Fix For: 3.1, 4.0

Attachments: PHPSerializedResponseWriter.java.patch,
PHPSerializedResponseWriter.java.patch,
PHPSerializedResponseWriter.java.patch, SOLR-2307.patch,
TestPHPSerializedResponseWriter.java, TestPHPSerializedResponseWriter.java

Solr throws a java.lang.IllegalArgumentException: Map size must not be
negative exception when using the PHP Serialized response writer with
sharded queries.
To reproduce the issue start your preferred example and try the following
query:
http://localhost:8983/solr/select/?q=*:*wt=phpsshards=localhost:8983/solr,localhost:8983/solr
It is caused by the JSONWriter implementation of writeSolrDocumentList and
writeSolrDocument. Overriding this two methods in the
PHPSerializedResponseWriter to handle the SolrDocument size seems to solve
the issue.
Attached my patch made against trunk rev 1055588.
cheers,
Antonio

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Let's drop Maven Artifacts !


On Jan 18, 2011, at 4:41 PM, Robert Muir wrote:

 On Tue, Jan 18, 2011 at 4:38 PM, Grant Ingersoll gsing...@apache.org wrote:
 
 It's just software and we have people willing to maintain the Maven stuff.  
 I simply don't get what the big deal is in keeping something that people 
 find useful and has (enough) committer support.
 
 Why not call a committer vote then?
 
 [] -- maintain maven ourselves instead of working on search features,
 and slower releases.

Wow, so having Maven releases is why we take 6-10 months to release?  Give me a 
break.  The only thing that is slower (arguably) is the building of the release 
itself.   We have had Maven support for a long time and it has never been 
brought up until you did that it was the cause.  The cause is, was and always 
will be that we innovate at a pretty rapid pace and always have the mindset to 
get just one more set of features/fixes into the next release.




 [] -- let others maintain maven downstream, instead we work on search
 features, and faster releases.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-2844) benchmark geospatial performance based on geonames.org

2011-01-18 Thread David Smiley (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-2844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

David Smiley updated LUCENE-2844:
-

Attachment: benchmark-geo.patch

This is an update to the patch which considers the move of the benchmark
contrib to /modules/benchmark. It also includes GeoNamesSetSolrAnalyzerTask
which will use Solr's field-specific analyzer. It's very much tied to these
set of classes in the patch. There are ASF headers now too.

benchmark geospatial performance based on geonames.org
--

Key: LUCENE-2844
URL: https://issues.apache.org/jira/browse/LUCENE-2844
Project: Lucene - Java
Issue Type: New Feature
Components: contrib/benchmark
Reporter: David Smiley
Priority: Minor
Fix For: 4.0

Attachments: benchmark-geo.patch, benchmark-geo.patch

Until now (with this patch), the benchmark contrib module did not include a
means to test geospatial data. This patch includes some new files and
changes to existing ones. Here is a summary of what is being added in this
patch per file (all files below are within the benchmark contrib module)
along with my notes:
Changes:
* build.xml -- Add dependency on Lucene's spatial module and Solr.
** It was a real pain to figure out the convoluted ant build system to make
this work, and I doubt I did it the proper way.
** Rob Muir thought it would be a good idea to make the benchmark contrib
module be top level module (i.e. be alongside analysis) so that it can depend
on everything.
http://lucene.472066.n3.nabble.com/Re-Geospatial-search-in-Lucene-Solr-tp2157146p2157824.html
I agree
* ReadTask.java -- Added a search.useHitTotal boolean option that will use
the total hits number for reporting purposes, instead of the existing
behavior.
** The existing behavior (i.e. when search.useHitTotal=false) doesn't look
very useful since the response integer is the sum of several things instead
of just one thing. I don't see how anyone makes use of it.
Note that on my local system, I also changed ReportTask RepSelectByPrefTask
to not include the '-' every other line, and also changed Format.java to not
use commas in the numbers. These changes are to make copy-pasting into excel
more streamlined.
New Files:
* geoname-spatial.alg -- my algorithm file.
** Note the :0 trailing the Populate sequence. This is a trick I use to
skip building the index, since it takes a while to build and I'm not
interested in benchmarking index construction. You'll want to set this to :1
and then subsequently put it back for further runs as long as you keep the
doc.geo.schemaField or any other configuration elements affecting index the
same.
** In the patch, doc.geo.schemaField=geohash but unless you're tinkering with
SOLR-2155, you'll probably want to set this to latlon
* GeoNamesContentSource.java -- a ContentSource for a geonames.org data file
(either a single country like US.txt or allCountries.txt).
** Uses a subclass of DocData to store all the fields. The existing DocData
wasn't very applicable to data that is not composed of a title and body.
** Doesn't reuse the docdata parameter to getNextDocData(); a new one is
created every time.
** Only supports content.source.forever=false
* GeoNamesDocMaker.java -- a subclass of DocMaker that works very differently
than the existing DocMaker.
** Instead of assuming that each line from geonames.org will correspond to
one Lucene document, this implementation supports, via configuration,
creating a variable number of documents, each with a variable number of
points taken randomly from a GeoNamesContentSource.
** doc.geo.docsToGenerate: The number of documents to generate. If blank it
defaults to the number of rows in GeoNamesContentSource.
** doc.geo.avgPlacesPerDoc: The average number of places to be added to a
document. A random number between 0 and one less than twice this amount is
chosen on a per document basis. If this is set to 1, then exactly one is
always used. In order to support a value greater than 1, use the geohash
field type and incorporate SOLR-2155 (geohash prefix technique).
** doc.geo.oneDocPerPlace: Whether at most one document should use the same
place. In other words, Can more than one document have the same place? If
so, set this to false.
** doc.geo.schemaField: references a field name in schema.xml. The field
should implement SpatialQueryable.
* GeoPerfData.java: This class is a singleton storing data in memory that is
shared by GeoNamesDocMaker.java and GeoQueryMaker.java.
** content.geo.zeroPopSubst: if a population is encountered that is = 0,
then use this population value instead. Default is 100.
** content.geo.maxPlaces: A limit on the number of rows read in

Re: Let's drop Maven Artifacts !