[jira] Issue Comment Edited: (SOLR-1395) Integrate Katta

2011-01-18 Thread tom liu (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12928464#action_12928464
 ] 

tom liu edited comment on SOLR-1395 at 1/18/11 3:27 AM:


JohnWu,Huang :

in katta integrations, the solr core has three roles:
# proxy, that is query dispatches or front server.
all query would be sent to this proxy, and then dispatch to subproxy on katta 
cluster node.
in this proxy, QueryComponent's distributedProcess would be executed. but the 
param isShard=false.
# subproxy, that is proxy on katta cluster node. 
because each node maybe has more than one cores, so subproxy would receive 
query from proxy, and send query to any core.
in this subproxy, QueryComponent's distributedProcess would be executed. but 
the param isShard=true.
# queryCore, that is real query solr core.
any query would be sent to querycore, and the querycore execute 
QueryComponent's process method.

so, when run solr cluster or distribution, we would setup three envs.
# proxy's solrconfig.xml 
{noformat}
requestHandler name=standard class=solr.KattaRequestHandler default=true
lst name=defaults
str name=echoParamsexplicit/str
str name=shards*/str
 /lst
/requestHandler
{noformat}
# subproxy's solrconfig.xml
requestHandler name=standard class=solr.SearchHandler 
default=true.../requestHandler
# querycore's solrconfig.xml
requestHandler name=standard class=solr.MultiEmbeddedSearchHandler 
default=true.../requestHandler

in katta's katta.node.properties::
node.server.class=org.apache.solr.katta.DeployableSolrKattaServer

and in classes dirs of proxy's solr webapps
pls add two files:
# katta.zk.properties
# katta.node.properties

  was (Author: tom_lt):
JohnWu,Huang :

in katta integrations, the solr core has three roles:
# proxy, that is query dispatches or front server.
all query would be sent to this proxy, and then dispatch to subproxy on katta 
cluster node.
in this proxy, QueryComponent's distributedProcess would be executed. but the 
param isShard=false.
# subproxy, that is proxy on katta cluster node. 
because each node maybe has more than one cores, so subproxy would receive 
query from proxy, and send query to any core.
in this subproxy, QueryComponent's distributedProcess would be executed. but 
the param isShard=true.
# queryCore, that is real query solr core.
any query would be sent to querycore, and the querycore execute 
QueryComponent's process method.

so, when run solr cluster or distribution, we would setup three envs.
# proxy's solrconfig.xml 
{noformat}
requestHandler name=standard class=solr.KattaRequestHandler default=true
lst name=defaults
str name=echoParamsexplicit/str
str name=shards*/str
 /lst
/requestHandler
{noformat}
# subproxy's solrconfig.xml
requestHandler name=standard class=solr.SearchHandler 
default=true.../requestHandler
# querycore's solrconfig.xml
requestHandler name=standard class=solr.SearchHandler 
default=true.../requestHandler

in katta's katta.node.properties::
node.server.class=org.apache.solr.katta.DeployableSolrKattaServer

and in classes dirs of proxy's solr webapps
pls add two files:
# katta.zk.properties
# katta.node.properties
  
 Integrate Katta
 ---

 Key: SOLR-1395
 URL: https://issues.apache.org/jira/browse/SOLR-1395
 Project: Solr
  Issue Type: New Feature
Affects Versions: 1.4
Reporter: Jason Rutherglen
Priority: Minor
 Fix For: Next

 Attachments: back-end.log, front-end.log, hadoop-core-0.19.0.jar, 
 katta-core-0.6-dev.jar, katta-solrcores.jpg, katta.node.properties, 
 katta.zk.properties, log4j-1.2.13.jar, solr-1395-1431-3.patch, 
 solr-1395-1431-4.patch, solr-1395-1431-katta0.6.patch, 
 solr-1395-1431-katta0.6.patch, solr-1395-1431.patch, 
 solr-1395-katta-0.6.2-1.patch, solr-1395-katta-0.6.2-2.patch, 
 solr-1395-katta-0.6.2-3.patch, solr-1395-katta-0.6.2.patch, SOLR-1395.patch, 
 SOLR-1395.patch, SOLR-1395.patch, test-katta-core-0.6-dev.jar, 
 zkclient-0.1-dev.jar, zookeeper-3.2.1.jar

   Original Estimate: 336h
  Remaining Estimate: 336h

 We'll integrate Katta into Solr so that:
 * Distributed search uses Hadoop RPC
 * Shard/SolrCore distribution and management
 * Zookeeper based failover
 * Indexes may be built using Hadoop

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Issue Comment Edited: (SOLR-1395) Integrate Katta

2011-01-18 Thread tom liu (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12935709#action_12935709
 ] 

tom liu edited comment on SOLR-1395 at 1/18/11 3:28 AM:


JohnWu:

my conf is:
{code:xml|title=proxy/solrconfig.xml}
requestHandler name=standard class=solr.KattaRequestHandler default=true
lst name=defaults
str name=echoParamsexplicit/str
str name=shards*/str
/lst
  /requestHandler
{code} 

{code:xml|title=subproxy/solrconfig.xml}
requestHandler name=standard class=solr.SearchHandler default=true
!-- default values for query parameters --
 lst name=defaults
   str name=echoParamsexplicit/str
 /lst
  /requestHandler
{code} 

{code:xml|title=querycore(shards)/solrconfig.xml}
requestHandler name=standard class=solr.MultiEmbeddedSearchHandler 
default=true
!-- default values for query parameters --
 lst name=defaults
   str name=echoParamsexplicit/str
 /lst
  /requestHandler
{code} 

{code:xml|title=zoo.cfg}
clientPort=2181
...
{code} 

in Katta/conf and Shards/WEB-INF/classes
{code:xml|title=katta.zk.properties}
zookeeper.embedded=false
zookeeper.servers=localhost:2181
...
{code} 

  was (Author: tom_lt):
JohnWu:

my conf is:
{code:xml|title=proxy/solrconfig.xml}
requestHandler name=standard class=solr.KattaRequestHandler default=true
lst name=defaults
str name=echoParamsexplicit/str
str name=shards*/str
/lst
  /requestHandler
{code} 

{code:xml|title=subproxy/solrconfig.xml}
requestHandler name=standard class=solr.SearchHandler default=true
!-- default values for query parameters --
 lst name=defaults
   str name=echoParamsexplicit/str
 /lst
  /requestHandler
{code} 

{code:xml|title=querycore(shards)/solrconfig.xml}
requestHandler name=standard class=solr.SearchHandler default=true
!-- default values for query parameters --
 lst name=defaults
   str name=echoParamsexplicit/str
 /lst
  /requestHandler
{code} 

{code:xml|title=zoo.cfg}
clientPort=2181
...
{code} 

in Katta/conf and Shards/WEB-INF/classes
{code:xml|title=katta.zk.properties}
zookeeper.embedded=false
zookeeper.servers=localhost:2181
...
{code} 
  
 Integrate Katta
 ---

 Key: SOLR-1395
 URL: https://issues.apache.org/jira/browse/SOLR-1395
 Project: Solr
  Issue Type: New Feature
Affects Versions: 1.4
Reporter: Jason Rutherglen
Priority: Minor
 Fix For: Next

 Attachments: back-end.log, front-end.log, hadoop-core-0.19.0.jar, 
 katta-core-0.6-dev.jar, katta-solrcores.jpg, katta.node.properties, 
 katta.zk.properties, log4j-1.2.13.jar, solr-1395-1431-3.patch, 
 solr-1395-1431-4.patch, solr-1395-1431-katta0.6.patch, 
 solr-1395-1431-katta0.6.patch, solr-1395-1431.patch, 
 solr-1395-katta-0.6.2-1.patch, solr-1395-katta-0.6.2-2.patch, 
 solr-1395-katta-0.6.2-3.patch, solr-1395-katta-0.6.2.patch, SOLR-1395.patch, 
 SOLR-1395.patch, SOLR-1395.patch, test-katta-core-0.6-dev.jar, 
 zkclient-0.1-dev.jar, zookeeper-3.2.1.jar

   Original Estimate: 336h
  Remaining Estimate: 336h

 We'll integrate Katta into Solr so that:
 * Distributed search uses Hadoop RPC
 * Shard/SolrCore distribution and management
 * Zookeeper based failover
 * Indexes may be built using Hadoop

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Let's drop Maven Artifacts !

2011-01-18 Thread Thomas Koch
Hi,

the developers list may not be the right place to find strong maven 
supporters. All developers know lucene from inside out and are perfectly fine 
to install lucene from whatever artifact.
Those people using maven are your end users, that propably don't even 
subscribe to users@.

Thomas Koch, http://www.koch.ro

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Created: (SOLR-2317) Slaves have leftover index.xxxxx directories, and leftover files in index/ directory

2011-01-18 Thread Bill Bell (JIRA)
Slaves have leftover index.x directories, and leftover files in index/ 
directory


 Key: SOLR-2317
 URL: https://issues.apache.org/jira/browse/SOLR-2317
 Project: Solr
  Issue Type: Bug
Affects Versions: 3.1
Reporter: Bill Bell


Wen replicating, we are getting leftover files on slaves. Some slaves are 
getting index.number with files leftover. And more concerning, the index/ 
direcotry has left over files from previous replicated runs.

This is a pain to keep cleaning up.

Bill


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-2317) Slaves have leftover index.xxxxx directories, and leftover files in index/ directory

2011-01-18 Thread Bill Bell (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Bell updated SOLR-2317:


Description: 
When replicating, we are getting leftover files on slaves. Some slaves are 
getting index.number with files leftover. And more concerning, the index/ 
direcotry has left over files from previous replicated runs.

This is a pain to keep cleaning up.

Bill


  was:
Wen replicating, we are getting leftover files on slaves. Some slaves are 
getting index.number with files leftover. And more concerning, the index/ 
direcotry has left over files from previous replicated runs.

This is a pain to keep cleaning up.

Bill



 Slaves have leftover index.x directories, and leftover files in index/ 
 directory
 

 Key: SOLR-2317
 URL: https://issues.apache.org/jira/browse/SOLR-2317
 Project: Solr
  Issue Type: Bug
Affects Versions: 3.1
Reporter: Bill Bell

 When replicating, we are getting leftover files on slaves. Some slaves are 
 getting index.number with files leftover. And more concerning, the index/ 
 direcotry has left over files from previous replicated runs.
 This is a pain to keep cleaning up.
 Bill

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-2317) Slaves have leftover index.xxxxx directories, and leftover files in index/ directory

2011-01-18 Thread Bill Bell (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983072#action_12983072
 ] 

Bill Bell commented on SOLR-2317:
-

This is running Windows 2008 R2. We are using Native Locking on the master and 
slave. Running Jetty 6.

 Slaves have leftover index.x directories, and leftover files in index/ 
 directory
 

 Key: SOLR-2317
 URL: https://issues.apache.org/jira/browse/SOLR-2317
 Project: Solr
  Issue Type: Bug
Affects Versions: 3.1
Reporter: Bill Bell

 When replicating, we are getting leftover files on slaves. Some slaves are 
 getting index.number with files leftover. And more concerning, the index/ 
 direcotry has left over files from previous replicated runs.
 This is a pain to keep cleaning up.
 Bill

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1395) Integrate Katta

2011-01-18 Thread tom liu (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983076#action_12983076
 ] 

tom liu commented on SOLR-1395:
---

sorry, the above comments have errors:
in querycore(shards)/solrconfig.xml, requestHandler must be 
solr.MultiEmbeddedSearchHandler.
{code:xml|title=querycore(shards)/solrconfig.xml}
  requestHandler name=standard class=solr.MultiEmbeddedSearchHandler 
default=true
!-- default values for query parameters --
 lst name=defaults
   str name=echoParamsexplicit/str
/lst
  /requestHandler
{code} 

QueryComponent returns DocSlice, but XMLWrite or EmbeddedServer returns 
SolrDocumentList from DocList.

 Integrate Katta
 ---

 Key: SOLR-1395
 URL: https://issues.apache.org/jira/browse/SOLR-1395
 Project: Solr
  Issue Type: New Feature
Affects Versions: 1.4
Reporter: Jason Rutherglen
Priority: Minor
 Fix For: Next

 Attachments: back-end.log, front-end.log, hadoop-core-0.19.0.jar, 
 katta-core-0.6-dev.jar, katta-solrcores.jpg, katta.node.properties, 
 katta.zk.properties, log4j-1.2.13.jar, solr-1395-1431-3.patch, 
 solr-1395-1431-4.patch, solr-1395-1431-katta0.6.patch, 
 solr-1395-1431-katta0.6.patch, solr-1395-1431.patch, 
 solr-1395-katta-0.6.2-1.patch, solr-1395-katta-0.6.2-2.patch, 
 solr-1395-katta-0.6.2-3.patch, solr-1395-katta-0.6.2.patch, SOLR-1395.patch, 
 SOLR-1395.patch, SOLR-1395.patch, test-katta-core-0.6-dev.jar, 
 zkclient-0.1-dev.jar, zookeeper-3.2.1.jar

   Original Estimate: 336h
  Remaining Estimate: 336h

 We'll integrate Katta into Solr so that:
 * Distributed search uses Hadoop RPC
 * Shard/SolrCore distribution and management
 * Zookeeper based failover
 * Indexes may be built using Hadoop

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: Query parser contract changes?

2011-01-18 Thread karl.wright
This turns out to have indeed been due to a recent, but un-announced, index 
format change.  A rebuilt index worked properly.

Thanks!
Karl


From: ext karl.wri...@nokia.com [karl.wri...@nokia.com]
Sent: Monday, January 17, 2011 10:53 AM
To: dev@lucene.apache.org
Subject: RE: Query parser contract changes?

Another data point: the standard query parser actually ALSO fails when you do 
anything other than a *:* query.  When you specify a field name, it returns 
zero results:

root@duck93:/data/solr-dym/solr-dym# curl 
http://localhost:8983/solr/nose/standard?q=value_0:a*;
?xml version=1.0 encoding=UTF-8?
response
lst name=responseHeaderint name=status0/intint name=QTime7/intl
st name=paramsstr name=qvalue_0:a*/str/lst/lstresult name=respons
e numFound=0 start=0/
/response

But:

root@duck93:/data/solr-dym/solr-dym# curl 
http://localhost:8983/solr/nose/standard?q=*:*;
?xml version=1.0 encoding=UTF-8?
response
lst name=responseHeaderint name=status0/intint name=QTime244/int
lst name=paramsstr name=q*:*/str/lst/lstresult name=response nu
mFound=59431646 start=0docstr name=latitude40.55856/strstr name=l
ongitude44.37457/strstr name=referenceLANGUAGE=und|TYPE=STREET|ADDR_TOWN
SHIP_NAME=Armenia|ADDR_COUNTRY_NAME=Armenia|ADDR_STREET_NAME=A329|TITLE=A329, Ar
menia, Armenia/str/docdocstr name=latitude40.7703/strstr name=long
itude43.838/strstr name=referenceLANGUAGE=und|TYPE=STREET|ADDR_TOWNSHIP_
NAME=Armenia|ADDR_COUNTRY_NAME=Armenia|ADDR_STREET_NAME=A330|TITLE=A330, Armenia
…

The schema has not changed:

!-- Level 0 non-language value field --
field name=othervalue_0 type=string_idx_normed 
required=false/

…where string_idx_normed is declared in the following way:

fieldType name=string_idx_normed class=solr.TextField
indexed=true stored=false omitNorms=false
analyzer type=index
tokenizer class=solr.ICUTokenizerFactory /
filter class=solr.ICUFoldingFilterFactory /
/analyzer
analyzer type=query
tokenizer class=solr.ICUTokenizerFactory /
filter class=solr.ICUFoldingFilterFactory /
/analyzer
/fieldType

… which shouldn’t matter anyway because even a simple TermQuery return from my 
query parser method doesn’t work any more.

Karl

From: ext karl.wri...@nokia.com [mailto:karl.wri...@nokia.com]
Sent: Monday, January 17, 2011 10:30 AM
To: dev@lucene.apache.org
Subject: Query parser contract changes?

Hi folks,

I’m sorely puzzled by the fact that my QParser implementation ceased to work 
after the latest Solr/Lucene trunk update.  My previous update was about ten 
days ago, right after Mike made his index changes.

The symptom is that, although the query parser is correctly called, and seems 
to have the right arguments, the Query it is returning seems to be ignored.  I 
always get zero results.  I eliminated any possibility of error by just 
hardwiring the return of a TermQuery, and that too always yields zero results.

I was able to confirm, using the standard handler with the default query 
parser, that the index is in fine shape.  So I was wondering if the contract 
for QParser had changed in some subtle way that I missed?

Karl


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Let's drop Maven Artifacts !

2011-01-18 Thread Simon Willnauer
On Tue, Jan 18, 2011 at 9:33 AM, Thomas Koch tho...@koch.ro wrote:
 Hi,

 the developers list may not be the right place to find strong maven
 supporters. All developers know lucene from inside out and are perfectly fine
 to install lucene from whatever artifact.
 Those people using maven are your end users, that propably don't even
 subscribe to users@.

big +1 for this comment! I have to admit that I am not a big maven fan
and each time I have to use it its a pain in the ass but it is the
de-facto standard for the majority of java projects on this planet so
really there is not much of an option in my opinion. A project like
lucene has to release maven artifacts even if its a pain.

Simon

 Thomas Koch, http://www.koch.ro

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Let's drop Maven Artifacts !

2011-01-18 Thread Shai Erera
Out of curiosity, how did the Maven people integrate Lucene before we had
Maven artifacts. To the best of my understanding, we never had proper Maven
artifacts (Steve is working on that in LUCENE-2657).

Shai

On Tue, Jan 18, 2011 at 11:03 AM, Simon Willnauer 
simon.willna...@googlemail.com wrote:

 On Tue, Jan 18, 2011 at 9:33 AM, Thomas Koch tho...@koch.ro wrote:
  Hi,
 
  the developers list may not be the right place to find strong maven
  supporters. All developers know lucene from inside out and are perfectly
 fine
  to install lucene from whatever artifact.
  Those people using maven are your end users, that propably don't even
  subscribe to users@.

 big +1 for this comment! I have to admit that I am not a big maven fan
 and each time I have to use it its a pain in the ass but it is the
 de-facto standard for the majority of java projects on this planet so
 really there is not much of an option in my opinion. A project like
 lucene has to release maven artifacts even if its a pain.

 Simon
 
  Thomas Koch, http://www.koch.ro
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: dev-h...@lucene.apache.org
 
 



[jira] Updated: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly

2011-01-18 Thread Steven Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Rowe updated LUCENE-2657:


Attachment: LUCENE-2657.patch

In this patch:

# {{ant generate-maven-artifacts}} now works the same as it does on trunk 
without this patch -- using {{maven-ant-tasks}} -- except that instead of using 
the POM templates, the POMs provided in the patch are used.
# {{ant generate-maven-artifacts}} now functions properly at the top level, 
from {{lucene/}}, from {{modules/}}, and from {{solr/}}.
# Removed all {{*-source.jar}} and {{*-javadoc.jar}} generation related 
functionality from the POMs, as well as the {{dist}} profile - the Ant build is 
responsible for putting together the maven artifacts.
# Removed the POM templates, except for the two required to deploy the 
{{solr-noggit}} and {{solr-commons-csv}} artifacts from the Ant build.
# Modified the Maven artifact handling in the Ant build, including artifact 
signing, to be correct.
# Based on feedback from Stevo Slavić 
http://www.mail-archive.com/solr-user@lucene.apache.org/msg45656.html, added 
explicit {{groupId}}'s to the POMs that didn't have them, and added explicit 
{{relativePath}}'s to the {{parent}} declarations in all POMs.

I think this patch is ready to be committed to trunk.

I'll post a branch_3x version of this patch tomorrow, and then I think the 
patches on this issue will be complete.

 Replace Maven POM templates with full POMs, and change documentation 
 accordingly
 

 Key: LUCENE-2657
 URL: https://issues.apache.org/jira/browse/LUCENE-2657
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.1, 4.0
Reporter: Steven Rowe
Assignee: Steven Rowe
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch


 The current Maven POM templates only contain dependency information, the bare 
 bones necessary for uploading artifacts to the Maven repository.
 The full Maven POMs in the attached patch include the information necessary 
 to run a multi-module Maven build, in addition to serving the same purpose as 
 the current POM templates.
 Several dependencies are not available through public maven repositories.  A 
 profile in the top-level POM can be activated to install these dependencies 
 from the various {{lib/}} directories into your local repository.  From the 
 top-level directory:
 {code}
 mvn -N -Pbootstrap install
 {code}
 Once these non-Maven dependencies have been installed, to run all Lucene/Solr 
 tests via Maven's surefire plugin, and populate your local repository with 
 all artifacts, from the top level directory, run:
 {code}
 mvn install
 {code}
 When one Lucene/Solr module depends on another, the dependency is declared on 
 the *artifact(s)* produced by the other module and deposited in your local 
 repository, rather than on the other module's un-jarred compiler output in 
 the {{build/}} directory, so you must run {{mvn install}} on the other module 
 before its changes are visible to the module that depends on it.
 To create all the artifacts without running tests:
 {code}
 mvn -DskipTests install
 {code}
 I almost always include the {{clean}} phase when I do a build, e.g.:
 {code}
 mvn -DskipTests clean install
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2295) Create a MaxFieldLengthAnalyzer to wrap any other Analyzer and provide the same functionality as MaxFieldLength provided on IndexWriter

2011-01-18 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-2295:
---

Attachment: LUCENE-2295-2-3x.patch

Patch against 3x. Removed the get/set from IWC and changed code which used it. 
I also added some clarifying notes to the deprecation note in 
IW.setMaxFieldLength.

I will post a separate patch for trunk where this setting will be removed 
altogether.

 Create a MaxFieldLengthAnalyzer to wrap any other Analyzer and provide the 
 same functionality as MaxFieldLength provided on IndexWriter
 ---

 Key: LUCENE-2295
 URL: https://issues.apache.org/jira/browse/LUCENE-2295
 Project: Lucene - Java
  Issue Type: Improvement
  Components: contrib/analyzers
Reporter: Shai Erera
Assignee: Uwe Schindler
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2295-2-3x.patch, LUCENE-2295-trunk.patch, 
 LUCENE-2295.patch


 A spinoff from LUCENE-2294. Instead of asking the user to specify on 
 IndexWriter his requested MFL limit, we can get rid of this setting entirely 
 by providing an Analyzer which will wrap any other Analyzer and its 
 TokenStream with a TokenFilter that keeps track of the number of tokens 
 produced and stop when the limit has reached.
 This will remove any count tracking in IW's indexing, which is done even if I 
 specified UNLIMITED for MFL.
 Let's try to do it for 3.1.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Let's drop Maven Artifacts !

2011-01-18 Thread Earwin Burrfoot
Somehow, they were made available since 2.0
- http://repo2.maven.org/maven2/org/apache/lucene/lucene-core/

The pom's are minimal, sans dependencies, so eg if your project
depends on lucene-spellchecker, lucene-core won't be transitively
included and your build is gonna fail (you therefore had to add
dependency on the core to your project yourself).
But they were enough to download and link jars/sources/javadocs.

On Tue, Jan 18, 2011 at 12:40, Shai Erera ser...@gmail.com wrote:
 Out of curiosity, how did the Maven people integrate Lucene before we had
 Maven artifacts. To the best of my understanding, we never had proper Maven
 artifacts (Steve is working on that in LUCENE-2657).

 Shai

 On Tue, Jan 18, 2011 at 11:03 AM, Simon Willnauer
 simon.willna...@googlemail.com wrote:

 On Tue, Jan 18, 2011 at 9:33 AM, Thomas Koch tho...@koch.ro wrote:
  Hi,
 
  the developers list may not be the right place to find strong maven
  supporters. All developers know lucene from inside out and are perfectly
  fine
  to install lucene from whatever artifact.
  Those people using maven are your end users, that propably don't even
  subscribe to users@.

 big +1 for this comment! I have to admit that I am not a big maven fan
 and each time I have to use it its a pain in the ass but it is the
 de-facto standard for the majority of java projects on this planet so
 really there is not much of an option in my opinion. A project like
 lucene has to release maven artifacts even if its a pain.

 Simon
 
  Thomas Koch, http://www.koch.ro
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: dev-h...@lucene.apache.org
 
 





-- 
Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com)
Phone: +7 (495) 683-567-4
ICQ: 104465785

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly

2011-01-18 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983113#action_12983113
 ] 

Robert Muir commented on LUCENE-2657:
-

bq. I think this patch is ready to be committed to trunk.

Well first of all, you obviously worked hard on this, but we need to think this 
one through before committing.
Can we put this code in a separate project, that takes care of maven support 
for lucene?
The problem is there are two camps die maven die and maven or die. There 
will *never* be consensus.

The only way for maven to survive, is for the users that care about it, to 
support itself, just like other packaging systems
such as debian, redhat rpm, freebsd/mac ports, etc etc that we lucene, don't 
deal with.
They can't continue to whine to people like me, who don't give a shit about it, 
to support it and produce its crazy ass
complicated artifacts.

Instead the people who care about these packaging systems, and know how to make 
them work must deal with them.

Personally I really don't like:
* Having two build systems
* Having one build system (ant) rely upon the other (maven) to create release 
artifacts.

Basically, the ant build system is our build. I think it needs to be able to 
fully build lucene for a 
release without involving any other build systems such as Make or Maven.


 Replace Maven POM templates with full POMs, and change documentation 
 accordingly
 

 Key: LUCENE-2657
 URL: https://issues.apache.org/jira/browse/LUCENE-2657
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.1, 4.0
Reporter: Steven Rowe
Assignee: Steven Rowe
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch


 The current Maven POM templates only contain dependency information, the bare 
 bones necessary for uploading artifacts to the Maven repository.
 The full Maven POMs in the attached patch include the information necessary 
 to run a multi-module Maven build, in addition to serving the same purpose as 
 the current POM templates.
 Several dependencies are not available through public maven repositories.  A 
 profile in the top-level POM can be activated to install these dependencies 
 from the various {{lib/}} directories into your local repository.  From the 
 top-level directory:
 {code}
 mvn -N -Pbootstrap install
 {code}
 Once these non-Maven dependencies have been installed, to run all Lucene/Solr 
 tests via Maven's surefire plugin, and populate your local repository with 
 all artifacts, from the top level directory, run:
 {code}
 mvn install
 {code}
 When one Lucene/Solr module depends on another, the dependency is declared on 
 the *artifact(s)* produced by the other module and deposited in your local 
 repository, rather than on the other module's un-jarred compiler output in 
 the {{build/}} directory, so you must run {{mvn install}} on the other module 
 before its changes are visible to the module that depends on it.
 To create all the artifacts without running tests:
 {code}
 mvn -DskipTests install
 {code}
 I almost always include the {{clean}} phase when I do a build, e.g.:
 {code}
 mvn -DskipTests clean install
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly

2011-01-18 Thread Steven Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983123#action_12983123
 ] 

Steven Rowe commented on LUCENE-2657:
-

bq. Can we put this code in a separate project, that takes care of maven 
support for lucene?

I'd rather not.  The Lucene project has published Maven artifacts since the 
1.9.1 release.  I think we should continue to do that.

bq. The only way for maven to survive, is for the users that care about it, to 
support itself, just like other packaging systems such as debian, redhat rpm, 
freebsd/mac ports, etc etc that we lucene, don't deal with.

OK, those are pretty obviously red herrings.  Can we concentrate on the actual 
issue here without dragging in those extraneous things?  Maven artifacts, not 
those other things, have been provided by Lucene since the 1.9.1 release.  We 
obviously *do* deal with Maven.

bq. They can't continue to whine to people like me, who don't give a shit about 
it, to support it and produce its crazy ass complicated artifacts.

The latest patch on this release uses the Ant artifacts directly.  POMs are 
provided.  You know, just like it has been since the 1.9.1 release.

bq. Instead the people who care about these packaging systems, and know how to 
make them work must deal with them.

Um, like the patch on this issue is doing?

bq. Basically, the ant build system is our build. I think it needs to be able 
to fully build lucene for a release without involving any other build systems 
such as Make or Maven.

This patch uses the Ant-produced artifacts to prepare for Maven artifact 
publishing.  Maven itself is not invoked in the process.  An Ant plugin handles 
the artifact deployment.

I seriously do not understand why this is such a big deal.  Why can't we just 
keep publishing Maven artifacts?  You know, like we have for the past 15-20 
releases.

 Replace Maven POM templates with full POMs, and change documentation 
 accordingly
 

 Key: LUCENE-2657
 URL: https://issues.apache.org/jira/browse/LUCENE-2657
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.1, 4.0
Reporter: Steven Rowe
Assignee: Steven Rowe
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch


 The current Maven POM templates only contain dependency information, the bare 
 bones necessary for uploading artifacts to the Maven repository.
 The full Maven POMs in the attached patch include the information necessary 
 to run a multi-module Maven build, in addition to serving the same purpose as 
 the current POM templates.
 Several dependencies are not available through public maven repositories.  A 
 profile in the top-level POM can be activated to install these dependencies 
 from the various {{lib/}} directories into your local repository.  From the 
 top-level directory:
 {code}
 mvn -N -Pbootstrap install
 {code}
 Once these non-Maven dependencies have been installed, to run all Lucene/Solr 
 tests via Maven's surefire plugin, and populate your local repository with 
 all artifacts, from the top level directory, run:
 {code}
 mvn install
 {code}
 When one Lucene/Solr module depends on another, the dependency is declared on 
 the *artifact(s)* produced by the other module and deposited in your local 
 repository, rather than on the other module's un-jarred compiler output in 
 the {{build/}} directory, so you must run {{mvn install}} on the other module 
 before its changes are visible to the module that depends on it.
 To create all the artifacts without running tests:
 {code}
 mvn -DskipTests install
 {code}
 I almost always include the {{clean}} phase when I do a build, e.g.:
 {code}
 mvn -DskipTests clean install
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Let's drop Maven Artifacts !

2011-01-18 Thread Robert Muir
On Tue, Jan 18, 2011 at 5:29 AM, Hardy Ferentschik s...@ferentschik.de wrote:

 It also means that someone outside the dev community will at some stage
 create some
 pom files and upload the artifact to a (semi-) public repository.

This sounds great! this is how open source works, those who care about
it, will make it happen!

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly

2011-01-18 Thread Chris Male (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983135#action_12983135
 ] 

Chris Male commented on LUCENE-2657:


I'm a little lost at what this patch introduces that is imposing? Ant itself 
has maven support as part of its trunk code base so its clearly not too 
imposing for them.

Is your issue that this patch introduces things that get in your way somehow 
with using ant to do builds? or are you against committing this due to your 
general concerns with Maven?

 Replace Maven POM templates with full POMs, and change documentation 
 accordingly
 

 Key: LUCENE-2657
 URL: https://issues.apache.org/jira/browse/LUCENE-2657
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.1, 4.0
Reporter: Steven Rowe
Assignee: Steven Rowe
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch


 The current Maven POM templates only contain dependency information, the bare 
 bones necessary for uploading artifacts to the Maven repository.
 The full Maven POMs in the attached patch include the information necessary 
 to run a multi-module Maven build, in addition to serving the same purpose as 
 the current POM templates.
 Several dependencies are not available through public maven repositories.  A 
 profile in the top-level POM can be activated to install these dependencies 
 from the various {{lib/}} directories into your local repository.  From the 
 top-level directory:
 {code}
 mvn -N -Pbootstrap install
 {code}
 Once these non-Maven dependencies have been installed, to run all Lucene/Solr 
 tests via Maven's surefire plugin, and populate your local repository with 
 all artifacts, from the top level directory, run:
 {code}
 mvn install
 {code}
 When one Lucene/Solr module depends on another, the dependency is declared on 
 the *artifact(s)* produced by the other module and deposited in your local 
 repository, rather than on the other module's un-jarred compiler output in 
 the {{build/}} directory, so you must run {{mvn install}} on the other module 
 before its changes are visible to the module that depends on it.
 To create all the artifacts without running tests:
 {code}
 mvn -DskipTests install
 {code}
 I almost always include the {{clean}} phase when I do a build, e.g.:
 {code}
 mvn -DskipTests clean install
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly

2011-01-18 Thread Chris Male (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983141#action_12983141
 ] 

Chris Male commented on LUCENE-2657:


Alright I can appreciate your concern.  I think comparing Maven to RPM or 
FreeBSD ports is going a little far, but I can understand the point you're 
making.

What if this were committed so that those of us who do understand maven and do 
like using it, could?  
This issue about whether maven artifacts need to then be released or not can be 
part of a greater discussion (as is already taking place).

By committing this we then make it easier for someone else outside of the 
project to create the correct artifacts which are then available from the
central maven repository, if thats the decision thats made which is also the 
one you support.

 Replace Maven POM templates with full POMs, and change documentation 
 accordingly
 

 Key: LUCENE-2657
 URL: https://issues.apache.org/jira/browse/LUCENE-2657
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.1, 4.0
Reporter: Steven Rowe
Assignee: Steven Rowe
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch


 The current Maven POM templates only contain dependency information, the bare 
 bones necessary for uploading artifacts to the Maven repository.
 The full Maven POMs in the attached patch include the information necessary 
 to run a multi-module Maven build, in addition to serving the same purpose as 
 the current POM templates.
 Several dependencies are not available through public maven repositories.  A 
 profile in the top-level POM can be activated to install these dependencies 
 from the various {{lib/}} directories into your local repository.  From the 
 top-level directory:
 {code}
 mvn -N -Pbootstrap install
 {code}
 Once these non-Maven dependencies have been installed, to run all Lucene/Solr 
 tests via Maven's surefire plugin, and populate your local repository with 
 all artifacts, from the top level directory, run:
 {code}
 mvn install
 {code}
 When one Lucene/Solr module depends on another, the dependency is declared on 
 the *artifact(s)* produced by the other module and deposited in your local 
 repository, rather than on the other module's un-jarred compiler output in 
 the {{build/}} directory, so you must run {{mvn install}} on the other module 
 before its changes are visible to the module that depends on it.
 To create all the artifacts without running tests:
 {code}
 mvn -DskipTests install
 {code}
 I almost always include the {{clean}} phase when I do a build, e.g.:
 {code}
 mvn -DskipTests clean install
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2584) Concurrency issues in SegmentInfo.files() could lead to ConcurrentModificationException

2011-01-18 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983142#action_12983142
 ] 

Shai Erera commented on LUCENE-2584:


On one hand, it's good to add the files to a Set, so that we can be sure they 
are added uniquely. On the other hand though, if we expect files are added 
properly, then adding to the set is redundant. Since this code is executed once 
per SI instance, I think explicitly adding to a Set is better.

Note that while the assert you added will work, if someone runs without 
assertions he may get duplicate file names, if indeed they are added twice. I 
think that it's not so crucial to know that the same files was added twice, 
it's a very unlikely bug, but it is crucial that files() return unique names.

So can you please use a Set in the method instead of the assert (like it's done 
on trunk). Also, while you're at it, the method doesn't have javadocs - they 
appear in regular comments. Can you convert them to javadocs (there is a 
warning there about not modifying the returned List, but it's not visible as 
javadocs :).

 Concurrency issues in SegmentInfo.files() could lead to 
 ConcurrentModificationException
 ---

 Key: LUCENE-2584
 URL: https://issues.apache.org/jira/browse/LUCENE-2584
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Affects Versions: 2.9, 2.9.1, 2.9.2, 2.9.3, 3.0, 3.0.1, 3.0.2
Reporter: Alexander Kanarsky
Priority: Minor
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2584-branch_3x.patch, 
 LUCENE-2584-lucene-2_9.patch, LUCENE-2584-lucene-3_0.patch


 The multi-threaded call of the files() in SegmentInfo could lead to the 
 ConcurrentModificationException if one thread is not finished additions to 
 the ArrayList (files) yet while the other thread already obtained it as 
 cached (see below). This is a rare exception, but it would be nice to fix. I 
 see the code is no longer problematic in the trunk (and others ported from 
 flex_1458), looks it was fixed while implementing post 3.x features. The fix 
 to 3.x and 2.9.x branches could be the same - create the files set first and 
 populate it, and then assign to the member variable at the end of the method. 
 This will resolve the issue. I could prepare the patch for 2.9.4 and 3.x, if 
 needed.
 --
 INFO: [19] webapp= path=/replication params={command=fetchindexwt=javabin} 
 status=0 QTime=1
 Jul 30, 2010 9:13:05 AM org.apache.solr.core.SolrCore execute
 INFO: [19] webapp= path=/replication params={command=detailswt=javabin} 
 status=0 QTime=24
 Jul 30, 2010 9:13:05 AM org.apache.solr.handler.ReplicationHandler doFetch
 SEVERE: SnapPull failed
 java.util.ConcurrentModificationException
 at 
 java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372)
 at java.util.AbstractList$Itr.next(AbstractList.java:343)
 at java.util.AbstractCollection.addAll(AbstractCollection.java:305)
 at org.apache.lucene.index.SegmentInfos.files(SegmentInfos.java:826)
 at 
 org.apache.lucene.index.DirectoryReader$ReaderCommit.init(DirectoryReader.java:916)
 at 
 org.apache.lucene.index.DirectoryReader.getIndexCommit(DirectoryReader.java:856)
 at 
 org.apache.solr.search.SolrIndexReader.getIndexCommit(SolrIndexReader.java:454)
 at 
 org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:261)
 at 
 org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:264)
 at 
 org.apache.solr.handler.ReplicationHandler$1.run(ReplicationHandler.java:146)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2295) Create a MaxFieldLengthAnalyzer to wrap any other Analyzer and provide the same functionality as MaxFieldLength provided on IndexWriter

2011-01-18 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-2295:
---

Attachment: LUCENE-2295-2-trunk.patch

Patch against trunk - removes maxFieldLength handling from all the code.

 Create a MaxFieldLengthAnalyzer to wrap any other Analyzer and provide the 
 same functionality as MaxFieldLength provided on IndexWriter
 ---

 Key: LUCENE-2295
 URL: https://issues.apache.org/jira/browse/LUCENE-2295
 Project: Lucene - Java
  Issue Type: Improvement
  Components: contrib/analyzers
Reporter: Shai Erera
Assignee: Uwe Schindler
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2295-2-3x.patch, LUCENE-2295-2-trunk.patch, 
 LUCENE-2295-trunk.patch, LUCENE-2295.patch


 A spinoff from LUCENE-2294. Instead of asking the user to specify on 
 IndexWriter his requested MFL limit, we can get rid of this setting entirely 
 by providing an Analyzer which will wrap any other Analyzer and its 
 TokenStream with a TokenFilter that keeps track of the number of tokens 
 produced and stop when the limit has reached.
 This will remove any count tracking in IW's indexing, which is done even if I 
 specified UNLIMITED for MFL.
 Let's try to do it for 3.1.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2324) Per thread DocumentsWriters that write their own private segments

2011-01-18 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983154#action_12983154
 ] 

Michael McCandless commented on LUCENE-2324:


I ran a quick perf test here: I built the 10M Wikipedia index,
Standard codec, using 6 threads.  Trunk took 541.6 sec; RT took 518.2
sec (only a bit faster), but the test wasn't really fair because it
flushed @ docCount=12870.

But I can't test flush by RAM -- that's not working yet on RT right?

(The search results matched, which is nice!)

Then I ran a single-threaded test.  Trunk took 1097.1 sec and RT took
1040.5 sec -- a bit faster!  Presumably in the noise (we don't expect
a speedup?), but excellent that it's not slower...

I think we lost infoStream output on the details of flushing?  I can't
see when which DWPTs are flushing...


 Per thread DocumentsWriters that write their own private segments
 -

 Key: LUCENE-2324
 URL: https://issues.apache.org/jira/browse/LUCENE-2324
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Michael Busch
Assignee: Michael Busch
Priority: Minor
 Fix For: Realtime Branch

 Attachments: LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, 
 LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, 
 LUCENE-2324.patch, LUCENE-2324.patch, LUCENE-2324.patch, lucene-2324.patch, 
 lucene-2324.patch, LUCENE-2324.patch, test.out, test.out, test.out, test.out


 See LUCENE-2293 for motivation and more details.
 I'm copying here Mike's summary he posted on 2293:
 Change the approach for how we buffer in RAM to a more isolated
 approach, whereby IW has N fully independent RAM segments
 in-process and when a doc needs to be indexed it's added to one of
 them. Each segment would also write its own doc stores and
 normal segment merging (not the inefficient merge we now do on
 flush) would merge them. This should be a good simplification in
 the chain (eg maybe we can remove the *PerThread classes). The
 segments can flush independently, letting us make much better
 concurrent use of IO  CPU.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2856) Create IndexWriter event listener, specifically for merges

2011-01-18 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983155#action_12983155
 ] 

Michael McCandless commented on LUCENE-2856:


The ReaderEvent is never generated?  Is that still work-in-progress?
When would this be invoked? Only if IW is pooling readers?  Maybe we
should hold off on that for a separate issue?

Why were the added checks needed in SegmentInfo?  Oh I see, it's
because you compute the sizeInBytes of the merged segment before the
merge completes... hmm.  I think I'd prefer that this SegmentInfo not
be published until the Type == COMPLETE.

How come merge is not also final in MergeEvent?

I agree we should change the name.  IndexEventListener?

I don't think we need CompositeSegmentListener?  Why not an API to
just add/remove listeners?  Also: are we sure this belongs in IWC?
This is analogous to infoStream, which is on IW.  It's not a config
parameter that affects indexing.

Should we also track segment flushed/aborted events?

Can you add some jdocs and mark the API as experimental?


 Create IndexWriter event listener, specifically for merges
 --

 Key: LUCENE-2856
 URL: https://issues.apache.org/jira/browse/LUCENE-2856
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Affects Versions: 4.0
Reporter: Jason Rutherglen
 Attachments: LUCENE-2856.patch, LUCENE-2856.patch, LUCENE-2856.patch


 The issue will allow users to monitor merges occurring within IndexWriter 
 using a callback notifier event listener.  This can be used by external 
 applications such as Solr to monitor large segment merges.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly

2011-01-18 Thread Chris Male (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983158#action_12983158
 ] 

Chris Male commented on LUCENE-2657:


That was basically what I was getting at (perhaps not clearly enough).  

Would a satisfactory compromise be to view this patch as adding development 
support for maven, which is not to do with whether maven artifacts are released 
or not? 

The discussion about release process, artifacts and build system flamewars can 
then happen outside of this.

 Replace Maven POM templates with full POMs, and change documentation 
 accordingly
 

 Key: LUCENE-2657
 URL: https://issues.apache.org/jira/browse/LUCENE-2657
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.1, 4.0
Reporter: Steven Rowe
Assignee: Steven Rowe
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch


 The current Maven POM templates only contain dependency information, the bare 
 bones necessary for uploading artifacts to the Maven repository.
 The full Maven POMs in the attached patch include the information necessary 
 to run a multi-module Maven build, in addition to serving the same purpose as 
 the current POM templates.
 Several dependencies are not available through public maven repositories.  A 
 profile in the top-level POM can be activated to install these dependencies 
 from the various {{lib/}} directories into your local repository.  From the 
 top-level directory:
 {code}
 mvn -N -Pbootstrap install
 {code}
 Once these non-Maven dependencies have been installed, to run all Lucene/Solr 
 tests via Maven's surefire plugin, and populate your local repository with 
 all artifacts, from the top level directory, run:
 {code}
 mvn install
 {code}
 When one Lucene/Solr module depends on another, the dependency is declared on 
 the *artifact(s)* produced by the other module and deposited in your local 
 repository, rather than on the other module's un-jarred compiler output in 
 the {{build/}} directory, so you must run {{mvn install}} on the other module 
 before its changes are visible to the module that depends on it.
 To create all the artifacts without running tests:
 {code}
 mvn -DskipTests install
 {code}
 I almost always include the {{clean}} phase when I do a build, e.g.:
 {code}
 mvn -DskipTests clean install
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2474) Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey)

2011-01-18 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983159#action_12983159
 ] 

Michael McCandless commented on LUCENE-2474:


bq. Still, I think that using CopyOnWriteArrayList is best here.

OK I'll switch back to COWAL... it makes me nervous though.  I like
being defensive and the added cost of CHM iteration really should be
negligible here.

{quote}
I'd like even more for there to be just a single CopyOnWriteArrayList per 
top-level reader that is then propagated to all sub/segment readers, 
including new ones on a reopen. But I guess Mike indicated that was currently 
too hard/hairy.
{quote}

This did get hairy... eg if you make a MultiReader (or ParallelReader)
w/ subs... what should happen to their listeners?  Ie what if the subs
already have listeners enrolled?

It also spooked me that apps may think they have to re-register after
re-open (if we stick w/ ArrayList) since then the list'd just grow...
it's trappy.

And, if you pull an NRT reader from IW (which is what reopen does
under the hood for an NRT reader), how to share its listeners?  Ie,
we'd have to add a setter to IW as well, so it's also single source
(propagates on reopen).

This is why I fell back to a simple static as the baby step for now.

{quote}
The static is really non-optimal though - among other problems, it requires 
systems with multiple readers (and wants to do different things with different 
readers, such as maintain separate caches) to figure out what top-level reader 
a segment reader is associated with. And given that we are dealing with 
IndexReader instances in the callbacks, and not ReaderContext objects, this 
seems impossible?
{quote}

ReaderContext doesn't really make sense here?

Ie, the listener is invoked when any/all composite readers sharing a
given segment have now closed (ie when the RC for that segment's core
drops to 0), or when a composite reader is closed.

Also, in practice, is it really so hard for the app to figure out
which SR goes to which of their caches?  Isn't this typically a
containsKey against the app level caches...?


 Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean 
 custom caches that use the IndexReader (getFieldCacheKey)
 

 Key: LUCENE-2474
 URL: https://issues.apache.org/jira/browse/LUCENE-2474
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Shay Banon
 Attachments: LUCENE-2474.patch, LUCENE-2474.patch


 Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean 
 custom caches that use the IndexReader (getFieldCacheKey).
 A spin of: https://issues.apache.org/jira/browse/LUCENE-2468. Basically, its 
 make a lot of sense to cache things based on IndexReader#getFieldCacheKey, 
 even Lucene itself uses it, for example, with the CachingWrapperFilter. 
 FieldCache enjoys being called explicitly to purge its cache when possible 
 (which is tricky to know from the outside, especially when using NRT - 
 reader attack of the clones).
 The provided patch allows to plug a CacheEvictionListener which will be 
 called when the cache should be purged for an IndexReader.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly

2011-01-18 Thread Earwin Burrfoot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983160#action_12983160
 ] 

Earwin Burrfoot commented on LUCENE-2657:
-

bq. we need to be very clear  and it has no effect on artifacts
I feel something was missed in the heat of debate. Eg:
bq. The latest patch on this release uses the Ant artifacts directly.
bq. This patch uses the Ant-produced artifacts to prepare for Maven artifact 
publishing. 
bq. Maven itself is not invoked in the process. An Ant plugin handles the 
artifact deployment.
I will now try to decipher these quotes.
It seems the patch takes the artifacts produced by Ant, as a part of our usual 
(and only) build process, and shoves it down Maven repository's throat along 
with a bunch of pom-descriptors.
Nothing else is happening.

Also, after everything that has been said, I think nobody in his right mind 
will *force* anyone to actually use the Ant target in question as a part of 
release. But it's nice to have it around, in case some user-friendly commiter 
would like to push (I'd like to reiterate - ant generated) artifacts into Maven.

 Replace Maven POM templates with full POMs, and change documentation 
 accordingly
 

 Key: LUCENE-2657
 URL: https://issues.apache.org/jira/browse/LUCENE-2657
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.1, 4.0
Reporter: Steven Rowe
Assignee: Steven Rowe
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch


 The current Maven POM templates only contain dependency information, the bare 
 bones necessary for uploading artifacts to the Maven repository.
 The full Maven POMs in the attached patch include the information necessary 
 to run a multi-module Maven build, in addition to serving the same purpose as 
 the current POM templates.
 Several dependencies are not available through public maven repositories.  A 
 profile in the top-level POM can be activated to install these dependencies 
 from the various {{lib/}} directories into your local repository.  From the 
 top-level directory:
 {code}
 mvn -N -Pbootstrap install
 {code}
 Once these non-Maven dependencies have been installed, to run all Lucene/Solr 
 tests via Maven's surefire plugin, and populate your local repository with 
 all artifacts, from the top level directory, run:
 {code}
 mvn install
 {code}
 When one Lucene/Solr module depends on another, the dependency is declared on 
 the *artifact(s)* produced by the other module and deposited in your local 
 repository, rather than on the other module's un-jarred compiler output in 
 the {{build/}} directory, so you must run {{mvn install}} on the other module 
 before its changes are visible to the module that depends on it.
 To create all the artifacts without running tests:
 {code}
 mvn -DskipTests install
 {code}
 I almost always include the {{clean}} phase when I do a build, e.g.:
 {code}
 mvn -DskipTests clean install
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly

2011-01-18 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983161#action_12983161
 ] 

Robert Muir commented on LUCENE-2657:
-

Chris: well thats the problem with maven, it tries to be too many things, a 
dependency management tool,
a packaging system, a build system, ...

So, thats why I said we have to just be very clear about which exact scope of 
maven we are discussing.
If the patch presented here is against /dev-tools, and is to assist developers 
who like maven, then as
I said before I am totally ok with this, but I'm only speaking for myself.

Because maven is so many things, and due to Earwin's confusion, I think it 
would be good in general to 
add a README.txt to dev-tools anyway, that states what exactly it is (tools to 
assist lucene/solr developers,
that aren't supported, its not bugs if they stop working, and will be deleted 
if they rot).

Separately what you said about other code in trunk is totally true... for 
example its my opinion that there is 
a lot of code in lucene's contrib that should be moved out to something like 
apache-extras... currently lucene's 
contrib has to compile and pass tests or the build fails... there is definitely 
some stuff in there that is more
sandboxy, slows down lucene core development, but itself isnt getting much 
maintenance other than devs
doing the minimum work to make them pass tests... and we should be keep other 
options in mind for stuff like this.



 Replace Maven POM templates with full POMs, and change documentation 
 accordingly
 

 Key: LUCENE-2657
 URL: https://issues.apache.org/jira/browse/LUCENE-2657
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.1, 4.0
Reporter: Steven Rowe
Assignee: Steven Rowe
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch


 The current Maven POM templates only contain dependency information, the bare 
 bones necessary for uploading artifacts to the Maven repository.
 The full Maven POMs in the attached patch include the information necessary 
 to run a multi-module Maven build, in addition to serving the same purpose as 
 the current POM templates.
 Several dependencies are not available through public maven repositories.  A 
 profile in the top-level POM can be activated to install these dependencies 
 from the various {{lib/}} directories into your local repository.  From the 
 top-level directory:
 {code}
 mvn -N -Pbootstrap install
 {code}
 Once these non-Maven dependencies have been installed, to run all Lucene/Solr 
 tests via Maven's surefire plugin, and populate your local repository with 
 all artifacts, from the top level directory, run:
 {code}
 mvn install
 {code}
 When one Lucene/Solr module depends on another, the dependency is declared on 
 the *artifact(s)* produced by the other module and deposited in your local 
 repository, rather than on the other module's un-jarred compiler output in 
 the {{build/}} directory, so you must run {{mvn install}} on the other module 
 before its changes are visible to the module that depends on it.
 To create all the artifacts without running tests:
 {code}
 mvn -DskipTests install
 {code}
 I almost always include the {{clean}} phase when I do a build, e.g.:
 {code}
 mvn -DskipTests clean install
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly

2011-01-18 Thread Earwin Burrfoot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983162#action_12983162
 ] 

Earwin Burrfoot commented on LUCENE-2657:
-

Thanks, but I'm not the one confused here. : )

 Replace Maven POM templates with full POMs, and change documentation 
 accordingly
 

 Key: LUCENE-2657
 URL: https://issues.apache.org/jira/browse/LUCENE-2657
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.1, 4.0
Reporter: Steven Rowe
Assignee: Steven Rowe
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch


 The current Maven POM templates only contain dependency information, the bare 
 bones necessary for uploading artifacts to the Maven repository.
 The full Maven POMs in the attached patch include the information necessary 
 to run a multi-module Maven build, in addition to serving the same purpose as 
 the current POM templates.
 Several dependencies are not available through public maven repositories.  A 
 profile in the top-level POM can be activated to install these dependencies 
 from the various {{lib/}} directories into your local repository.  From the 
 top-level directory:
 {code}
 mvn -N -Pbootstrap install
 {code}
 Once these non-Maven dependencies have been installed, to run all Lucene/Solr 
 tests via Maven's surefire plugin, and populate your local repository with 
 all artifacts, from the top level directory, run:
 {code}
 mvn install
 {code}
 When one Lucene/Solr module depends on another, the dependency is declared on 
 the *artifact(s)* produced by the other module and deposited in your local 
 repository, rather than on the other module's un-jarred compiler output in 
 the {{build/}} directory, so you must run {{mvn install}} on the other module 
 before its changes are visible to the module that depends on it.
 To create all the artifacts without running tests:
 {code}
 mvn -DskipTests install
 {code}
 I almost always include the {{clean}} phase when I do a build, e.g.:
 {code}
 mvn -DskipTests clean install
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Let's drop Maven Artifacts !

2011-01-18 Thread Stevo Slavić
More than one build tools is not way to go, I believe everyone agrees
on that, and that it's not an issue.

Have you guys at least considered making a switch to a build tool that
knows to produce maven artifacts (or enhancing exiting one to take
care of that)? E.g. ant+ivy, gradle, maven itself.

IMO making a switch to a modern build tool or enhancing existing one
to produce maven artifacts at the moment is out of best interest for
any open source project including this one, it will be out of benefit
for projec users/contributors, developers, and project as a whole:
- official project binaries will (continue to) be available to as
large as possible user base so you'll get more potential testers/bug
reporters, and more potential contributors, and more potential
commercial/paying customers which will raise project quality, bring
new ideas, and finance future development
- modern build tools have declarative dependency management so it will
be easier to develop and contribute, at least one won't have to wait
for dependency libs to get downloaded together with sources every time
project is checked out and you will not have to manually download
new/updated 3rd party dependencies, just change build script/metadata
- modern build tools try to be and mostly are non intrusive, and
promote good proven solutions like standard project structure/layout
so it's easier to get started and productive on such projects compared
to projects with custom layout;
- modern build tools are better integrated with current development
infrastructure tools, like IDEs, and continuous integration servers.

This switch would also make it easier to maintain project metadata, to
keep metadata DRY, so that publishing Maven artifacts even if decided
not to be part of main release process, can be done with not much
effort and enough credibility.

If who cares about project maven artifact consumers regardless of size
of that community attitude is accepted and official project stand, and
project community size is not considered as project asset, I don't
understand why project is being published under open source license.

Regards,
Stevo.


On Tue, Jan 18, 2011 at 11:50 AM, Robert Muir rcm...@gmail.com wrote:
 On Tue, Jan 18, 2011 at 5:29 AM, Hardy Ferentschik s...@ferentschik.de 
 wrote:

 It also means that someone outside the dev community will at some stage
 create some
 pom files and upload the artifact to a (semi-) public repository.

 This sounds great! this is how open source works, those who care about
 it, will make it happen!

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly

2011-01-18 Thread Chris Male (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983163#action_12983163
 ] 

Chris Male commented on LUCENE-2657:


Ant does many things too and we use it in a specific way so I see no problem 
defining what we intend our maven support to be for.

So I'm feeling some consensus (fortunately I spoke too soon before) that if we 
target this toward being a development tool which is not
forced upon any users / release managers.

Is this okay with you Steven?

A README.txt describing the scope of the dev-tools sounds appropriate 
irrespective of what happens here.  I certainly wasn't aware of what
their maintenance plan was.

 Replace Maven POM templates with full POMs, and change documentation 
 accordingly
 

 Key: LUCENE-2657
 URL: https://issues.apache.org/jira/browse/LUCENE-2657
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.1, 4.0
Reporter: Steven Rowe
Assignee: Steven Rowe
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch


 The current Maven POM templates only contain dependency information, the bare 
 bones necessary for uploading artifacts to the Maven repository.
 The full Maven POMs in the attached patch include the information necessary 
 to run a multi-module Maven build, in addition to serving the same purpose as 
 the current POM templates.
 Several dependencies are not available through public maven repositories.  A 
 profile in the top-level POM can be activated to install these dependencies 
 from the various {{lib/}} directories into your local repository.  From the 
 top-level directory:
 {code}
 mvn -N -Pbootstrap install
 {code}
 Once these non-Maven dependencies have been installed, to run all Lucene/Solr 
 tests via Maven's surefire plugin, and populate your local repository with 
 all artifacts, from the top level directory, run:
 {code}
 mvn install
 {code}
 When one Lucene/Solr module depends on another, the dependency is declared on 
 the *artifact(s)* produced by the other module and deposited in your local 
 repository, rather than on the other module's un-jarred compiler output in 
 the {{build/}} directory, so you must run {{mvn install}} on the other module 
 before its changes are visible to the module that depends on it.
 To create all the artifacts without running tests:
 {code}
 mvn -DskipTests install
 {code}
 I almost always include the {{clean}} phase when I do a build, e.g.:
 {code}
 mvn -DskipTests clean install
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Resolved: (LUCENE-2295) Create a MaxFieldLengthAnalyzer to wrap any other Analyzer and provide the same functionality as MaxFieldLength provided on IndexWriter

2011-01-18 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera resolved LUCENE-2295.


Resolution: Fixed

Committed revision 1060340 (trunk).
Committed revision 1060342 (3x).

 Create a MaxFieldLengthAnalyzer to wrap any other Analyzer and provide the 
 same functionality as MaxFieldLength provided on IndexWriter
 ---

 Key: LUCENE-2295
 URL: https://issues.apache.org/jira/browse/LUCENE-2295
 Project: Lucene - Java
  Issue Type: Improvement
  Components: contrib/analyzers
Reporter: Shai Erera
Assignee: Uwe Schindler
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2295-2-3x.patch, LUCENE-2295-2-trunk.patch, 
 LUCENE-2295-trunk.patch, LUCENE-2295.patch


 A spinoff from LUCENE-2294. Instead of asking the user to specify on 
 IndexWriter his requested MFL limit, we can get rid of this setting entirely 
 by providing an Analyzer which will wrap any other Analyzer and its 
 TokenStream with a TokenFilter that keeps track of the number of tokens 
 produced and stop when the limit has reached.
 This will remove any count tracking in IW's indexing, which is done even if I 
 specified UNLIMITED for MFL.
 Let's try to do it for 3.1.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Let's drop Maven Artifacts !

2011-01-18 Thread Robert Muir
On Tue, Jan 18, 2011 at 6:56 AM, Stevo Slavić ssla...@gmail.com wrote:
 More than one build tools is not way to go, I believe everyone agrees
 on that, and that it's not an issue.

 Have you guys at least considered making a switch to a build tool that
 knows to produce maven artifacts (or enhancing exiting one to take
 care of that)? E.g. ant+ivy, gradle, maven itself.


I think its important to look at the build system as supporting
development too, but most features being developed today are against
lucene's core: which has no dependencies at all.

For example, our ant build supports rapidly running the core tests
(splitting them across different jvms in parallel: i've looked at the
support for parallel testing in other build systems like maven and I
think ours is significantly better for our tests).

This compile-test-debug lifecycle is important, for the lucene core
tests its very fast.

So while I might agree with you that for something like Solr
development, perhaps ant+ivy is something worth considering, I think
its overkill and would be a step backwards for lucene, we would only
slow down development.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Windows test failure VelocityResponseWriter, unmodified trunk.

2011-01-18 Thread Erick Erickson
Yep, already tried a fresh checkout before sending the e-mail. At first
glance this looks like a classpath issue hopefully just on my machine,
but it was late last night and I wanted to give someone a chance to pipe
up with Ooops, I was changing that and.. Yes, I'm lazy when I can
be. Er... Efficient that is.

Erick

On Tue, Jan 18, 2011 at 12:19 AM, Yonik Seeley
yo...@lucidimagination.comwrote:

 On Mon, Jan 17, 2011 at 10:42 PM, Erick Erickson
 erickerick...@gmail.com wrote:
  H, a fresh, unmodified checkout of Solr will fail on my Windows7 box
 if
  I run ant -Dtestcase=VelocityResponseWriterTest test. It succeeds on my
  Mac. Anyone got a clue? Or should I look into it? Of course it succeeds
 in
  IntelliJ. S

 My windows laptop took a vacation (a permanent one) so I can't verify.
 But  when I see NoSuchMethod runtime exceptions, I usually try a fresh
 checkout first.  It's sometimes just stuff not getting cleaned up
 properly.

 -Yonik
 http://www.lucidimagination.com

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




[jira] Updated: (SOLR-445) XmlUpdateRequestHandler bad documents mid batch aborts rest of batch

2011-01-18 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-445:


Attachment: SOLR-445-3_x.patch
SOLR-445.patch

I think it's ready for review, both trunk and 3_x. Would someone look this over 
and commit it if they think it's ready?

Note to self: do NOT call initCore in a test case just because you need a 
different schema.

The problem I was having with running tests was because I needed a schema file 
with a required field so I naively called initCore with schema11.xml in spite 
of the fact that @BeforeClass called it with just schema.xml. Which apparently 
does bad things with the state of *something* and caused other tests to fail... 
I can get TestDistributedSearch to fail on unchanged source code simply by 
calling initCore with schema11.xml and doing nothing else in a new test case in 
BasicFunctionalityTest. So I put my new tests that required schema11 in a new 
file instead.

The XML file attached is not intended to be committed, it is just a convenience 
for anyone checking out this patch to run against a Solr instance to see what 
is returned.

This seems to return the data in the SolrJ case as well.

NOTE: This does change the behavior of Solr. Without this patch, the first 
document that is incorrect stops processing. Now, it continues merrily on 
adding documents as it can. Is this desirable behavior? It would be easy to 
abort on first error if that's the consensus, and I could take some tedious 
record-keeping out. I think there's no big problem with continuing on, since 
the state of committed documents is indeterminate already when errors occur so 
worrying about this should be part of a bigger issue.

 XmlUpdateRequestHandler bad documents mid batch aborts rest of batch
 

 Key: SOLR-445
 URL: https://issues.apache.org/jira/browse/SOLR-445
 Project: Solr
  Issue Type: Bug
  Components: update
Affects Versions: 1.3
Reporter: Will Johnson
Assignee: Erick Erickson
 Fix For: Next

 Attachments: SOLR-445-3_x.patch, SOLR-445.patch, SOLR-445.patch, 
 solr-445.xml


 Has anyone run into the problem of handling bad documents / failures mid 
 batch.  Ie:
 add
   doc
 field name=id1/field
   /doc
   doc
 field name=id2/field
 field name=myDateFieldI_AM_A_BAD_DATE/field
   /doc
   doc
 field name=id3/field
   /doc
 /add
 Right now solr adds the first doc and then aborts.  It would seem like it 
 should either fail the entire batch or log a message/return a code and then 
 continue on to add doc 3.  Option 1 would seem to be much harder to 
 accomplish and possibly require more memory while Option 2 would require more 
 information to come back from the API.  I'm about to dig into this but I 
 thought I'd ask to see if anyone had any suggestions, thoughts or comments.   
  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Windows test failure VelocityResponseWriter, unmodified trunk.

2011-01-18 Thread Erick Erickson
Robert:

Thanks, that's just the kind of hint I was looking for. I'll be able to
spend
some time on this a bit later.

Erick

On Tue, Jan 18, 2011 at 7:55 AM, Robert Muir rcm...@gmail.com wrote:

 Erick, I think i know the problem: see
 https://issues.apache.org/jira/browse/SOLR-2303

 perhaps the issue is somehow not fixed though. feel free to re-open
 it and we can try to get to the bottom of it...
 But i suspect it has to do with log4j jars being in ant's classpath,
 and somewhere in solr's build it must be adding ant's classpath to the
 junit runtime classpath... i know i cleared this up for lucene but
 perhaps i missed a spot for solr.

 On Tue, Jan 18, 2011 at 7:50 AM, Erick Erickson erickerick...@gmail.com
 wrote:
  Yep, already tried a fresh checkout before sending the e-mail. At first
  glance this looks like a classpath issue hopefully just on my machine,
  but it was late last night and I wanted to give someone a chance to pipe
  up with Ooops, I was changing that and.. Yes, I'm lazy when I can
  be. Er... Efficient that is.
  Erick
 
  On Tue, Jan 18, 2011 at 12:19 AM, Yonik Seeley 
 yo...@lucidimagination.com
  wrote:
 
  On Mon, Jan 17, 2011 at 10:42 PM, Erick Erickson
  erickerick...@gmail.com wrote:
   H, a fresh, unmodified checkout of Solr will fail on my Windows7
 box
   if
   I run ant -Dtestcase=VelocityResponseWriterTest test. It succeeds on
   my
   Mac. Anyone got a clue? Or should I look into it? Of course it
 succeeds
   in
   IntelliJ. S
 
  My windows laptop took a vacation (a permanent one) so I can't verify.
  But  when I see NoSuchMethod runtime exceptions, I usually try a fresh
  checkout first.  It's sometimes just stuff not getting cleaned up
  properly.
 
  -Yonik
  http://www.lucidimagination.com
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




[jira] Assigned: (LUCENE-2584) Concurrency issues in SegmentInfo.files() could lead to ConcurrentModificationException

2011-01-18 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera reassigned LUCENE-2584:
--

Assignee: Shai Erera

 Concurrency issues in SegmentInfo.files() could lead to 
 ConcurrentModificationException
 ---

 Key: LUCENE-2584
 URL: https://issues.apache.org/jira/browse/LUCENE-2584
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Affects Versions: 2.9, 2.9.1, 2.9.2, 2.9.3, 3.0, 3.0.1, 3.0.2
Reporter: Alexander Kanarsky
Assignee: Shai Erera
Priority: Minor
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2584-branch_3x.patch, 
 LUCENE-2584-lucene-2_9.patch, LUCENE-2584-lucene-3_0.patch


 The multi-threaded call of the files() in SegmentInfo could lead to the 
 ConcurrentModificationException if one thread is not finished additions to 
 the ArrayList (files) yet while the other thread already obtained it as 
 cached (see below). This is a rare exception, but it would be nice to fix. I 
 see the code is no longer problematic in the trunk (and others ported from 
 flex_1458), looks it was fixed while implementing post 3.x features. The fix 
 to 3.x and 2.9.x branches could be the same - create the files set first and 
 populate it, and then assign to the member variable at the end of the method. 
 This will resolve the issue. I could prepare the patch for 2.9.4 and 3.x, if 
 needed.
 --
 INFO: [19] webapp= path=/replication params={command=fetchindexwt=javabin} 
 status=0 QTime=1
 Jul 30, 2010 9:13:05 AM org.apache.solr.core.SolrCore execute
 INFO: [19] webapp= path=/replication params={command=detailswt=javabin} 
 status=0 QTime=24
 Jul 30, 2010 9:13:05 AM org.apache.solr.handler.ReplicationHandler doFetch
 SEVERE: SnapPull failed
 java.util.ConcurrentModificationException
 at 
 java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372)
 at java.util.AbstractList$Itr.next(AbstractList.java:343)
 at java.util.AbstractCollection.addAll(AbstractCollection.java:305)
 at org.apache.lucene.index.SegmentInfos.files(SegmentInfos.java:826)
 at 
 org.apache.lucene.index.DirectoryReader$ReaderCommit.init(DirectoryReader.java:916)
 at 
 org.apache.lucene.index.DirectoryReader.getIndexCommit(DirectoryReader.java:856)
 at 
 org.apache.solr.search.SolrIndexReader.getIndexCommit(SolrIndexReader.java:454)
 at 
 org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:261)
 at 
 org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:264)
 at 
 org.apache.solr.handler.ReplicationHandler$1.run(ReplicationHandler.java:146)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2584) Concurrency issues in SegmentInfo.files() could lead to ConcurrentModificationException

2011-01-18 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-2584:
---

Attachment: LUCENE-2584.patch

Patch against 3x - fixes the bug according to Alexander's other patch (but uses 
HashSet all the way), and I added a CHANGES entry and test case to 
TestSegmentInfo. I plan to commit this soon and also backport to 3.0 and 2.9

 Concurrency issues in SegmentInfo.files() could lead to 
 ConcurrentModificationException
 ---

 Key: LUCENE-2584
 URL: https://issues.apache.org/jira/browse/LUCENE-2584
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Affects Versions: 2.9, 2.9.1, 2.9.2, 2.9.3, 3.0, 3.0.1, 3.0.2
Reporter: Alexander Kanarsky
Assignee: Shai Erera
Priority: Minor
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2584-branch_3x.patch, 
 LUCENE-2584-lucene-2_9.patch, LUCENE-2584-lucene-3_0.patch, LUCENE-2584.patch


 The multi-threaded call of the files() in SegmentInfo could lead to the 
 ConcurrentModificationException if one thread is not finished additions to 
 the ArrayList (files) yet while the other thread already obtained it as 
 cached (see below). This is a rare exception, but it would be nice to fix. I 
 see the code is no longer problematic in the trunk (and others ported from 
 flex_1458), looks it was fixed while implementing post 3.x features. The fix 
 to 3.x and 2.9.x branches could be the same - create the files set first and 
 populate it, and then assign to the member variable at the end of the method. 
 This will resolve the issue. I could prepare the patch for 2.9.4 and 3.x, if 
 needed.
 --
 INFO: [19] webapp= path=/replication params={command=fetchindexwt=javabin} 
 status=0 QTime=1
 Jul 30, 2010 9:13:05 AM org.apache.solr.core.SolrCore execute
 INFO: [19] webapp= path=/replication params={command=detailswt=javabin} 
 status=0 QTime=24
 Jul 30, 2010 9:13:05 AM org.apache.solr.handler.ReplicationHandler doFetch
 SEVERE: SnapPull failed
 java.util.ConcurrentModificationException
 at 
 java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372)
 at java.util.AbstractList$Itr.next(AbstractList.java:343)
 at java.util.AbstractCollection.addAll(AbstractCollection.java:305)
 at org.apache.lucene.index.SegmentInfos.files(SegmentInfos.java:826)
 at 
 org.apache.lucene.index.DirectoryReader$ReaderCommit.init(DirectoryReader.java:916)
 at 
 org.apache.lucene.index.DirectoryReader.getIndexCommit(DirectoryReader.java:856)
 at 
 org.apache.solr.search.SolrIndexReader.getIndexCommit(SolrIndexReader.java:454)
 at 
 org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:261)
 at 
 org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:264)
 at 
 org.apache.solr.handler.ReplicationHandler$1.run(ReplicationHandler.java:146)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2584) Concurrency issues in SegmentInfo.files() could lead to ConcurrentModificationException

2011-01-18 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983183#action_12983183
 ] 

Michael McCandless commented on LUCENE-2584:


Patch looks good Shai!

I don't think you need to backport to 2.9/3.0 immediately (unless you really 
want to!)?  We can backport if/when we do another release...

 Concurrency issues in SegmentInfo.files() could lead to 
 ConcurrentModificationException
 ---

 Key: LUCENE-2584
 URL: https://issues.apache.org/jira/browse/LUCENE-2584
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Affects Versions: 2.9, 2.9.1, 2.9.2, 2.9.3, 3.0, 3.0.1, 3.0.2
Reporter: Alexander Kanarsky
Assignee: Shai Erera
Priority: Minor
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2584-branch_3x.patch, 
 LUCENE-2584-lucene-2_9.patch, LUCENE-2584-lucene-3_0.patch, LUCENE-2584.patch


 The multi-threaded call of the files() in SegmentInfo could lead to the 
 ConcurrentModificationException if one thread is not finished additions to 
 the ArrayList (files) yet while the other thread already obtained it as 
 cached (see below). This is a rare exception, but it would be nice to fix. I 
 see the code is no longer problematic in the trunk (and others ported from 
 flex_1458), looks it was fixed while implementing post 3.x features. The fix 
 to 3.x and 2.9.x branches could be the same - create the files set first and 
 populate it, and then assign to the member variable at the end of the method. 
 This will resolve the issue. I could prepare the patch for 2.9.4 and 3.x, if 
 needed.
 --
 INFO: [19] webapp= path=/replication params={command=fetchindexwt=javabin} 
 status=0 QTime=1
 Jul 30, 2010 9:13:05 AM org.apache.solr.core.SolrCore execute
 INFO: [19] webapp= path=/replication params={command=detailswt=javabin} 
 status=0 QTime=24
 Jul 30, 2010 9:13:05 AM org.apache.solr.handler.ReplicationHandler doFetch
 SEVERE: SnapPull failed
 java.util.ConcurrentModificationException
 at 
 java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372)
 at java.util.AbstractList$Itr.next(AbstractList.java:343)
 at java.util.AbstractCollection.addAll(AbstractCollection.java:305)
 at org.apache.lucene.index.SegmentInfos.files(SegmentInfos.java:826)
 at 
 org.apache.lucene.index.DirectoryReader$ReaderCommit.init(DirectoryReader.java:916)
 at 
 org.apache.lucene.index.DirectoryReader.getIndexCommit(DirectoryReader.java:856)
 at 
 org.apache.solr.search.SolrIndexReader.getIndexCommit(SolrIndexReader.java:454)
 at 
 org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:261)
 at 
 org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:264)
 at 
 org.apache.solr.handler.ReplicationHandler$1.run(ReplicationHandler.java:146)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2472) The terms index divisor in IW should be set via IWC not via getReader

2011-01-18 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983189#action_12983189
 ] 

Shai Erera commented on LUCENE-2472:


This is already set on IWC (set/getReaderTermsIndexDivisor).

So I guess all that's needed is to deprecate IW.getReader(int) on 3x and remove 
from trunk?

 The terms index divisor in IW should be set via IWC not via getReader
 -

 Key: LUCENE-2472
 URL: https://issues.apache.org/jira/browse/LUCENE-2472
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 3.1, 4.0


 The getReader call gives a false sense of security... since if deletions have 
 already been applied (and IW is pooling) the readers have already been loaded 
 with a divisor of 1.
 Better to set the divisor up front in IWC.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Let's drop Maven Artifacts !

2011-01-18 Thread Grant Ingersoll
It seems to me that if we have a fix for the things that ail our Maven support 
(Steve's work), that it isn't then the reason for holding up a release and we 
should just keep them as there are a significant number of users who consume 
Lucene that way (via the central repository).  I agree that we should not 
switch our build system,  but supporting the POMs is no different than 
supporting the IntelliJ/Eclipse generation tools (they are both problematic 
since they are not automated)   


On Jan 18, 2011, at 7:48 AM, Robert Muir wrote:

 On Tue, Jan 18, 2011 at 6:56 AM, Stevo Slavić ssla...@gmail.com wrote:
 More than one build tools is not way to go, I believe everyone agrees
 on that, and that it's not an issue.
 
 Have you guys at least considered making a switch to a build tool that
 knows to produce maven artifacts (or enhancing exiting one to take
 care of that)? E.g. ant+ivy, gradle, maven itself.
 
 
 I think its important to look at the build system as supporting
 development too, but most features being developed today are against
 lucene's core: which has no dependencies at all.
 
 For example, our ant build supports rapidly running the core tests
 (splitting them across different jvms in parallel: i've looked at the
 support for parallel testing in other build systems like maven and I
 think ours is significantly better for our tests).
 
 This compile-test-debug lifecycle is important, for the lucene core
 tests its very fast.
 
 So while I might agree with you that for something like Solr
 development, perhaps ant+ivy is something worth considering, I think
 its overkill and would be a step backwards for lucene, we would only
 slow down development.
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 

--
Grant Ingersoll
http://www.lucidimagination.com


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2374) Add reflection API to AttributeSource/AttributeImpl

2011-01-18 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983195#action_12983195
 ] 

Uwe Schindler commented on LUCENE-2374:
---

In my opinion, there is lots of code duplication in unmainainable analysis.jsp. 
I think we should open a new issue to remove it and replace by an XSL or 
alternatively make its internal functionality backed by 
FieldAnalysisReuqestHandler.

 Add reflection API to AttributeSource/AttributeImpl
 ---

 Key: LUCENE-2374
 URL: https://issues.apache.org/jira/browse/LUCENE-2374
 Project: Lucene - Java
  Issue Type: Improvement
  Components: contrib/analyzers
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2374-3x.patch, LUCENE-2374-3x.patch, 
 LUCENE-2374-3x.patch, shot1.png, shot2.png, shot3.png, shot4.png


 AttributeSource/TokenStream inspection in Solr needs to have some insight 
 into the contents of AttributeImpls. As LUCENE-2302 has some problems with 
 toString() [which is not structured and conflicts with CharSequence's 
 definition for CharTermAttribute], I propose an simple API that get a default 
 implementation in AttributeImpl (just like toString() current):
 - IteratorMap.EntryString,? AttributeImpl.contentsIterator() returns an 
 iterator (for most attributes its a singleton) of a key-value pair, e.g. 
 term-foobar,startOffset-Integer.valueOf(0),...
 - AttributeSource gets the same method, it just concat the iterators of each 
 getAttributeImplsIterator() AttributeImpl
 No backwards problems occur, as the default toString() method will work like 
 before (it just gets iterator and lists), but we simply remove the 
 documentation for the format. (Char)TermAttribute gets a special impl fo 
 toString() according to CharSequence and a corresponding iterator.
 I also want to remove the abstract hashCode() and equals() methods from 
 AttributeImpl, as they are not needed and just create work for the 
 implementor.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Let's drop Maven Artifacts !

2011-01-18 Thread Earwin Burrfoot
On Tue, Jan 18, 2011 at 17:00, Robert Muir rcm...@gmail.com wrote:
 On Tue, Jan 18, 2011 at 8:54 AM, Grant Ingersoll gsing...@apache.org wrote:
 It seems to me that if we have a fix for the things that ail our Maven 
 support (Steve's work), that it isn't then the reason for holding up a 
 release and we should just keep them as there are a significant number of 
 users who consume Lucene that way (via the central repository).  I agree 
 that we should not switch our build system,  but supporting the POMs is no 
 different than supporting the IntelliJ/Eclipse generation tools (they are 
 both problematic since they are not automated)


 its totally different in every way! we don't release the
 intellij/eclipse stuff, its for internal use only.
 additionally, there are no release artifacts generated by these
Latest code from LUCENE-2657 does not generate any new artifacts. It
uploads those you already have (built via ant) to the repo.


-- 
Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com)
Phone: +7 (495) 683-567-4
ICQ: 104465785

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Exception hit on 3_0 branch

2011-01-18 Thread Shai Erera
Hi

I ran tests on 3_0 branch and hit this:

[junit] Testcase:
testRankByte(org.apache.lucene.search.function.TestFieldScoreQuery):
Caused an ERROR
[junit] null
[junit] java.util.ConcurrentModificationException
[junit] at
java.util.WeakHashMap$HashIterator.next(WeakHashMap.java:169)
[junit] at
org.apache.lucene.search.FieldCacheImpl.getCacheEntries(FieldCacheImpl.java:75)
[junit] at
org.apache.lucene.util.LuceneTestCase.assertSaneFieldCaches(LuceneTestCase.java:133)
[junit] at
org.apache.lucene.util.LuceneTestCase.tearDown(LuceneTestCase.java:100)
[junit] at
org.apache.lucene.search.function.FunctionTestSetup.tearDown(FunctionTestSetup.java:86)
[junit] at
org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:216)

I couldn't reproduce it the second time I ran the test (test only and all
tests), and I don't know if it applies to 3x/trunk too. I can dig into it
later, but sending to the list in case someone wants to look at it before.

I see that the method is called from tearDown() and ConcurrentModEx suggests
someone added to the set during while someone else iterated over it -- could
it be that the tests step on each other somehow?

Shai


[jira] Commented: (SOLR-1604) Wildcards, ORs etc inside Phrase Queries

2011-01-18 Thread Salman Akram (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983208#action_12983208
 ] 

Salman Akram commented on SOLR-1604:


I tried the patch with latest non-grayed file but still inOrder doesn't seem to 
have any impact.

Results for a b~5 and b a~5 are still different.

Also any feedback about CommonGrams integration?

Thanks a lot for all the help!

 Wildcards, ORs etc inside Phrase Queries
 

 Key: SOLR-1604
 URL: https://issues.apache.org/jira/browse/SOLR-1604
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
Reporter: Ahmet Arslan
Priority: Minor
 Fix For: Next

 Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, 
 ComplexPhrase.zip, ComplexPhraseQueryParser.java, SOLR-1604.patch


 Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports 
 wildcards, ORs, ranges, fuzzies inside phrase queries.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Resolved: (LUCENE-2584) Concurrency issues in SegmentInfo.files() could lead to ConcurrentModificationException

2011-01-18 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera resolved LUCENE-2584.


   Resolution: Fixed
Fix Version/s: (was: 4.0)
   3.0.4
   2.9.5
Lucene Fields: [New, Patch Available]  (was: [New])

Committed revision 1060358 (3x).
Committed revision 1060391 (3.0).
Committed revision 1060398 (2.9).

Thanks Alexander !

 Concurrency issues in SegmentInfo.files() could lead to 
 ConcurrentModificationException
 ---

 Key: LUCENE-2584
 URL: https://issues.apache.org/jira/browse/LUCENE-2584
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Affects Versions: 2.9, 2.9.1, 2.9.2, 2.9.3, 3.0, 3.0.1, 3.0.2
Reporter: Alexander Kanarsky
Assignee: Shai Erera
Priority: Minor
 Fix For: 2.9.5, 3.0.4, 3.1

 Attachments: LUCENE-2584-branch_3x.patch, 
 LUCENE-2584-lucene-2_9.patch, LUCENE-2584-lucene-3_0.patch, LUCENE-2584.patch


 The multi-threaded call of the files() in SegmentInfo could lead to the 
 ConcurrentModificationException if one thread is not finished additions to 
 the ArrayList (files) yet while the other thread already obtained it as 
 cached (see below). This is a rare exception, but it would be nice to fix. I 
 see the code is no longer problematic in the trunk (and others ported from 
 flex_1458), looks it was fixed while implementing post 3.x features. The fix 
 to 3.x and 2.9.x branches could be the same - create the files set first and 
 populate it, and then assign to the member variable at the end of the method. 
 This will resolve the issue. I could prepare the patch for 2.9.4 and 3.x, if 
 needed.
 --
 INFO: [19] webapp= path=/replication params={command=fetchindexwt=javabin} 
 status=0 QTime=1
 Jul 30, 2010 9:13:05 AM org.apache.solr.core.SolrCore execute
 INFO: [19] webapp= path=/replication params={command=detailswt=javabin} 
 status=0 QTime=24
 Jul 30, 2010 9:13:05 AM org.apache.solr.handler.ReplicationHandler doFetch
 SEVERE: SnapPull failed
 java.util.ConcurrentModificationException
 at 
 java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372)
 at java.util.AbstractList$Itr.next(AbstractList.java:343)
 at java.util.AbstractCollection.addAll(AbstractCollection.java:305)
 at org.apache.lucene.index.SegmentInfos.files(SegmentInfos.java:826)
 at 
 org.apache.lucene.index.DirectoryReader$ReaderCommit.init(DirectoryReader.java:916)
 at 
 org.apache.lucene.index.DirectoryReader.getIndexCommit(DirectoryReader.java:856)
 at 
 org.apache.solr.search.SolrIndexReader.getIndexCommit(SolrIndexReader.java:454)
 at 
 org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:261)
 at 
 org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:264)
 at 
 org.apache.solr.handler.ReplicationHandler$1.run(ReplicationHandler.java:146)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2324) Per thread DocumentsWriters that write their own private segments

2011-01-18 Thread Jason Rutherglen (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983209#action_12983209
 ] 

Jason Rutherglen commented on LUCENE-2324:
--

bq. I can't test flush by RAM - that's not working yet on RT right?

Right, we're only flushing by doc count, so we could be flushing segments that 
are too small?  However we can see some of the concurrency gains by not 
sync'ing on IW and allowing documents updates to continue while flushing.

 Per thread DocumentsWriters that write their own private segments
 -

 Key: LUCENE-2324
 URL: https://issues.apache.org/jira/browse/LUCENE-2324
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Michael Busch
Assignee: Michael Busch
Priority: Minor
 Fix For: Realtime Branch

 Attachments: LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, 
 LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, 
 LUCENE-2324.patch, LUCENE-2324.patch, LUCENE-2324.patch, lucene-2324.patch, 
 lucene-2324.patch, LUCENE-2324.patch, test.out, test.out, test.out, test.out


 See LUCENE-2293 for motivation and more details.
 I'm copying here Mike's summary he posted on 2293:
 Change the approach for how we buffer in RAM to a more isolated
 approach, whereby IW has N fully independent RAM segments
 in-process and when a doc needs to be indexed it's added to one of
 them. Each segment would also write its own doc stores and
 normal segment merging (not the inefficient merge we now do on
 flush) would merge them. This should be a good simplification in
 the chain (eg maybe we can remove the *PerThread classes). The
 segments can flush independently, letting us make much better
 concurrent use of IO  CPU.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-2316) SynonymFilterFactory should ensure synonyms argument is provided.

2011-01-18 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983214#action_12983214
 ] 

David Smiley commented on SOLR-2316:


Both.

 SynonymFilterFactory should ensure synonyms argument is provided.
 -

 Key: SOLR-2316
 URL: https://issues.apache.org/jira/browse/SOLR-2316
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Reporter: David Smiley
Priority: Minor
 Fix For: 3.1

 Attachments: 2316.patch


 If for some reason the synonyms attribute is not present on the filter 
 factory configuration, a latent NPE will eventually show up during 
 indexing/searching.  Instead a helpful error should be thrown at 
 initialization.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-445) XmlUpdateRequestHandler bad documents mid batch aborts rest of batch

2011-01-18 Thread Simon Rosenthal (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983215#action_12983215
 ] 

Simon Rosenthal commented on SOLR-445:
--

bq.  Don't allow autocommits during an update. Simple. Or, rather, all update 
requests block at the beginning during an autocommit. If an update request has 
too many documents, don't do so many documents in an update. (Lance)
Lance - How do you (dynamically ) disable autocommits during a specific update  
? That functionality would also be useful in other use cases, but that's 
another issue). 

bq. NOTE: This does change the behavior of Solr. Without this patch, the first 
document that is incorrect stops processing. Now, it continues merrily on 
adding documents as it can. Is this desirable behavior? It would be easy to 
abort on first error if that's the consensus, and I could take some tedious 
record-keeping out. I think there's no big problem with continuing on, since 
the state of committed documents is indeterminate already when errors occur so 
worrying about this should be part of a bigger issue.

I think it should be an option, if possible. I can see use cases where 
abort-on-first-error is desirable, but also situations where you know one or 
two documents may be erroneous, and its worth continuing on in order to index 
the other 99%


 XmlUpdateRequestHandler bad documents mid batch aborts rest of batch
 

 Key: SOLR-445
 URL: https://issues.apache.org/jira/browse/SOLR-445
 Project: Solr
  Issue Type: Bug
  Components: update
Affects Versions: 1.3
Reporter: Will Johnson
Assignee: Erick Erickson
 Fix For: Next

 Attachments: SOLR-445-3_x.patch, SOLR-445.patch, SOLR-445.patch, 
 solr-445.xml


 Has anyone run into the problem of handling bad documents / failures mid 
 batch.  Ie:
 add
   doc
 field name=id1/field
   /doc
   doc
 field name=id2/field
 field name=myDateFieldI_AM_A_BAD_DATE/field
   /doc
   doc
 field name=id3/field
   /doc
 /add
 Right now solr adds the first doc and then aborts.  It would seem like it 
 should either fail the entire batch or log a message/return a code and then 
 continue on to add doc 3.  Option 1 would seem to be much harder to 
 accomplish and possibly require more memory while Option 2 would require more 
 information to come back from the API.  I'm about to dig into this but I 
 thought I'd ask to see if anyone had any suggestions, thoughts or comments.   
  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Lucene-Solr-tests-only-trunk - Build # 3881 - Failure

2011-01-18 Thread Apache Hudson Server
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/3881/

No tests ran.

Build Log (for compile errors):
[...truncated 62 lines...]
+ 
JAVADOCS_ARTIFACTS=/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/javadocs
+ set +x
Checking for files containing nocommit (exits build with failure if list is 
non-empty):
+ cd /home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout
+ JAVA_HOME=/home/hudson/tools/java/latest1.5 
/home/hudson/tools/ant/latest1.7/bin/ant clean
Buildfile: build.xml

clean:

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build

clean:

clean:
 [echo] Building analyzers-common...

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/modules/analysis/build/common
 [echo] Building analyzers-icu...

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/modules/analysis/build/icu
 [echo] Building analyzers-phonetic...

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/modules/analysis/build/phonetic
 [echo] Building analyzers-smartcn...

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/modules/analysis/build/smartcn
 [echo] Building analyzers-stempel...

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/modules/analysis/build/stempel
 [echo] Building benchmark...

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/modules/benchmark/build

clean-contrib:

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/contrib/analysis-extras/build
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/contrib/analysis-extras/lucene-libs

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/contrib/clustering/build

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/contrib/dataimporthandler/target

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/contrib/extraction/build

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/build

BUILD SUCCESSFUL
Total time: 3 seconds
+ cd 
/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene
+ JAVA_HOME=/home/hudson/tools/java/latest1.5 
/home/hudson/tools/ant/latest1.7/bin/ant compile compile-test build-contrib
Buildfile: build.xml

jflex-uptodate-check:

jflex-notice:

javacc-uptodate-check:

javacc-notice:

init:

clover.setup:

clover.info:
 [echo] 
 [echo]   Clover not found. Code coverage reports disabled.
 [echo] 

clover:

common.compile-core:
[mkdir] Created dir: 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/classes/java
[javac] Compiling 507 source files to 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/classes/java
[javac] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/util/Version.java:73:
 warning: [dep-ann] deprecated name isnt annotated with @Deprecated
[javac]   public boolean onOrAfter(Version other) {
[javac]  ^
[javac] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/index/IndexWriter.java:985:
 method does not override a method from its superclass
[javac]   @Override
[javac]^
[javac] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/queryParser/CharStream.java:34:
 warning: [dep-ann] deprecated name isnt annotated with @Deprecated
[javac]   int getColumn();
[javac]   ^
[javac] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/queryParser/CharStream.java:41:
 warning: [dep-ann] deprecated name isnt annotated with @Deprecated
[javac]   int getLine();
[javac]   ^
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] 1 error
[...truncated 10 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2374) Add reflection API to AttributeSource/AttributeImpl

2011-01-18 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983224#action_12983224
 ] 

Mark Miller commented on LUCENE-2374:
-

Agreed Uwe.

 Add reflection API to AttributeSource/AttributeImpl
 ---

 Key: LUCENE-2374
 URL: https://issues.apache.org/jira/browse/LUCENE-2374
 Project: Lucene - Java
  Issue Type: Improvement
  Components: contrib/analyzers
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2374-3x.patch, LUCENE-2374-3x.patch, 
 LUCENE-2374-3x.patch, shot1.png, shot2.png, shot3.png, shot4.png


 AttributeSource/TokenStream inspection in Solr needs to have some insight 
 into the contents of AttributeImpls. As LUCENE-2302 has some problems with 
 toString() [which is not structured and conflicts with CharSequence's 
 definition for CharTermAttribute], I propose an simple API that get a default 
 implementation in AttributeImpl (just like toString() current):
 - IteratorMap.EntryString,? AttributeImpl.contentsIterator() returns an 
 iterator (for most attributes its a singleton) of a key-value pair, e.g. 
 term-foobar,startOffset-Integer.valueOf(0),...
 - AttributeSource gets the same method, it just concat the iterators of each 
 getAttributeImplsIterator() AttributeImpl
 No backwards problems occur, as the default toString() method will work like 
 before (it just gets iterator and lists), but we simply remove the 
 documentation for the format. (Char)TermAttribute gets a special impl fo 
 toString() according to CharSequence and a corresponding iterator.
 I also want to remove the abstract hashCode() and equals() methods from 
 AttributeImpl, as they are not needed and just create work for the 
 implementor.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1604) Wildcards, ORs etc inside Phrase Queries

2011-01-18 Thread Ahmet Arslan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983225#action_12983225
 ] 

Ahmet Arslan commented on SOLR-1604:


When you add debugQuery=on to your search URL, you should see something like 
in the debug section:
spanNear([text:a, text:b], 5, false) , false here means un-ordered phrase 
query. Do you see it?

I will look into CommonGrams this weekend.

 Wildcards, ORs etc inside Phrase Queries
 

 Key: SOLR-1604
 URL: https://issues.apache.org/jira/browse/SOLR-1604
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
Reporter: Ahmet Arslan
Priority: Minor
 Fix For: Next

 Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, 
 ComplexPhrase.zip, ComplexPhraseQueryParser.java, SOLR-1604.patch


 Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports 
 wildcards, ORs, ranges, fuzzies inside phrase queries.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly

2011-01-18 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983234#action_12983234
 ] 

Ryan McKinley commented on LUCENE-2657:
---

Steve, great work with this patch -- it takes care of all the previous concerns 
about our problematic maven support.

With this patch, we now have:
 * testable maven artifacts
 * easy repo distribution
 * ant is still *the* build system

The RM can choose to ignore the generate-maven-artifacts target and let someone 
else push the artifacts.  

As with most religious conflicts -- I hope the resolution is not conversion, 
rather something that lets everyone to live (work)  in peace.  





 Replace Maven POM templates with full POMs, and change documentation 
 accordingly
 

 Key: LUCENE-2657
 URL: https://issues.apache.org/jira/browse/LUCENE-2657
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.1, 4.0
Reporter: Steven Rowe
Assignee: Steven Rowe
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch


 The current Maven POM templates only contain dependency information, the bare 
 bones necessary for uploading artifacts to the Maven repository.
 The full Maven POMs in the attached patch include the information necessary 
 to run a multi-module Maven build, in addition to serving the same purpose as 
 the current POM templates.
 Several dependencies are not available through public maven repositories.  A 
 profile in the top-level POM can be activated to install these dependencies 
 from the various {{lib/}} directories into your local repository.  From the 
 top-level directory:
 {code}
 mvn -N -Pbootstrap install
 {code}
 Once these non-Maven dependencies have been installed, to run all Lucene/Solr 
 tests via Maven's surefire plugin, and populate your local repository with 
 all artifacts, from the top level directory, run:
 {code}
 mvn install
 {code}
 When one Lucene/Solr module depends on another, the dependency is declared on 
 the *artifact(s)* produced by the other module and deposited in your local 
 repository, rather than on the other module's un-jarred compiler output in 
 the {{build/}} directory, so you must run {{mvn install}} on the other module 
 before its changes are visible to the module that depends on it.
 To create all the artifacts without running tests:
 {code}
 mvn -DskipTests install
 {code}
 I almost always include the {{clean}} phase when I do a build, e.g.:
 {code}
 mvn -DskipTests clean install
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Let's drop Maven Artifacts !

2011-01-18 Thread Grant Ingersoll
I still don't see why you care so much.  You have people willing to maintain it 
and it is no sweat off your back and it is used by a pretty large chunk of 
downstream users.  And don't tell me it is what holds up releases b/c it simply 
isn't true.


On Jan 18, 2011, at 9:12 AM, Robert Muir wrote:

 On Tue, Jan 18, 2011 at 9:10 AM, Earwin Burrfoot ear...@gmail.com wrote:
 Latest code from LUCENE-2657 does not generate any new artifacts. It
 uploads those you already have (built via ant) to the repo.
 
 
 yep, thats releasing artifacts. thats the whole point of this email
 thread (read the title, thanks)
 
 the intellij/eclipse stuff is just unreleased stuff that sits in our
 SVN. it doesnt get uploaded anywhere.
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Let's drop Maven Artifacts !

2011-01-18 Thread Robert Muir
On Tue, Jan 18, 2011 at 10:53 AM, Grant Ingersoll gsing...@apache.org wrote:
 I still don't see why you care so much.  You have people willing to maintain 
 it and it is no sweat off your back and it is used by a pretty large chunk of 
 downstream users.  And don't tell me it is what holds up releases b/c it 
 simply isn't true.


it is what holds up releases. the last time i brought up releasing, it
was totally destroyed because of maven.

the RM shouldn't have to deal with 2 build systems, packaging systems,
and repository hell, and that's what maven artifacts require.

If there is a large chunk of downstream users, then they can handle
this downstream, it doesn't need to be in lucene, just like we don't
deal with other packaging systems.

Unfortunately there is a very loud minority that care about maven,
most of us that think the situation is ridiculous have totally given
up arguing about it, except me, i don't want to put out a shitty
release with broken maven artifacts like in the past, i'd rather let
some downstream project deal with maven instead.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Resolved: (LUCENE-2755) Some improvements to CMS

2011-01-18 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-2755.


Resolution: Fixed

 Some improvements to CMS
 

 Key: LUCENE-2755
 URL: https://issues.apache.org/jira/browse/LUCENE-2755
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Minor
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2755.patch


 While running optimize on a large index, I've noticed several things that got 
 me to read CMS code more carefully, and find these issues:
 * CMS may hold onto a merge if maxMergeCount is hit. That results in the 
 MergeThreads taking merges from the IndexWriter until they are exhausted, and 
 only then that blocked merge will run. I think it's unnecessary that that 
 merge will be blocked.
 * CMS sorts merges by segments size, doc-based and not bytes-based. Since the 
 default MP is LogByteSizeMP, and I hardly believe people care about doc-based 
 size segments anymore, I think we should switch the default impl. There are 
 two ways to make it extensible, if we want:
 ** Have an overridable member/method in CMS that you can extend and override 
 - easy.
 ** Have OneMerge be comparable and let the MP determine the order (e.g. by 
 bytes, docs, calibrate deletes etc.). Better, but will need to tap into 
 several places in the code, so more risky and complicated.
 On the go, I'd like to add some documentation to CMS - it's not very easy to 
 read and follow.
 I'll work on a patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2324) Per thread DocumentsWriters that write their own private segments

2011-01-18 Thread Michael Busch (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983246#action_12983246
 ] 

Michael Busch commented on LUCENE-2324:
---

{quote}
I ran a quick perf test here: I built the 10M Wikipedia index,
Standard codec, using 6 threads. Trunk took 541.6 sec; RT took 518.2
sec (only a bit faster), but the test wasn't really fair because it
flushed @ docCount=12870.
{quote}

Thanks for running the tests!
Hmm that's a bit disappointing - we were hoping for more speedup.  
Flushing by docCount is currently per DWPT, so every initial segment
in your test had 12870 docs. I guess there's a lot of merging happening.

Maybe you could rerun with higher docCount?

bq. But I can't test flush by RAM - that's not working yet on RT right?

True.  I'm going to add that soonish.  There's one thread-safety bug 
related to deletes that needs to be fixed too.

{quote}
Then I ran a single-threaded test. Trunk took 1097.1 sec and RT took
1040.5 sec - a bit faster! Presumably in the noise (we don't expect
a speedup?), but excellent that it's not slower...
{quote}

Yeah I didn't expect much speedup - cool! :)  Maybe because some 
code is gone, like the WaitQueue, not sure how much overhead that 
added in the single-threaded case.

{quote}
I think we lost infoStream output on the details of flushing? I can't
see when which DWPTs are flushing...
{quote}

Oh yeah, good point, I'll add some infoStream messages to DWPT!

 Per thread DocumentsWriters that write their own private segments
 -

 Key: LUCENE-2324
 URL: https://issues.apache.org/jira/browse/LUCENE-2324
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Michael Busch
Assignee: Michael Busch
Priority: Minor
 Fix For: Realtime Branch

 Attachments: LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, 
 LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, 
 LUCENE-2324.patch, LUCENE-2324.patch, LUCENE-2324.patch, lucene-2324.patch, 
 lucene-2324.patch, LUCENE-2324.patch, test.out, test.out, test.out, test.out


 See LUCENE-2293 for motivation and more details.
 I'm copying here Mike's summary he posted on 2293:
 Change the approach for how we buffer in RAM to a more isolated
 approach, whereby IW has N fully independent RAM segments
 in-process and when a doc needs to be indexed it's added to one of
 them. Each segment would also write its own doc stores and
 normal segment merging (not the inefficient merge we now do on
 flush) would merge them. This should be a good simplification in
 the chain (eg maybe we can remove the *PerThread classes). The
 segments can flush independently, letting us make much better
 concurrent use of IO  CPU.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly

2011-01-18 Thread Chris A. Mattmann (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983247#action_12983247
 ] 

Chris A. Mattmann commented on LUCENE-2657:
---

+1 for Steve's patch, great work and you beat me to it. 


 Replace Maven POM templates with full POMs, and change documentation 
 accordingly
 

 Key: LUCENE-2657
 URL: https://issues.apache.org/jira/browse/LUCENE-2657
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.1, 4.0
Reporter: Steven Rowe
Assignee: Steven Rowe
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, 
 LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch


 The current Maven POM templates only contain dependency information, the bare 
 bones necessary for uploading artifacts to the Maven repository.
 The full Maven POMs in the attached patch include the information necessary 
 to run a multi-module Maven build, in addition to serving the same purpose as 
 the current POM templates.
 Several dependencies are not available through public maven repositories.  A 
 profile in the top-level POM can be activated to install these dependencies 
 from the various {{lib/}} directories into your local repository.  From the 
 top-level directory:
 {code}
 mvn -N -Pbootstrap install
 {code}
 Once these non-Maven dependencies have been installed, to run all Lucene/Solr 
 tests via Maven's surefire plugin, and populate your local repository with 
 all artifacts, from the top level directory, run:
 {code}
 mvn install
 {code}
 When one Lucene/Solr module depends on another, the dependency is declared on 
 the *artifact(s)* produced by the other module and deposited in your local 
 repository, rather than on the other module's un-jarred compiler output in 
 the {{build/}} directory, so you must run {{mvn install}} on the other module 
 before its changes are visible to the module that depends on it.
 To create all the artifacts without running tests:
 {code}
 mvn -DskipTests install
 {code}
 I almost always include the {{clean}} phase when I do a build, e.g.:
 {code}
 mvn -DskipTests clean install
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Let's drop Maven Artifacts !

2011-01-18 Thread Mark Miller

On Jan 18, 2011, at 11:12 AM, Robert Muir wrote:

 
 there is a very loud minority that care about maven,
 most of us that think the situation is ridiculous have totally given
 up arguing about it, except me, i don't want to put out a shitty
 release with broken maven artifacts like in the past, i'd rather let
 some downstream project deal with maven instead.

+1. What a fantastic idea for an apache extra's project :)

I'll open my arms to first class maven the first time it sees the light of 
consensus ;)

- Mark
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Let's drop Maven Artifacts !

2011-01-18 Thread Grant Ingersoll

On Jan 18, 2011, at 11:12 AM, Robert Muir wrote:

 On Tue, Jan 18, 2011 at 10:53 AM, Grant Ingersoll gsing...@apache.org wrote:
 I still don't see why you care so much.  You have people willing to maintain 
 it and it is no sweat off your back and it is used by a pretty large chunk 
 of downstream users.  And don't tell me it is what holds up releases b/c it 
 simply isn't true.
 
 
 it is what holds up releases. the last time i brought up releasing, it
 was totally destroyed because of maven.

I'll grant you it held up the last release _ONCE WE DECIDED TO RELEASE_, but 
don't act like it is why we don't release very often, because it isn't.

 
 the RM shouldn't have to deal with 2 build systems, packaging systems,
 and repository hell, and that's what maven artifacts require.

And Steve has said he would fix it and it won't require two build systems, so 
your main complaint is solved.
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2472) The terms index divisor in IW should be set via IWC not via getReader

2011-01-18 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983252#action_12983252
 ] 

Michael McCandless commented on LUCENE-2472:


bq. So I guess all that's needed is to deprecate IW.getReader(int) on 3x and 
remove from trunk?

+1

Though, it's already removed on trunk.  So we just need to deprecate on 3.x...

 The terms index divisor in IW should be set via IWC not via getReader
 -

 Key: LUCENE-2472
 URL: https://issues.apache.org/jira/browse/LUCENE-2472
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 3.1, 4.0


 The getReader call gives a false sense of security... since if deletions have 
 already been applied (and IW is pooling) the readers have already been loaded 
 with a divisor of 1.
 Better to set the divisor up front in IWC.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Let's drop Maven Artifacts !

2011-01-18 Thread Robert Muir
On Tue, Jan 18, 2011 at 11:27 AM, Mark Miller markrmil...@gmail.com wrote:
 I'll open my arms to first class maven the first time it sees the light of 
 consensus ;)

thats the main thing missing from releasing maven artifacts... looking
at previous threads I don't really see consensus that we need to do
this.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Lucene-Solr-tests-only-trunk - Build # 3864 - Failure

2011-01-18 Thread Michael McCandless
This was caused by a latent bug in PrefixCodedTermsReader...

But, I'm about to replace that w/ BlockTermsReader, so I'll leave this
bug there...

Mike

On Mon, Jan 17, 2011 at 2:05 AM, Apache Hudson Server
hud...@hudson.apache.org wrote:
 Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/3864/

 1 tests failed.
 REGRESSION:  org.apache.lucene.util.automaton.fst.TestFSTs.testRealTerms

 Error Message:
 null

 Stack Trace:
 junit.framework.AssertionFailedError
        at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1127)
        at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1059)
        at 
 org.apache.lucene.index.codecs.intblock.FixedIntBlockIndexInput$Index.read(FixedIntBlockIndexInput.java:167)
        at 
 org.apache.lucene.index.codecs.sep.SepPostingsReaderImpl.readTerm(SepPostingsReaderImpl.java:167)
        at 
 org.apache.lucene.index.codecs.pulsing.PulsingPostingsReaderImpl.readTerm(PulsingPostingsReaderImpl.java:135)
        at 
 org.apache.lucene.index.codecs.PrefixCodedTermsReader$FieldReader$SegmentTermsEnum.next(PrefixCodedTermsReader.java:508)
        at 
 org.apache.lucene.index.codecs.PrefixCodedTermsReader$FieldReader$SegmentTermsEnum.seek(PrefixCodedTermsReader.java:431)
        at org.apache.lucene.index.TermsEnum.seek(TermsEnum.java:68)
        at 
 org.apache.lucene.util.automaton.fst.TestFSTs.testRealTerms(TestFSTs.java:1016)




 Build Log (for compile errors):
 [...truncated 2947 lines...]



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-2303) remove unnecessary (and problematic) log4j jars in contribs

2011-01-18 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983260#action_12983260
 ] 

Erick Erickson commented on SOLR-2303:
--

I am Officially Confused, but the culprit appears to be 
log4j-over-slf4j-1.5.5.jar

3_x has:
   log4j jars in solr/contrib/extraction and solr/contrib/clustering
   a bunch of slf4j jars in solr/lib (but NOT log4j-over-slf4j-1.5.5.jar, see 
below).
   All tests succeed just fine.

Trunk has:
  no log4j jars in contrib
  the same slf4j jars as in 3_x BUT ALSO log4j-over-slf4j-1.5.5.jar
  VelocityResponseWriterTest fails


In trunk, removing log4j-over-slf4j-1.5.5.jar allows VelocityResponseWriterTest 
and all other tests to succeed.

in 3_x, removing the log4j jars from solr/contrib makes no difference, all 
tests pass.

So I propose that the fix for this is to remove the log4j files from 3_x and 
the log4j-over-slf4j-1.5.5.jar from trunk.

Should I create a patch? And do patches actually remove jars like this?

 remove unnecessary (and problematic) log4j jars in contribs
 ---

 Key: SOLR-2303
 URL: https://issues.apache.org/jira/browse/SOLR-2303
 Project: Solr
  Issue Type: Improvement
  Components: Build
Reporter: Robert Muir
 Fix For: 4.0

 Attachments: SOLR-2303.patch


 In solr 4.0 there is log4j-over-slf4j.
 But if you have log4j jars also in the classpath (e.g. contrib/extraction, 
 contrib/clustering) you can get strange errors such as:
 java.lang.NoSuchMethodError: org.apache.log4j.Logger.setAdditivity(Z)V
 So I think we should remove the log4j jars in these contribs, all tests pass 
 with them removed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Reopened: (SOLR-2303) remove unnecessary (and problematic) log4j jars in contribs

2011-01-18 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson reopened SOLR-2303:
--


See previous comment, I believe that there are some jars in Solr that need to 
be removed.

 remove unnecessary (and problematic) log4j jars in contribs
 ---

 Key: SOLR-2303
 URL: https://issues.apache.org/jira/browse/SOLR-2303
 Project: Solr
  Issue Type: Improvement
  Components: Build
Reporter: Robert Muir
 Fix For: 4.0

 Attachments: SOLR-2303.patch


 In solr 4.0 there is log4j-over-slf4j.
 But if you have log4j jars also in the classpath (e.g. contrib/extraction, 
 contrib/clustering) you can get strange errors such as:
 java.lang.NoSuchMethodError: org.apache.log4j.Logger.setAdditivity(Z)V
 So I think we should remove the log4j jars in these contribs, all tests pass 
 with them removed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-2303) remove unnecessary (and problematic) log4j jars in contribs

2011-01-18 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983265#action_12983265
 ] 

Robert Muir commented on SOLR-2303:
---

Erick, actually i think the issue is that log4j-over-slf4j conflicts with 
log4j, if log4j is in the classpath.

The problem is that currently, the solr build runs tests with whatever is in 
ant's classpath.
This is why the tests pass for you, even if you remove all logging jars, but 
this is obviously bad as its not really a repeatable build.

So to fix this, we need to use includeantruntime=no in the junit tasks, and 
also not include $java.class.path in the test classpath.
instead, we explicitly include the ant libs we supply (especially since we 
extend some of them for testing).

This might make some warnings or even errors for ant 1.8 users, but I think 
thats ok.


 remove unnecessary (and problematic) log4j jars in contribs
 ---

 Key: SOLR-2303
 URL: https://issues.apache.org/jira/browse/SOLR-2303
 Project: Solr
  Issue Type: Improvement
  Components: Build
Reporter: Robert Muir
 Fix For: 4.0

 Attachments: SOLR-2303.patch


 In solr 4.0 there is log4j-over-slf4j.
 But if you have log4j jars also in the classpath (e.g. contrib/extraction, 
 contrib/clustering) you can get strange errors such as:
 java.lang.NoSuchMethodError: org.apache.log4j.Logger.setAdditivity(Z)V
 So I think we should remove the log4j jars in these contribs, all tests pass 
 with them removed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-2303) remove unnecessary (and problematic) log4j jars in contribs

2011-01-18 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983266#action_12983266
 ] 

Mark Miller commented on SOLR-2303:
---

Hey Erick,

If I remember right, log4j-over-slf4j is in there for proper zookeeper logging 
(hoping they switch to slf4j). Rather than dropping it, we should likely try 
and figure out how to keep and fix the issue - as suggested by Robert.

 remove unnecessary (and problematic) log4j jars in contribs
 ---

 Key: SOLR-2303
 URL: https://issues.apache.org/jira/browse/SOLR-2303
 Project: Solr
  Issue Type: Improvement
  Components: Build
Reporter: Robert Muir
 Fix For: 4.0

 Attachments: SOLR-2303.patch


 In solr 4.0 there is log4j-over-slf4j.
 But if you have log4j jars also in the classpath (e.g. contrib/extraction, 
 contrib/clustering) you can get strange errors such as:
 java.lang.NoSuchMethodError: org.apache.log4j.Logger.setAdditivity(Z)V
 So I think we should remove the log4j jars in these contribs, all tests pass 
 with them removed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: Let's drop Maven Artifacts !

2011-01-18 Thread Steven A Rowe
On 1/18/2011 at 11:34 AM, Robert Muir wrote:
 On Tue, Jan 18, 2011 at 11:27 AM, Mark Miller markrmil...@gmail.com
 wrote:
  I'll open my arms to first class maven the first time it sees the light
  of consensus ;)
 
 thats the main thing missing from releasing maven artifacts... looking
 at previous threads I don't really see consensus that we need to do
 this.

I think there is consensus that the RM does not have to release Maven artifacts.

There clearly is no consensus for removing Maven support from Lucene.  

 Unfortunately there is a very loud minority that care about maven

I would wager that there is a sizable silent *majority* of users who literally 
depend on Lucene's Maven artifacts.

Steve



Re: Let's drop Maven Artifacts !

2011-01-18 Thread Robert Muir
On Tue, Jan 18, 2011 at 12:03 PM, Steven A Rowe sar...@syr.edu wrote:

 There clearly is no consensus for removing Maven support from Lucene.

and see there is my problem, there was no consensus to begin with, now
suddenly its de-facto required. Maven is quite an insidious computer
virus.


 Unfortunately there is a very loud minority that care about maven

 I would wager that there is a sizable silent *majority* of users who 
 literally depend on Lucene's Maven artifacts.

I can't help but remind myself, this is the same argument Oracle
offered up for the whole reason hudson debacle
(http://hudson-labs.org/content/whos-driving-thing)

Declaring that I have a secret pocket of users that want XYZ isn't
open source consensus.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Let's drop Maven Artifacts !

2011-01-18 Thread Michael Busch

On 1/18/11 9:13 AM, Robert Muir wrote:

I can't help but remind myself, this is the same argument Oracle
offered up for the whole reason hudson debacle
(http://hudson-labs.org/content/whos-driving-thing)

Declaring that I have a secret pocket of users that want XYZ isn't
open source consensus.


Well everyone using ant+ivy or maven as their build system likely
consumes artifacts from maven repos.

I'm surprised you're so much against keeping to publish.  I too really
really want to keep ant as Lucene's build tool.  Maven has made me
suicidal in the past.  But I don't want to stop publishing artifacts
to commonly used repos.

I guess we could try to figure out how many people download the
artifacts from m2 repos.  Maybe they have download statistics?
But then what?  What number would justify stopping to publish?

 Michael

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: Let's drop Maven Artifacts !

2011-01-18 Thread Steven A Rowe
On 1/18/2011 at 12:14 PM, Robert Muir wrote:
 On Tue, Jan 18, 2011 at 12:03 PM, Steven A Rowe sar...@syr.edu wrote:
 
  There clearly is no consensus for removing Maven support from Lucene.
 
 and see there is my problem, there was no consensus to begin with, now
 suddenly its de-facto required. Maven is quite an insidious computer
 virus.

So you think you personally have the power to remove functionality from Lucene 
that has the support of multiple committers?

  Unfortunately there is a very loud minority that care about maven
 
  I would wager that there is a sizable silent *majority* of users who
 literally depend on Lucene's Maven artifacts.
 
 I can't help but remind myself, this is the same argument Oracle
 offered up for the whole reason hudson debacle
 (http://hudson-labs.org/content/whos-driving-thing)
 
 Declaring that I have a secret pocket of users that want XYZ isn't
 open source consensus.

In summary: you claim a silent majority (of devs) in favor of your position, 
and I claim a silent majority (of users) in favor of mine.  Your move: my 
majority, of which I have no proof, has no standing.  Sweet.

I dunno - why are we at war?  Why is it so damn important that you *remove* 
functionality that devs care about and will support?

Steve


[jira] Commented: (LUCENE-2324) Per thread DocumentsWriters that write their own private segments

2011-01-18 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983316#action_12983316
 ] 

Michael McCandless commented on LUCENE-2324:



The branch is looking very nice!!  Very clean :)

Random comments:

Why does DW.anyDeletions need to be sync'd?

Missing headers on at least DocumentsWriterPerThreadPool,
ThreadAffinityDWTP.

IWC.setIndexerThreadPool's javadoc is stale.

On ThreadAffinityDWTP... it may be better if we had a single queue,
where threads wait in line, if no DWPT is available?  And when a DWPT
finishes it then notifies any waiting threads?  (Ie, instead of queue-per-DWPT).

I see the fieldInfos.update(dwpt.getFieldInfos()) (in
DW.updateDocument) -- is there a risk that two threads bring a new
field into existence at the same time, but w/ different config?  Eg
one doc omitsTFAP and the other doesn't?  Or, on flush, does each DWPT
use its private FieldInfos to correctly flush the segment?  (Hmm: do
we seed each DWPT w/ the original FieldInfos created by IW on init?).

How are we handling the case of open IW, do delete-by-term but no
added docs?

Does DW.pushDeletes really need to sync on IW?  BufferedDeletes is
sync'd already.

DW.substractFlushedDocs is mis-spelled (not sure it's used though).

In DW.deleteTerms... shouldn't we skip a DWPT if it has no buffered
docs?


 Per thread DocumentsWriters that write their own private segments
 -

 Key: LUCENE-2324
 URL: https://issues.apache.org/jira/browse/LUCENE-2324
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Michael Busch
Assignee: Michael Busch
Priority: Minor
 Fix For: Realtime Branch

 Attachments: LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, 
 LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, 
 LUCENE-2324.patch, LUCENE-2324.patch, LUCENE-2324.patch, lucene-2324.patch, 
 lucene-2324.patch, LUCENE-2324.patch, test.out, test.out, test.out, test.out


 See LUCENE-2293 for motivation and more details.
 I'm copying here Mike's summary he posted on 2293:
 Change the approach for how we buffer in RAM to a more isolated
 approach, whereby IW has N fully independent RAM segments
 in-process and when a doc needs to be indexed it's added to one of
 them. Each segment would also write its own doc stores and
 normal segment merging (not the inefficient merge we now do on
 flush) would merge them. This should be a good simplification in
 the chain (eg maybe we can remove the *PerThread classes). The
 segments can flush independently, letting us make much better
 concurrent use of IO  CPU.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Let's drop Maven Artifacts !

2011-01-18 Thread Mark Miller

On Jan 18, 2011, at 12:30 PM, Michael Busch wrote:
 
 I guess we could try to figure out how many people download the
 artifacts from m2 repos.  Maybe they have download statistics?
 But then what?  What number would justify stopping to publish?
 
 Michael

Realistically, I would expect that Maven artifacts would still be published, 
even if we kick them out of the Lucene project to Apache extras.
If some of the people care as much as they say they do, they will figure out 
how to make poms and whatever downstream, and a Committer into Maven will put 
them on the official Apache repo. It will just more truly not be a concern to 
the rest of us.

- Mark
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-2303) remove unnecessary (and problematic) log4j jars in contribs

2011-01-18 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983317#action_12983317
 ] 

Erick Erickson commented on SOLR-2303:
--

Ah, I think the light finally dawns. And helps explain why I'm getting 
different results on different machines/environments

There's a reason they don't often let me near build systems.

Ok, splendid. I suggested removing things to see if it was a bad idea. It is. 
Almost.

So does it still make sense to remove the log4j jars in contrib in the 3_x 
branch?

Robert:
I did as you suggested, and of course started getting classNotFound errors 
for JUnitTestRunner and so-on. So I included these lines in Solr's build.xml.

pathelement path=${common-solr.dir}/../lucene/lib/ant-junit-1.7.1.jar /
pathelement path=${common-solr.dir}/../lucene/lib/ant-1.7.1.jar /
pathelement path=${common-solr.dir}/../lucene/lib/junit-4.7.jar /
 
in place of java.class.path and all is well. Is this the path you'd go down? 
I'm not very comfortable having Solr reach over into Lucene, but what do I know?

It should be fairly obvious by now that I'm not very ant-sophisticated, is 
there a preferred way of doing this? Because if this is OK, it seems we should 
also remove junit-4.7.jar from ../solr/lib and point anything that needs it 
should path to ../lucene/lib as well.

I'm currently testing similar changes on the 3_x build with log4j files 
removed. But that worked before as well.

Let me know


 remove unnecessary (and problematic) log4j jars in contribs
 ---

 Key: SOLR-2303
 URL: https://issues.apache.org/jira/browse/SOLR-2303
 Project: Solr
  Issue Type: Improvement
  Components: Build
Reporter: Robert Muir
 Fix For: 4.0

 Attachments: SOLR-2303.patch


 In solr 4.0 there is log4j-over-slf4j.
 But if you have log4j jars also in the classpath (e.g. contrib/extraction, 
 contrib/clustering) you can get strange errors such as:
 java.lang.NoSuchMethodError: org.apache.log4j.Logger.setAdditivity(Z)V
 So I think we should remove the log4j jars in these contribs, all tests pass 
 with them removed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Let's drop Maven Artifacts !

2011-01-18 Thread Mark Miller

On Jan 18, 2011, at 12:28 PM, Steven A Rowe wrote:

 On 1/18/2011 at 12:14 PM, Robert Muir wrote:
 On Tue, Jan 18, 2011 at 12:03 PM, Steven A Rowe sar...@syr.edu wrote:
 
 There clearly is no consensus for removing Maven support from Lucene.
 
 and see there is my problem, there was no consensus to begin with, now
 suddenly its de-facto required. Maven is quite an insidious computer
 virus.
 
 So you think you personally have the power to remove functionality from 
 Lucene that has the support of multiple committers?

If he thought that, he would have removed maven from svn by now!

From my point of view, but perhaps I misremember:

At some point, Grant or someone put in some Maven poms. I don't think anyone 
else really paid attention. Later, as we did releases, and saw and dealt with 
these poms, most of us commented against Maven support. It just feels to me 
like it slipped in - and really its the type of thing that should have been 
more discussed and thought out, and perhaps voted upon. Maven snuck into Lucene 
IMO. To my knowledge, the majority of core developers do not want maven in the 
build and/or frown on dealing with Maven. We could always have a little vote to 
gauge numbers - I just have not wanted to rush to another vote thread myself ;) 
Users are important too - but they don't get official votes - it's up to each 
of us to consider the User feelings/vote in our opinions/votes as we see fit 
IMO.

- Mark
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2324) Per thread DocumentsWriters that write their own private segments

2011-01-18 Thread Michael Busch (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983346#action_12983346
 ] 

Michael Busch commented on LUCENE-2324:
---

bq. Why does DW.anyDeletions need to be sync'd?

Hmm good point.  Actually only the call to DW.pendingDeletes.any() needs to be 
synced, but not the loop that calls the DWPTs.

{quote}
In ThreadAffinityDWTP... it may be better if we had a single queue,
where threads wait in line, if no DWPT is available? And when a DWPT
finishes it then notifies any waiting threads? (Ie, instead of queue-per-DWPT).
{quote}

Whole foods instead of safeway? :)
Yeah that would be fairer.  A large doc (= a full cart) wouldn't block unlucky 
other docs.  I'll make that change, good idea!

{quote}
I see the fieldInfos.update(dwpt.getFieldInfos()) (in
DW.updateDocument) - is there a risk that two threads bring a new
field into existence at the same time, but w/ different config? Eg
one doc omitsTFAP and the other doesn't? Or, on flush, does each DWPT
use its private FieldInfos to correctly flush the segment? (Hmm: do
we seed each DWPT w/ the original FieldInfos created by IW on init?).
{quote}

Every DWPT has its own private FieldInfos.  When a segment is flushed the DWPT 
uses its private FI and then it updates the original DW.fieldInfos (from IW), 
which is a synchronized call.  

The only consumer of DW.getFieldInfos() is SegmentMerger in IW.  Hmm, given 
that IW.flush() isn't synchronized anymore I assume this can lead into a 
problem?  E.g. the SegmentMerger gets a FieldInfos that's newer than the list 
of segments it's trying to flush?

bq. How are we handling the case of open IW, do delete-by-term but no added 
docs?

DW has a SegmentDeletes (pendingDeletes) which gets pushed to the last segment. 
 We only add delTerms to DW.pendingDeletes if we couldn't push it to any DWPT.  
Btw. I think the whole pushDeletes business isn't working correctly yet, I'm 
looking into it.  I need to understand the code that coalesces the deletes 
better. 

bq. In DW.deleteTerms... shouldn't we skip a DWPT if it has no buffered docs?

Yeah, I did that already, but not committed yet.

 Per thread DocumentsWriters that write their own private segments
 -

 Key: LUCENE-2324
 URL: https://issues.apache.org/jira/browse/LUCENE-2324
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Michael Busch
Assignee: Michael Busch
Priority: Minor
 Fix For: Realtime Branch

 Attachments: LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, 
 LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, 
 LUCENE-2324.patch, LUCENE-2324.patch, LUCENE-2324.patch, lucene-2324.patch, 
 lucene-2324.patch, LUCENE-2324.patch, test.out, test.out, test.out, test.out


 See LUCENE-2293 for motivation and more details.
 I'm copying here Mike's summary he posted on 2293:
 Change the approach for how we buffer in RAM to a more isolated
 approach, whereby IW has N fully independent RAM segments
 in-process and when a doc needs to be indexed it's added to one of
 them. Each segment would also write its own doc stores and
 normal segment merging (not the inefficient merge we now do on
 flush) would merge them. This should be a good simplification in
 the chain (eg maybe we can remove the *PerThread classes). The
 segments can flush independently, letting us make much better
 concurrent use of IO  CPU.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Resolved: (LUCENE-2472) The terms index divisor in IW should be set via IWC not via getReader

2011-01-18 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera resolved LUCENE-2472.


   Resolution: Fixed
Fix Version/s: (was: 4.0)

You're right Mike. I committed the deprecation note to revision 1060545.

 The terms index divisor in IW should be set via IWC not via getReader
 -

 Key: LUCENE-2472
 URL: https://issues.apache.org/jira/browse/LUCENE-2472
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 3.1


 The getReader call gives a false sense of security... since if deletions have 
 already been applied (and IW is pooling) the readers have already been loaded 
 with a divisor of 1.
 Better to set the divisor up front in IWC.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Let's drop Maven Artifacts !

2011-01-18 Thread Michael Busch

On 1/18/11 10:44 AM, Mark Miller wrote:

 From my point of view, but perhaps I misremember:

At some point, Grant or someone put in some Maven poms.
I did. :) It was a ton of work and especially getting the 
maven-ant-tasks to work was a nightmare!



I don't think anyone else really paid attention.


All those patches were attached to a jira issue, and the issue was open 
for a while, with people asking for published maven artifacts.



Later, as we did releases, and saw and dealt with these poms, most of us 
commented against Maven support.


So can you explain what the problem with the maven support is?  Isn't it 
enough to just call the ant target and copying the generated files 
somewhere?  When I did releases I never thought it made the release any 
harder.  Just two additional easy steps.



It just feels to me like it slipped in - and really its the type of thing that 
should have been more discussed and thought out, and perhaps voted upon. Maven 
snuck into Lucene IMO. To my knowledge, the majority of core developers do not 
want maven in the build and/or frown on dealing with Maven. We could always 
have a little vote to gauge numbers - I just have not wanted to rush to another 
vote thread myself ;) Users are important too - but they don't get official 
votes - it's up to each of us to consider the User feelings/vote in our 
opinions/votes as we see fit IMO.

- Mark
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org





-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: Let's drop Maven Artifacts !

2011-01-18 Thread Steven A Rowe
On 1/18/2011 at 1:45 PM, Mark Miller wrote:
 At some point, Grant or someone put in some Maven poms. I don't think
 anyone else really paid attention. Later, as we did releases, and saw and
 dealt with these poms, most of us commented against Maven support. It just
 feels to me like it slipped in - and really its the type of thing that
 should have been more discussed and thought out, and perhaps voted upon.
 Maven snuck into Lucene IMO.

Lucene's policy is commit-then-review, and lazy consensus is the rule, right?


[jira] Created: (LUCENE-2873) TestIndexWriterReader fails: too many open files

2011-01-18 Thread Robert Muir (JIRA)
TestIndexWriterReader fails: too many open files


 Key: LUCENE-2873
 URL: https://issues.apache.org/jira/browse/LUCENE-2873
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 3.1
 Environment: java version 1.6.0
Java(TM) SE Runtime Environment (build pxi3260sr9-20101125_01(SR9))
IBM J9 VM (build 2.4, JRE 1.6.0 IBM J9 2.4 Linux x86-32 
jvmxi3260sr9-20101124_69295 (JIT enabled, AOT enabled)
J9VM - 20101124_069295
JIT  - r9_20101028_17488ifx2
GC   - 20101027_AA)
JCL  - 20101119_01

Reporter: Robert Muir


{noformat}
[junit] Testsuite: org.apache.lucene.index.TestIndexWriterReader
[junit] Testcase: 
testAddIndexesAndDoDeletesThreads(org.apache.lucene.index.TestIndexWriterReader):
 Caused an ERROR
[junit] 
/home/cron/branch_3x/lucene/build/test/6/test7430286492423218781tmp/_90.prx 
(Too many open files)
[junit] java.io.FileNotFoundException: 
/home/cron/branch_3x/lucene/build/test/6/test7430286492423218781tmp/_90.prx 
(Too many open files)
[junit] at java.io.RandomAccessFile.open(Native Method)
[junit] at java.io.RandomAccessFile.init(RandomAccessFile.java:229)
[junit] at 
org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput$Descriptor.init(SimpleFSDirectory.java:69)
[junit] at 
org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.init(SimpleFSDirectory.java:90)
[junit] at 
org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.init(NIOFSDirectory.java:91)
[junit] at 
org.apache.lucene.store.NIOFSDirectory.openInput(NIOFSDirectory.java:78)
[junit] at 
org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:353)
[junit] at 
org.apache.lucene.store.MockDirectoryWrapper.openInput(MockDirectoryWrapper.java:358)
[junit] at 
org.apache.lucene.store.Directory.openInput(Directory.java:139)
[junit] at 
org.apache.lucene.index.SegmentReader$CoreReaders.init(SegmentReader.java:135)
[junit] at 
org.apache.lucene.index.SegmentReader.get(SegmentReader.java:583)
[junit] at 
org.apache.lucene.index.SegmentReader.get(SegmentReader.java:561)
[junit] at 
org.apache.lucene.index.DirectoryReader.init(DirectoryReader.java:101)
[junit] at 
org.apache.lucene.index.ReadOnlyDirectoryReader.init(ReadOnlyDirectoryReader.java:27)
[junit] at 
org.apache.lucene.index.DirectoryReader$1.doBody(DirectoryReader.java:78)
[junit] at 
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:697)
[junit] at 
org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:72)
[junit] at 
org.apache.lucene.index.IndexReader.open(IndexReader.java:344)
[junit] at 
org.apache.lucene.index.IndexReader.open(IndexReader.java:230)
[junit] at 
org.apache.lucene.index.TestIndexWriterReader.testAddIndexesAndDoDeletesThreads(TestIndexWriterReader.java:381)
[junit] at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1007)
[junit] at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:939)
[junit] 
[junit] 
[junit] Tests run: 18, Failures: 0, Errors: 1, Time elapsed: 13.56 sec
[junit] 
[junit] - Standard Error -
[junit] NOTE: reproduce with: ant test -Dtestcase=TestIndexWriterReader 
-Dtestmethod=testAddIndexesAndDoDeletesThreads 
-Dtests.seed=-7781539944268912038:-6865031686554264582
[junit] NOTE: test params are: locale=ar_SD, timezone=Asia/Almaty
[junit] NOTE: all tests run in this JVM:
[junit] [TestMergeSchedulerExternal, TestCharFilter, 
TestISOLatin1AccentFilter, TestCharTermAttributeImpl, TestDoc, 
TestFieldsReader, TestFilterIndexReader, TestIndexWriterReader]
[junit] NOTE: Linux 2.6.32-24-generic x86/IBM Corporation 1.6.0 
(32-bit)/cpus=1,threads=3,free=6495816,total=11920384
[junit] -  ---
[junit] TEST org.apache.lucene.index.TestIndexWriterReader FAILED
{noformat}


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Let's drop Maven Artifacts !

2011-01-18 Thread Mark Miller

On Jan 18, 2011, at 2:37 PM, Steven A Rowe wrote:

 On 1/18/2011 at 1:45 PM, Mark Miller wrote:
 At some point, Grant or someone put in some Maven poms. I don't think
 anyone else really paid attention. Later, as we did releases, and saw and
 dealt with these poms, most of us commented against Maven support. It just
 feels to me like it slipped in - and really its the type of thing that
 should have been more discussed and thought out, and perhaps voted upon.
 Maven snuck into Lucene IMO.
 
 Lucene's policy is commit-then-review, and lazy consensus is the rule, right?

Right - clearly this is not some sneaky or underhanded thing that happened. 
Certainly this is how a lot of legit things happen.

The only reason I feel it was more of a Maven sneaking in thing is that in IRC 
I have learned how many active core devs really didn't want Maven in the build 
at a later time. I think we just didn't really know what was happening / paid 
attention. I don't mean to characterize incorrectly. If you asked me back then, 
I prob would not have understood the consequences whatsoever and said, please 
go ahead! Patches welcome.

People's opinions have shifted though - we have more committers now - perhaps 
the Maven support side is larger than the against now.

Just stating things as I roughly knew them - happy to see things cleared up, 
fined tuned.

- Mark
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Let's drop Maven Artifacts !

2011-01-18 Thread Mark Miller

On Jan 18, 2011, at 2:41 PM, Michael Busch wrote:

 
 So can you explain what the problem with the maven support is?  Isn't it 
 enough to just call the ant target and copying the generated files somewhere? 
  When I did releases I never thought it made the release any harder.  Just 
 two additional easy steps.
 

Robert and I have gone over this a fair amount in previous exchanges I think, 
if you really want to know particulars. Suffice it to say, the problems so far 
have not been large, it feels like the likelihood of future larger problems is 
growing, if you ask people that seem to like/care about Maven support, the 
problems are probably not really a problem or easily addressable, if you ask 
people that dislike/don't want Maven, the problems are probably just not worth 
ever having to run into when we are still convinced this could be handled 
downstream.

If I remember right, a large reason Robert is against is that he doesn't want 
to sign/support/endorse something he doesn't understand or care about as a 
Release Manager? But thats probably a major simplification of his previous 
arguments. And the pro Maven team has offered their counters to that.


- Mark
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Let's drop Maven Artifacts !

2011-01-18 Thread Robert Muir
On Tue, Jan 18, 2011 at 2:57 PM, Mark Miller markrmil...@gmail.com wrote:

 If I remember right, a large reason Robert is against is that he doesn't want 
 to sign/support/endorse something he doesn't understand or care about as a 
 Release Manager? But thats probably a major simplification of his previous 
 arguments. And the pro Maven team has offered their counters to that.


Well, i definitely don't want to produce a jacked-up release. And I
listed in the last 99-email maven thread, a reference to how many of
the previous releases have had various bugs/problems with maven. The
problem is, as it is in our code now, there is no way to verify these
magical files will actually work. and yet we all just ignore the fact
we are probably shipping broken artifacts and go with the release
anyway?

(separately, for reference i know that Uwe has the releasing down to
an art and is probably the sole person here that could actually do a
release without having maven jacked up, so he isn't included)

But for the rest of us, we don't understand maven. why can't it be
handled downstream?
And it sets a tone for future things, for instance *the most popular
issue* in lucene, its not flexible indexing, its not realtime search,
its not column stride fields, its... make Lucene an OSGI bundle?

https://issues.apache.org/jira/browse/LUCENE?report=com.atlassian.jira.plugin.system.project:popularissues-panel

Anyway i think we are making a search engine library, and if someone
else can deal with these hassles, they should. we should focus on
search engine stuff and getting out solid releases.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Let's drop Maven Artifacts !

2011-01-18 Thread Robert Muir
On Tue, Jan 18, 2011 at 3:11 PM, Michael Busch busch...@gmail.com wrote:
 I'm not sure what's so complicated or mysterious about maven artifacts.  A
 maven artifact consists of normal jar file(s) plus a POM file containing
 some metadata, like the artifact name and group.

its the POM files that cause problems and reported bugs. i don't think
they are simple at all, in fact i think they are more complicated than
ant build.xml files!

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2755) Some improvements to CMS

2011-01-18 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983359#action_12983359
 ] 

Robert Muir commented on LUCENE-2755:
-

Mike, fyi it looks like we are hung again in hudson:
https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/3866/

Not sure if its the same deadlock you found.

 Some improvements to CMS
 

 Key: LUCENE-2755
 URL: https://issues.apache.org/jira/browse/LUCENE-2755
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Minor
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2755.patch


 While running optimize on a large index, I've noticed several things that got 
 me to read CMS code more carefully, and find these issues:
 * CMS may hold onto a merge if maxMergeCount is hit. That results in the 
 MergeThreads taking merges from the IndexWriter until they are exhausted, and 
 only then that blocked merge will run. I think it's unnecessary that that 
 merge will be blocked.
 * CMS sorts merges by segments size, doc-based and not bytes-based. Since the 
 default MP is LogByteSizeMP, and I hardly believe people care about doc-based 
 size segments anymore, I think we should switch the default impl. There are 
 two ways to make it extensible, if we want:
 ** Have an overridable member/method in CMS that you can extend and override 
 - easy.
 ** Have OneMerge be comparable and let the MP determine the order (e.g. by 
 bytes, docs, calibrate deletes etc.). Better, but will need to tap into 
 several places in the code, so more risky and complicated.
 On the go, I'd like to add some documentation to CMS - it's not very easy to 
 read and follow.
 I'll work on a patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Reopened: (LUCENE-2755) Some improvements to CMS

2011-01-18 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir reopened LUCENE-2755:
-


I am reopening just so we don't miss fixing the deadlock... its hung in the 
same exact part of the tests as earlier
today so I think its somehow related...

 Some improvements to CMS
 

 Key: LUCENE-2755
 URL: https://issues.apache.org/jira/browse/LUCENE-2755
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Minor
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2755.patch


 While running optimize on a large index, I've noticed several things that got 
 me to read CMS code more carefully, and find these issues:
 * CMS may hold onto a merge if maxMergeCount is hit. That results in the 
 MergeThreads taking merges from the IndexWriter until they are exhausted, and 
 only then that blocked merge will run. I think it's unnecessary that that 
 merge will be blocked.
 * CMS sorts merges by segments size, doc-based and not bytes-based. Since the 
 default MP is LogByteSizeMP, and I hardly believe people care about doc-based 
 size segments anymore, I think we should switch the default impl. There are 
 two ways to make it extensible, if we want:
 ** Have an overridable member/method in CMS that you can extend and override 
 - easy.
 ** Have OneMerge be comparable and let the MP determine the order (e.g. by 
 bytes, docs, calibrate deletes etc.). Better, but will need to tap into 
 several places in the code, so more risky and complicated.
 On the go, I'd like to add some documentation to CMS - it's not very easy to 
 read and follow.
 I'll work on a patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Let's drop Maven Artifacts !

2011-01-18 Thread Mark Miller
To follow up Steven:

Yes - Maven is part of Lucene now - it got in with lazy consensus or whatever 
method - and now it's basically a first class citizen. I would have to get 
consensus to drop it much more than you would have to get consensus to keep it. 
This is exactly why I don't want it to stick around or grow when it could be a 
downstream project. All of this continued Maven work just looks more stuff we 
will have maintain/support in the future it seems to me.

Honestly though - if it looks like the majority are for Maven - I drop my 
objection.

- Mark


On Jan 18, 2011, at 2:45 PM, Mark Miller wrote:

 
 On Jan 18, 2011, at 2:37 PM, Steven A Rowe wrote:
 
 On 1/18/2011 at 1:45 PM, Mark Miller wrote:
 At some point, Grant or someone put in some Maven poms. I don't think
 anyone else really paid attention. Later, as we did releases, and saw and
 dealt with these poms, most of us commented against Maven support. It just
 feels to me like it slipped in - and really its the type of thing that
 should have been more discussed and thought out, and perhaps voted upon.
 Maven snuck into Lucene IMO.
 
 Lucene's policy is commit-then-review, and lazy consensus is the rule, right?
 
 Right - clearly this is not some sneaky or underhanded thing that happened. 
 Certainly this is how a lot of legit things happen.
 
 The only reason I feel it was more of a Maven sneaking in thing is that in 
 IRC I have learned how many active core devs really didn't want Maven in the 
 build at a later time. I think we just didn't really know what was happening 
 / paid attention. I don't mean to characterize incorrectly. If you asked me 
 back then, I prob would not have understood the consequences whatsoever and 
 said, please go ahead! Patches welcome.
 
 People's opinions have shifted though - we have more committers now - perhaps 
 the Maven support side is larger than the against now.
 
 Just stating things as I roughly knew them - happy to see things cleared up, 
 fined tuned.
 
 - Mark


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-2303) remove unnecessary (and problematic) log4j jars in contribs

2011-01-18 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983367#action_12983367
 ] 

Robert Muir commented on SOLR-2303:
---

bq. OK, scratch the notion of removing the junit-4.7.jar file from Solr, the 
test cases...er...stop compiling. But the rest still stands.

{quote}
pathelement path=${common-solr.dir}/../lucene/lib/ant-junit-1.7.1.jar /
pathelement path=${common-solr.dir}/../lucene/lib/ant-1.7.1.jar /
pathelement path=${common-solr.dir}/../lucene/lib/junit-4.7.jar /

in place of java.class.path and all is well. Is this the path you'd go down? 
I'm not very comfortable having Solr reach over into Lucene, but what do I know?
{quote}


Yeah, in general it would be good to explicitly include ant, ant-junit, and 
junit into our classpath for tests.
I know i fooled with trying to do this across all of lucene and solr, there are 
some twists:
* when the clover build is enabled, we have to actually use the ant 
runtime/java.class.path, because clover injects itself via ant's classpath via 
-lib. There
might be a better way to configure clover to avoid this, but failing that we 
have to sometimes support throwing ant's classpath into the classpath like we 
do now.
* the contrib/ant gets tricky (i dont remember why) especially with clover 
enabled :)
* finally, ant 1.8 support might break, since we specifically include ant 1.7 
stuff in our lib. But its generally what we want, better to have a reliable 
classpath in
our build/tests than to compile/test with whatever version of ant the person 
happens to be using. Ant gets angry if you try to put ant 1.7.jar into an ant 
1.8 runtime...

the same situation exists for compilation actually, but I *think* i fixed that 
one... you would have to re-check :)


 remove unnecessary (and problematic) log4j jars in contribs
 ---

 Key: SOLR-2303
 URL: https://issues.apache.org/jira/browse/SOLR-2303
 Project: Solr
  Issue Type: Improvement
  Components: Build
Reporter: Robert Muir
Assignee: Erick Erickson
 Fix For: 4.0

 Attachments: SOLR-2303.patch


 In solr 4.0 there is log4j-over-slf4j.
 But if you have log4j jars also in the classpath (e.g. contrib/extraction, 
 contrib/clustering) you can get strange errors such as:
 java.lang.NoSuchMethodError: org.apache.log4j.Logger.setAdditivity(Z)V
 So I think we should remove the log4j jars in these contribs, all tests pass 
 with them removed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Let's drop Maven Artifacts !

2011-01-18 Thread Grant Ingersoll


On Jan 18, 2011, at 12:13 PM, Robert Muir wrote:

 I can't help but remind myself, this is the same argument Oracle
 offered up for the whole reason hudson debacle
 (http://hudson-labs.org/content/whos-driving-thing)
 
 Declaring that I have a secret pocket of users that want XYZ isn't
 open source consensus.



You were very quick to cite your own secret pocket of users when you called 
those who support it the vocal minority.  So, if you want to continue baiting 
the discussion we can, but as I see it, we have committers willing to support 
it, so what's the big deal?
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Let's drop Maven Artifacts !

2011-01-18 Thread Robert Muir
On Tue, Jan 18, 2011 at 3:49 PM, Grant Ingersoll gsing...@apache.org wrote:


 On Jan 18, 2011, at 12:13 PM, Robert Muir wrote:

 I can't help but remind myself, this is the same argument Oracle
 offered up for the whole reason hudson debacle
 (http://hudson-labs.org/content/whos-driving-thing)

 Declaring that I have a secret pocket of users that want XYZ isn't
 open source consensus.



 You were very quick to cite your own secret pocket of users when you called 
 those who support it the vocal minority.  So, if you want to continue 
 baiting the discussion we can, but as I see it, we have committers willing to 
 support it, so what's the big deal?

I don't think they are that secret, you can look at the last maven
discussion and see several other committers who spoke up against it.
they are just sick of the discussion i gather and have given up
fighting it.

The problem again, is the magical special artifacts.

I dont see consensus here for maven... when you have it, get back to me.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Let's drop Maven Artifacts !

2011-01-18 Thread Robert Muir
On Tue, Jan 18, 2011 at 3:49 PM, Grant Ingersoll gsing...@apache.org wrote:


 On Jan 18, 2011, at 12:13 PM, Robert Muir wrote:

 I can't help but remind myself, this is the same argument Oracle
 offered up for the whole reason hudson debacle
 (http://hudson-labs.org/content/whos-driving-thing)

 Declaring that I have a secret pocket of users that want XYZ isn't
 open source consensus.



 You were very quick to cite your own secret pocket of users when you called 
 those who support it the vocal minority.  So, if you want to continue 
 baiting the discussion we can, but as I see it, we have committers willing to 
 support it, so what's the big deal?

http://www.lucidimagination.com/search/document/474564645f673fbb/discussion_about_release_frequency

You can look there, and see the responses of several other committers
about maven.

I think i like Yonik's comment best: Maven is not a part of the
release process, if you think it should be, maybe you should call a
vote?

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Let's drop Maven Artifacts !

2011-01-18 Thread Grant Ingersoll


On Jan 18, 2011, at 3:55 PM, Robert Muir wrote:

 On Tue, Jan 18, 2011 at 3:49 PM, Grant Ingersoll gsing...@apache.org wrote:
 
 
 On Jan 18, 2011, at 12:13 PM, Robert Muir wrote:
 
 I can't help but remind myself, this is the same argument Oracle
 offered up for the whole reason hudson debacle
 (http://hudson-labs.org/content/whos-driving-thing)
 
 Declaring that I have a secret pocket of users that want XYZ isn't
 open source consensus.
 
 
 
 You were very quick to cite your own secret pocket of users when you called 
 those who support it the vocal minority.  So, if you want to continue 
 baiting the discussion we can, but as I see it, we have committers willing 
 to support it, so what's the big deal?
 
 I don't think they are that secret, you can look at the last maven
 discussion and see several other committers who spoke up against it.
 they are just sick of the discussion i gather and have given up
 fighting it.

Wow, so who is the vocal minority now?  

 
 The problem again, is the magical special artifacts.
 
 I dont see consensus here for maven... when you have it, get back to me.

As I see, it you have you, Shai and Miller (and Yonik, likely from the last go 
around).  On the Maven side, you have me, Steve, McKinley and Busch, plus some 
users/contributors. 

In other words, I don't see consensus for dropping it.  When you have it, get 
back to me.  
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Let's drop Maven Artifacts !

2011-01-18 Thread Robert Muir
On Tue, Jan 18, 2011 at 4:06 PM, Grant Ingersoll gsing...@apache.org wrote:

 In other words, I don't see consensus for dropping it.  When you have it, get 
 back to me.

Thats not how things are added to the release process.
So currently, maven is not included in the release process.

I don't care if your poll on the users list has 100% of users checking
maven, you biased your poll already by mentioning that its because we
are considering dropping maven support at the start of the email, so
its total garbage.

There's a lot of totally insane things I could poll the user list and
get lots of responses for, that I think the devs would disagree with.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Let's drop Maven Artifacts !

2011-01-18 Thread Michael Busch

It's sad how aggressive these discussions get.  There's really no reason.

On 1/18/11 1:10 PM, Robert Muir wrote:

On Tue, Jan 18, 2011 at 4:06 PM, Grant Ingersollgsing...@apache.org  wrote:

In other words, I don't see consensus for dropping it.  When you have it, get 
back to me.

Thats not how things are added to the release process.
So currently, maven is not included in the release process.

I don't care if your poll on the users list has 100% of users checking
maven, you biased your poll already by mentioning that its because we
are considering dropping maven support at the start of the email, so
its total garbage.

There's a lot of totally insane things I could poll the user list and
get lots of responses for, that I think the devs would disagree with.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org





-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Assigned: (SOLR-2307) PHPSerialized fails with sharded queries

2011-01-18 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man reassigned SOLR-2307:
--

Assignee: Hoss Man

 PHPSerialized fails with sharded queries
 

 Key: SOLR-2307
 URL: https://issues.apache.org/jira/browse/SOLR-2307
 Project: Solr
  Issue Type: Bug
  Components: Response Writers
Affects Versions: 1.3, 1.4.1
Reporter: Antonio Verni
Assignee: Hoss Man
Priority: Minor
 Attachments: PHPSerializedResponseWriter.java.patch, 
 PHPSerializedResponseWriter.java.patch, 
 PHPSerializedResponseWriter.java.patch, TestPHPSerializedResponseWriter.java, 
 TestPHPSerializedResponseWriter.java


 Solr throws a java.lang.IllegalArgumentException: Map size must not be 
 negative exception when using the PHP Serialized response writer with 
 sharded queries. 
 To reproduce the issue start your preferred example and try the following 
 query:
 http://localhost:8983/solr/select/?q=*:*wt=phpsshards=localhost:8983/solr,localhost:8983/solr
 It is caused by the JSONWriter implementation of writeSolrDocumentList and 
 writeSolrDocument. Overriding this two methods in the 
 PHPSerializedResponseWriter to handle the SolrDocument size seems to solve 
 the issue.
 Attached my patch made against trunk rev 1055588.
 cheers,
 Antonio

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Let's drop Maven Artifacts !

2011-01-18 Thread Peter Karich
 Why not vote for or against 'maven artifacts'?

http://www.doodle.com/2qp35b42vstivhvx

I'm using lucene+solr a lot times via maven.
Elasticsearch uses lucene via gradle.
Solandra uses lucene via ivy and so on ;)
So maven artifacts are not only very handy for maven folks.
But I think no artifacts would be better than broken ones.

Why not trying to 'switch' to ivy build system? It's ant but handles
dependencies better IMO.

Regards,
Peter.

 On Tue, Jan 18, 2011 at 3:49 PM, Grant Ingersoll gsing...@apache.org wrote:

 On Jan 18, 2011, at 12:13 PM, Robert Muir wrote:

 I can't help but remind myself, this is the same argument Oracle
 offered up for the whole reason hudson debacle
 (http://hudson-labs.org/content/whos-driving-thing)

 Declaring that I have a secret pocket of users that want XYZ isn't
 open source consensus.


 You were very quick to cite your own secret pocket of users when you called 
 those who support it the vocal minority.  So, if you want to continue 
 baiting the discussion we can, but as I see it, we have committers willing 
 to support it, so what's the big deal?
 I don't think they are that secret, you can look at the last maven
 discussion and see several other committers who spoke up against it.
 they are just sick of the discussion i gather and have given up
 fighting it.

 The problem again, is the magical special artifacts.

 I dont see consensus here for maven... when you have it, get back to me.

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Let's drop Maven Artifacts !

2011-01-18 Thread Grant Ingersoll

On Jan 18, 2011, at 4:10 PM, Robert Muir wrote:

 On Tue, Jan 18, 2011 at 4:06 PM, Grant Ingersoll gsing...@apache.org wrote:
 
 In other words, I don't see consensus for dropping it.  When you have it, 
 get back to me.
 
 Thats not how things are added to the release process.
 So currently, maven is not included in the release process.
 
 I don't care if your poll on the users list has 100% of users checking
 maven, you biased your poll already by mentioning that its because we
 are considering dropping maven support at the start of the email, so
 its total garbage.

Sorry, I'm not a professional poll writer.  Even if I didn't include it, it 
would take all of a half of a second for someone to figure it out.  As you can 
see by the responses, though, I think people are simply answering it.

It's just software and we have people willing to maintain the Maven stuff.  I 
simply don't get what the big deal is in keeping something that people find 
useful and has (enough) committer support.
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Let's drop Maven Artifacts !

2011-01-18 Thread Robert Muir
On Tue, Jan 18, 2011 at 4:38 PM, Grant Ingersoll gsing...@apache.org wrote:

 It's just software and we have people willing to maintain the Maven stuff.  I 
 simply don't get what the big deal is in keeping something that people find 
 useful and has (enough) committer support.

Why not call a committer vote then?

[] -- maintain maven ourselves instead of working on search features,
and slower releases.
[] -- let others maintain maven downstream, instead we work on search
features, and faster releases.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Resolved: (SOLR-2307) PHPSerialized fails with sharded queries

2011-01-18 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-2307.


   Resolution: Fixed
Fix Version/s: 4.0
   3.1

Committed revision 1060585. -- trunk


Committed revision 1060589. - 3x.

thanks again for the great patch Antonio


 PHPSerialized fails with sharded queries
 

 Key: SOLR-2307
 URL: https://issues.apache.org/jira/browse/SOLR-2307
 Project: Solr
  Issue Type: Bug
  Components: Response Writers
Affects Versions: 1.3, 1.4.1
Reporter: Antonio Verni
Assignee: Hoss Man
Priority: Minor
 Fix For: 3.1, 4.0

 Attachments: PHPSerializedResponseWriter.java.patch, 
 PHPSerializedResponseWriter.java.patch, 
 PHPSerializedResponseWriter.java.patch, SOLR-2307.patch, 
 TestPHPSerializedResponseWriter.java, TestPHPSerializedResponseWriter.java


 Solr throws a java.lang.IllegalArgumentException: Map size must not be 
 negative exception when using the PHP Serialized response writer with 
 sharded queries. 
 To reproduce the issue start your preferred example and try the following 
 query:
 http://localhost:8983/solr/select/?q=*:*wt=phpsshards=localhost:8983/solr,localhost:8983/solr
 It is caused by the JSONWriter implementation of writeSolrDocumentList and 
 writeSolrDocument. Overriding this two methods in the 
 PHPSerializedResponseWriter to handle the SolrDocument size seems to solve 
 the issue.
 Attached my patch made against trunk rev 1055588.
 cheers,
 Antonio

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Let's drop Maven Artifacts !

2011-01-18 Thread Grant Ingersoll

On Jan 18, 2011, at 4:41 PM, Robert Muir wrote:

 On Tue, Jan 18, 2011 at 4:38 PM, Grant Ingersoll gsing...@apache.org wrote:
 
 It's just software and we have people willing to maintain the Maven stuff.  
 I simply don't get what the big deal is in keeping something that people 
 find useful and has (enough) committer support.
 
 Why not call a committer vote then?
 
 [] -- maintain maven ourselves instead of working on search features,
 and slower releases.

Wow, so having Maven releases is why we take 6-10 months to release?  Give me a 
break.  The only thing that is slower (arguably) is the building of the release 
itself.   We have had Maven support for a long time and it has never been 
brought up until you did that it was the cause.  The cause is, was and always 
will be that we innovate at a pretty rapid pace and always have the mindset to 
get just one more set of features/fixes into the next release.




 [] -- let others maintain maven downstream, instead we work on search
 features, and faster releases.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2844) benchmark geospatial performance based on geonames.org

2011-01-18 Thread David Smiley (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated LUCENE-2844:
-

Attachment: benchmark-geo.patch

This is an update to the patch which considers the move of the benchmark 
contrib to /modules/benchmark.  It also includes GeoNamesSetSolrAnalyzerTask 
which will use Solr's field-specific analyzer.  It's very much tied to these 
set of classes in the patch.  There are ASF headers now too.

 benchmark geospatial performance based on geonames.org
 --

 Key: LUCENE-2844
 URL: https://issues.apache.org/jira/browse/LUCENE-2844
 Project: Lucene - Java
  Issue Type: New Feature
  Components: contrib/benchmark
Reporter: David Smiley
Priority: Minor
 Fix For: 4.0

 Attachments: benchmark-geo.patch, benchmark-geo.patch


 Until now (with this patch), the benchmark contrib module did not include a 
 means to test geospatial data.  This patch includes some new files and 
 changes to existing ones.  Here is a summary of what is being added in this 
 patch per file (all files below are within the benchmark contrib module) 
 along with my notes:
 Changes:
 * build.xml -- Add dependency on Lucene's spatial module and Solr.
 ** It was a real pain to figure out the convoluted ant build system to make 
 this work, and I doubt I did it the proper way.  
 ** Rob Muir thought it would be a good idea to make the benchmark contrib 
 module be top level module (i.e. be alongside analysis) so that it can depend 
 on everything.  
 http://lucene.472066.n3.nabble.com/Re-Geospatial-search-in-Lucene-Solr-tp2157146p2157824.html
   I agree 
 * ReadTask.java -- Added a search.useHitTotal boolean option that will use 
 the total hits number for reporting purposes, instead of the existing 
 behavior.
 ** The existing behavior (i.e. when search.useHitTotal=false) doesn't look 
 very useful since the response integer is the sum of several things instead 
 of just one thing.  I don't see how anyone makes use of it.
 Note that on my local system, I also changed ReportTask  RepSelectByPrefTask 
 to not include the '-' every other line, and also changed Format.java to not 
 use commas in the numbers.  These changes are to make copy-pasting into excel 
 more streamlined.
 New Files:
 * geoname-spatial.alg -- my algorithm file.
 **  Note the :0 trailing the Populate sequence.  This is a trick I use to 
 skip building the index, since it takes a while to build and I'm not 
 interested in benchmarking index construction.  You'll want to set this to :1 
 and then subsequently put it back for further runs as long as you keep the 
 doc.geo.schemaField or any other configuration elements affecting index the 
 same.
 ** In the patch, doc.geo.schemaField=geohash but unless you're tinkering with 
 SOLR-2155, you'll probably want to set this to latlon
 * GeoNamesContentSource.java -- a ContentSource for a geonames.org data file 
 (either a single country like US.txt or allCountries.txt).
 ** Uses a subclass of DocData to store all the fields.  The existing DocData 
 wasn't very applicable to data that is not composed of a title and body.
 ** Doesn't reuse the docdata parameter to getNextDocData(); a new one is 
 created every time.
 ** Only supports content.source.forever=false
 * GeoNamesDocMaker.java -- a subclass of DocMaker that works very differently 
 than the existing DocMaker.
 ** Instead of assuming that each line from geonames.org will correspond to 
 one Lucene document, this implementation supports, via configuration, 
 creating a variable number of documents, each with a variable number of 
 points taken randomly from a GeoNamesContentSource.
 ** doc.geo.docsToGenerate:  The number of documents to generate.  If blank it 
 defaults to the number of rows in GeoNamesContentSource.
 ** doc.geo.avgPlacesPerDoc: The average number of places to be added to a 
 document.  A random number between 0 and one less than twice this amount is 
 chosen on a per document basis.  If this is set to 1, then exactly one is 
 always used.  In order to support a value greater than 1, use the geohash 
 field type and incorporate SOLR-2155 (geohash prefix technique).
 ** doc.geo.oneDocPerPlace: Whether at most one document should use the same 
 place.  In other words, Can more than one document have the same place?  If 
 so, set this to false.
 ** doc.geo.schemaField: references a field name in schema.xml.  The field 
 should implement SpatialQueryable.
 * GeoPerfData.java: This class is a singleton storing data in memory that is 
 shared by GeoNamesDocMaker.java and GeoQueryMaker.java.
 ** content.geo.zeroPopSubst: if a population is encountered that is = 0, 
 then use this population value instead.  Default is 100.
 ** content.geo.maxPlaces: A limit on the number of rows read in 

Re: Let's drop Maven Artifacts !

2011-01-18 Thread Robert Muir
On Tue, Jan 18, 2011 at 4:50 PM, Grant Ingersoll gsing...@apache.org wrote:

 On Jan 18, 2011, at 4:41 PM, Robert Muir wrote:

 On Tue, Jan 18, 2011 at 4:38 PM, Grant Ingersoll gsing...@apache.org wrote:

 It's just software and we have people willing to maintain the Maven stuff.  
 I simply don't get what the big deal is in keeping something that people 
 find useful and has (enough) committer support.

 Why not call a committer vote then?

 [] -- maintain maven ourselves instead of working on search features,
 and slower releases.

 Wow, so having Maven releases is why we take 6-10 months to release?  Give me 
 a break.  The only thing that is slower (arguably) is the building of the 
 release itself.   We have had Maven support for a long time and it has never 
 been brought up until you did that it was the cause.  The cause is, was and 
 always will be that we innovate at a pretty rapid pace and always have the 
 mindset to get just one more set of features/fixes into the next release.


In my opinion it is just a part of it, i think i detailed this here:
http://www.lucidimagination.com/search/document/474564645f673fbb/discussion_about_release_frequency

(This discussion was subsequently sidetracked and dominated completely
by maven, so I gave up, until Shai brought up the idea again recently
of trying to do a release)

I think that the release process is too complicated, and doing things
to simplify it, such as pushing maven downstream would help a lot.

Furthermore I had this to say about maven once it completely took over
the discussion:

since i have been around, it seems the maven is wrong in nearly every
release[1] including even bugfix releases.
if i am going to be the one making artifacts, i want them to be right.

[1]:
Lucene/Solr 3.x, 4.0: SOLR-2041, SOLR-2055
Solr 1.4.1: SOLR-1977
Solr 1.4: SOLR-981
Lucene 2.9.1, 3.0: LUCENE-2107
Lucene 2.9.0:  LUCENE-1927
Lucene 2.4: LUCENE-1525

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



  1   2   >