[jira] Issue Comment Edited: (SOLR-1395) Integrate Katta
[ https://issues.apache.org/jira/browse/SOLR-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12928464#action_12928464 ] tom liu edited comment on SOLR-1395 at 1/18/11 3:27 AM: JohnWu,Huang : in katta integrations, the solr core has three roles: # proxy, that is query dispatches or front server. all query would be sent to this proxy, and then dispatch to subproxy on katta cluster node. in this proxy, QueryComponent's distributedProcess would be executed. but the param isShard=false. # subproxy, that is proxy on katta cluster node. because each node maybe has more than one cores, so subproxy would receive query from proxy, and send query to any core. in this subproxy, QueryComponent's distributedProcess would be executed. but the param isShard=true. # queryCore, that is real query solr core. any query would be sent to querycore, and the querycore execute QueryComponent's process method. so, when run solr cluster or distribution, we would setup three envs. # proxy's solrconfig.xml {noformat} requestHandler name=standard class=solr.KattaRequestHandler default=true lst name=defaults str name=echoParamsexplicit/str str name=shards*/str /lst /requestHandler {noformat} # subproxy's solrconfig.xml requestHandler name=standard class=solr.SearchHandler default=true.../requestHandler # querycore's solrconfig.xml requestHandler name=standard class=solr.MultiEmbeddedSearchHandler default=true.../requestHandler in katta's katta.node.properties:: node.server.class=org.apache.solr.katta.DeployableSolrKattaServer and in classes dirs of proxy's solr webapps pls add two files: # katta.zk.properties # katta.node.properties was (Author: tom_lt): JohnWu,Huang : in katta integrations, the solr core has three roles: # proxy, that is query dispatches or front server. all query would be sent to this proxy, and then dispatch to subproxy on katta cluster node. in this proxy, QueryComponent's distributedProcess would be executed. but the param isShard=false. # subproxy, that is proxy on katta cluster node. because each node maybe has more than one cores, so subproxy would receive query from proxy, and send query to any core. in this subproxy, QueryComponent's distributedProcess would be executed. but the param isShard=true. # queryCore, that is real query solr core. any query would be sent to querycore, and the querycore execute QueryComponent's process method. so, when run solr cluster or distribution, we would setup three envs. # proxy's solrconfig.xml {noformat} requestHandler name=standard class=solr.KattaRequestHandler default=true lst name=defaults str name=echoParamsexplicit/str str name=shards*/str /lst /requestHandler {noformat} # subproxy's solrconfig.xml requestHandler name=standard class=solr.SearchHandler default=true.../requestHandler # querycore's solrconfig.xml requestHandler name=standard class=solr.SearchHandler default=true.../requestHandler in katta's katta.node.properties:: node.server.class=org.apache.solr.katta.DeployableSolrKattaServer and in classes dirs of proxy's solr webapps pls add two files: # katta.zk.properties # katta.node.properties Integrate Katta --- Key: SOLR-1395 URL: https://issues.apache.org/jira/browse/SOLR-1395 Project: Solr Issue Type: New Feature Affects Versions: 1.4 Reporter: Jason Rutherglen Priority: Minor Fix For: Next Attachments: back-end.log, front-end.log, hadoop-core-0.19.0.jar, katta-core-0.6-dev.jar, katta-solrcores.jpg, katta.node.properties, katta.zk.properties, log4j-1.2.13.jar, solr-1395-1431-3.patch, solr-1395-1431-4.patch, solr-1395-1431-katta0.6.patch, solr-1395-1431-katta0.6.patch, solr-1395-1431.patch, solr-1395-katta-0.6.2-1.patch, solr-1395-katta-0.6.2-2.patch, solr-1395-katta-0.6.2-3.patch, solr-1395-katta-0.6.2.patch, SOLR-1395.patch, SOLR-1395.patch, SOLR-1395.patch, test-katta-core-0.6-dev.jar, zkclient-0.1-dev.jar, zookeeper-3.2.1.jar Original Estimate: 336h Remaining Estimate: 336h We'll integrate Katta into Solr so that: * Distributed search uses Hadoop RPC * Shard/SolrCore distribution and management * Zookeeper based failover * Indexes may be built using Hadoop -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Issue Comment Edited: (SOLR-1395) Integrate Katta
[ https://issues.apache.org/jira/browse/SOLR-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12935709#action_12935709 ] tom liu edited comment on SOLR-1395 at 1/18/11 3:28 AM: JohnWu: my conf is: {code:xml|title=proxy/solrconfig.xml} requestHandler name=standard class=solr.KattaRequestHandler default=true lst name=defaults str name=echoParamsexplicit/str str name=shards*/str /lst /requestHandler {code} {code:xml|title=subproxy/solrconfig.xml} requestHandler name=standard class=solr.SearchHandler default=true !-- default values for query parameters -- lst name=defaults str name=echoParamsexplicit/str /lst /requestHandler {code} {code:xml|title=querycore(shards)/solrconfig.xml} requestHandler name=standard class=solr.MultiEmbeddedSearchHandler default=true !-- default values for query parameters -- lst name=defaults str name=echoParamsexplicit/str /lst /requestHandler {code} {code:xml|title=zoo.cfg} clientPort=2181 ... {code} in Katta/conf and Shards/WEB-INF/classes {code:xml|title=katta.zk.properties} zookeeper.embedded=false zookeeper.servers=localhost:2181 ... {code} was (Author: tom_lt): JohnWu: my conf is: {code:xml|title=proxy/solrconfig.xml} requestHandler name=standard class=solr.KattaRequestHandler default=true lst name=defaults str name=echoParamsexplicit/str str name=shards*/str /lst /requestHandler {code} {code:xml|title=subproxy/solrconfig.xml} requestHandler name=standard class=solr.SearchHandler default=true !-- default values for query parameters -- lst name=defaults str name=echoParamsexplicit/str /lst /requestHandler {code} {code:xml|title=querycore(shards)/solrconfig.xml} requestHandler name=standard class=solr.SearchHandler default=true !-- default values for query parameters -- lst name=defaults str name=echoParamsexplicit/str /lst /requestHandler {code} {code:xml|title=zoo.cfg} clientPort=2181 ... {code} in Katta/conf and Shards/WEB-INF/classes {code:xml|title=katta.zk.properties} zookeeper.embedded=false zookeeper.servers=localhost:2181 ... {code} Integrate Katta --- Key: SOLR-1395 URL: https://issues.apache.org/jira/browse/SOLR-1395 Project: Solr Issue Type: New Feature Affects Versions: 1.4 Reporter: Jason Rutherglen Priority: Minor Fix For: Next Attachments: back-end.log, front-end.log, hadoop-core-0.19.0.jar, katta-core-0.6-dev.jar, katta-solrcores.jpg, katta.node.properties, katta.zk.properties, log4j-1.2.13.jar, solr-1395-1431-3.patch, solr-1395-1431-4.patch, solr-1395-1431-katta0.6.patch, solr-1395-1431-katta0.6.patch, solr-1395-1431.patch, solr-1395-katta-0.6.2-1.patch, solr-1395-katta-0.6.2-2.patch, solr-1395-katta-0.6.2-3.patch, solr-1395-katta-0.6.2.patch, SOLR-1395.patch, SOLR-1395.patch, SOLR-1395.patch, test-katta-core-0.6-dev.jar, zkclient-0.1-dev.jar, zookeeper-3.2.1.jar Original Estimate: 336h Remaining Estimate: 336h We'll integrate Katta into Solr so that: * Distributed search uses Hadoop RPC * Shard/SolrCore distribution and management * Zookeeper based failover * Indexes may be built using Hadoop -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Let's drop Maven Artifacts !
Hi, the developers list may not be the right place to find strong maven supporters. All developers know lucene from inside out and are perfectly fine to install lucene from whatever artifact. Those people using maven are your end users, that propably don't even subscribe to users@. Thomas Koch, http://www.koch.ro - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Created: (SOLR-2317) Slaves have leftover index.xxxxx directories, and leftover files in index/ directory
Slaves have leftover index.x directories, and leftover files in index/ directory Key: SOLR-2317 URL: https://issues.apache.org/jira/browse/SOLR-2317 Project: Solr Issue Type: Bug Affects Versions: 3.1 Reporter: Bill Bell Wen replicating, we are getting leftover files on slaves. Some slaves are getting index.number with files leftover. And more concerning, the index/ direcotry has left over files from previous replicated runs. This is a pain to keep cleaning up. Bill -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (SOLR-2317) Slaves have leftover index.xxxxx directories, and leftover files in index/ directory
[ https://issues.apache.org/jira/browse/SOLR-2317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Bell updated SOLR-2317: Description: When replicating, we are getting leftover files on slaves. Some slaves are getting index.number with files leftover. And more concerning, the index/ direcotry has left over files from previous replicated runs. This is a pain to keep cleaning up. Bill was: Wen replicating, we are getting leftover files on slaves. Some slaves are getting index.number with files leftover. And more concerning, the index/ direcotry has left over files from previous replicated runs. This is a pain to keep cleaning up. Bill Slaves have leftover index.x directories, and leftover files in index/ directory Key: SOLR-2317 URL: https://issues.apache.org/jira/browse/SOLR-2317 Project: Solr Issue Type: Bug Affects Versions: 3.1 Reporter: Bill Bell When replicating, we are getting leftover files on slaves. Some slaves are getting index.number with files leftover. And more concerning, the index/ direcotry has left over files from previous replicated runs. This is a pain to keep cleaning up. Bill -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2317) Slaves have leftover index.xxxxx directories, and leftover files in index/ directory
[ https://issues.apache.org/jira/browse/SOLR-2317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983072#action_12983072 ] Bill Bell commented on SOLR-2317: - This is running Windows 2008 R2. We are using Native Locking on the master and slave. Running Jetty 6. Slaves have leftover index.x directories, and leftover files in index/ directory Key: SOLR-2317 URL: https://issues.apache.org/jira/browse/SOLR-2317 Project: Solr Issue Type: Bug Affects Versions: 3.1 Reporter: Bill Bell When replicating, we are getting leftover files on slaves. Some slaves are getting index.number with files leftover. And more concerning, the index/ direcotry has left over files from previous replicated runs. This is a pain to keep cleaning up. Bill -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-1395) Integrate Katta
[ https://issues.apache.org/jira/browse/SOLR-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983076#action_12983076 ] tom liu commented on SOLR-1395: --- sorry, the above comments have errors: in querycore(shards)/solrconfig.xml, requestHandler must be solr.MultiEmbeddedSearchHandler. {code:xml|title=querycore(shards)/solrconfig.xml} requestHandler name=standard class=solr.MultiEmbeddedSearchHandler default=true !-- default values for query parameters -- lst name=defaults str name=echoParamsexplicit/str /lst /requestHandler {code} QueryComponent returns DocSlice, but XMLWrite or EmbeddedServer returns SolrDocumentList from DocList. Integrate Katta --- Key: SOLR-1395 URL: https://issues.apache.org/jira/browse/SOLR-1395 Project: Solr Issue Type: New Feature Affects Versions: 1.4 Reporter: Jason Rutherglen Priority: Minor Fix For: Next Attachments: back-end.log, front-end.log, hadoop-core-0.19.0.jar, katta-core-0.6-dev.jar, katta-solrcores.jpg, katta.node.properties, katta.zk.properties, log4j-1.2.13.jar, solr-1395-1431-3.patch, solr-1395-1431-4.patch, solr-1395-1431-katta0.6.patch, solr-1395-1431-katta0.6.patch, solr-1395-1431.patch, solr-1395-katta-0.6.2-1.patch, solr-1395-katta-0.6.2-2.patch, solr-1395-katta-0.6.2-3.patch, solr-1395-katta-0.6.2.patch, SOLR-1395.patch, SOLR-1395.patch, SOLR-1395.patch, test-katta-core-0.6-dev.jar, zkclient-0.1-dev.jar, zookeeper-3.2.1.jar Original Estimate: 336h Remaining Estimate: 336h We'll integrate Katta into Solr so that: * Distributed search uses Hadoop RPC * Shard/SolrCore distribution and management * Zookeeper based failover * Indexes may be built using Hadoop -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: Query parser contract changes?
This turns out to have indeed been due to a recent, but un-announced, index format change. A rebuilt index worked properly. Thanks! Karl From: ext karl.wri...@nokia.com [karl.wri...@nokia.com] Sent: Monday, January 17, 2011 10:53 AM To: dev@lucene.apache.org Subject: RE: Query parser contract changes? Another data point: the standard query parser actually ALSO fails when you do anything other than a *:* query. When you specify a field name, it returns zero results: root@duck93:/data/solr-dym/solr-dym# curl http://localhost:8983/solr/nose/standard?q=value_0:a*; ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status0/intint name=QTime7/intl st name=paramsstr name=qvalue_0:a*/str/lst/lstresult name=respons e numFound=0 start=0/ /response But: root@duck93:/data/solr-dym/solr-dym# curl http://localhost:8983/solr/nose/standard?q=*:*; ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status0/intint name=QTime244/int lst name=paramsstr name=q*:*/str/lst/lstresult name=response nu mFound=59431646 start=0docstr name=latitude40.55856/strstr name=l ongitude44.37457/strstr name=referenceLANGUAGE=und|TYPE=STREET|ADDR_TOWN SHIP_NAME=Armenia|ADDR_COUNTRY_NAME=Armenia|ADDR_STREET_NAME=A329|TITLE=A329, Ar menia, Armenia/str/docdocstr name=latitude40.7703/strstr name=long itude43.838/strstr name=referenceLANGUAGE=und|TYPE=STREET|ADDR_TOWNSHIP_ NAME=Armenia|ADDR_COUNTRY_NAME=Armenia|ADDR_STREET_NAME=A330|TITLE=A330, Armenia … The schema has not changed: !-- Level 0 non-language value field -- field name=othervalue_0 type=string_idx_normed required=false/ …where string_idx_normed is declared in the following way: fieldType name=string_idx_normed class=solr.TextField indexed=true stored=false omitNorms=false analyzer type=index tokenizer class=solr.ICUTokenizerFactory / filter class=solr.ICUFoldingFilterFactory / /analyzer analyzer type=query tokenizer class=solr.ICUTokenizerFactory / filter class=solr.ICUFoldingFilterFactory / /analyzer /fieldType … which shouldn’t matter anyway because even a simple TermQuery return from my query parser method doesn’t work any more. Karl From: ext karl.wri...@nokia.com [mailto:karl.wri...@nokia.com] Sent: Monday, January 17, 2011 10:30 AM To: dev@lucene.apache.org Subject: Query parser contract changes? Hi folks, I’m sorely puzzled by the fact that my QParser implementation ceased to work after the latest Solr/Lucene trunk update. My previous update was about ten days ago, right after Mike made his index changes. The symptom is that, although the query parser is correctly called, and seems to have the right arguments, the Query it is returning seems to be ignored. I always get zero results. I eliminated any possibility of error by just hardwiring the return of a TermQuery, and that too always yields zero results. I was able to confirm, using the standard handler with the default query parser, that the index is in fine shape. So I was wondering if the contract for QParser had changed in some subtle way that I missed? Karl - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Let's drop Maven Artifacts !
On Tue, Jan 18, 2011 at 9:33 AM, Thomas Koch tho...@koch.ro wrote: Hi, the developers list may not be the right place to find strong maven supporters. All developers know lucene from inside out and are perfectly fine to install lucene from whatever artifact. Those people using maven are your end users, that propably don't even subscribe to users@. big +1 for this comment! I have to admit that I am not a big maven fan and each time I have to use it its a pain in the ass but it is the de-facto standard for the majority of java projects on this planet so really there is not much of an option in my opinion. A project like lucene has to release maven artifacts even if its a pain. Simon Thomas Koch, http://www.koch.ro - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Let's drop Maven Artifacts !
Out of curiosity, how did the Maven people integrate Lucene before we had Maven artifacts. To the best of my understanding, we never had proper Maven artifacts (Steve is working on that in LUCENE-2657). Shai On Tue, Jan 18, 2011 at 11:03 AM, Simon Willnauer simon.willna...@googlemail.com wrote: On Tue, Jan 18, 2011 at 9:33 AM, Thomas Koch tho...@koch.ro wrote: Hi, the developers list may not be the right place to find strong maven supporters. All developers know lucene from inside out and are perfectly fine to install lucene from whatever artifact. Those people using maven are your end users, that propably don't even subscribe to users@. big +1 for this comment! I have to admit that I am not a big maven fan and each time I have to use it its a pain in the ass but it is the de-facto standard for the majority of java projects on this planet so really there is not much of an option in my opinion. A project like lucene has to release maven artifacts even if its a pain. Simon Thomas Koch, http://www.koch.ro - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly
[ https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Rowe updated LUCENE-2657: Attachment: LUCENE-2657.patch In this patch: # {{ant generate-maven-artifacts}} now works the same as it does on trunk without this patch -- using {{maven-ant-tasks}} -- except that instead of using the POM templates, the POMs provided in the patch are used. # {{ant generate-maven-artifacts}} now functions properly at the top level, from {{lucene/}}, from {{modules/}}, and from {{solr/}}. # Removed all {{*-source.jar}} and {{*-javadoc.jar}} generation related functionality from the POMs, as well as the {{dist}} profile - the Ant build is responsible for putting together the maven artifacts. # Removed the POM templates, except for the two required to deploy the {{solr-noggit}} and {{solr-commons-csv}} artifacts from the Ant build. # Modified the Maven artifact handling in the Ant build, including artifact signing, to be correct. # Based on feedback from Stevo Slavić http://www.mail-archive.com/solr-user@lucene.apache.org/msg45656.html, added explicit {{groupId}}'s to the POMs that didn't have them, and added explicit {{relativePath}}'s to the {{parent}} declarations in all POMs. I think this patch is ready to be committed to trunk. I'll post a branch_3x version of this patch tomorrow, and then I think the patches on this issue will be complete. Replace Maven POM templates with full POMs, and change documentation accordingly Key: LUCENE-2657 URL: https://issues.apache.org/jira/browse/LUCENE-2657 Project: Lucene - Java Issue Type: Improvement Components: Build Affects Versions: 3.1, 4.0 Reporter: Steven Rowe Assignee: Steven Rowe Fix For: 3.1, 4.0 Attachments: LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch The current Maven POM templates only contain dependency information, the bare bones necessary for uploading artifacts to the Maven repository. The full Maven POMs in the attached patch include the information necessary to run a multi-module Maven build, in addition to serving the same purpose as the current POM templates. Several dependencies are not available through public maven repositories. A profile in the top-level POM can be activated to install these dependencies from the various {{lib/}} directories into your local repository. From the top-level directory: {code} mvn -N -Pbootstrap install {code} Once these non-Maven dependencies have been installed, to run all Lucene/Solr tests via Maven's surefire plugin, and populate your local repository with all artifacts, from the top level directory, run: {code} mvn install {code} When one Lucene/Solr module depends on another, the dependency is declared on the *artifact(s)* produced by the other module and deposited in your local repository, rather than on the other module's un-jarred compiler output in the {{build/}} directory, so you must run {{mvn install}} on the other module before its changes are visible to the module that depends on it. To create all the artifacts without running tests: {code} mvn -DskipTests install {code} I almost always include the {{clean}} phase when I do a build, e.g.: {code} mvn -DskipTests clean install {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2295) Create a MaxFieldLengthAnalyzer to wrap any other Analyzer and provide the same functionality as MaxFieldLength provided on IndexWriter
[ https://issues.apache.org/jira/browse/LUCENE-2295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-2295: --- Attachment: LUCENE-2295-2-3x.patch Patch against 3x. Removed the get/set from IWC and changed code which used it. I also added some clarifying notes to the deprecation note in IW.setMaxFieldLength. I will post a separate patch for trunk where this setting will be removed altogether. Create a MaxFieldLengthAnalyzer to wrap any other Analyzer and provide the same functionality as MaxFieldLength provided on IndexWriter --- Key: LUCENE-2295 URL: https://issues.apache.org/jira/browse/LUCENE-2295 Project: Lucene - Java Issue Type: Improvement Components: contrib/analyzers Reporter: Shai Erera Assignee: Uwe Schindler Fix For: 3.1, 4.0 Attachments: LUCENE-2295-2-3x.patch, LUCENE-2295-trunk.patch, LUCENE-2295.patch A spinoff from LUCENE-2294. Instead of asking the user to specify on IndexWriter his requested MFL limit, we can get rid of this setting entirely by providing an Analyzer which will wrap any other Analyzer and its TokenStream with a TokenFilter that keeps track of the number of tokens produced and stop when the limit has reached. This will remove any count tracking in IW's indexing, which is done even if I specified UNLIMITED for MFL. Let's try to do it for 3.1. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Let's drop Maven Artifacts !
Somehow, they were made available since 2.0 - http://repo2.maven.org/maven2/org/apache/lucene/lucene-core/ The pom's are minimal, sans dependencies, so eg if your project depends on lucene-spellchecker, lucene-core won't be transitively included and your build is gonna fail (you therefore had to add dependency on the core to your project yourself). But they were enough to download and link jars/sources/javadocs. On Tue, Jan 18, 2011 at 12:40, Shai Erera ser...@gmail.com wrote: Out of curiosity, how did the Maven people integrate Lucene before we had Maven artifacts. To the best of my understanding, we never had proper Maven artifacts (Steve is working on that in LUCENE-2657). Shai On Tue, Jan 18, 2011 at 11:03 AM, Simon Willnauer simon.willna...@googlemail.com wrote: On Tue, Jan 18, 2011 at 9:33 AM, Thomas Koch tho...@koch.ro wrote: Hi, the developers list may not be the right place to find strong maven supporters. All developers know lucene from inside out and are perfectly fine to install lucene from whatever artifact. Those people using maven are your end users, that propably don't even subscribe to users@. big +1 for this comment! I have to admit that I am not a big maven fan and each time I have to use it its a pain in the ass but it is the de-facto standard for the majority of java projects on this planet so really there is not much of an option in my opinion. A project like lucene has to release maven artifacts even if its a pain. Simon Thomas Koch, http://www.koch.ro - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com) Phone: +7 (495) 683-567-4 ICQ: 104465785 - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly
[ https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983113#action_12983113 ] Robert Muir commented on LUCENE-2657: - bq. I think this patch is ready to be committed to trunk. Well first of all, you obviously worked hard on this, but we need to think this one through before committing. Can we put this code in a separate project, that takes care of maven support for lucene? The problem is there are two camps die maven die and maven or die. There will *never* be consensus. The only way for maven to survive, is for the users that care about it, to support itself, just like other packaging systems such as debian, redhat rpm, freebsd/mac ports, etc etc that we lucene, don't deal with. They can't continue to whine to people like me, who don't give a shit about it, to support it and produce its crazy ass complicated artifacts. Instead the people who care about these packaging systems, and know how to make them work must deal with them. Personally I really don't like: * Having two build systems * Having one build system (ant) rely upon the other (maven) to create release artifacts. Basically, the ant build system is our build. I think it needs to be able to fully build lucene for a release without involving any other build systems such as Make or Maven. Replace Maven POM templates with full POMs, and change documentation accordingly Key: LUCENE-2657 URL: https://issues.apache.org/jira/browse/LUCENE-2657 Project: Lucene - Java Issue Type: Improvement Components: Build Affects Versions: 3.1, 4.0 Reporter: Steven Rowe Assignee: Steven Rowe Fix For: 3.1, 4.0 Attachments: LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch The current Maven POM templates only contain dependency information, the bare bones necessary for uploading artifacts to the Maven repository. The full Maven POMs in the attached patch include the information necessary to run a multi-module Maven build, in addition to serving the same purpose as the current POM templates. Several dependencies are not available through public maven repositories. A profile in the top-level POM can be activated to install these dependencies from the various {{lib/}} directories into your local repository. From the top-level directory: {code} mvn -N -Pbootstrap install {code} Once these non-Maven dependencies have been installed, to run all Lucene/Solr tests via Maven's surefire plugin, and populate your local repository with all artifacts, from the top level directory, run: {code} mvn install {code} When one Lucene/Solr module depends on another, the dependency is declared on the *artifact(s)* produced by the other module and deposited in your local repository, rather than on the other module's un-jarred compiler output in the {{build/}} directory, so you must run {{mvn install}} on the other module before its changes are visible to the module that depends on it. To create all the artifacts without running tests: {code} mvn -DskipTests install {code} I almost always include the {{clean}} phase when I do a build, e.g.: {code} mvn -DskipTests clean install {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly
[ https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983123#action_12983123 ] Steven Rowe commented on LUCENE-2657: - bq. Can we put this code in a separate project, that takes care of maven support for lucene? I'd rather not. The Lucene project has published Maven artifacts since the 1.9.1 release. I think we should continue to do that. bq. The only way for maven to survive, is for the users that care about it, to support itself, just like other packaging systems such as debian, redhat rpm, freebsd/mac ports, etc etc that we lucene, don't deal with. OK, those are pretty obviously red herrings. Can we concentrate on the actual issue here without dragging in those extraneous things? Maven artifacts, not those other things, have been provided by Lucene since the 1.9.1 release. We obviously *do* deal with Maven. bq. They can't continue to whine to people like me, who don't give a shit about it, to support it and produce its crazy ass complicated artifacts. The latest patch on this release uses the Ant artifacts directly. POMs are provided. You know, just like it has been since the 1.9.1 release. bq. Instead the people who care about these packaging systems, and know how to make them work must deal with them. Um, like the patch on this issue is doing? bq. Basically, the ant build system is our build. I think it needs to be able to fully build lucene for a release without involving any other build systems such as Make or Maven. This patch uses the Ant-produced artifacts to prepare for Maven artifact publishing. Maven itself is not invoked in the process. An Ant plugin handles the artifact deployment. I seriously do not understand why this is such a big deal. Why can't we just keep publishing Maven artifacts? You know, like we have for the past 15-20 releases. Replace Maven POM templates with full POMs, and change documentation accordingly Key: LUCENE-2657 URL: https://issues.apache.org/jira/browse/LUCENE-2657 Project: Lucene - Java Issue Type: Improvement Components: Build Affects Versions: 3.1, 4.0 Reporter: Steven Rowe Assignee: Steven Rowe Fix For: 3.1, 4.0 Attachments: LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch The current Maven POM templates only contain dependency information, the bare bones necessary for uploading artifacts to the Maven repository. The full Maven POMs in the attached patch include the information necessary to run a multi-module Maven build, in addition to serving the same purpose as the current POM templates. Several dependencies are not available through public maven repositories. A profile in the top-level POM can be activated to install these dependencies from the various {{lib/}} directories into your local repository. From the top-level directory: {code} mvn -N -Pbootstrap install {code} Once these non-Maven dependencies have been installed, to run all Lucene/Solr tests via Maven's surefire plugin, and populate your local repository with all artifacts, from the top level directory, run: {code} mvn install {code} When one Lucene/Solr module depends on another, the dependency is declared on the *artifact(s)* produced by the other module and deposited in your local repository, rather than on the other module's un-jarred compiler output in the {{build/}} directory, so you must run {{mvn install}} on the other module before its changes are visible to the module that depends on it. To create all the artifacts without running tests: {code} mvn -DskipTests install {code} I almost always include the {{clean}} phase when I do a build, e.g.: {code} mvn -DskipTests clean install {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Let's drop Maven Artifacts !
On Tue, Jan 18, 2011 at 5:29 AM, Hardy Ferentschik s...@ferentschik.de wrote: It also means that someone outside the dev community will at some stage create some pom files and upload the artifact to a (semi-) public repository. This sounds great! this is how open source works, those who care about it, will make it happen! - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly
[ https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983135#action_12983135 ] Chris Male commented on LUCENE-2657: I'm a little lost at what this patch introduces that is imposing? Ant itself has maven support as part of its trunk code base so its clearly not too imposing for them. Is your issue that this patch introduces things that get in your way somehow with using ant to do builds? or are you against committing this due to your general concerns with Maven? Replace Maven POM templates with full POMs, and change documentation accordingly Key: LUCENE-2657 URL: https://issues.apache.org/jira/browse/LUCENE-2657 Project: Lucene - Java Issue Type: Improvement Components: Build Affects Versions: 3.1, 4.0 Reporter: Steven Rowe Assignee: Steven Rowe Fix For: 3.1, 4.0 Attachments: LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch The current Maven POM templates only contain dependency information, the bare bones necessary for uploading artifacts to the Maven repository. The full Maven POMs in the attached patch include the information necessary to run a multi-module Maven build, in addition to serving the same purpose as the current POM templates. Several dependencies are not available through public maven repositories. A profile in the top-level POM can be activated to install these dependencies from the various {{lib/}} directories into your local repository. From the top-level directory: {code} mvn -N -Pbootstrap install {code} Once these non-Maven dependencies have been installed, to run all Lucene/Solr tests via Maven's surefire plugin, and populate your local repository with all artifacts, from the top level directory, run: {code} mvn install {code} When one Lucene/Solr module depends on another, the dependency is declared on the *artifact(s)* produced by the other module and deposited in your local repository, rather than on the other module's un-jarred compiler output in the {{build/}} directory, so you must run {{mvn install}} on the other module before its changes are visible to the module that depends on it. To create all the artifacts without running tests: {code} mvn -DskipTests install {code} I almost always include the {{clean}} phase when I do a build, e.g.: {code} mvn -DskipTests clean install {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly
[ https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983141#action_12983141 ] Chris Male commented on LUCENE-2657: Alright I can appreciate your concern. I think comparing Maven to RPM or FreeBSD ports is going a little far, but I can understand the point you're making. What if this were committed so that those of us who do understand maven and do like using it, could? This issue about whether maven artifacts need to then be released or not can be part of a greater discussion (as is already taking place). By committing this we then make it easier for someone else outside of the project to create the correct artifacts which are then available from the central maven repository, if thats the decision thats made which is also the one you support. Replace Maven POM templates with full POMs, and change documentation accordingly Key: LUCENE-2657 URL: https://issues.apache.org/jira/browse/LUCENE-2657 Project: Lucene - Java Issue Type: Improvement Components: Build Affects Versions: 3.1, 4.0 Reporter: Steven Rowe Assignee: Steven Rowe Fix For: 3.1, 4.0 Attachments: LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch The current Maven POM templates only contain dependency information, the bare bones necessary for uploading artifacts to the Maven repository. The full Maven POMs in the attached patch include the information necessary to run a multi-module Maven build, in addition to serving the same purpose as the current POM templates. Several dependencies are not available through public maven repositories. A profile in the top-level POM can be activated to install these dependencies from the various {{lib/}} directories into your local repository. From the top-level directory: {code} mvn -N -Pbootstrap install {code} Once these non-Maven dependencies have been installed, to run all Lucene/Solr tests via Maven's surefire plugin, and populate your local repository with all artifacts, from the top level directory, run: {code} mvn install {code} When one Lucene/Solr module depends on another, the dependency is declared on the *artifact(s)* produced by the other module and deposited in your local repository, rather than on the other module's un-jarred compiler output in the {{build/}} directory, so you must run {{mvn install}} on the other module before its changes are visible to the module that depends on it. To create all the artifacts without running tests: {code} mvn -DskipTests install {code} I almost always include the {{clean}} phase when I do a build, e.g.: {code} mvn -DskipTests clean install {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2584) Concurrency issues in SegmentInfo.files() could lead to ConcurrentModificationException
[ https://issues.apache.org/jira/browse/LUCENE-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983142#action_12983142 ] Shai Erera commented on LUCENE-2584: On one hand, it's good to add the files to a Set, so that we can be sure they are added uniquely. On the other hand though, if we expect files are added properly, then adding to the set is redundant. Since this code is executed once per SI instance, I think explicitly adding to a Set is better. Note that while the assert you added will work, if someone runs without assertions he may get duplicate file names, if indeed they are added twice. I think that it's not so crucial to know that the same files was added twice, it's a very unlikely bug, but it is crucial that files() return unique names. So can you please use a Set in the method instead of the assert (like it's done on trunk). Also, while you're at it, the method doesn't have javadocs - they appear in regular comments. Can you convert them to javadocs (there is a warning there about not modifying the returned List, but it's not visible as javadocs :). Concurrency issues in SegmentInfo.files() could lead to ConcurrentModificationException --- Key: LUCENE-2584 URL: https://issues.apache.org/jira/browse/LUCENE-2584 Project: Lucene - Java Issue Type: Bug Components: Index Affects Versions: 2.9, 2.9.1, 2.9.2, 2.9.3, 3.0, 3.0.1, 3.0.2 Reporter: Alexander Kanarsky Priority: Minor Fix For: 3.1, 4.0 Attachments: LUCENE-2584-branch_3x.patch, LUCENE-2584-lucene-2_9.patch, LUCENE-2584-lucene-3_0.patch The multi-threaded call of the files() in SegmentInfo could lead to the ConcurrentModificationException if one thread is not finished additions to the ArrayList (files) yet while the other thread already obtained it as cached (see below). This is a rare exception, but it would be nice to fix. I see the code is no longer problematic in the trunk (and others ported from flex_1458), looks it was fixed while implementing post 3.x features. The fix to 3.x and 2.9.x branches could be the same - create the files set first and populate it, and then assign to the member variable at the end of the method. This will resolve the issue. I could prepare the patch for 2.9.4 and 3.x, if needed. -- INFO: [19] webapp= path=/replication params={command=fetchindexwt=javabin} status=0 QTime=1 Jul 30, 2010 9:13:05 AM org.apache.solr.core.SolrCore execute INFO: [19] webapp= path=/replication params={command=detailswt=javabin} status=0 QTime=24 Jul 30, 2010 9:13:05 AM org.apache.solr.handler.ReplicationHandler doFetch SEVERE: SnapPull failed java.util.ConcurrentModificationException at java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372) at java.util.AbstractList$Itr.next(AbstractList.java:343) at java.util.AbstractCollection.addAll(AbstractCollection.java:305) at org.apache.lucene.index.SegmentInfos.files(SegmentInfos.java:826) at org.apache.lucene.index.DirectoryReader$ReaderCommit.init(DirectoryReader.java:916) at org.apache.lucene.index.DirectoryReader.getIndexCommit(DirectoryReader.java:856) at org.apache.solr.search.SolrIndexReader.getIndexCommit(SolrIndexReader.java:454) at org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:261) at org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:264) at org.apache.solr.handler.ReplicationHandler$1.run(ReplicationHandler.java:146) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2295) Create a MaxFieldLengthAnalyzer to wrap any other Analyzer and provide the same functionality as MaxFieldLength provided on IndexWriter
[ https://issues.apache.org/jira/browse/LUCENE-2295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-2295: --- Attachment: LUCENE-2295-2-trunk.patch Patch against trunk - removes maxFieldLength handling from all the code. Create a MaxFieldLengthAnalyzer to wrap any other Analyzer and provide the same functionality as MaxFieldLength provided on IndexWriter --- Key: LUCENE-2295 URL: https://issues.apache.org/jira/browse/LUCENE-2295 Project: Lucene - Java Issue Type: Improvement Components: contrib/analyzers Reporter: Shai Erera Assignee: Uwe Schindler Fix For: 3.1, 4.0 Attachments: LUCENE-2295-2-3x.patch, LUCENE-2295-2-trunk.patch, LUCENE-2295-trunk.patch, LUCENE-2295.patch A spinoff from LUCENE-2294. Instead of asking the user to specify on IndexWriter his requested MFL limit, we can get rid of this setting entirely by providing an Analyzer which will wrap any other Analyzer and its TokenStream with a TokenFilter that keeps track of the number of tokens produced and stop when the limit has reached. This will remove any count tracking in IW's indexing, which is done even if I specified UNLIMITED for MFL. Let's try to do it for 3.1. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2324) Per thread DocumentsWriters that write their own private segments
[ https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983154#action_12983154 ] Michael McCandless commented on LUCENE-2324: I ran a quick perf test here: I built the 10M Wikipedia index, Standard codec, using 6 threads. Trunk took 541.6 sec; RT took 518.2 sec (only a bit faster), but the test wasn't really fair because it flushed @ docCount=12870. But I can't test flush by RAM -- that's not working yet on RT right? (The search results matched, which is nice!) Then I ran a single-threaded test. Trunk took 1097.1 sec and RT took 1040.5 sec -- a bit faster! Presumably in the noise (we don't expect a speedup?), but excellent that it's not slower... I think we lost infoStream output on the details of flushing? I can't see when which DWPTs are flushing... Per thread DocumentsWriters that write their own private segments - Key: LUCENE-2324 URL: https://issues.apache.org/jira/browse/LUCENE-2324 Project: Lucene - Java Issue Type: Improvement Components: Index Reporter: Michael Busch Assignee: Michael Busch Priority: Minor Fix For: Realtime Branch Attachments: LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324.patch, LUCENE-2324.patch, LUCENE-2324.patch, lucene-2324.patch, lucene-2324.patch, LUCENE-2324.patch, test.out, test.out, test.out, test.out See LUCENE-2293 for motivation and more details. I'm copying here Mike's summary he posted on 2293: Change the approach for how we buffer in RAM to a more isolated approach, whereby IW has N fully independent RAM segments in-process and when a doc needs to be indexed it's added to one of them. Each segment would also write its own doc stores and normal segment merging (not the inefficient merge we now do on flush) would merge them. This should be a good simplification in the chain (eg maybe we can remove the *PerThread classes). The segments can flush independently, letting us make much better concurrent use of IO CPU. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2856) Create IndexWriter event listener, specifically for merges
[ https://issues.apache.org/jira/browse/LUCENE-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983155#action_12983155 ] Michael McCandless commented on LUCENE-2856: The ReaderEvent is never generated? Is that still work-in-progress? When would this be invoked? Only if IW is pooling readers? Maybe we should hold off on that for a separate issue? Why were the added checks needed in SegmentInfo? Oh I see, it's because you compute the sizeInBytes of the merged segment before the merge completes... hmm. I think I'd prefer that this SegmentInfo not be published until the Type == COMPLETE. How come merge is not also final in MergeEvent? I agree we should change the name. IndexEventListener? I don't think we need CompositeSegmentListener? Why not an API to just add/remove listeners? Also: are we sure this belongs in IWC? This is analogous to infoStream, which is on IW. It's not a config parameter that affects indexing. Should we also track segment flushed/aborted events? Can you add some jdocs and mark the API as experimental? Create IndexWriter event listener, specifically for merges -- Key: LUCENE-2856 URL: https://issues.apache.org/jira/browse/LUCENE-2856 Project: Lucene - Java Issue Type: Improvement Components: Index Affects Versions: 4.0 Reporter: Jason Rutherglen Attachments: LUCENE-2856.patch, LUCENE-2856.patch, LUCENE-2856.patch The issue will allow users to monitor merges occurring within IndexWriter using a callback notifier event listener. This can be used by external applications such as Solr to monitor large segment merges. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly
[ https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983158#action_12983158 ] Chris Male commented on LUCENE-2657: That was basically what I was getting at (perhaps not clearly enough). Would a satisfactory compromise be to view this patch as adding development support for maven, which is not to do with whether maven artifacts are released or not? The discussion about release process, artifacts and build system flamewars can then happen outside of this. Replace Maven POM templates with full POMs, and change documentation accordingly Key: LUCENE-2657 URL: https://issues.apache.org/jira/browse/LUCENE-2657 Project: Lucene - Java Issue Type: Improvement Components: Build Affects Versions: 3.1, 4.0 Reporter: Steven Rowe Assignee: Steven Rowe Fix For: 3.1, 4.0 Attachments: LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch The current Maven POM templates only contain dependency information, the bare bones necessary for uploading artifacts to the Maven repository. The full Maven POMs in the attached patch include the information necessary to run a multi-module Maven build, in addition to serving the same purpose as the current POM templates. Several dependencies are not available through public maven repositories. A profile in the top-level POM can be activated to install these dependencies from the various {{lib/}} directories into your local repository. From the top-level directory: {code} mvn -N -Pbootstrap install {code} Once these non-Maven dependencies have been installed, to run all Lucene/Solr tests via Maven's surefire plugin, and populate your local repository with all artifacts, from the top level directory, run: {code} mvn install {code} When one Lucene/Solr module depends on another, the dependency is declared on the *artifact(s)* produced by the other module and deposited in your local repository, rather than on the other module's un-jarred compiler output in the {{build/}} directory, so you must run {{mvn install}} on the other module before its changes are visible to the module that depends on it. To create all the artifacts without running tests: {code} mvn -DskipTests install {code} I almost always include the {{clean}} phase when I do a build, e.g.: {code} mvn -DskipTests clean install {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2474) Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey)
[ https://issues.apache.org/jira/browse/LUCENE-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983159#action_12983159 ] Michael McCandless commented on LUCENE-2474: bq. Still, I think that using CopyOnWriteArrayList is best here. OK I'll switch back to COWAL... it makes me nervous though. I like being defensive and the added cost of CHM iteration really should be negligible here. {quote} I'd like even more for there to be just a single CopyOnWriteArrayList per top-level reader that is then propagated to all sub/segment readers, including new ones on a reopen. But I guess Mike indicated that was currently too hard/hairy. {quote} This did get hairy... eg if you make a MultiReader (or ParallelReader) w/ subs... what should happen to their listeners? Ie what if the subs already have listeners enrolled? It also spooked me that apps may think they have to re-register after re-open (if we stick w/ ArrayList) since then the list'd just grow... it's trappy. And, if you pull an NRT reader from IW (which is what reopen does under the hood for an NRT reader), how to share its listeners? Ie, we'd have to add a setter to IW as well, so it's also single source (propagates on reopen). This is why I fell back to a simple static as the baby step for now. {quote} The static is really non-optimal though - among other problems, it requires systems with multiple readers (and wants to do different things with different readers, such as maintain separate caches) to figure out what top-level reader a segment reader is associated with. And given that we are dealing with IndexReader instances in the callbacks, and not ReaderContext objects, this seems impossible? {quote} ReaderContext doesn't really make sense here? Ie, the listener is invoked when any/all composite readers sharing a given segment have now closed (ie when the RC for that segment's core drops to 0), or when a composite reader is closed. Also, in practice, is it really so hard for the app to figure out which SR goes to which of their caches? Isn't this typically a containsKey against the app level caches...? Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey) Key: LUCENE-2474 URL: https://issues.apache.org/jira/browse/LUCENE-2474 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Shay Banon Attachments: LUCENE-2474.patch, LUCENE-2474.patch Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey). A spin of: https://issues.apache.org/jira/browse/LUCENE-2468. Basically, its make a lot of sense to cache things based on IndexReader#getFieldCacheKey, even Lucene itself uses it, for example, with the CachingWrapperFilter. FieldCache enjoys being called explicitly to purge its cache when possible (which is tricky to know from the outside, especially when using NRT - reader attack of the clones). The provided patch allows to plug a CacheEvictionListener which will be called when the cache should be purged for an IndexReader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly
[ https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983160#action_12983160 ] Earwin Burrfoot commented on LUCENE-2657: - bq. we need to be very clear and it has no effect on artifacts I feel something was missed in the heat of debate. Eg: bq. The latest patch on this release uses the Ant artifacts directly. bq. This patch uses the Ant-produced artifacts to prepare for Maven artifact publishing. bq. Maven itself is not invoked in the process. An Ant plugin handles the artifact deployment. I will now try to decipher these quotes. It seems the patch takes the artifacts produced by Ant, as a part of our usual (and only) build process, and shoves it down Maven repository's throat along with a bunch of pom-descriptors. Nothing else is happening. Also, after everything that has been said, I think nobody in his right mind will *force* anyone to actually use the Ant target in question as a part of release. But it's nice to have it around, in case some user-friendly commiter would like to push (I'd like to reiterate - ant generated) artifacts into Maven. Replace Maven POM templates with full POMs, and change documentation accordingly Key: LUCENE-2657 URL: https://issues.apache.org/jira/browse/LUCENE-2657 Project: Lucene - Java Issue Type: Improvement Components: Build Affects Versions: 3.1, 4.0 Reporter: Steven Rowe Assignee: Steven Rowe Fix For: 3.1, 4.0 Attachments: LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch The current Maven POM templates only contain dependency information, the bare bones necessary for uploading artifacts to the Maven repository. The full Maven POMs in the attached patch include the information necessary to run a multi-module Maven build, in addition to serving the same purpose as the current POM templates. Several dependencies are not available through public maven repositories. A profile in the top-level POM can be activated to install these dependencies from the various {{lib/}} directories into your local repository. From the top-level directory: {code} mvn -N -Pbootstrap install {code} Once these non-Maven dependencies have been installed, to run all Lucene/Solr tests via Maven's surefire plugin, and populate your local repository with all artifacts, from the top level directory, run: {code} mvn install {code} When one Lucene/Solr module depends on another, the dependency is declared on the *artifact(s)* produced by the other module and deposited in your local repository, rather than on the other module's un-jarred compiler output in the {{build/}} directory, so you must run {{mvn install}} on the other module before its changes are visible to the module that depends on it. To create all the artifacts without running tests: {code} mvn -DskipTests install {code} I almost always include the {{clean}} phase when I do a build, e.g.: {code} mvn -DskipTests clean install {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly
[ https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983161#action_12983161 ] Robert Muir commented on LUCENE-2657: - Chris: well thats the problem with maven, it tries to be too many things, a dependency management tool, a packaging system, a build system, ... So, thats why I said we have to just be very clear about which exact scope of maven we are discussing. If the patch presented here is against /dev-tools, and is to assist developers who like maven, then as I said before I am totally ok with this, but I'm only speaking for myself. Because maven is so many things, and due to Earwin's confusion, I think it would be good in general to add a README.txt to dev-tools anyway, that states what exactly it is (tools to assist lucene/solr developers, that aren't supported, its not bugs if they stop working, and will be deleted if they rot). Separately what you said about other code in trunk is totally true... for example its my opinion that there is a lot of code in lucene's contrib that should be moved out to something like apache-extras... currently lucene's contrib has to compile and pass tests or the build fails... there is definitely some stuff in there that is more sandboxy, slows down lucene core development, but itself isnt getting much maintenance other than devs doing the minimum work to make them pass tests... and we should be keep other options in mind for stuff like this. Replace Maven POM templates with full POMs, and change documentation accordingly Key: LUCENE-2657 URL: https://issues.apache.org/jira/browse/LUCENE-2657 Project: Lucene - Java Issue Type: Improvement Components: Build Affects Versions: 3.1, 4.0 Reporter: Steven Rowe Assignee: Steven Rowe Fix For: 3.1, 4.0 Attachments: LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch The current Maven POM templates only contain dependency information, the bare bones necessary for uploading artifacts to the Maven repository. The full Maven POMs in the attached patch include the information necessary to run a multi-module Maven build, in addition to serving the same purpose as the current POM templates. Several dependencies are not available through public maven repositories. A profile in the top-level POM can be activated to install these dependencies from the various {{lib/}} directories into your local repository. From the top-level directory: {code} mvn -N -Pbootstrap install {code} Once these non-Maven dependencies have been installed, to run all Lucene/Solr tests via Maven's surefire plugin, and populate your local repository with all artifacts, from the top level directory, run: {code} mvn install {code} When one Lucene/Solr module depends on another, the dependency is declared on the *artifact(s)* produced by the other module and deposited in your local repository, rather than on the other module's un-jarred compiler output in the {{build/}} directory, so you must run {{mvn install}} on the other module before its changes are visible to the module that depends on it. To create all the artifacts without running tests: {code} mvn -DskipTests install {code} I almost always include the {{clean}} phase when I do a build, e.g.: {code} mvn -DskipTests clean install {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly
[ https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983162#action_12983162 ] Earwin Burrfoot commented on LUCENE-2657: - Thanks, but I'm not the one confused here. : ) Replace Maven POM templates with full POMs, and change documentation accordingly Key: LUCENE-2657 URL: https://issues.apache.org/jira/browse/LUCENE-2657 Project: Lucene - Java Issue Type: Improvement Components: Build Affects Versions: 3.1, 4.0 Reporter: Steven Rowe Assignee: Steven Rowe Fix For: 3.1, 4.0 Attachments: LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch The current Maven POM templates only contain dependency information, the bare bones necessary for uploading artifacts to the Maven repository. The full Maven POMs in the attached patch include the information necessary to run a multi-module Maven build, in addition to serving the same purpose as the current POM templates. Several dependencies are not available through public maven repositories. A profile in the top-level POM can be activated to install these dependencies from the various {{lib/}} directories into your local repository. From the top-level directory: {code} mvn -N -Pbootstrap install {code} Once these non-Maven dependencies have been installed, to run all Lucene/Solr tests via Maven's surefire plugin, and populate your local repository with all artifacts, from the top level directory, run: {code} mvn install {code} When one Lucene/Solr module depends on another, the dependency is declared on the *artifact(s)* produced by the other module and deposited in your local repository, rather than on the other module's un-jarred compiler output in the {{build/}} directory, so you must run {{mvn install}} on the other module before its changes are visible to the module that depends on it. To create all the artifacts without running tests: {code} mvn -DskipTests install {code} I almost always include the {{clean}} phase when I do a build, e.g.: {code} mvn -DskipTests clean install {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Let's drop Maven Artifacts !
More than one build tools is not way to go, I believe everyone agrees on that, and that it's not an issue. Have you guys at least considered making a switch to a build tool that knows to produce maven artifacts (or enhancing exiting one to take care of that)? E.g. ant+ivy, gradle, maven itself. IMO making a switch to a modern build tool or enhancing existing one to produce maven artifacts at the moment is out of best interest for any open source project including this one, it will be out of benefit for projec users/contributors, developers, and project as a whole: - official project binaries will (continue to) be available to as large as possible user base so you'll get more potential testers/bug reporters, and more potential contributors, and more potential commercial/paying customers which will raise project quality, bring new ideas, and finance future development - modern build tools have declarative dependency management so it will be easier to develop and contribute, at least one won't have to wait for dependency libs to get downloaded together with sources every time project is checked out and you will not have to manually download new/updated 3rd party dependencies, just change build script/metadata - modern build tools try to be and mostly are non intrusive, and promote good proven solutions like standard project structure/layout so it's easier to get started and productive on such projects compared to projects with custom layout; - modern build tools are better integrated with current development infrastructure tools, like IDEs, and continuous integration servers. This switch would also make it easier to maintain project metadata, to keep metadata DRY, so that publishing Maven artifacts even if decided not to be part of main release process, can be done with not much effort and enough credibility. If who cares about project maven artifact consumers regardless of size of that community attitude is accepted and official project stand, and project community size is not considered as project asset, I don't understand why project is being published under open source license. Regards, Stevo. On Tue, Jan 18, 2011 at 11:50 AM, Robert Muir rcm...@gmail.com wrote: On Tue, Jan 18, 2011 at 5:29 AM, Hardy Ferentschik s...@ferentschik.de wrote: It also means that someone outside the dev community will at some stage create some pom files and upload the artifact to a (semi-) public repository. This sounds great! this is how open source works, those who care about it, will make it happen! - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly
[ https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983163#action_12983163 ] Chris Male commented on LUCENE-2657: Ant does many things too and we use it in a specific way so I see no problem defining what we intend our maven support to be for. So I'm feeling some consensus (fortunately I spoke too soon before) that if we target this toward being a development tool which is not forced upon any users / release managers. Is this okay with you Steven? A README.txt describing the scope of the dev-tools sounds appropriate irrespective of what happens here. I certainly wasn't aware of what their maintenance plan was. Replace Maven POM templates with full POMs, and change documentation accordingly Key: LUCENE-2657 URL: https://issues.apache.org/jira/browse/LUCENE-2657 Project: Lucene - Java Issue Type: Improvement Components: Build Affects Versions: 3.1, 4.0 Reporter: Steven Rowe Assignee: Steven Rowe Fix For: 3.1, 4.0 Attachments: LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch The current Maven POM templates only contain dependency information, the bare bones necessary for uploading artifacts to the Maven repository. The full Maven POMs in the attached patch include the information necessary to run a multi-module Maven build, in addition to serving the same purpose as the current POM templates. Several dependencies are not available through public maven repositories. A profile in the top-level POM can be activated to install these dependencies from the various {{lib/}} directories into your local repository. From the top-level directory: {code} mvn -N -Pbootstrap install {code} Once these non-Maven dependencies have been installed, to run all Lucene/Solr tests via Maven's surefire plugin, and populate your local repository with all artifacts, from the top level directory, run: {code} mvn install {code} When one Lucene/Solr module depends on another, the dependency is declared on the *artifact(s)* produced by the other module and deposited in your local repository, rather than on the other module's un-jarred compiler output in the {{build/}} directory, so you must run {{mvn install}} on the other module before its changes are visible to the module that depends on it. To create all the artifacts without running tests: {code} mvn -DskipTests install {code} I almost always include the {{clean}} phase when I do a build, e.g.: {code} mvn -DskipTests clean install {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Resolved: (LUCENE-2295) Create a MaxFieldLengthAnalyzer to wrap any other Analyzer and provide the same functionality as MaxFieldLength provided on IndexWriter
[ https://issues.apache.org/jira/browse/LUCENE-2295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera resolved LUCENE-2295. Resolution: Fixed Committed revision 1060340 (trunk). Committed revision 1060342 (3x). Create a MaxFieldLengthAnalyzer to wrap any other Analyzer and provide the same functionality as MaxFieldLength provided on IndexWriter --- Key: LUCENE-2295 URL: https://issues.apache.org/jira/browse/LUCENE-2295 Project: Lucene - Java Issue Type: Improvement Components: contrib/analyzers Reporter: Shai Erera Assignee: Uwe Schindler Fix For: 3.1, 4.0 Attachments: LUCENE-2295-2-3x.patch, LUCENE-2295-2-trunk.patch, LUCENE-2295-trunk.patch, LUCENE-2295.patch A spinoff from LUCENE-2294. Instead of asking the user to specify on IndexWriter his requested MFL limit, we can get rid of this setting entirely by providing an Analyzer which will wrap any other Analyzer and its TokenStream with a TokenFilter that keeps track of the number of tokens produced and stop when the limit has reached. This will remove any count tracking in IW's indexing, which is done even if I specified UNLIMITED for MFL. Let's try to do it for 3.1. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Let's drop Maven Artifacts !
On Tue, Jan 18, 2011 at 6:56 AM, Stevo Slavić ssla...@gmail.com wrote: More than one build tools is not way to go, I believe everyone agrees on that, and that it's not an issue. Have you guys at least considered making a switch to a build tool that knows to produce maven artifacts (or enhancing exiting one to take care of that)? E.g. ant+ivy, gradle, maven itself. I think its important to look at the build system as supporting development too, but most features being developed today are against lucene's core: which has no dependencies at all. For example, our ant build supports rapidly running the core tests (splitting them across different jvms in parallel: i've looked at the support for parallel testing in other build systems like maven and I think ours is significantly better for our tests). This compile-test-debug lifecycle is important, for the lucene core tests its very fast. So while I might agree with you that for something like Solr development, perhaps ant+ivy is something worth considering, I think its overkill and would be a step backwards for lucene, we would only slow down development. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Windows test failure VelocityResponseWriter, unmodified trunk.
Yep, already tried a fresh checkout before sending the e-mail. At first glance this looks like a classpath issue hopefully just on my machine, but it was late last night and I wanted to give someone a chance to pipe up with Ooops, I was changing that and.. Yes, I'm lazy when I can be. Er... Efficient that is. Erick On Tue, Jan 18, 2011 at 12:19 AM, Yonik Seeley yo...@lucidimagination.comwrote: On Mon, Jan 17, 2011 at 10:42 PM, Erick Erickson erickerick...@gmail.com wrote: H, a fresh, unmodified checkout of Solr will fail on my Windows7 box if I run ant -Dtestcase=VelocityResponseWriterTest test. It succeeds on my Mac. Anyone got a clue? Or should I look into it? Of course it succeeds in IntelliJ. S My windows laptop took a vacation (a permanent one) so I can't verify. But when I see NoSuchMethod runtime exceptions, I usually try a fresh checkout first. It's sometimes just stuff not getting cleaned up properly. -Yonik http://www.lucidimagination.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (SOLR-445) XmlUpdateRequestHandler bad documents mid batch aborts rest of batch
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson updated SOLR-445: Attachment: SOLR-445-3_x.patch SOLR-445.patch I think it's ready for review, both trunk and 3_x. Would someone look this over and commit it if they think it's ready? Note to self: do NOT call initCore in a test case just because you need a different schema. The problem I was having with running tests was because I needed a schema file with a required field so I naively called initCore with schema11.xml in spite of the fact that @BeforeClass called it with just schema.xml. Which apparently does bad things with the state of *something* and caused other tests to fail... I can get TestDistributedSearch to fail on unchanged source code simply by calling initCore with schema11.xml and doing nothing else in a new test case in BasicFunctionalityTest. So I put my new tests that required schema11 in a new file instead. The XML file attached is not intended to be committed, it is just a convenience for anyone checking out this patch to run against a Solr instance to see what is returned. This seems to return the data in the SolrJ case as well. NOTE: This does change the behavior of Solr. Without this patch, the first document that is incorrect stops processing. Now, it continues merrily on adding documents as it can. Is this desirable behavior? It would be easy to abort on first error if that's the consensus, and I could take some tedious record-keeping out. I think there's no big problem with continuing on, since the state of committed documents is indeterminate already when errors occur so worrying about this should be part of a bigger issue. XmlUpdateRequestHandler bad documents mid batch aborts rest of batch Key: SOLR-445 URL: https://issues.apache.org/jira/browse/SOLR-445 Project: Solr Issue Type: Bug Components: update Affects Versions: 1.3 Reporter: Will Johnson Assignee: Erick Erickson Fix For: Next Attachments: SOLR-445-3_x.patch, SOLR-445.patch, SOLR-445.patch, solr-445.xml Has anyone run into the problem of handling bad documents / failures mid batch. Ie: add doc field name=id1/field /doc doc field name=id2/field field name=myDateFieldI_AM_A_BAD_DATE/field /doc doc field name=id3/field /doc /add Right now solr adds the first doc and then aborts. It would seem like it should either fail the entire batch or log a message/return a code and then continue on to add doc 3. Option 1 would seem to be much harder to accomplish and possibly require more memory while Option 2 would require more information to come back from the API. I'm about to dig into this but I thought I'd ask to see if anyone had any suggestions, thoughts or comments. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Windows test failure VelocityResponseWriter, unmodified trunk.
Robert: Thanks, that's just the kind of hint I was looking for. I'll be able to spend some time on this a bit later. Erick On Tue, Jan 18, 2011 at 7:55 AM, Robert Muir rcm...@gmail.com wrote: Erick, I think i know the problem: see https://issues.apache.org/jira/browse/SOLR-2303 perhaps the issue is somehow not fixed though. feel free to re-open it and we can try to get to the bottom of it... But i suspect it has to do with log4j jars being in ant's classpath, and somewhere in solr's build it must be adding ant's classpath to the junit runtime classpath... i know i cleared this up for lucene but perhaps i missed a spot for solr. On Tue, Jan 18, 2011 at 7:50 AM, Erick Erickson erickerick...@gmail.com wrote: Yep, already tried a fresh checkout before sending the e-mail. At first glance this looks like a classpath issue hopefully just on my machine, but it was late last night and I wanted to give someone a chance to pipe up with Ooops, I was changing that and.. Yes, I'm lazy when I can be. Er... Efficient that is. Erick On Tue, Jan 18, 2011 at 12:19 AM, Yonik Seeley yo...@lucidimagination.com wrote: On Mon, Jan 17, 2011 at 10:42 PM, Erick Erickson erickerick...@gmail.com wrote: H, a fresh, unmodified checkout of Solr will fail on my Windows7 box if I run ant -Dtestcase=VelocityResponseWriterTest test. It succeeds on my Mac. Anyone got a clue? Or should I look into it? Of course it succeeds in IntelliJ. S My windows laptop took a vacation (a permanent one) so I can't verify. But when I see NoSuchMethod runtime exceptions, I usually try a fresh checkout first. It's sometimes just stuff not getting cleaned up properly. -Yonik http://www.lucidimagination.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Assigned: (LUCENE-2584) Concurrency issues in SegmentInfo.files() could lead to ConcurrentModificationException
[ https://issues.apache.org/jira/browse/LUCENE-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera reassigned LUCENE-2584: -- Assignee: Shai Erera Concurrency issues in SegmentInfo.files() could lead to ConcurrentModificationException --- Key: LUCENE-2584 URL: https://issues.apache.org/jira/browse/LUCENE-2584 Project: Lucene - Java Issue Type: Bug Components: Index Affects Versions: 2.9, 2.9.1, 2.9.2, 2.9.3, 3.0, 3.0.1, 3.0.2 Reporter: Alexander Kanarsky Assignee: Shai Erera Priority: Minor Fix For: 3.1, 4.0 Attachments: LUCENE-2584-branch_3x.patch, LUCENE-2584-lucene-2_9.patch, LUCENE-2584-lucene-3_0.patch The multi-threaded call of the files() in SegmentInfo could lead to the ConcurrentModificationException if one thread is not finished additions to the ArrayList (files) yet while the other thread already obtained it as cached (see below). This is a rare exception, but it would be nice to fix. I see the code is no longer problematic in the trunk (and others ported from flex_1458), looks it was fixed while implementing post 3.x features. The fix to 3.x and 2.9.x branches could be the same - create the files set first and populate it, and then assign to the member variable at the end of the method. This will resolve the issue. I could prepare the patch for 2.9.4 and 3.x, if needed. -- INFO: [19] webapp= path=/replication params={command=fetchindexwt=javabin} status=0 QTime=1 Jul 30, 2010 9:13:05 AM org.apache.solr.core.SolrCore execute INFO: [19] webapp= path=/replication params={command=detailswt=javabin} status=0 QTime=24 Jul 30, 2010 9:13:05 AM org.apache.solr.handler.ReplicationHandler doFetch SEVERE: SnapPull failed java.util.ConcurrentModificationException at java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372) at java.util.AbstractList$Itr.next(AbstractList.java:343) at java.util.AbstractCollection.addAll(AbstractCollection.java:305) at org.apache.lucene.index.SegmentInfos.files(SegmentInfos.java:826) at org.apache.lucene.index.DirectoryReader$ReaderCommit.init(DirectoryReader.java:916) at org.apache.lucene.index.DirectoryReader.getIndexCommit(DirectoryReader.java:856) at org.apache.solr.search.SolrIndexReader.getIndexCommit(SolrIndexReader.java:454) at org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:261) at org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:264) at org.apache.solr.handler.ReplicationHandler$1.run(ReplicationHandler.java:146) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2584) Concurrency issues in SegmentInfo.files() could lead to ConcurrentModificationException
[ https://issues.apache.org/jira/browse/LUCENE-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-2584: --- Attachment: LUCENE-2584.patch Patch against 3x - fixes the bug according to Alexander's other patch (but uses HashSet all the way), and I added a CHANGES entry and test case to TestSegmentInfo. I plan to commit this soon and also backport to 3.0 and 2.9 Concurrency issues in SegmentInfo.files() could lead to ConcurrentModificationException --- Key: LUCENE-2584 URL: https://issues.apache.org/jira/browse/LUCENE-2584 Project: Lucene - Java Issue Type: Bug Components: Index Affects Versions: 2.9, 2.9.1, 2.9.2, 2.9.3, 3.0, 3.0.1, 3.0.2 Reporter: Alexander Kanarsky Assignee: Shai Erera Priority: Minor Fix For: 3.1, 4.0 Attachments: LUCENE-2584-branch_3x.patch, LUCENE-2584-lucene-2_9.patch, LUCENE-2584-lucene-3_0.patch, LUCENE-2584.patch The multi-threaded call of the files() in SegmentInfo could lead to the ConcurrentModificationException if one thread is not finished additions to the ArrayList (files) yet while the other thread already obtained it as cached (see below). This is a rare exception, but it would be nice to fix. I see the code is no longer problematic in the trunk (and others ported from flex_1458), looks it was fixed while implementing post 3.x features. The fix to 3.x and 2.9.x branches could be the same - create the files set first and populate it, and then assign to the member variable at the end of the method. This will resolve the issue. I could prepare the patch for 2.9.4 and 3.x, if needed. -- INFO: [19] webapp= path=/replication params={command=fetchindexwt=javabin} status=0 QTime=1 Jul 30, 2010 9:13:05 AM org.apache.solr.core.SolrCore execute INFO: [19] webapp= path=/replication params={command=detailswt=javabin} status=0 QTime=24 Jul 30, 2010 9:13:05 AM org.apache.solr.handler.ReplicationHandler doFetch SEVERE: SnapPull failed java.util.ConcurrentModificationException at java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372) at java.util.AbstractList$Itr.next(AbstractList.java:343) at java.util.AbstractCollection.addAll(AbstractCollection.java:305) at org.apache.lucene.index.SegmentInfos.files(SegmentInfos.java:826) at org.apache.lucene.index.DirectoryReader$ReaderCommit.init(DirectoryReader.java:916) at org.apache.lucene.index.DirectoryReader.getIndexCommit(DirectoryReader.java:856) at org.apache.solr.search.SolrIndexReader.getIndexCommit(SolrIndexReader.java:454) at org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:261) at org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:264) at org.apache.solr.handler.ReplicationHandler$1.run(ReplicationHandler.java:146) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2584) Concurrency issues in SegmentInfo.files() could lead to ConcurrentModificationException
[ https://issues.apache.org/jira/browse/LUCENE-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983183#action_12983183 ] Michael McCandless commented on LUCENE-2584: Patch looks good Shai! I don't think you need to backport to 2.9/3.0 immediately (unless you really want to!)? We can backport if/when we do another release... Concurrency issues in SegmentInfo.files() could lead to ConcurrentModificationException --- Key: LUCENE-2584 URL: https://issues.apache.org/jira/browse/LUCENE-2584 Project: Lucene - Java Issue Type: Bug Components: Index Affects Versions: 2.9, 2.9.1, 2.9.2, 2.9.3, 3.0, 3.0.1, 3.0.2 Reporter: Alexander Kanarsky Assignee: Shai Erera Priority: Minor Fix For: 3.1, 4.0 Attachments: LUCENE-2584-branch_3x.patch, LUCENE-2584-lucene-2_9.patch, LUCENE-2584-lucene-3_0.patch, LUCENE-2584.patch The multi-threaded call of the files() in SegmentInfo could lead to the ConcurrentModificationException if one thread is not finished additions to the ArrayList (files) yet while the other thread already obtained it as cached (see below). This is a rare exception, but it would be nice to fix. I see the code is no longer problematic in the trunk (and others ported from flex_1458), looks it was fixed while implementing post 3.x features. The fix to 3.x and 2.9.x branches could be the same - create the files set first and populate it, and then assign to the member variable at the end of the method. This will resolve the issue. I could prepare the patch for 2.9.4 and 3.x, if needed. -- INFO: [19] webapp= path=/replication params={command=fetchindexwt=javabin} status=0 QTime=1 Jul 30, 2010 9:13:05 AM org.apache.solr.core.SolrCore execute INFO: [19] webapp= path=/replication params={command=detailswt=javabin} status=0 QTime=24 Jul 30, 2010 9:13:05 AM org.apache.solr.handler.ReplicationHandler doFetch SEVERE: SnapPull failed java.util.ConcurrentModificationException at java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372) at java.util.AbstractList$Itr.next(AbstractList.java:343) at java.util.AbstractCollection.addAll(AbstractCollection.java:305) at org.apache.lucene.index.SegmentInfos.files(SegmentInfos.java:826) at org.apache.lucene.index.DirectoryReader$ReaderCommit.init(DirectoryReader.java:916) at org.apache.lucene.index.DirectoryReader.getIndexCommit(DirectoryReader.java:856) at org.apache.solr.search.SolrIndexReader.getIndexCommit(SolrIndexReader.java:454) at org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:261) at org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:264) at org.apache.solr.handler.ReplicationHandler$1.run(ReplicationHandler.java:146) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2472) The terms index divisor in IW should be set via IWC not via getReader
[ https://issues.apache.org/jira/browse/LUCENE-2472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983189#action_12983189 ] Shai Erera commented on LUCENE-2472: This is already set on IWC (set/getReaderTermsIndexDivisor). So I guess all that's needed is to deprecate IW.getReader(int) on 3x and remove from trunk? The terms index divisor in IW should be set via IWC not via getReader - Key: LUCENE-2472 URL: https://issues.apache.org/jira/browse/LUCENE-2472 Project: Lucene - Java Issue Type: Improvement Components: Index Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 3.1, 4.0 The getReader call gives a false sense of security... since if deletions have already been applied (and IW is pooling) the readers have already been loaded with a divisor of 1. Better to set the divisor up front in IWC. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Let's drop Maven Artifacts !
It seems to me that if we have a fix for the things that ail our Maven support (Steve's work), that it isn't then the reason for holding up a release and we should just keep them as there are a significant number of users who consume Lucene that way (via the central repository). I agree that we should not switch our build system, but supporting the POMs is no different than supporting the IntelliJ/Eclipse generation tools (they are both problematic since they are not automated) On Jan 18, 2011, at 7:48 AM, Robert Muir wrote: On Tue, Jan 18, 2011 at 6:56 AM, Stevo Slavić ssla...@gmail.com wrote: More than one build tools is not way to go, I believe everyone agrees on that, and that it's not an issue. Have you guys at least considered making a switch to a build tool that knows to produce maven artifacts (or enhancing exiting one to take care of that)? E.g. ant+ivy, gradle, maven itself. I think its important to look at the build system as supporting development too, but most features being developed today are against lucene's core: which has no dependencies at all. For example, our ant build supports rapidly running the core tests (splitting them across different jvms in parallel: i've looked at the support for parallel testing in other build systems like maven and I think ours is significantly better for our tests). This compile-test-debug lifecycle is important, for the lucene core tests its very fast. So while I might agree with you that for something like Solr development, perhaps ant+ivy is something worth considering, I think its overkill and would be a step backwards for lucene, we would only slow down development. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Grant Ingersoll http://www.lucidimagination.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2374) Add reflection API to AttributeSource/AttributeImpl
[ https://issues.apache.org/jira/browse/LUCENE-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983195#action_12983195 ] Uwe Schindler commented on LUCENE-2374: --- In my opinion, there is lots of code duplication in unmainainable analysis.jsp. I think we should open a new issue to remove it and replace by an XSL or alternatively make its internal functionality backed by FieldAnalysisReuqestHandler. Add reflection API to AttributeSource/AttributeImpl --- Key: LUCENE-2374 URL: https://issues.apache.org/jira/browse/LUCENE-2374 Project: Lucene - Java Issue Type: Improvement Components: contrib/analyzers Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 3.1, 4.0 Attachments: LUCENE-2374-3x.patch, LUCENE-2374-3x.patch, LUCENE-2374-3x.patch, shot1.png, shot2.png, shot3.png, shot4.png AttributeSource/TokenStream inspection in Solr needs to have some insight into the contents of AttributeImpls. As LUCENE-2302 has some problems with toString() [which is not structured and conflicts with CharSequence's definition for CharTermAttribute], I propose an simple API that get a default implementation in AttributeImpl (just like toString() current): - IteratorMap.EntryString,? AttributeImpl.contentsIterator() returns an iterator (for most attributes its a singleton) of a key-value pair, e.g. term-foobar,startOffset-Integer.valueOf(0),... - AttributeSource gets the same method, it just concat the iterators of each getAttributeImplsIterator() AttributeImpl No backwards problems occur, as the default toString() method will work like before (it just gets iterator and lists), but we simply remove the documentation for the format. (Char)TermAttribute gets a special impl fo toString() according to CharSequence and a corresponding iterator. I also want to remove the abstract hashCode() and equals() methods from AttributeImpl, as they are not needed and just create work for the implementor. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Let's drop Maven Artifacts !
On Tue, Jan 18, 2011 at 17:00, Robert Muir rcm...@gmail.com wrote: On Tue, Jan 18, 2011 at 8:54 AM, Grant Ingersoll gsing...@apache.org wrote: It seems to me that if we have a fix for the things that ail our Maven support (Steve's work), that it isn't then the reason for holding up a release and we should just keep them as there are a significant number of users who consume Lucene that way (via the central repository). I agree that we should not switch our build system, but supporting the POMs is no different than supporting the IntelliJ/Eclipse generation tools (they are both problematic since they are not automated) its totally different in every way! we don't release the intellij/eclipse stuff, its for internal use only. additionally, there are no release artifacts generated by these Latest code from LUCENE-2657 does not generate any new artifacts. It uploads those you already have (built via ant) to the repo. -- Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com) Phone: +7 (495) 683-567-4 ICQ: 104465785 - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Exception hit on 3_0 branch
Hi I ran tests on 3_0 branch and hit this: [junit] Testcase: testRankByte(org.apache.lucene.search.function.TestFieldScoreQuery): Caused an ERROR [junit] null [junit] java.util.ConcurrentModificationException [junit] at java.util.WeakHashMap$HashIterator.next(WeakHashMap.java:169) [junit] at org.apache.lucene.search.FieldCacheImpl.getCacheEntries(FieldCacheImpl.java:75) [junit] at org.apache.lucene.util.LuceneTestCase.assertSaneFieldCaches(LuceneTestCase.java:133) [junit] at org.apache.lucene.util.LuceneTestCase.tearDown(LuceneTestCase.java:100) [junit] at org.apache.lucene.search.function.FunctionTestSetup.tearDown(FunctionTestSetup.java:86) [junit] at org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:216) I couldn't reproduce it the second time I ran the test (test only and all tests), and I don't know if it applies to 3x/trunk too. I can dig into it later, but sending to the list in case someone wants to look at it before. I see that the method is called from tearDown() and ConcurrentModEx suggests someone added to the set during while someone else iterated over it -- could it be that the tests step on each other somehow? Shai
[jira] Commented: (SOLR-1604) Wildcards, ORs etc inside Phrase Queries
[ https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983208#action_12983208 ] Salman Akram commented on SOLR-1604: I tried the patch with latest non-grayed file but still inOrder doesn't seem to have any impact. Results for a b~5 and b a~5 are still different. Also any feedback about CommonGrams integration? Thanks a lot for all the help! Wildcards, ORs etc inside Phrase Queries Key: SOLR-1604 URL: https://issues.apache.org/jira/browse/SOLR-1604 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Reporter: Ahmet Arslan Priority: Minor Fix For: Next Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhraseQueryParser.java, SOLR-1604.patch Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports wildcards, ORs, ranges, fuzzies inside phrase queries. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Resolved: (LUCENE-2584) Concurrency issues in SegmentInfo.files() could lead to ConcurrentModificationException
[ https://issues.apache.org/jira/browse/LUCENE-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera resolved LUCENE-2584. Resolution: Fixed Fix Version/s: (was: 4.0) 3.0.4 2.9.5 Lucene Fields: [New, Patch Available] (was: [New]) Committed revision 1060358 (3x). Committed revision 1060391 (3.0). Committed revision 1060398 (2.9). Thanks Alexander ! Concurrency issues in SegmentInfo.files() could lead to ConcurrentModificationException --- Key: LUCENE-2584 URL: https://issues.apache.org/jira/browse/LUCENE-2584 Project: Lucene - Java Issue Type: Bug Components: Index Affects Versions: 2.9, 2.9.1, 2.9.2, 2.9.3, 3.0, 3.0.1, 3.0.2 Reporter: Alexander Kanarsky Assignee: Shai Erera Priority: Minor Fix For: 2.9.5, 3.0.4, 3.1 Attachments: LUCENE-2584-branch_3x.patch, LUCENE-2584-lucene-2_9.patch, LUCENE-2584-lucene-3_0.patch, LUCENE-2584.patch The multi-threaded call of the files() in SegmentInfo could lead to the ConcurrentModificationException if one thread is not finished additions to the ArrayList (files) yet while the other thread already obtained it as cached (see below). This is a rare exception, but it would be nice to fix. I see the code is no longer problematic in the trunk (and others ported from flex_1458), looks it was fixed while implementing post 3.x features. The fix to 3.x and 2.9.x branches could be the same - create the files set first and populate it, and then assign to the member variable at the end of the method. This will resolve the issue. I could prepare the patch for 2.9.4 and 3.x, if needed. -- INFO: [19] webapp= path=/replication params={command=fetchindexwt=javabin} status=0 QTime=1 Jul 30, 2010 9:13:05 AM org.apache.solr.core.SolrCore execute INFO: [19] webapp= path=/replication params={command=detailswt=javabin} status=0 QTime=24 Jul 30, 2010 9:13:05 AM org.apache.solr.handler.ReplicationHandler doFetch SEVERE: SnapPull failed java.util.ConcurrentModificationException at java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372) at java.util.AbstractList$Itr.next(AbstractList.java:343) at java.util.AbstractCollection.addAll(AbstractCollection.java:305) at org.apache.lucene.index.SegmentInfos.files(SegmentInfos.java:826) at org.apache.lucene.index.DirectoryReader$ReaderCommit.init(DirectoryReader.java:916) at org.apache.lucene.index.DirectoryReader.getIndexCommit(DirectoryReader.java:856) at org.apache.solr.search.SolrIndexReader.getIndexCommit(SolrIndexReader.java:454) at org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:261) at org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:264) at org.apache.solr.handler.ReplicationHandler$1.run(ReplicationHandler.java:146) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2324) Per thread DocumentsWriters that write their own private segments
[ https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983209#action_12983209 ] Jason Rutherglen commented on LUCENE-2324: -- bq. I can't test flush by RAM - that's not working yet on RT right? Right, we're only flushing by doc count, so we could be flushing segments that are too small? However we can see some of the concurrency gains by not sync'ing on IW and allowing documents updates to continue while flushing. Per thread DocumentsWriters that write their own private segments - Key: LUCENE-2324 URL: https://issues.apache.org/jira/browse/LUCENE-2324 Project: Lucene - Java Issue Type: Improvement Components: Index Reporter: Michael Busch Assignee: Michael Busch Priority: Minor Fix For: Realtime Branch Attachments: LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324.patch, LUCENE-2324.patch, LUCENE-2324.patch, lucene-2324.patch, lucene-2324.patch, LUCENE-2324.patch, test.out, test.out, test.out, test.out See LUCENE-2293 for motivation and more details. I'm copying here Mike's summary he posted on 2293: Change the approach for how we buffer in RAM to a more isolated approach, whereby IW has N fully independent RAM segments in-process and when a doc needs to be indexed it's added to one of them. Each segment would also write its own doc stores and normal segment merging (not the inefficient merge we now do on flush) would merge them. This should be a good simplification in the chain (eg maybe we can remove the *PerThread classes). The segments can flush independently, letting us make much better concurrent use of IO CPU. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2316) SynonymFilterFactory should ensure synonyms argument is provided.
[ https://issues.apache.org/jira/browse/SOLR-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983214#action_12983214 ] David Smiley commented on SOLR-2316: Both. SynonymFilterFactory should ensure synonyms argument is provided. - Key: SOLR-2316 URL: https://issues.apache.org/jira/browse/SOLR-2316 Project: Solr Issue Type: Improvement Components: Schema and Analysis Reporter: David Smiley Priority: Minor Fix For: 3.1 Attachments: 2316.patch If for some reason the synonyms attribute is not present on the filter factory configuration, a latent NPE will eventually show up during indexing/searching. Instead a helpful error should be thrown at initialization. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-445) XmlUpdateRequestHandler bad documents mid batch aborts rest of batch
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983215#action_12983215 ] Simon Rosenthal commented on SOLR-445: -- bq. Don't allow autocommits during an update. Simple. Or, rather, all update requests block at the beginning during an autocommit. If an update request has too many documents, don't do so many documents in an update. (Lance) Lance - How do you (dynamically ) disable autocommits during a specific update ? That functionality would also be useful in other use cases, but that's another issue). bq. NOTE: This does change the behavior of Solr. Without this patch, the first document that is incorrect stops processing. Now, it continues merrily on adding documents as it can. Is this desirable behavior? It would be easy to abort on first error if that's the consensus, and I could take some tedious record-keeping out. I think there's no big problem with continuing on, since the state of committed documents is indeterminate already when errors occur so worrying about this should be part of a bigger issue. I think it should be an option, if possible. I can see use cases where abort-on-first-error is desirable, but also situations where you know one or two documents may be erroneous, and its worth continuing on in order to index the other 99% XmlUpdateRequestHandler bad documents mid batch aborts rest of batch Key: SOLR-445 URL: https://issues.apache.org/jira/browse/SOLR-445 Project: Solr Issue Type: Bug Components: update Affects Versions: 1.3 Reporter: Will Johnson Assignee: Erick Erickson Fix For: Next Attachments: SOLR-445-3_x.patch, SOLR-445.patch, SOLR-445.patch, solr-445.xml Has anyone run into the problem of handling bad documents / failures mid batch. Ie: add doc field name=id1/field /doc doc field name=id2/field field name=myDateFieldI_AM_A_BAD_DATE/field /doc doc field name=id3/field /doc /add Right now solr adds the first doc and then aborts. It would seem like it should either fail the entire batch or log a message/return a code and then continue on to add doc 3. Option 1 would seem to be much harder to accomplish and possibly require more memory while Option 2 would require more information to come back from the API. I'm about to dig into this but I thought I'd ask to see if anyone had any suggestions, thoughts or comments. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Lucene-Solr-tests-only-trunk - Build # 3881 - Failure
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/3881/ No tests ran. Build Log (for compile errors): [...truncated 62 lines...] + JAVADOCS_ARTIFACTS=/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/javadocs + set +x Checking for files containing nocommit (exits build with failure if list is non-empty): + cd /home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout + JAVA_HOME=/home/hudson/tools/java/latest1.5 /home/hudson/tools/ant/latest1.7/bin/ant clean Buildfile: build.xml clean: clean: [delete] Deleting directory /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build clean: clean: [echo] Building analyzers-common... clean: [delete] Deleting directory /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/modules/analysis/build/common [echo] Building analyzers-icu... clean: [delete] Deleting directory /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/modules/analysis/build/icu [echo] Building analyzers-phonetic... clean: [delete] Deleting directory /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/modules/analysis/build/phonetic [echo] Building analyzers-smartcn... clean: [delete] Deleting directory /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/modules/analysis/build/smartcn [echo] Building analyzers-stempel... clean: [delete] Deleting directory /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/modules/analysis/build/stempel [echo] Building benchmark... clean: [delete] Deleting directory /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/modules/benchmark/build clean-contrib: clean: [delete] Deleting directory /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/contrib/analysis-extras/build [delete] Deleting directory /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/contrib/analysis-extras/lucene-libs clean: [delete] Deleting directory /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/contrib/clustering/build clean: [delete] Deleting directory /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/contrib/dataimporthandler/target clean: [delete] Deleting directory /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/contrib/extraction/build clean: [delete] Deleting directory /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/build BUILD SUCCESSFUL Total time: 3 seconds + cd /home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene + JAVA_HOME=/home/hudson/tools/java/latest1.5 /home/hudson/tools/ant/latest1.7/bin/ant compile compile-test build-contrib Buildfile: build.xml jflex-uptodate-check: jflex-notice: javacc-uptodate-check: javacc-notice: init: clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: common.compile-core: [mkdir] Created dir: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/classes/java [javac] Compiling 507 source files to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/classes/java [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/util/Version.java:73: warning: [dep-ann] deprecated name isnt annotated with @Deprecated [javac] public boolean onOrAfter(Version other) { [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/index/IndexWriter.java:985: method does not override a method from its superclass [javac] @Override [javac]^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/queryParser/CharStream.java:34: warning: [dep-ann] deprecated name isnt annotated with @Deprecated [javac] int getColumn(); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/queryParser/CharStream.java:41: warning: [dep-ann] deprecated name isnt annotated with @Deprecated [javac] int getLine(); [javac] ^ [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] 1 error [...truncated 10 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2374) Add reflection API to AttributeSource/AttributeImpl
[ https://issues.apache.org/jira/browse/LUCENE-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983224#action_12983224 ] Mark Miller commented on LUCENE-2374: - Agreed Uwe. Add reflection API to AttributeSource/AttributeImpl --- Key: LUCENE-2374 URL: https://issues.apache.org/jira/browse/LUCENE-2374 Project: Lucene - Java Issue Type: Improvement Components: contrib/analyzers Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 3.1, 4.0 Attachments: LUCENE-2374-3x.patch, LUCENE-2374-3x.patch, LUCENE-2374-3x.patch, shot1.png, shot2.png, shot3.png, shot4.png AttributeSource/TokenStream inspection in Solr needs to have some insight into the contents of AttributeImpls. As LUCENE-2302 has some problems with toString() [which is not structured and conflicts with CharSequence's definition for CharTermAttribute], I propose an simple API that get a default implementation in AttributeImpl (just like toString() current): - IteratorMap.EntryString,? AttributeImpl.contentsIterator() returns an iterator (for most attributes its a singleton) of a key-value pair, e.g. term-foobar,startOffset-Integer.valueOf(0),... - AttributeSource gets the same method, it just concat the iterators of each getAttributeImplsIterator() AttributeImpl No backwards problems occur, as the default toString() method will work like before (it just gets iterator and lists), but we simply remove the documentation for the format. (Char)TermAttribute gets a special impl fo toString() according to CharSequence and a corresponding iterator. I also want to remove the abstract hashCode() and equals() methods from AttributeImpl, as they are not needed and just create work for the implementor. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-1604) Wildcards, ORs etc inside Phrase Queries
[ https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983225#action_12983225 ] Ahmet Arslan commented on SOLR-1604: When you add debugQuery=on to your search URL, you should see something like in the debug section: spanNear([text:a, text:b], 5, false) , false here means un-ordered phrase query. Do you see it? I will look into CommonGrams this weekend. Wildcards, ORs etc inside Phrase Queries Key: SOLR-1604 URL: https://issues.apache.org/jira/browse/SOLR-1604 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Reporter: Ahmet Arslan Priority: Minor Fix For: Next Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhraseQueryParser.java, SOLR-1604.patch Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports wildcards, ORs, ranges, fuzzies inside phrase queries. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly
[ https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983234#action_12983234 ] Ryan McKinley commented on LUCENE-2657: --- Steve, great work with this patch -- it takes care of all the previous concerns about our problematic maven support. With this patch, we now have: * testable maven artifacts * easy repo distribution * ant is still *the* build system The RM can choose to ignore the generate-maven-artifacts target and let someone else push the artifacts. As with most religious conflicts -- I hope the resolution is not conversion, rather something that lets everyone to live (work) in peace. Replace Maven POM templates with full POMs, and change documentation accordingly Key: LUCENE-2657 URL: https://issues.apache.org/jira/browse/LUCENE-2657 Project: Lucene - Java Issue Type: Improvement Components: Build Affects Versions: 3.1, 4.0 Reporter: Steven Rowe Assignee: Steven Rowe Fix For: 3.1, 4.0 Attachments: LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch The current Maven POM templates only contain dependency information, the bare bones necessary for uploading artifacts to the Maven repository. The full Maven POMs in the attached patch include the information necessary to run a multi-module Maven build, in addition to serving the same purpose as the current POM templates. Several dependencies are not available through public maven repositories. A profile in the top-level POM can be activated to install these dependencies from the various {{lib/}} directories into your local repository. From the top-level directory: {code} mvn -N -Pbootstrap install {code} Once these non-Maven dependencies have been installed, to run all Lucene/Solr tests via Maven's surefire plugin, and populate your local repository with all artifacts, from the top level directory, run: {code} mvn install {code} When one Lucene/Solr module depends on another, the dependency is declared on the *artifact(s)* produced by the other module and deposited in your local repository, rather than on the other module's un-jarred compiler output in the {{build/}} directory, so you must run {{mvn install}} on the other module before its changes are visible to the module that depends on it. To create all the artifacts without running tests: {code} mvn -DskipTests install {code} I almost always include the {{clean}} phase when I do a build, e.g.: {code} mvn -DskipTests clean install {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Let's drop Maven Artifacts !
I still don't see why you care so much. You have people willing to maintain it and it is no sweat off your back and it is used by a pretty large chunk of downstream users. And don't tell me it is what holds up releases b/c it simply isn't true. On Jan 18, 2011, at 9:12 AM, Robert Muir wrote: On Tue, Jan 18, 2011 at 9:10 AM, Earwin Burrfoot ear...@gmail.com wrote: Latest code from LUCENE-2657 does not generate any new artifacts. It uploads those you already have (built via ant) to the repo. yep, thats releasing artifacts. thats the whole point of this email thread (read the title, thanks) the intellij/eclipse stuff is just unreleased stuff that sits in our SVN. it doesnt get uploaded anywhere. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Let's drop Maven Artifacts !
On Tue, Jan 18, 2011 at 10:53 AM, Grant Ingersoll gsing...@apache.org wrote: I still don't see why you care so much. You have people willing to maintain it and it is no sweat off your back and it is used by a pretty large chunk of downstream users. And don't tell me it is what holds up releases b/c it simply isn't true. it is what holds up releases. the last time i brought up releasing, it was totally destroyed because of maven. the RM shouldn't have to deal with 2 build systems, packaging systems, and repository hell, and that's what maven artifacts require. If there is a large chunk of downstream users, then they can handle this downstream, it doesn't need to be in lucene, just like we don't deal with other packaging systems. Unfortunately there is a very loud minority that care about maven, most of us that think the situation is ridiculous have totally given up arguing about it, except me, i don't want to put out a shitty release with broken maven artifacts like in the past, i'd rather let some downstream project deal with maven instead. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Resolved: (LUCENE-2755) Some improvements to CMS
[ https://issues.apache.org/jira/browse/LUCENE-2755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-2755. Resolution: Fixed Some improvements to CMS Key: LUCENE-2755 URL: https://issues.apache.org/jira/browse/LUCENE-2755 Project: Lucene - Java Issue Type: Improvement Components: Index Reporter: Shai Erera Assignee: Shai Erera Priority: Minor Fix For: 3.1, 4.0 Attachments: LUCENE-2755.patch While running optimize on a large index, I've noticed several things that got me to read CMS code more carefully, and find these issues: * CMS may hold onto a merge if maxMergeCount is hit. That results in the MergeThreads taking merges from the IndexWriter until they are exhausted, and only then that blocked merge will run. I think it's unnecessary that that merge will be blocked. * CMS sorts merges by segments size, doc-based and not bytes-based. Since the default MP is LogByteSizeMP, and I hardly believe people care about doc-based size segments anymore, I think we should switch the default impl. There are two ways to make it extensible, if we want: ** Have an overridable member/method in CMS that you can extend and override - easy. ** Have OneMerge be comparable and let the MP determine the order (e.g. by bytes, docs, calibrate deletes etc.). Better, but will need to tap into several places in the code, so more risky and complicated. On the go, I'd like to add some documentation to CMS - it's not very easy to read and follow. I'll work on a patch. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2324) Per thread DocumentsWriters that write their own private segments
[ https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983246#action_12983246 ] Michael Busch commented on LUCENE-2324: --- {quote} I ran a quick perf test here: I built the 10M Wikipedia index, Standard codec, using 6 threads. Trunk took 541.6 sec; RT took 518.2 sec (only a bit faster), but the test wasn't really fair because it flushed @ docCount=12870. {quote} Thanks for running the tests! Hmm that's a bit disappointing - we were hoping for more speedup. Flushing by docCount is currently per DWPT, so every initial segment in your test had 12870 docs. I guess there's a lot of merging happening. Maybe you could rerun with higher docCount? bq. But I can't test flush by RAM - that's not working yet on RT right? True. I'm going to add that soonish. There's one thread-safety bug related to deletes that needs to be fixed too. {quote} Then I ran a single-threaded test. Trunk took 1097.1 sec and RT took 1040.5 sec - a bit faster! Presumably in the noise (we don't expect a speedup?), but excellent that it's not slower... {quote} Yeah I didn't expect much speedup - cool! :) Maybe because some code is gone, like the WaitQueue, not sure how much overhead that added in the single-threaded case. {quote} I think we lost infoStream output on the details of flushing? I can't see when which DWPTs are flushing... {quote} Oh yeah, good point, I'll add some infoStream messages to DWPT! Per thread DocumentsWriters that write their own private segments - Key: LUCENE-2324 URL: https://issues.apache.org/jira/browse/LUCENE-2324 Project: Lucene - Java Issue Type: Improvement Components: Index Reporter: Michael Busch Assignee: Michael Busch Priority: Minor Fix For: Realtime Branch Attachments: LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324.patch, LUCENE-2324.patch, LUCENE-2324.patch, lucene-2324.patch, lucene-2324.patch, LUCENE-2324.patch, test.out, test.out, test.out, test.out See LUCENE-2293 for motivation and more details. I'm copying here Mike's summary he posted on 2293: Change the approach for how we buffer in RAM to a more isolated approach, whereby IW has N fully independent RAM segments in-process and when a doc needs to be indexed it's added to one of them. Each segment would also write its own doc stores and normal segment merging (not the inefficient merge we now do on flush) would merge them. This should be a good simplification in the chain (eg maybe we can remove the *PerThread classes). The segments can flush independently, letting us make much better concurrent use of IO CPU. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly
[ https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983247#action_12983247 ] Chris A. Mattmann commented on LUCENE-2657: --- +1 for Steve's patch, great work and you beat me to it. Replace Maven POM templates with full POMs, and change documentation accordingly Key: LUCENE-2657 URL: https://issues.apache.org/jira/browse/LUCENE-2657 Project: Lucene - Java Issue Type: Improvement Components: Build Affects Versions: 3.1, 4.0 Reporter: Steven Rowe Assignee: Steven Rowe Fix For: 3.1, 4.0 Attachments: LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch, LUCENE-2657.patch The current Maven POM templates only contain dependency information, the bare bones necessary for uploading artifacts to the Maven repository. The full Maven POMs in the attached patch include the information necessary to run a multi-module Maven build, in addition to serving the same purpose as the current POM templates. Several dependencies are not available through public maven repositories. A profile in the top-level POM can be activated to install these dependencies from the various {{lib/}} directories into your local repository. From the top-level directory: {code} mvn -N -Pbootstrap install {code} Once these non-Maven dependencies have been installed, to run all Lucene/Solr tests via Maven's surefire plugin, and populate your local repository with all artifacts, from the top level directory, run: {code} mvn install {code} When one Lucene/Solr module depends on another, the dependency is declared on the *artifact(s)* produced by the other module and deposited in your local repository, rather than on the other module's un-jarred compiler output in the {{build/}} directory, so you must run {{mvn install}} on the other module before its changes are visible to the module that depends on it. To create all the artifacts without running tests: {code} mvn -DskipTests install {code} I almost always include the {{clean}} phase when I do a build, e.g.: {code} mvn -DskipTests clean install {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Let's drop Maven Artifacts !
On Jan 18, 2011, at 11:12 AM, Robert Muir wrote: there is a very loud minority that care about maven, most of us that think the situation is ridiculous have totally given up arguing about it, except me, i don't want to put out a shitty release with broken maven artifacts like in the past, i'd rather let some downstream project deal with maven instead. +1. What a fantastic idea for an apache extra's project :) I'll open my arms to first class maven the first time it sees the light of consensus ;) - Mark - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Let's drop Maven Artifacts !
On Jan 18, 2011, at 11:12 AM, Robert Muir wrote: On Tue, Jan 18, 2011 at 10:53 AM, Grant Ingersoll gsing...@apache.org wrote: I still don't see why you care so much. You have people willing to maintain it and it is no sweat off your back and it is used by a pretty large chunk of downstream users. And don't tell me it is what holds up releases b/c it simply isn't true. it is what holds up releases. the last time i brought up releasing, it was totally destroyed because of maven. I'll grant you it held up the last release _ONCE WE DECIDED TO RELEASE_, but don't act like it is why we don't release very often, because it isn't. the RM shouldn't have to deal with 2 build systems, packaging systems, and repository hell, and that's what maven artifacts require. And Steve has said he would fix it and it won't require two build systems, so your main complaint is solved. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2472) The terms index divisor in IW should be set via IWC not via getReader
[ https://issues.apache.org/jira/browse/LUCENE-2472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983252#action_12983252 ] Michael McCandless commented on LUCENE-2472: bq. So I guess all that's needed is to deprecate IW.getReader(int) on 3x and remove from trunk? +1 Though, it's already removed on trunk. So we just need to deprecate on 3.x... The terms index divisor in IW should be set via IWC not via getReader - Key: LUCENE-2472 URL: https://issues.apache.org/jira/browse/LUCENE-2472 Project: Lucene - Java Issue Type: Improvement Components: Index Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 3.1, 4.0 The getReader call gives a false sense of security... since if deletions have already been applied (and IW is pooling) the readers have already been loaded with a divisor of 1. Better to set the divisor up front in IWC. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Let's drop Maven Artifacts !
On Tue, Jan 18, 2011 at 11:27 AM, Mark Miller markrmil...@gmail.com wrote: I'll open my arms to first class maven the first time it sees the light of consensus ;) thats the main thing missing from releasing maven artifacts... looking at previous threads I don't really see consensus that we need to do this. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Lucene-Solr-tests-only-trunk - Build # 3864 - Failure
This was caused by a latent bug in PrefixCodedTermsReader... But, I'm about to replace that w/ BlockTermsReader, so I'll leave this bug there... Mike On Mon, Jan 17, 2011 at 2:05 AM, Apache Hudson Server hud...@hudson.apache.org wrote: Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/3864/ 1 tests failed. REGRESSION: org.apache.lucene.util.automaton.fst.TestFSTs.testRealTerms Error Message: null Stack Trace: junit.framework.AssertionFailedError at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1127) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1059) at org.apache.lucene.index.codecs.intblock.FixedIntBlockIndexInput$Index.read(FixedIntBlockIndexInput.java:167) at org.apache.lucene.index.codecs.sep.SepPostingsReaderImpl.readTerm(SepPostingsReaderImpl.java:167) at org.apache.lucene.index.codecs.pulsing.PulsingPostingsReaderImpl.readTerm(PulsingPostingsReaderImpl.java:135) at org.apache.lucene.index.codecs.PrefixCodedTermsReader$FieldReader$SegmentTermsEnum.next(PrefixCodedTermsReader.java:508) at org.apache.lucene.index.codecs.PrefixCodedTermsReader$FieldReader$SegmentTermsEnum.seek(PrefixCodedTermsReader.java:431) at org.apache.lucene.index.TermsEnum.seek(TermsEnum.java:68) at org.apache.lucene.util.automaton.fst.TestFSTs.testRealTerms(TestFSTs.java:1016) Build Log (for compile errors): [...truncated 2947 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2303) remove unnecessary (and problematic) log4j jars in contribs
[ https://issues.apache.org/jira/browse/SOLR-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983260#action_12983260 ] Erick Erickson commented on SOLR-2303: -- I am Officially Confused, but the culprit appears to be log4j-over-slf4j-1.5.5.jar 3_x has: log4j jars in solr/contrib/extraction and solr/contrib/clustering a bunch of slf4j jars in solr/lib (but NOT log4j-over-slf4j-1.5.5.jar, see below). All tests succeed just fine. Trunk has: no log4j jars in contrib the same slf4j jars as in 3_x BUT ALSO log4j-over-slf4j-1.5.5.jar VelocityResponseWriterTest fails In trunk, removing log4j-over-slf4j-1.5.5.jar allows VelocityResponseWriterTest and all other tests to succeed. in 3_x, removing the log4j jars from solr/contrib makes no difference, all tests pass. So I propose that the fix for this is to remove the log4j files from 3_x and the log4j-over-slf4j-1.5.5.jar from trunk. Should I create a patch? And do patches actually remove jars like this? remove unnecessary (and problematic) log4j jars in contribs --- Key: SOLR-2303 URL: https://issues.apache.org/jira/browse/SOLR-2303 Project: Solr Issue Type: Improvement Components: Build Reporter: Robert Muir Fix For: 4.0 Attachments: SOLR-2303.patch In solr 4.0 there is log4j-over-slf4j. But if you have log4j jars also in the classpath (e.g. contrib/extraction, contrib/clustering) you can get strange errors such as: java.lang.NoSuchMethodError: org.apache.log4j.Logger.setAdditivity(Z)V So I think we should remove the log4j jars in these contribs, all tests pass with them removed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Reopened: (SOLR-2303) remove unnecessary (and problematic) log4j jars in contribs
[ https://issues.apache.org/jira/browse/SOLR-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson reopened SOLR-2303: -- See previous comment, I believe that there are some jars in Solr that need to be removed. remove unnecessary (and problematic) log4j jars in contribs --- Key: SOLR-2303 URL: https://issues.apache.org/jira/browse/SOLR-2303 Project: Solr Issue Type: Improvement Components: Build Reporter: Robert Muir Fix For: 4.0 Attachments: SOLR-2303.patch In solr 4.0 there is log4j-over-slf4j. But if you have log4j jars also in the classpath (e.g. contrib/extraction, contrib/clustering) you can get strange errors such as: java.lang.NoSuchMethodError: org.apache.log4j.Logger.setAdditivity(Z)V So I think we should remove the log4j jars in these contribs, all tests pass with them removed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2303) remove unnecessary (and problematic) log4j jars in contribs
[ https://issues.apache.org/jira/browse/SOLR-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983265#action_12983265 ] Robert Muir commented on SOLR-2303: --- Erick, actually i think the issue is that log4j-over-slf4j conflicts with log4j, if log4j is in the classpath. The problem is that currently, the solr build runs tests with whatever is in ant's classpath. This is why the tests pass for you, even if you remove all logging jars, but this is obviously bad as its not really a repeatable build. So to fix this, we need to use includeantruntime=no in the junit tasks, and also not include $java.class.path in the test classpath. instead, we explicitly include the ant libs we supply (especially since we extend some of them for testing). This might make some warnings or even errors for ant 1.8 users, but I think thats ok. remove unnecessary (and problematic) log4j jars in contribs --- Key: SOLR-2303 URL: https://issues.apache.org/jira/browse/SOLR-2303 Project: Solr Issue Type: Improvement Components: Build Reporter: Robert Muir Fix For: 4.0 Attachments: SOLR-2303.patch In solr 4.0 there is log4j-over-slf4j. But if you have log4j jars also in the classpath (e.g. contrib/extraction, contrib/clustering) you can get strange errors such as: java.lang.NoSuchMethodError: org.apache.log4j.Logger.setAdditivity(Z)V So I think we should remove the log4j jars in these contribs, all tests pass with them removed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2303) remove unnecessary (and problematic) log4j jars in contribs
[ https://issues.apache.org/jira/browse/SOLR-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983266#action_12983266 ] Mark Miller commented on SOLR-2303: --- Hey Erick, If I remember right, log4j-over-slf4j is in there for proper zookeeper logging (hoping they switch to slf4j). Rather than dropping it, we should likely try and figure out how to keep and fix the issue - as suggested by Robert. remove unnecessary (and problematic) log4j jars in contribs --- Key: SOLR-2303 URL: https://issues.apache.org/jira/browse/SOLR-2303 Project: Solr Issue Type: Improvement Components: Build Reporter: Robert Muir Fix For: 4.0 Attachments: SOLR-2303.patch In solr 4.0 there is log4j-over-slf4j. But if you have log4j jars also in the classpath (e.g. contrib/extraction, contrib/clustering) you can get strange errors such as: java.lang.NoSuchMethodError: org.apache.log4j.Logger.setAdditivity(Z)V So I think we should remove the log4j jars in these contribs, all tests pass with them removed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: Let's drop Maven Artifacts !
On 1/18/2011 at 11:34 AM, Robert Muir wrote: On Tue, Jan 18, 2011 at 11:27 AM, Mark Miller markrmil...@gmail.com wrote: I'll open my arms to first class maven the first time it sees the light of consensus ;) thats the main thing missing from releasing maven artifacts... looking at previous threads I don't really see consensus that we need to do this. I think there is consensus that the RM does not have to release Maven artifacts. There clearly is no consensus for removing Maven support from Lucene. Unfortunately there is a very loud minority that care about maven I would wager that there is a sizable silent *majority* of users who literally depend on Lucene's Maven artifacts. Steve
Re: Let's drop Maven Artifacts !
On Tue, Jan 18, 2011 at 12:03 PM, Steven A Rowe sar...@syr.edu wrote: There clearly is no consensus for removing Maven support from Lucene. and see there is my problem, there was no consensus to begin with, now suddenly its de-facto required. Maven is quite an insidious computer virus. Unfortunately there is a very loud minority that care about maven I would wager that there is a sizable silent *majority* of users who literally depend on Lucene's Maven artifacts. I can't help but remind myself, this is the same argument Oracle offered up for the whole reason hudson debacle (http://hudson-labs.org/content/whos-driving-thing) Declaring that I have a secret pocket of users that want XYZ isn't open source consensus. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Let's drop Maven Artifacts !
On 1/18/11 9:13 AM, Robert Muir wrote: I can't help but remind myself, this is the same argument Oracle offered up for the whole reason hudson debacle (http://hudson-labs.org/content/whos-driving-thing) Declaring that I have a secret pocket of users that want XYZ isn't open source consensus. Well everyone using ant+ivy or maven as their build system likely consumes artifacts from maven repos. I'm surprised you're so much against keeping to publish. I too really really want to keep ant as Lucene's build tool. Maven has made me suicidal in the past. But I don't want to stop publishing artifacts to commonly used repos. I guess we could try to figure out how many people download the artifacts from m2 repos. Maybe they have download statistics? But then what? What number would justify stopping to publish? Michael - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: Let's drop Maven Artifacts !
On 1/18/2011 at 12:14 PM, Robert Muir wrote: On Tue, Jan 18, 2011 at 12:03 PM, Steven A Rowe sar...@syr.edu wrote: There clearly is no consensus for removing Maven support from Lucene. and see there is my problem, there was no consensus to begin with, now suddenly its de-facto required. Maven is quite an insidious computer virus. So you think you personally have the power to remove functionality from Lucene that has the support of multiple committers? Unfortunately there is a very loud minority that care about maven I would wager that there is a sizable silent *majority* of users who literally depend on Lucene's Maven artifacts. I can't help but remind myself, this is the same argument Oracle offered up for the whole reason hudson debacle (http://hudson-labs.org/content/whos-driving-thing) Declaring that I have a secret pocket of users that want XYZ isn't open source consensus. In summary: you claim a silent majority (of devs) in favor of your position, and I claim a silent majority (of users) in favor of mine. Your move: my majority, of which I have no proof, has no standing. Sweet. I dunno - why are we at war? Why is it so damn important that you *remove* functionality that devs care about and will support? Steve
[jira] Commented: (LUCENE-2324) Per thread DocumentsWriters that write their own private segments
[ https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983316#action_12983316 ] Michael McCandless commented on LUCENE-2324: The branch is looking very nice!! Very clean :) Random comments: Why does DW.anyDeletions need to be sync'd? Missing headers on at least DocumentsWriterPerThreadPool, ThreadAffinityDWTP. IWC.setIndexerThreadPool's javadoc is stale. On ThreadAffinityDWTP... it may be better if we had a single queue, where threads wait in line, if no DWPT is available? And when a DWPT finishes it then notifies any waiting threads? (Ie, instead of queue-per-DWPT). I see the fieldInfos.update(dwpt.getFieldInfos()) (in DW.updateDocument) -- is there a risk that two threads bring a new field into existence at the same time, but w/ different config? Eg one doc omitsTFAP and the other doesn't? Or, on flush, does each DWPT use its private FieldInfos to correctly flush the segment? (Hmm: do we seed each DWPT w/ the original FieldInfos created by IW on init?). How are we handling the case of open IW, do delete-by-term but no added docs? Does DW.pushDeletes really need to sync on IW? BufferedDeletes is sync'd already. DW.substractFlushedDocs is mis-spelled (not sure it's used though). In DW.deleteTerms... shouldn't we skip a DWPT if it has no buffered docs? Per thread DocumentsWriters that write their own private segments - Key: LUCENE-2324 URL: https://issues.apache.org/jira/browse/LUCENE-2324 Project: Lucene - Java Issue Type: Improvement Components: Index Reporter: Michael Busch Assignee: Michael Busch Priority: Minor Fix For: Realtime Branch Attachments: LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324.patch, LUCENE-2324.patch, LUCENE-2324.patch, lucene-2324.patch, lucene-2324.patch, LUCENE-2324.patch, test.out, test.out, test.out, test.out See LUCENE-2293 for motivation and more details. I'm copying here Mike's summary he posted on 2293: Change the approach for how we buffer in RAM to a more isolated approach, whereby IW has N fully independent RAM segments in-process and when a doc needs to be indexed it's added to one of them. Each segment would also write its own doc stores and normal segment merging (not the inefficient merge we now do on flush) would merge them. This should be a good simplification in the chain (eg maybe we can remove the *PerThread classes). The segments can flush independently, letting us make much better concurrent use of IO CPU. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Let's drop Maven Artifacts !
On Jan 18, 2011, at 12:30 PM, Michael Busch wrote: I guess we could try to figure out how many people download the artifacts from m2 repos. Maybe they have download statistics? But then what? What number would justify stopping to publish? Michael Realistically, I would expect that Maven artifacts would still be published, even if we kick them out of the Lucene project to Apache extras. If some of the people care as much as they say they do, they will figure out how to make poms and whatever downstream, and a Committer into Maven will put them on the official Apache repo. It will just more truly not be a concern to the rest of us. - Mark - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2303) remove unnecessary (and problematic) log4j jars in contribs
[ https://issues.apache.org/jira/browse/SOLR-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983317#action_12983317 ] Erick Erickson commented on SOLR-2303: -- Ah, I think the light finally dawns. And helps explain why I'm getting different results on different machines/environments There's a reason they don't often let me near build systems. Ok, splendid. I suggested removing things to see if it was a bad idea. It is. Almost. So does it still make sense to remove the log4j jars in contrib in the 3_x branch? Robert: I did as you suggested, and of course started getting classNotFound errors for JUnitTestRunner and so-on. So I included these lines in Solr's build.xml. pathelement path=${common-solr.dir}/../lucene/lib/ant-junit-1.7.1.jar / pathelement path=${common-solr.dir}/../lucene/lib/ant-1.7.1.jar / pathelement path=${common-solr.dir}/../lucene/lib/junit-4.7.jar / in place of java.class.path and all is well. Is this the path you'd go down? I'm not very comfortable having Solr reach over into Lucene, but what do I know? It should be fairly obvious by now that I'm not very ant-sophisticated, is there a preferred way of doing this? Because if this is OK, it seems we should also remove junit-4.7.jar from ../solr/lib and point anything that needs it should path to ../lucene/lib as well. I'm currently testing similar changes on the 3_x build with log4j files removed. But that worked before as well. Let me know remove unnecessary (and problematic) log4j jars in contribs --- Key: SOLR-2303 URL: https://issues.apache.org/jira/browse/SOLR-2303 Project: Solr Issue Type: Improvement Components: Build Reporter: Robert Muir Fix For: 4.0 Attachments: SOLR-2303.patch In solr 4.0 there is log4j-over-slf4j. But if you have log4j jars also in the classpath (e.g. contrib/extraction, contrib/clustering) you can get strange errors such as: java.lang.NoSuchMethodError: org.apache.log4j.Logger.setAdditivity(Z)V So I think we should remove the log4j jars in these contribs, all tests pass with them removed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Let's drop Maven Artifacts !
On Jan 18, 2011, at 12:28 PM, Steven A Rowe wrote: On 1/18/2011 at 12:14 PM, Robert Muir wrote: On Tue, Jan 18, 2011 at 12:03 PM, Steven A Rowe sar...@syr.edu wrote: There clearly is no consensus for removing Maven support from Lucene. and see there is my problem, there was no consensus to begin with, now suddenly its de-facto required. Maven is quite an insidious computer virus. So you think you personally have the power to remove functionality from Lucene that has the support of multiple committers? If he thought that, he would have removed maven from svn by now! From my point of view, but perhaps I misremember: At some point, Grant or someone put in some Maven poms. I don't think anyone else really paid attention. Later, as we did releases, and saw and dealt with these poms, most of us commented against Maven support. It just feels to me like it slipped in - and really its the type of thing that should have been more discussed and thought out, and perhaps voted upon. Maven snuck into Lucene IMO. To my knowledge, the majority of core developers do not want maven in the build and/or frown on dealing with Maven. We could always have a little vote to gauge numbers - I just have not wanted to rush to another vote thread myself ;) Users are important too - but they don't get official votes - it's up to each of us to consider the User feelings/vote in our opinions/votes as we see fit IMO. - Mark - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2324) Per thread DocumentsWriters that write their own private segments
[ https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983346#action_12983346 ] Michael Busch commented on LUCENE-2324: --- bq. Why does DW.anyDeletions need to be sync'd? Hmm good point. Actually only the call to DW.pendingDeletes.any() needs to be synced, but not the loop that calls the DWPTs. {quote} In ThreadAffinityDWTP... it may be better if we had a single queue, where threads wait in line, if no DWPT is available? And when a DWPT finishes it then notifies any waiting threads? (Ie, instead of queue-per-DWPT). {quote} Whole foods instead of safeway? :) Yeah that would be fairer. A large doc (= a full cart) wouldn't block unlucky other docs. I'll make that change, good idea! {quote} I see the fieldInfos.update(dwpt.getFieldInfos()) (in DW.updateDocument) - is there a risk that two threads bring a new field into existence at the same time, but w/ different config? Eg one doc omitsTFAP and the other doesn't? Or, on flush, does each DWPT use its private FieldInfos to correctly flush the segment? (Hmm: do we seed each DWPT w/ the original FieldInfos created by IW on init?). {quote} Every DWPT has its own private FieldInfos. When a segment is flushed the DWPT uses its private FI and then it updates the original DW.fieldInfos (from IW), which is a synchronized call. The only consumer of DW.getFieldInfos() is SegmentMerger in IW. Hmm, given that IW.flush() isn't synchronized anymore I assume this can lead into a problem? E.g. the SegmentMerger gets a FieldInfos that's newer than the list of segments it's trying to flush? bq. How are we handling the case of open IW, do delete-by-term but no added docs? DW has a SegmentDeletes (pendingDeletes) which gets pushed to the last segment. We only add delTerms to DW.pendingDeletes if we couldn't push it to any DWPT. Btw. I think the whole pushDeletes business isn't working correctly yet, I'm looking into it. I need to understand the code that coalesces the deletes better. bq. In DW.deleteTerms... shouldn't we skip a DWPT if it has no buffered docs? Yeah, I did that already, but not committed yet. Per thread DocumentsWriters that write their own private segments - Key: LUCENE-2324 URL: https://issues.apache.org/jira/browse/LUCENE-2324 Project: Lucene - Java Issue Type: Improvement Components: Index Reporter: Michael Busch Assignee: Michael Busch Priority: Minor Fix For: Realtime Branch Attachments: LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324.patch, LUCENE-2324.patch, LUCENE-2324.patch, lucene-2324.patch, lucene-2324.patch, LUCENE-2324.patch, test.out, test.out, test.out, test.out See LUCENE-2293 for motivation and more details. I'm copying here Mike's summary he posted on 2293: Change the approach for how we buffer in RAM to a more isolated approach, whereby IW has N fully independent RAM segments in-process and when a doc needs to be indexed it's added to one of them. Each segment would also write its own doc stores and normal segment merging (not the inefficient merge we now do on flush) would merge them. This should be a good simplification in the chain (eg maybe we can remove the *PerThread classes). The segments can flush independently, letting us make much better concurrent use of IO CPU. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Resolved: (LUCENE-2472) The terms index divisor in IW should be set via IWC not via getReader
[ https://issues.apache.org/jira/browse/LUCENE-2472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera resolved LUCENE-2472. Resolution: Fixed Fix Version/s: (was: 4.0) You're right Mike. I committed the deprecation note to revision 1060545. The terms index divisor in IW should be set via IWC not via getReader - Key: LUCENE-2472 URL: https://issues.apache.org/jira/browse/LUCENE-2472 Project: Lucene - Java Issue Type: Improvement Components: Index Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 3.1 The getReader call gives a false sense of security... since if deletions have already been applied (and IW is pooling) the readers have already been loaded with a divisor of 1. Better to set the divisor up front in IWC. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Let's drop Maven Artifacts !
On 1/18/11 10:44 AM, Mark Miller wrote: From my point of view, but perhaps I misremember: At some point, Grant or someone put in some Maven poms. I did. :) It was a ton of work and especially getting the maven-ant-tasks to work was a nightmare! I don't think anyone else really paid attention. All those patches were attached to a jira issue, and the issue was open for a while, with people asking for published maven artifacts. Later, as we did releases, and saw and dealt with these poms, most of us commented against Maven support. So can you explain what the problem with the maven support is? Isn't it enough to just call the ant target and copying the generated files somewhere? When I did releases I never thought it made the release any harder. Just two additional easy steps. It just feels to me like it slipped in - and really its the type of thing that should have been more discussed and thought out, and perhaps voted upon. Maven snuck into Lucene IMO. To my knowledge, the majority of core developers do not want maven in the build and/or frown on dealing with Maven. We could always have a little vote to gauge numbers - I just have not wanted to rush to another vote thread myself ;) Users are important too - but they don't get official votes - it's up to each of us to consider the User feelings/vote in our opinions/votes as we see fit IMO. - Mark - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: Let's drop Maven Artifacts !
On 1/18/2011 at 1:45 PM, Mark Miller wrote: At some point, Grant or someone put in some Maven poms. I don't think anyone else really paid attention. Later, as we did releases, and saw and dealt with these poms, most of us commented against Maven support. It just feels to me like it slipped in - and really its the type of thing that should have been more discussed and thought out, and perhaps voted upon. Maven snuck into Lucene IMO. Lucene's policy is commit-then-review, and lazy consensus is the rule, right?
[jira] Created: (LUCENE-2873) TestIndexWriterReader fails: too many open files
TestIndexWriterReader fails: too many open files Key: LUCENE-2873 URL: https://issues.apache.org/jira/browse/LUCENE-2873 Project: Lucene - Java Issue Type: Bug Affects Versions: 3.1 Environment: java version 1.6.0 Java(TM) SE Runtime Environment (build pxi3260sr9-20101125_01(SR9)) IBM J9 VM (build 2.4, JRE 1.6.0 IBM J9 2.4 Linux x86-32 jvmxi3260sr9-20101124_69295 (JIT enabled, AOT enabled) J9VM - 20101124_069295 JIT - r9_20101028_17488ifx2 GC - 20101027_AA) JCL - 20101119_01 Reporter: Robert Muir {noformat} [junit] Testsuite: org.apache.lucene.index.TestIndexWriterReader [junit] Testcase: testAddIndexesAndDoDeletesThreads(org.apache.lucene.index.TestIndexWriterReader): Caused an ERROR [junit] /home/cron/branch_3x/lucene/build/test/6/test7430286492423218781tmp/_90.prx (Too many open files) [junit] java.io.FileNotFoundException: /home/cron/branch_3x/lucene/build/test/6/test7430286492423218781tmp/_90.prx (Too many open files) [junit] at java.io.RandomAccessFile.open(Native Method) [junit] at java.io.RandomAccessFile.init(RandomAccessFile.java:229) [junit] at org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput$Descriptor.init(SimpleFSDirectory.java:69) [junit] at org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.init(SimpleFSDirectory.java:90) [junit] at org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.init(NIOFSDirectory.java:91) [junit] at org.apache.lucene.store.NIOFSDirectory.openInput(NIOFSDirectory.java:78) [junit] at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:353) [junit] at org.apache.lucene.store.MockDirectoryWrapper.openInput(MockDirectoryWrapper.java:358) [junit] at org.apache.lucene.store.Directory.openInput(Directory.java:139) [junit] at org.apache.lucene.index.SegmentReader$CoreReaders.init(SegmentReader.java:135) [junit] at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:583) [junit] at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:561) [junit] at org.apache.lucene.index.DirectoryReader.init(DirectoryReader.java:101) [junit] at org.apache.lucene.index.ReadOnlyDirectoryReader.init(ReadOnlyDirectoryReader.java:27) [junit] at org.apache.lucene.index.DirectoryReader$1.doBody(DirectoryReader.java:78) [junit] at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:697) [junit] at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:72) [junit] at org.apache.lucene.index.IndexReader.open(IndexReader.java:344) [junit] at org.apache.lucene.index.IndexReader.open(IndexReader.java:230) [junit] at org.apache.lucene.index.TestIndexWriterReader.testAddIndexesAndDoDeletesThreads(TestIndexWriterReader.java:381) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1007) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:939) [junit] [junit] [junit] Tests run: 18, Failures: 0, Errors: 1, Time elapsed: 13.56 sec [junit] [junit] - Standard Error - [junit] NOTE: reproduce with: ant test -Dtestcase=TestIndexWriterReader -Dtestmethod=testAddIndexesAndDoDeletesThreads -Dtests.seed=-7781539944268912038:-6865031686554264582 [junit] NOTE: test params are: locale=ar_SD, timezone=Asia/Almaty [junit] NOTE: all tests run in this JVM: [junit] [TestMergeSchedulerExternal, TestCharFilter, TestISOLatin1AccentFilter, TestCharTermAttributeImpl, TestDoc, TestFieldsReader, TestFilterIndexReader, TestIndexWriterReader] [junit] NOTE: Linux 2.6.32-24-generic x86/IBM Corporation 1.6.0 (32-bit)/cpus=1,threads=3,free=6495816,total=11920384 [junit] - --- [junit] TEST org.apache.lucene.index.TestIndexWriterReader FAILED {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Let's drop Maven Artifacts !
On Jan 18, 2011, at 2:37 PM, Steven A Rowe wrote: On 1/18/2011 at 1:45 PM, Mark Miller wrote: At some point, Grant or someone put in some Maven poms. I don't think anyone else really paid attention. Later, as we did releases, and saw and dealt with these poms, most of us commented against Maven support. It just feels to me like it slipped in - and really its the type of thing that should have been more discussed and thought out, and perhaps voted upon. Maven snuck into Lucene IMO. Lucene's policy is commit-then-review, and lazy consensus is the rule, right? Right - clearly this is not some sneaky or underhanded thing that happened. Certainly this is how a lot of legit things happen. The only reason I feel it was more of a Maven sneaking in thing is that in IRC I have learned how many active core devs really didn't want Maven in the build at a later time. I think we just didn't really know what was happening / paid attention. I don't mean to characterize incorrectly. If you asked me back then, I prob would not have understood the consequences whatsoever and said, please go ahead! Patches welcome. People's opinions have shifted though - we have more committers now - perhaps the Maven support side is larger than the against now. Just stating things as I roughly knew them - happy to see things cleared up, fined tuned. - Mark - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Let's drop Maven Artifacts !
On Jan 18, 2011, at 2:41 PM, Michael Busch wrote: So can you explain what the problem with the maven support is? Isn't it enough to just call the ant target and copying the generated files somewhere? When I did releases I never thought it made the release any harder. Just two additional easy steps. Robert and I have gone over this a fair amount in previous exchanges I think, if you really want to know particulars. Suffice it to say, the problems so far have not been large, it feels like the likelihood of future larger problems is growing, if you ask people that seem to like/care about Maven support, the problems are probably not really a problem or easily addressable, if you ask people that dislike/don't want Maven, the problems are probably just not worth ever having to run into when we are still convinced this could be handled downstream. If I remember right, a large reason Robert is against is that he doesn't want to sign/support/endorse something he doesn't understand or care about as a Release Manager? But thats probably a major simplification of his previous arguments. And the pro Maven team has offered their counters to that. - Mark - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Let's drop Maven Artifacts !
On Tue, Jan 18, 2011 at 2:57 PM, Mark Miller markrmil...@gmail.com wrote: If I remember right, a large reason Robert is against is that he doesn't want to sign/support/endorse something he doesn't understand or care about as a Release Manager? But thats probably a major simplification of his previous arguments. And the pro Maven team has offered their counters to that. Well, i definitely don't want to produce a jacked-up release. And I listed in the last 99-email maven thread, a reference to how many of the previous releases have had various bugs/problems with maven. The problem is, as it is in our code now, there is no way to verify these magical files will actually work. and yet we all just ignore the fact we are probably shipping broken artifacts and go with the release anyway? (separately, for reference i know that Uwe has the releasing down to an art and is probably the sole person here that could actually do a release without having maven jacked up, so he isn't included) But for the rest of us, we don't understand maven. why can't it be handled downstream? And it sets a tone for future things, for instance *the most popular issue* in lucene, its not flexible indexing, its not realtime search, its not column stride fields, its... make Lucene an OSGI bundle? https://issues.apache.org/jira/browse/LUCENE?report=com.atlassian.jira.plugin.system.project:popularissues-panel Anyway i think we are making a search engine library, and if someone else can deal with these hassles, they should. we should focus on search engine stuff and getting out solid releases. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Let's drop Maven Artifacts !
On Tue, Jan 18, 2011 at 3:11 PM, Michael Busch busch...@gmail.com wrote: I'm not sure what's so complicated or mysterious about maven artifacts. A maven artifact consists of normal jar file(s) plus a POM file containing some metadata, like the artifact name and group. its the POM files that cause problems and reported bugs. i don't think they are simple at all, in fact i think they are more complicated than ant build.xml files! - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2755) Some improvements to CMS
[ https://issues.apache.org/jira/browse/LUCENE-2755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983359#action_12983359 ] Robert Muir commented on LUCENE-2755: - Mike, fyi it looks like we are hung again in hudson: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/3866/ Not sure if its the same deadlock you found. Some improvements to CMS Key: LUCENE-2755 URL: https://issues.apache.org/jira/browse/LUCENE-2755 Project: Lucene - Java Issue Type: Improvement Components: Index Reporter: Shai Erera Assignee: Shai Erera Priority: Minor Fix For: 3.1, 4.0 Attachments: LUCENE-2755.patch While running optimize on a large index, I've noticed several things that got me to read CMS code more carefully, and find these issues: * CMS may hold onto a merge if maxMergeCount is hit. That results in the MergeThreads taking merges from the IndexWriter until they are exhausted, and only then that blocked merge will run. I think it's unnecessary that that merge will be blocked. * CMS sorts merges by segments size, doc-based and not bytes-based. Since the default MP is LogByteSizeMP, and I hardly believe people care about doc-based size segments anymore, I think we should switch the default impl. There are two ways to make it extensible, if we want: ** Have an overridable member/method in CMS that you can extend and override - easy. ** Have OneMerge be comparable and let the MP determine the order (e.g. by bytes, docs, calibrate deletes etc.). Better, but will need to tap into several places in the code, so more risky and complicated. On the go, I'd like to add some documentation to CMS - it's not very easy to read and follow. I'll work on a patch. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Reopened: (LUCENE-2755) Some improvements to CMS
[ https://issues.apache.org/jira/browse/LUCENE-2755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir reopened LUCENE-2755: - I am reopening just so we don't miss fixing the deadlock... its hung in the same exact part of the tests as earlier today so I think its somehow related... Some improvements to CMS Key: LUCENE-2755 URL: https://issues.apache.org/jira/browse/LUCENE-2755 Project: Lucene - Java Issue Type: Improvement Components: Index Reporter: Shai Erera Assignee: Shai Erera Priority: Minor Fix For: 3.1, 4.0 Attachments: LUCENE-2755.patch While running optimize on a large index, I've noticed several things that got me to read CMS code more carefully, and find these issues: * CMS may hold onto a merge if maxMergeCount is hit. That results in the MergeThreads taking merges from the IndexWriter until they are exhausted, and only then that blocked merge will run. I think it's unnecessary that that merge will be blocked. * CMS sorts merges by segments size, doc-based and not bytes-based. Since the default MP is LogByteSizeMP, and I hardly believe people care about doc-based size segments anymore, I think we should switch the default impl. There are two ways to make it extensible, if we want: ** Have an overridable member/method in CMS that you can extend and override - easy. ** Have OneMerge be comparable and let the MP determine the order (e.g. by bytes, docs, calibrate deletes etc.). Better, but will need to tap into several places in the code, so more risky and complicated. On the go, I'd like to add some documentation to CMS - it's not very easy to read and follow. I'll work on a patch. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Let's drop Maven Artifacts !
To follow up Steven: Yes - Maven is part of Lucene now - it got in with lazy consensus or whatever method - and now it's basically a first class citizen. I would have to get consensus to drop it much more than you would have to get consensus to keep it. This is exactly why I don't want it to stick around or grow when it could be a downstream project. All of this continued Maven work just looks more stuff we will have maintain/support in the future it seems to me. Honestly though - if it looks like the majority are for Maven - I drop my objection. - Mark On Jan 18, 2011, at 2:45 PM, Mark Miller wrote: On Jan 18, 2011, at 2:37 PM, Steven A Rowe wrote: On 1/18/2011 at 1:45 PM, Mark Miller wrote: At some point, Grant or someone put in some Maven poms. I don't think anyone else really paid attention. Later, as we did releases, and saw and dealt with these poms, most of us commented against Maven support. It just feels to me like it slipped in - and really its the type of thing that should have been more discussed and thought out, and perhaps voted upon. Maven snuck into Lucene IMO. Lucene's policy is commit-then-review, and lazy consensus is the rule, right? Right - clearly this is not some sneaky or underhanded thing that happened. Certainly this is how a lot of legit things happen. The only reason I feel it was more of a Maven sneaking in thing is that in IRC I have learned how many active core devs really didn't want Maven in the build at a later time. I think we just didn't really know what was happening / paid attention. I don't mean to characterize incorrectly. If you asked me back then, I prob would not have understood the consequences whatsoever and said, please go ahead! Patches welcome. People's opinions have shifted though - we have more committers now - perhaps the Maven support side is larger than the against now. Just stating things as I roughly knew them - happy to see things cleared up, fined tuned. - Mark - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2303) remove unnecessary (and problematic) log4j jars in contribs
[ https://issues.apache.org/jira/browse/SOLR-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983367#action_12983367 ] Robert Muir commented on SOLR-2303: --- bq. OK, scratch the notion of removing the junit-4.7.jar file from Solr, the test cases...er...stop compiling. But the rest still stands. {quote} pathelement path=${common-solr.dir}/../lucene/lib/ant-junit-1.7.1.jar / pathelement path=${common-solr.dir}/../lucene/lib/ant-1.7.1.jar / pathelement path=${common-solr.dir}/../lucene/lib/junit-4.7.jar / in place of java.class.path and all is well. Is this the path you'd go down? I'm not very comfortable having Solr reach over into Lucene, but what do I know? {quote} Yeah, in general it would be good to explicitly include ant, ant-junit, and junit into our classpath for tests. I know i fooled with trying to do this across all of lucene and solr, there are some twists: * when the clover build is enabled, we have to actually use the ant runtime/java.class.path, because clover injects itself via ant's classpath via -lib. There might be a better way to configure clover to avoid this, but failing that we have to sometimes support throwing ant's classpath into the classpath like we do now. * the contrib/ant gets tricky (i dont remember why) especially with clover enabled :) * finally, ant 1.8 support might break, since we specifically include ant 1.7 stuff in our lib. But its generally what we want, better to have a reliable classpath in our build/tests than to compile/test with whatever version of ant the person happens to be using. Ant gets angry if you try to put ant 1.7.jar into an ant 1.8 runtime... the same situation exists for compilation actually, but I *think* i fixed that one... you would have to re-check :) remove unnecessary (and problematic) log4j jars in contribs --- Key: SOLR-2303 URL: https://issues.apache.org/jira/browse/SOLR-2303 Project: Solr Issue Type: Improvement Components: Build Reporter: Robert Muir Assignee: Erick Erickson Fix For: 4.0 Attachments: SOLR-2303.patch In solr 4.0 there is log4j-over-slf4j. But if you have log4j jars also in the classpath (e.g. contrib/extraction, contrib/clustering) you can get strange errors such as: java.lang.NoSuchMethodError: org.apache.log4j.Logger.setAdditivity(Z)V So I think we should remove the log4j jars in these contribs, all tests pass with them removed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Let's drop Maven Artifacts !
On Jan 18, 2011, at 12:13 PM, Robert Muir wrote: I can't help but remind myself, this is the same argument Oracle offered up for the whole reason hudson debacle (http://hudson-labs.org/content/whos-driving-thing) Declaring that I have a secret pocket of users that want XYZ isn't open source consensus. You were very quick to cite your own secret pocket of users when you called those who support it the vocal minority. So, if you want to continue baiting the discussion we can, but as I see it, we have committers willing to support it, so what's the big deal? - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Let's drop Maven Artifacts !
On Tue, Jan 18, 2011 at 3:49 PM, Grant Ingersoll gsing...@apache.org wrote: On Jan 18, 2011, at 12:13 PM, Robert Muir wrote: I can't help but remind myself, this is the same argument Oracle offered up for the whole reason hudson debacle (http://hudson-labs.org/content/whos-driving-thing) Declaring that I have a secret pocket of users that want XYZ isn't open source consensus. You were very quick to cite your own secret pocket of users when you called those who support it the vocal minority. So, if you want to continue baiting the discussion we can, but as I see it, we have committers willing to support it, so what's the big deal? I don't think they are that secret, you can look at the last maven discussion and see several other committers who spoke up against it. they are just sick of the discussion i gather and have given up fighting it. The problem again, is the magical special artifacts. I dont see consensus here for maven... when you have it, get back to me. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Let's drop Maven Artifacts !
On Tue, Jan 18, 2011 at 3:49 PM, Grant Ingersoll gsing...@apache.org wrote: On Jan 18, 2011, at 12:13 PM, Robert Muir wrote: I can't help but remind myself, this is the same argument Oracle offered up for the whole reason hudson debacle (http://hudson-labs.org/content/whos-driving-thing) Declaring that I have a secret pocket of users that want XYZ isn't open source consensus. You were very quick to cite your own secret pocket of users when you called those who support it the vocal minority. So, if you want to continue baiting the discussion we can, but as I see it, we have committers willing to support it, so what's the big deal? http://www.lucidimagination.com/search/document/474564645f673fbb/discussion_about_release_frequency You can look there, and see the responses of several other committers about maven. I think i like Yonik's comment best: Maven is not a part of the release process, if you think it should be, maybe you should call a vote? - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Let's drop Maven Artifacts !
On Jan 18, 2011, at 3:55 PM, Robert Muir wrote: On Tue, Jan 18, 2011 at 3:49 PM, Grant Ingersoll gsing...@apache.org wrote: On Jan 18, 2011, at 12:13 PM, Robert Muir wrote: I can't help but remind myself, this is the same argument Oracle offered up for the whole reason hudson debacle (http://hudson-labs.org/content/whos-driving-thing) Declaring that I have a secret pocket of users that want XYZ isn't open source consensus. You were very quick to cite your own secret pocket of users when you called those who support it the vocal minority. So, if you want to continue baiting the discussion we can, but as I see it, we have committers willing to support it, so what's the big deal? I don't think they are that secret, you can look at the last maven discussion and see several other committers who spoke up against it. they are just sick of the discussion i gather and have given up fighting it. Wow, so who is the vocal minority now? The problem again, is the magical special artifacts. I dont see consensus here for maven... when you have it, get back to me. As I see, it you have you, Shai and Miller (and Yonik, likely from the last go around). On the Maven side, you have me, Steve, McKinley and Busch, plus some users/contributors. In other words, I don't see consensus for dropping it. When you have it, get back to me. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Let's drop Maven Artifacts !
On Tue, Jan 18, 2011 at 4:06 PM, Grant Ingersoll gsing...@apache.org wrote: In other words, I don't see consensus for dropping it. When you have it, get back to me. Thats not how things are added to the release process. So currently, maven is not included in the release process. I don't care if your poll on the users list has 100% of users checking maven, you biased your poll already by mentioning that its because we are considering dropping maven support at the start of the email, so its total garbage. There's a lot of totally insane things I could poll the user list and get lots of responses for, that I think the devs would disagree with. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Let's drop Maven Artifacts !
It's sad how aggressive these discussions get. There's really no reason. On 1/18/11 1:10 PM, Robert Muir wrote: On Tue, Jan 18, 2011 at 4:06 PM, Grant Ingersollgsing...@apache.org wrote: In other words, I don't see consensus for dropping it. When you have it, get back to me. Thats not how things are added to the release process. So currently, maven is not included in the release process. I don't care if your poll on the users list has 100% of users checking maven, you biased your poll already by mentioning that its because we are considering dropping maven support at the start of the email, so its total garbage. There's a lot of totally insane things I could poll the user list and get lots of responses for, that I think the devs would disagree with. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Assigned: (SOLR-2307) PHPSerialized fails with sharded queries
[ https://issues.apache.org/jira/browse/SOLR-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man reassigned SOLR-2307: -- Assignee: Hoss Man PHPSerialized fails with sharded queries Key: SOLR-2307 URL: https://issues.apache.org/jira/browse/SOLR-2307 Project: Solr Issue Type: Bug Components: Response Writers Affects Versions: 1.3, 1.4.1 Reporter: Antonio Verni Assignee: Hoss Man Priority: Minor Attachments: PHPSerializedResponseWriter.java.patch, PHPSerializedResponseWriter.java.patch, PHPSerializedResponseWriter.java.patch, TestPHPSerializedResponseWriter.java, TestPHPSerializedResponseWriter.java Solr throws a java.lang.IllegalArgumentException: Map size must not be negative exception when using the PHP Serialized response writer with sharded queries. To reproduce the issue start your preferred example and try the following query: http://localhost:8983/solr/select/?q=*:*wt=phpsshards=localhost:8983/solr,localhost:8983/solr It is caused by the JSONWriter implementation of writeSolrDocumentList and writeSolrDocument. Overriding this two methods in the PHPSerializedResponseWriter to handle the SolrDocument size seems to solve the issue. Attached my patch made against trunk rev 1055588. cheers, Antonio -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Let's drop Maven Artifacts !
Why not vote for or against 'maven artifacts'? http://www.doodle.com/2qp35b42vstivhvx I'm using lucene+solr a lot times via maven. Elasticsearch uses lucene via gradle. Solandra uses lucene via ivy and so on ;) So maven artifacts are not only very handy for maven folks. But I think no artifacts would be better than broken ones. Why not trying to 'switch' to ivy build system? It's ant but handles dependencies better IMO. Regards, Peter. On Tue, Jan 18, 2011 at 3:49 PM, Grant Ingersoll gsing...@apache.org wrote: On Jan 18, 2011, at 12:13 PM, Robert Muir wrote: I can't help but remind myself, this is the same argument Oracle offered up for the whole reason hudson debacle (http://hudson-labs.org/content/whos-driving-thing) Declaring that I have a secret pocket of users that want XYZ isn't open source consensus. You were very quick to cite your own secret pocket of users when you called those who support it the vocal minority. So, if you want to continue baiting the discussion we can, but as I see it, we have committers willing to support it, so what's the big deal? I don't think they are that secret, you can look at the last maven discussion and see several other committers who spoke up against it. they are just sick of the discussion i gather and have given up fighting it. The problem again, is the magical special artifacts. I dont see consensus here for maven... when you have it, get back to me. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Let's drop Maven Artifacts !
On Jan 18, 2011, at 4:10 PM, Robert Muir wrote: On Tue, Jan 18, 2011 at 4:06 PM, Grant Ingersoll gsing...@apache.org wrote: In other words, I don't see consensus for dropping it. When you have it, get back to me. Thats not how things are added to the release process. So currently, maven is not included in the release process. I don't care if your poll on the users list has 100% of users checking maven, you biased your poll already by mentioning that its because we are considering dropping maven support at the start of the email, so its total garbage. Sorry, I'm not a professional poll writer. Even if I didn't include it, it would take all of a half of a second for someone to figure it out. As you can see by the responses, though, I think people are simply answering it. It's just software and we have people willing to maintain the Maven stuff. I simply don't get what the big deal is in keeping something that people find useful and has (enough) committer support. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Let's drop Maven Artifacts !
On Tue, Jan 18, 2011 at 4:38 PM, Grant Ingersoll gsing...@apache.org wrote: It's just software and we have people willing to maintain the Maven stuff. I simply don't get what the big deal is in keeping something that people find useful and has (enough) committer support. Why not call a committer vote then? [] -- maintain maven ourselves instead of working on search features, and slower releases. [] -- let others maintain maven downstream, instead we work on search features, and faster releases. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Resolved: (SOLR-2307) PHPSerialized fails with sharded queries
[ https://issues.apache.org/jira/browse/SOLR-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-2307. Resolution: Fixed Fix Version/s: 4.0 3.1 Committed revision 1060585. -- trunk Committed revision 1060589. - 3x. thanks again for the great patch Antonio PHPSerialized fails with sharded queries Key: SOLR-2307 URL: https://issues.apache.org/jira/browse/SOLR-2307 Project: Solr Issue Type: Bug Components: Response Writers Affects Versions: 1.3, 1.4.1 Reporter: Antonio Verni Assignee: Hoss Man Priority: Minor Fix For: 3.1, 4.0 Attachments: PHPSerializedResponseWriter.java.patch, PHPSerializedResponseWriter.java.patch, PHPSerializedResponseWriter.java.patch, SOLR-2307.patch, TestPHPSerializedResponseWriter.java, TestPHPSerializedResponseWriter.java Solr throws a java.lang.IllegalArgumentException: Map size must not be negative exception when using the PHP Serialized response writer with sharded queries. To reproduce the issue start your preferred example and try the following query: http://localhost:8983/solr/select/?q=*:*wt=phpsshards=localhost:8983/solr,localhost:8983/solr It is caused by the JSONWriter implementation of writeSolrDocumentList and writeSolrDocument. Overriding this two methods in the PHPSerializedResponseWriter to handle the SolrDocument size seems to solve the issue. Attached my patch made against trunk rev 1055588. cheers, Antonio -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Let's drop Maven Artifacts !
On Jan 18, 2011, at 4:41 PM, Robert Muir wrote: On Tue, Jan 18, 2011 at 4:38 PM, Grant Ingersoll gsing...@apache.org wrote: It's just software and we have people willing to maintain the Maven stuff. I simply don't get what the big deal is in keeping something that people find useful and has (enough) committer support. Why not call a committer vote then? [] -- maintain maven ourselves instead of working on search features, and slower releases. Wow, so having Maven releases is why we take 6-10 months to release? Give me a break. The only thing that is slower (arguably) is the building of the release itself. We have had Maven support for a long time and it has never been brought up until you did that it was the cause. The cause is, was and always will be that we innovate at a pretty rapid pace and always have the mindset to get just one more set of features/fixes into the next release. [] -- let others maintain maven downstream, instead we work on search features, and faster releases. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2844) benchmark geospatial performance based on geonames.org
[ https://issues.apache.org/jira/browse/LUCENE-2844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated LUCENE-2844: - Attachment: benchmark-geo.patch This is an update to the patch which considers the move of the benchmark contrib to /modules/benchmark. It also includes GeoNamesSetSolrAnalyzerTask which will use Solr's field-specific analyzer. It's very much tied to these set of classes in the patch. There are ASF headers now too. benchmark geospatial performance based on geonames.org -- Key: LUCENE-2844 URL: https://issues.apache.org/jira/browse/LUCENE-2844 Project: Lucene - Java Issue Type: New Feature Components: contrib/benchmark Reporter: David Smiley Priority: Minor Fix For: 4.0 Attachments: benchmark-geo.patch, benchmark-geo.patch Until now (with this patch), the benchmark contrib module did not include a means to test geospatial data. This patch includes some new files and changes to existing ones. Here is a summary of what is being added in this patch per file (all files below are within the benchmark contrib module) along with my notes: Changes: * build.xml -- Add dependency on Lucene's spatial module and Solr. ** It was a real pain to figure out the convoluted ant build system to make this work, and I doubt I did it the proper way. ** Rob Muir thought it would be a good idea to make the benchmark contrib module be top level module (i.e. be alongside analysis) so that it can depend on everything. http://lucene.472066.n3.nabble.com/Re-Geospatial-search-in-Lucene-Solr-tp2157146p2157824.html I agree * ReadTask.java -- Added a search.useHitTotal boolean option that will use the total hits number for reporting purposes, instead of the existing behavior. ** The existing behavior (i.e. when search.useHitTotal=false) doesn't look very useful since the response integer is the sum of several things instead of just one thing. I don't see how anyone makes use of it. Note that on my local system, I also changed ReportTask RepSelectByPrefTask to not include the '-' every other line, and also changed Format.java to not use commas in the numbers. These changes are to make copy-pasting into excel more streamlined. New Files: * geoname-spatial.alg -- my algorithm file. ** Note the :0 trailing the Populate sequence. This is a trick I use to skip building the index, since it takes a while to build and I'm not interested in benchmarking index construction. You'll want to set this to :1 and then subsequently put it back for further runs as long as you keep the doc.geo.schemaField or any other configuration elements affecting index the same. ** In the patch, doc.geo.schemaField=geohash but unless you're tinkering with SOLR-2155, you'll probably want to set this to latlon * GeoNamesContentSource.java -- a ContentSource for a geonames.org data file (either a single country like US.txt or allCountries.txt). ** Uses a subclass of DocData to store all the fields. The existing DocData wasn't very applicable to data that is not composed of a title and body. ** Doesn't reuse the docdata parameter to getNextDocData(); a new one is created every time. ** Only supports content.source.forever=false * GeoNamesDocMaker.java -- a subclass of DocMaker that works very differently than the existing DocMaker. ** Instead of assuming that each line from geonames.org will correspond to one Lucene document, this implementation supports, via configuration, creating a variable number of documents, each with a variable number of points taken randomly from a GeoNamesContentSource. ** doc.geo.docsToGenerate: The number of documents to generate. If blank it defaults to the number of rows in GeoNamesContentSource. ** doc.geo.avgPlacesPerDoc: The average number of places to be added to a document. A random number between 0 and one less than twice this amount is chosen on a per document basis. If this is set to 1, then exactly one is always used. In order to support a value greater than 1, use the geohash field type and incorporate SOLR-2155 (geohash prefix technique). ** doc.geo.oneDocPerPlace: Whether at most one document should use the same place. In other words, Can more than one document have the same place? If so, set this to false. ** doc.geo.schemaField: references a field name in schema.xml. The field should implement SpatialQueryable. * GeoPerfData.java: This class is a singleton storing data in memory that is shared by GeoNamesDocMaker.java and GeoQueryMaker.java. ** content.geo.zeroPopSubst: if a population is encountered that is = 0, then use this population value instead. Default is 100. ** content.geo.maxPlaces: A limit on the number of rows read in
Re: Let's drop Maven Artifacts !
On Tue, Jan 18, 2011 at 4:50 PM, Grant Ingersoll gsing...@apache.org wrote: On Jan 18, 2011, at 4:41 PM, Robert Muir wrote: On Tue, Jan 18, 2011 at 4:38 PM, Grant Ingersoll gsing...@apache.org wrote: It's just software and we have people willing to maintain the Maven stuff. I simply don't get what the big deal is in keeping something that people find useful and has (enough) committer support. Why not call a committer vote then? [] -- maintain maven ourselves instead of working on search features, and slower releases. Wow, so having Maven releases is why we take 6-10 months to release? Give me a break. The only thing that is slower (arguably) is the building of the release itself. We have had Maven support for a long time and it has never been brought up until you did that it was the cause. The cause is, was and always will be that we innovate at a pretty rapid pace and always have the mindset to get just one more set of features/fixes into the next release. In my opinion it is just a part of it, i think i detailed this here: http://www.lucidimagination.com/search/document/474564645f673fbb/discussion_about_release_frequency (This discussion was subsequently sidetracked and dominated completely by maven, so I gave up, until Shai brought up the idea again recently of trying to do a release) I think that the release process is too complicated, and doing things to simplify it, such as pushing maven downstream would help a lot. Furthermore I had this to say about maven once it completely took over the discussion: since i have been around, it seems the maven is wrong in nearly every release[1] including even bugfix releases. if i am going to be the one making artifacts, i want them to be right. [1]: Lucene/Solr 3.x, 4.0: SOLR-2041, SOLR-2055 Solr 1.4.1: SOLR-1977 Solr 1.4: SOLR-981 Lucene 2.9.1, 3.0: LUCENE-2107 Lucene 2.9.0: LUCENE-1927 Lucene 2.4: LUCENE-1525 - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org