[JENKINS] Lucene-Solr-trunk-Windows ([[ Exception while replacing ENV. Please report this as a bug. ]]
{{ java.lang.NullPointerException }}) - Build # 2421 - Failure! MIME-Version: 1.0 Content-Type: multipart/mixed; boundary==_Part_62_913129272.1358497357747 X-Jenkins-Job: Lucene-Solr-trunk-Windows X-Jenkins-Result: FAILURE Precedence: bulk --=_Part_62_913129272.1358497357747 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Windows/2421/ Java: [[ Exception while replacing ENV. Please report this as a bug. ]] {{ java.lang.NullPointerException }} No tests ran. Build Log: [...truncated 10 lines...] FATAL: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset at hudson.remoting.Request.call(Request.java:174) at hudson.remoting.Channel.call(Channel.java:672) at hudson.FilePath.act(FilePath.java:841) at hudson.FilePath.act(FilePath.java:825) at hudson.scm.SubversionSCM.checkout(SubversionSCM.java:771) at hudson.scm.SubversionSCM.checkout(SubversionSCM.java:713) at hudson.model.AbstractProject.checkout(AbstractProject.java:1325) at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:682) at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:88) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:587) at hudson.model.Run.execute(Run.java:1543) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:236) Caused by: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset at hudson.remoting.Request.abort(Request.java:299) at hudson.remoting.Channel.terminate(Channel.java:732) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:69) Caused by: java.net.SocketException: Connection reset at java.net.SocketInputStream.read(SocketInputStream.java:168) at java.io.FilterInputStream.read(FilterInputStream.java:116) at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at java.io.BufferedInputStream.read(BufferedInputStream.java:237) at java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2248) at java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2541) at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2551) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1296) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350) at hudson.remoting.Command.readFrom(Command.java:92) at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48) --=_Part_62_913129272.1358497357747-- - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [DISCUSS] Enable javadoc check on Solr too
...and surely working in a branch as suggested by Alan would be a good idea :-) Thanks Alan, Tommaso 2013/1/18 Tommaso Teofili tommaso.teof...@gmail.com I see Yonik and Jack's points which look reasonable, but, at least for my experience, even if Solr is meant to be a server it often happens that developers (not necessarily plugins' developers) have to go deep into the code in order to understand how actually things work under the hood / fix bugs / etc. and I think that would really help. Also that should help our users feel more comfortable while browsing the Solr code which I think is important. Wrapping up I think introducing such check couldn't harm but just improve the overall quality of the project so I think it'd be worth the effort. My 2 cents, Tommaso 2013/1/18 Jack Krupansky j...@basetechnology.com To the degree that people are using Solr merely as a server, that's fine. I think the main issue are the touch points of Solr that relate to user-developed plugins. The parts of Solr that invoke user plugins and that user plugins invoke should have Grade A Prime Javadoc, if for no other reason than that Eclipse is a friendly environment for developing and testing plugins. -- Jack Krupansky -Original Message- From: Yonik Seeley Sent: Thursday, January 17, 2013 12:42 PM To: dev@lucene.apache.org Subject: Re: [DISCUSS] Enable javadoc check on Solr too Solr is in a different scenario though - the primary use case is to run as a server. The majority of the java code is implementation to support that. I personally don't refer to javadoc (by itself) during development - so normal comments work just as well. Documentation of methods should be on an as-needed basis, not mandated everywhere. -Yonik http://lucidworks.com On Thu, Jan 17, 2013 at 11:44 AM, Tommaso Teofili tommaso.teof...@gmail.com wrote: Hi all, What do you think about (re) enabling javadoc check for Solr build too? At start it may be a little annoying (since a lot of Solr code misses proper javadoc thus we may have lots of failing builds) but that should turn in being a very useful thing for devs once that's fixed and we keep adding javadocs along with checked in code. So basically that should just use current Lucene's task for checking javadoc and make the build fail if there's any missing javadoc. We could add that as soon as 4.1 is out. What do you think? Regards, Tommaso --**--**- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.**orgdev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org --**--**- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.**orgdev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4696) Allow SpanNearQuery to take a BooleanQuery.
Michel Conrad created LUCENE-4696: - Summary: Allow SpanNearQuery to take a BooleanQuery. Key: LUCENE-4696 URL: https://issues.apache.org/jira/browse/LUCENE-4696 Project: Lucene - Core Issue Type: New Feature Components: core/search Affects Versions: 4.0 Reporter: Michel Conrad Currently SpanNearQuery can only take other SpanQuery objects, which include spans, span term and span wrapped multi-term queries, but not Boolean queries. By allowing a Boolean query to added to a SpanNearQuery, we can add f.i. queries that come from a QueryParser and which can not be easily transformed in the corresponding span objects. The main use case here is to find the intersection between two sets of results with the additional restriction that the matched terms from the different queries should be near one another. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-1604) Wildcards, ORs etc inside Phrase Queries
[ https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557053#comment-13557053 ] Dmitry Kan commented on SOLR-1604: -- Hello! Great work! I have two questions: 1) What would it take to incorporate phrase searches into this extended query parser? \a b\ c~100 that is, a b (phrase search) is found in that order and exactly side by side =100 tokens away from c. 2) does this implementation support the Boolean operators, like AND, OR, NOT (at least OR and NOT are supported as far as I can see)? Can they be nested? Wildcards, ORs etc inside Phrase Queries Key: SOLR-1604 URL: https://issues.apache.org/jira/browse/SOLR-1604 Project: Solr Issue Type: Improvement Components: query parsers, search Affects Versions: 1.4 Reporter: Ahmet Arslan Priority: Minor Attachments: ASF.LICENSE.NOT.GRANTED--ComplexPhrase.zip, ComplexPhraseQueryParser.java, ComplexPhrase_solr_3.4.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, SOLR-1604-alternative.patch, SOLR-1604.patch, SOLR-1604.patch Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports wildcards, ORs, ranges, fuzzies inside phrase queries. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-4043) Add scoring support for query time join
[ https://issues.apache.org/jira/browse/LUCENE-4043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556033#comment-13556033 ] David vandendriessche edited comment on LUCENE-4043 at 1/18/13 9:47 AM: Hi, Is this possible to use in solr I tried setting {!scoreMode=Avg}, but it doesn't seem to have any effect. was (Author: davidvdd): Hi is there any chance that this might work with multiple cores using the fromIndex??? Add scoring support for query time join --- Key: LUCENE-4043 URL: https://issues.apache.org/jira/browse/LUCENE-4043 Project: Lucene - Core Issue Type: Improvement Components: modules/join Reporter: Martijn van Groningen Fix For: 4.0-ALPHA Attachments: LUCENE-4043.patch, LUCENE-4043.patch, LUCENE-4043.patch, LUCENE-4043.patch Have similar scoring for query time joining just like the index time block join (with the score mode). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4307) Solr join scoring
[ https://issues.apache.org/jira/browse/SOLR-4307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David vandendriessche updated SOLR-4307: Summary: Solr join scoring (was: eDismax cross-core query support (and scoring)) Solr join scoring - Key: SOLR-4307 URL: https://issues.apache.org/jira/browse/SOLR-4307 Project: Solr Issue Type: Wish Components: multicore, query parsers Environment: I'm using Solr 4.0.0 Reporter: David vandendriessche Labels: java, solr I would like to have cross-core eDismax query support. (for the fromIndex query) Example: q= {!join fromIndex=PageCore from=docId to=fileId}pageTxt: little red riding hood defType= edismax qf= pageTxt When this Query is entered it only queries:pageTxt:little Even when I set the defType to edismax. I know I could change the query to: (pageTxt: little) AND (pageTxt:red) AND (pageTxt:riding) AND (pageTxt:hood) But as far as I know this doesn't score documents etc,... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4307) Solr join scoring
[ https://issues.apache.org/jira/browse/SOLR-4307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David vandendriessche updated SOLR-4307: Description: I would like to have cross-core eDismax query support. (for the fromIndex query) Example: q= {!join from=docId to=fileId}pageTxt:test123 defType= edismax qf= pageTxt was: I would like to have cross-core eDismax query support. (for the fromIndex query) Example: q= {!join fromIndex=PageCore from=docId to=fileId}pageTxt: little red riding hood defType= edismax qf= pageTxt When this Query is entered it only queries:pageTxt:little Even when I set the defType to edismax. I know I could change the query to: (pageTxt: little) AND (pageTxt:red) AND (pageTxt:riding) AND (pageTxt:hood) But as far as I know this doesn't score documents etc,... Solr join scoring - Key: SOLR-4307 URL: https://issues.apache.org/jira/browse/SOLR-4307 Project: Solr Issue Type: Wish Components: multicore, query parsers Environment: I'm using Solr 4.0.0 Reporter: David vandendriessche Labels: java, solr I would like to have cross-core eDismax query support. (for the fromIndex query) Example: q= {!join from=docId to=fileId}pageTxt:test123 defType= edismax qf= pageTxt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4307) Solr join scoring
[ https://issues.apache.org/jira/browse/SOLR-4307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David vandendriessche updated SOLR-4307: Description: I would like to have cross-core eDismax query support. (for the fromIndex query) Example: q={!join from=docId to=fileId}pageTxt:test-123 Add queryTimeJoining to solr. was: I would like to have cross-core eDismax query support. (for the fromIndex query) Example: q= {!join from=docId to=fileId}pageTxt:test123 defType= edismax qf= pageTxt Solr join scoring - Key: SOLR-4307 URL: https://issues.apache.org/jira/browse/SOLR-4307 Project: Solr Issue Type: Wish Components: multicore, query parsers Environment: I'm using Solr 4.0.0 Reporter: David vandendriessche Labels: java, solr I would like to have cross-core eDismax query support. (for the fromIndex query) Example: q={!join from=docId to=fileId}pageTxt:test-123 Add queryTimeJoining to solr. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4307) Solr join scoring
[ https://issues.apache.org/jira/browse/SOLR-4307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David vandendriessche updated SOLR-4307: Component/s: (was: multicore) Solr join scoring - Key: SOLR-4307 URL: https://issues.apache.org/jira/browse/SOLR-4307 Project: Solr Issue Type: Wish Components: query parsers Environment: I'm using Solr 4.0.0 Reporter: David vandendriessche Labels: java, solr I would like to have cross-core eDismax query support. (for the fromIndex query) Example: q={!join from=docId to=fileId}pageTxt:test-123 Add queryTimeJoining to solr. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4307) Solr join scoring
[ https://issues.apache.org/jira/browse/SOLR-4307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David vandendriessche updated SOLR-4307: Description: Add queryTimeJoining to solr. Example: q={!join from=docId to=fileId}pageTxt:test-123 was: I would like to have cross-core eDismax query support. (for the fromIndex query) Example: q={!join from=docId to=fileId}pageTxt:test-123 Add queryTimeJoining to solr. Solr join scoring - Key: SOLR-4307 URL: https://issues.apache.org/jira/browse/SOLR-4307 Project: Solr Issue Type: Wish Components: query parsers Environment: I'm using Solr 4.0.0 Reporter: David vandendriessche Labels: java, solr Add queryTimeJoining to solr. Example: q={!join from=docId to=fileId}pageTxt:test-123 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Issue Comment Deleted] (SOLR-4307) Solr join scoring
[ https://issues.apache.org/jira/browse/SOLR-4307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David vandendriessche updated SOLR-4307: Comment: was deleted (was: {!join fromIndex=PageCore from=docId to=fileId}{!edismax qf=pageTxt}little red Seems to get me better results. Is this the correct way to query with join and use edismax?) Solr join scoring - Key: SOLR-4307 URL: https://issues.apache.org/jira/browse/SOLR-4307 Project: Solr Issue Type: Wish Components: query parsers Environment: I'm using Solr 4.0.0 Reporter: David vandendriessche Labels: java, solr Add queryTimeJoining to solr. Example: q={!join from=docId to=fileId}pageTxt:test-123 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4307) Solr join scoring
[ https://issues.apache.org/jira/browse/SOLR-4307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David vandendriessche updated SOLR-4307: Description: Add queryTimeJoining to solr. Example: q={!join from=docId to=fileId}pageTxt:test-123 No scoring on the result just a list of documents that have a match. was: Add queryTimeJoining to solr. Example: q={!join from=docId to=fileId}pageTxt:test-123 Solr join scoring - Key: SOLR-4307 URL: https://issues.apache.org/jira/browse/SOLR-4307 Project: Solr Issue Type: Wish Components: query parsers Environment: I'm using Solr 4.0.0 Reporter: David vandendriessche Labels: java, solr Add queryTimeJoining to solr. Example: q={!join from=docId to=fileId}pageTxt:test-123 No scoring on the result just a list of documents that have a match. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Issue Comment Deleted] (SOLR-4307) Solr join scoring
[ https://issues.apache.org/jira/browse/SOLR-4307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David vandendriessche updated SOLR-4307: Comment: was deleted (was: This is what I wan't in solr.) Solr join scoring - Key: SOLR-4307 URL: https://issues.apache.org/jira/browse/SOLR-4307 Project: Solr Issue Type: Wish Components: query parsers Environment: I'm using Solr 4.0.0 Reporter: David vandendriessche Labels: java, solr Add queryTimeJoining to solr. Example: q={!join from=docId to=fileId}pageTxt:test-123 No scoring on the result just a list of documents that have a match. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4043) Add scoring support for query time join
[ https://issues.apache.org/jira/browse/LUCENE-4043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557151#comment-13557151 ] Martijn van Groningen commented on LUCENE-4043: --- Solr uses a different joining implementation. Which doesn't support mapping the scores from the `from` side to the `to` side. If you want to use the Lucene joining implementation you could wrap this in a Solr QParserPlugin extension. Add scoring support for query time join --- Key: LUCENE-4043 URL: https://issues.apache.org/jira/browse/LUCENE-4043 Project: Lucene - Core Issue Type: Improvement Components: modules/join Reporter: Martijn van Groningen Fix For: 4.0-ALPHA Attachments: LUCENE-4043.patch, LUCENE-4043.patch, LUCENE-4043.patch, LUCENE-4043.patch Have similar scoring for query time joining just like the index time block join (with the score mode). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-4315) Remove useless shardId param in DistributedUpdateProcessor#defensiveChecks
Tommaso Teofili created SOLR-4315: - Summary: Remove useless shardId param in DistributedUpdateProcessor#defensiveChecks Key: SOLR-4315 URL: https://issues.apache.org/jira/browse/SOLR-4315 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.0, 4.1 Reporter: Tommaso Teofili Assignee: Tommaso Teofili Priority: Trivial Fix For: 4.2, 5.0 DistributedUpdateProcessor#doDefensiveChecks takes the shardId parameter as while it's not using it, so that should be removed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4315) Remove useless shardId param in DistributedUpdateProcessor#defensiveChecks
[ https://issues.apache.org/jira/browse/SOLR-4315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557157#comment-13557157 ] Commit Tag Bot commented on SOLR-4315: -- [trunk commit] Tommaso Teofili http://svn.apache.org/viewvc?view=revisionrevision=1435097 [SOLR-4315] - removed useless shardId param from doDefensiveChecks method Remove useless shardId param in DistributedUpdateProcessor#defensiveChecks -- Key: SOLR-4315 URL: https://issues.apache.org/jira/browse/SOLR-4315 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.0, 4.1 Reporter: Tommaso Teofili Assignee: Tommaso Teofili Priority: Trivial Fix For: 4.2, 5.0 DistributedUpdateProcessor#doDefensiveChecks takes the shardId parameter as while it's not using it, so that should be removed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4570) release policeman tools?
[ https://issues.apache.org/jira/browse/LUCENE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-4570: -- Attachment: LUCENE-4570.patch The forbidden-api checker is now available on sonatype-snapshots with the maven-coordinates: {noformat} groupId=de.thetaphi artifactId=forbiddenapis version=1.0-SNAPSHOT {noformat} Attached is a patch for Lucene trunk, removing the forbidden api checker from checkout and use the snapshot version. To enable the download of snapshots, I added for now (until it is released) the sonatype-snapshots repo to ivy-settings.xml. There is some cleanup needed in the patch: - It somehow relies on tools compiled, otherwise some properties are not defined, to locate the txt files. This can be solved by placing the not-bundled lucene-specific signature files outside tools (where its no longer need to be). Just place solr ones in solr and lucene ones in lucene. - I have to review the API files and also move e.g. commons-io.txt to the checker JAR file, so we have more bundled signatures and dont need to maintain them inside lucene. This of course does not apply to specific solr/lucene ones to prevent specific test patterns. release policeman tools? Key: LUCENE-4570 URL: https://issues.apache.org/jira/browse/LUCENE-4570 Project: Lucene - Core Issue Type: New Feature Reporter: Robert Muir Assignee: Uwe Schindler Attachments: LUCENE-4570.patch Currently there is source code in lucene/tools/src (e.g. Forbidden APIs checker ant task). It would be convenient if you could download this thing in your ant build from ivy (especially if maybe it included our definitions .txt files as resources). In general checking for locale/charset violations in this way is a pretty general useful thing for a server-side app. Can we either release lucene-tools.jar as an artifact, or maybe alternatively move this somewhere else as a standalone project and suck it in ourselves? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS-MAVEN] Lucene-Solr-Maven-4.x #219: POMs out of sync
Build: https://builds.apache.org/job/Lucene-Solr-Maven-4.x/219/ 1 tests failed. REGRESSION: org.apache.solr.cloud.SyncSliceTest.testDistribSearch Error Message: shard1 should have just been set up to be inconsistent - but it's still consistent Stack Trace: java.lang.AssertionError: shard1 should have just been set up to be inconsistent - but it's still consistent at __randomizedtesting.SeedInfo.seed([B854E2C42793B547:39B26CDC50CCD57B]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.assertTrue(Assert.java:43) at org.junit.Assert.assertNotNull(Assert.java:526) at org.apache.solr.cloud.SyncSliceTest.doTest(SyncSliceTest.java:214) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:794) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
[jira] [Commented] (LUCENE-4570) release policeman tools?
[ https://issues.apache.org/jira/browse/LUCENE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557210#comment-13557210 ] Uwe Schindler commented on LUCENE-4570: --- By the way: The new checker finds use of a deprecated API, that was missing from the hand-made jdk-deprecated.txt: File.toURL(). Its used at three places in analyzers - which is a bummer, because it will prevent using those analyzers on configs where the lucene files are in a directory with e.g. umlauts or other special symbols (see deprecated message). release policeman tools? Key: LUCENE-4570 URL: https://issues.apache.org/jira/browse/LUCENE-4570 Project: Lucene - Core Issue Type: New Feature Reporter: Robert Muir Assignee: Uwe Schindler Attachments: LUCENE-4570.patch Currently there is source code in lucene/tools/src (e.g. Forbidden APIs checker ant task). It would be convenient if you could download this thing in your ant build from ivy (especially if maybe it included our definitions .txt files as resources). In general checking for locale/charset violations in this way is a pretty general useful thing for a server-side app. Can we either release lucene-tools.jar as an artifact, or maybe alternatively move this somewhere else as a standalone project and suck it in ourselves? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader/replica
[ https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557211#comment-13557211 ] Markus Jelsma commented on SOLR-4260: - I've removed domain_b from the index and as i expected the numDocs is now inconsistent indeed. By coincidence the what was missing in one replica from domain_a was replaced by an extra doc from domain_b and vice versa. The collection of a couple of million records has one replica that's missing one document. Inconsistent numDocs between leader/replica --- Key: SOLR-4260 URL: https://issues.apache.org/jira/browse/SOLR-4260 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 5.0 Environment: 5.0.0.2013.01.04.15.31.51 Reporter: Markus Jelsma Priority: Critical Fix For: 5.0 After wiping all cores and reindexing some 3.3 million docs from Nutch using CloudSolrServer we see inconsistencies between the leader and replica for some shards. Each core hold about 3.3k documents. For some reason 5 out of 10 shards have a small deviation in then number of documents. The leader and slave deviate for roughly 10-20 documents, not more. Results hopping ranks in the result set for identical queries got my attention, there were small IDF differences for exactly the same record causing a record to shift positions in the result set. During those tests no records were indexed. Consecutive catch all queries also return different number of numDocs. We're running a 10 node test cluster with 10 shards and a replication factor of two and frequently reindex using a fresh build from trunk. I've not seen this issue for quite some time until a few days ago. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-4570) release policeman tools?
[ https://issues.apache.org/jira/browse/LUCENE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557210#comment-13557210 ] Uwe Schindler edited comment on LUCENE-4570 at 1/18/13 1:54 PM: By the way: The new checker finds use of a deprecated API, that was missing from the hand-made jdk-deprecated.txt: File.toURL(). Its used at three places in analyzers - which is a bummer, because it will prevent using those analyzers on configs where the lucene files are in a directory with e.g. umlauts or other special symbols (see deprecated message). Here the message: {noformat} -check-forbidden-jdk-apis: [forbidden-apis] Reading bundled API signatures: jdk-unsafe-1.6 [forbidden-apis] Reading bundled API signatures: jdk-deprecated-1.6 [forbidden-apis] Reading API signatures: C:\Users\Uwe Schindler\Projects\lucene\trunk-lusolr3\lucene\tools\forbiddenApis\executors.txt [forbidden-apis] Loading classes to check... [forbidden-apis] Scanning for API signatures and dependencies... [forbidden-apis] Forbidden method invocation: java.io.File#toURL() [Deprecated in Java 1.6] [forbidden-apis] in org.apache.lucene.analysis.compound.hyphenation.PatternParser (PatternParser.java:101) [forbidden-apis] Forbidden method invocation: java.io.File#toURL() [Deprecated in Java 1.6] [forbidden-apis] in org.apache.lucene.analysis.compound.HyphenationCompoundWordTokenFilter (HyphenationCompoundWordTokenFilter.java:151) [forbidden-apis] Forbidden method invocation: java.io.File#toURL() [Deprecated in Java 1.6] [forbidden-apis] in org.apache.lucene.analysis.compound.hyphenation.HyphenationTree (HyphenationTree.java:114) [forbidden-apis] Scanned 5468 (and 432 related) class file(s) for forbidden API invocations (in 2.29s), 3 error(s). {noformat} was (Author: thetaphi): By the way: The new checker finds use of a deprecated API, that was missing from the hand-made jdk-deprecated.txt: File.toURL(). Its used at three places in analyzers - which is a bummer, because it will prevent using those analyzers on configs where the lucene files are in a directory with e.g. umlauts or other special symbols (see deprecated message). release policeman tools? Key: LUCENE-4570 URL: https://issues.apache.org/jira/browse/LUCENE-4570 Project: Lucene - Core Issue Type: New Feature Reporter: Robert Muir Assignee: Uwe Schindler Attachments: LUCENE-4570.patch Currently there is source code in lucene/tools/src (e.g. Forbidden APIs checker ant task). It would be convenient if you could download this thing in your ant build from ivy (especially if maybe it included our definitions .txt files as resources). In general checking for locale/charset violations in this way is a pretty general useful thing for a server-side app. Can we either release lucene-tools.jar as an artifact, or maybe alternatively move this somewhere else as a standalone project and suck it in ourselves? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4315) Remove useless shardId param in DistributedUpdateProcessor#defensiveChecks
[ https://issues.apache.org/jira/browse/SOLR-4315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557215#comment-13557215 ] Commit Tag Bot commented on SOLR-4315: -- [branch_4x commit] Tommaso Teofili http://svn.apache.org/viewvc?view=revisionrevision=1435137 [SOLR-4315] - merged back to branch_4x Remove useless shardId param in DistributedUpdateProcessor#defensiveChecks -- Key: SOLR-4315 URL: https://issues.apache.org/jira/browse/SOLR-4315 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.0, 4.1 Reporter: Tommaso Teofili Assignee: Tommaso Teofili Priority: Trivial Fix For: 4.2, 5.0 DistributedUpdateProcessor#doDefensiveChecks takes the shardId parameter as while it's not using it, so that should be removed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-4315) Remove useless shardId param in DistributedUpdateProcessor#defensiveChecks
[ https://issues.apache.org/jira/browse/SOLR-4315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tommaso Teofili resolved SOLR-4315. --- Resolution: Fixed Remove useless shardId param in DistributedUpdateProcessor#defensiveChecks -- Key: SOLR-4315 URL: https://issues.apache.org/jira/browse/SOLR-4315 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.0, 4.1 Reporter: Tommaso Teofili Assignee: Tommaso Teofili Priority: Trivial Fix For: 4.2, 5.0 DistributedUpdateProcessor#doDefensiveChecks takes the shardId parameter as while it's not using it, so that should be removed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4570) release policeman tools?
[ https://issues.apache.org/jira/browse/LUCENE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557227#comment-13557227 ] Uwe Schindler commented on LUCENE-4570: --- I fixed the violations for now... release policeman tools? Key: LUCENE-4570 URL: https://issues.apache.org/jira/browse/LUCENE-4570 Project: Lucene - Core Issue Type: New Feature Reporter: Robert Muir Assignee: Uwe Schindler Attachments: LUCENE-4570.patch Currently there is source code in lucene/tools/src (e.g. Forbidden APIs checker ant task). It would be convenient if you could download this thing in your ant build from ivy (especially if maybe it included our definitions .txt files as resources). In general checking for locale/charset violations in this way is a pretty general useful thing for a server-side app. Can we either release lucene-tools.jar as an artifact, or maybe alternatively move this somewhere else as a standalone project and suck it in ourselves? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4677) Use vInt to encode node addresses inside FST
[ https://issues.apache.org/jira/browse/LUCENE-4677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557228#comment-13557228 ] Commit Tag Bot commented on LUCENE-4677: [branch_4x commit] Robert Muir http://svn.apache.org/viewvc?view=revisionrevision=1435141 LUCENE-4677, LUCENE-4682, LUCENE-4678, LUCENE-3298: Merged /lucene/dev/trunk:r1432459,1432466,1432472,1432474,1432522,1432646,1433026,1433109 Use vInt to encode node addresses inside FST Key: LUCENE-4677 URL: https://issues.apache.org/jira/browse/LUCENE-4677 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 4.2, 5.0 Attachments: LUCENE-4677.patch, LUCENE-4677.patch, LUCENE-4677.patch Today we use int, but towards enabling 2.1G sized FSTs, I'd like to make this vInt instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4682) Reduce wasted bytes in FST due to array arcs
[ https://issues.apache.org/jira/browse/LUCENE-4682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557229#comment-13557229 ] Commit Tag Bot commented on LUCENE-4682: [branch_4x commit] Robert Muir http://svn.apache.org/viewvc?view=revisionrevision=1435141 LUCENE-4677, LUCENE-4682, LUCENE-4678, LUCENE-3298: Merged /lucene/dev/trunk:r1432459,1432466,1432472,1432474,1432522,1432646,1433026,1433109 Reduce wasted bytes in FST due to array arcs Key: LUCENE-4682 URL: https://issues.apache.org/jira/browse/LUCENE-4682 Project: Lucene - Core Issue Type: Improvement Components: core/FSTs Reporter: Michael McCandless Priority: Minor Attachments: kuromoji.wasted.bytes.txt, LUCENE-4682.patch When a node is close to the root, or it has many outgoing arcs, the FST writes the arcs as an array (each arc gets N bytes), so we can e.g. bin search on lookup. The problem is N is set to the max(numBytesPerArc), so if you have an outlier arc e.g. with a big output, you can waste many bytes for all the other arcs that didn't need so many bytes. I generated Kuromoji's FST and found it has 271187 wasted bytes vs total size 1535612 = ~18% wasted. It would be nice to reduce this. One thing we could do without packing is: in addNode, if we detect that number of wasted bytes is above some threshold, then don't do the expansion. Another thing, if we are packing: we could record stats in the first pass about which nodes wasted the most, and then in the second pass (paack) we could set the threshold based on the top X% nodes that waste ... Another idea is maybe to deref large outputs, so that the numBytesPerArc is more uniform ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4678) FST should use paged byte[] instead of single contiguous byte[]
[ https://issues.apache.org/jira/browse/LUCENE-4678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557230#comment-13557230 ] Commit Tag Bot commented on LUCENE-4678: [branch_4x commit] Robert Muir http://svn.apache.org/viewvc?view=revisionrevision=1435141 LUCENE-4677, LUCENE-4682, LUCENE-4678, LUCENE-3298: Merged /lucene/dev/trunk:r1432459,1432466,1432472,1432474,1432522,1432646,1433026,1433109 FST should use paged byte[] instead of single contiguous byte[] --- Key: LUCENE-4678 URL: https://issues.apache.org/jira/browse/LUCENE-4678 Project: Lucene - Core Issue Type: Improvement Components: core/FSTs Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 4.2, 5.0 Attachments: LUCENE-4678.patch, LUCENE-4678.patch, LUCENE-4678.patch, LUCENE-4678.patch, LUCENE-4678.patch The single byte[] we use today has several limitations, eg it limits us to 2.1 GB FSTs (and suggesters in the wild are getting close to this limit), and it causes big RAM spikes during building when a the array has to grow. I took basically the same approach as LUCENE-3298, but I want to break out this patch separately from changing all int - long for 2.1 GB support. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3298) FST has hard limit max size of 2.1 GB
[ https://issues.apache.org/jira/browse/LUCENE-3298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557231#comment-13557231 ] Commit Tag Bot commented on LUCENE-3298: [branch_4x commit] Robert Muir http://svn.apache.org/viewvc?view=revisionrevision=1435141 LUCENE-4677, LUCENE-4682, LUCENE-4678, LUCENE-3298: Merged /lucene/dev/trunk:r1432459,1432466,1432472,1432474,1432522,1432646,1433026,1433109 FST has hard limit max size of 2.1 GB - Key: LUCENE-3298 URL: https://issues.apache.org/jira/browse/LUCENE-3298 Project: Lucene - Core Issue Type: Improvement Components: core/FSTs Reporter: Michael McCandless Assignee: Michael McCandless Priority: Minor Attachments: LUCENE-3298.patch, LUCENE-3298.patch, LUCENE-3298.patch, LUCENE-3298.patch The FST uses a single contiguous byte[] under the hood, which in java is indexed by int so we cannot grow this over Integer.MAX_VALUE. It also internally encodes references to this array as vInt. We could switch this to a paged byte[] and make the far larger. But I think this is low priority... I'm not going to work on it any time soon. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4570) release policeman tools?
[ https://issues.apache.org/jira/browse/LUCENE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557232#comment-13557232 ] Commit Tag Bot commented on LUCENE-4570: [trunk commit] Uwe Schindler http://svn.apache.org/viewvc?view=revisionrevision=1435146 LUCENE-4570: Fix deprecated API usage (otherwise may lead to bugs if Hyphenation filters load files from directories with non-ascii path names) release policeman tools? Key: LUCENE-4570 URL: https://issues.apache.org/jira/browse/LUCENE-4570 Project: Lucene - Core Issue Type: New Feature Reporter: Robert Muir Assignee: Uwe Schindler Attachments: LUCENE-4570.patch Currently there is source code in lucene/tools/src (e.g. Forbidden APIs checker ant task). It would be convenient if you could download this thing in your ant build from ivy (especially if maybe it included our definitions .txt files as resources). In general checking for locale/charset violations in this way is a pretty general useful thing for a server-side app. Can we either release lucene-tools.jar as an artifact, or maybe alternatively move this somewhere else as a standalone project and suck it in ourselves? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4570) release policeman tools?
[ https://issues.apache.org/jira/browse/LUCENE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557234#comment-13557234 ] Commit Tag Bot commented on LUCENE-4570: [branch_4x commit] Uwe Schindler http://svn.apache.org/viewvc?view=revisionrevision=1435148 Merged revision(s) 1435146 from lucene/dev/trunk: LUCENE-4570: Fix deprecated API usage (otherwise may lead to bugs if Hyphenation filters load files from directories with non-ascii path names) release policeman tools? Key: LUCENE-4570 URL: https://issues.apache.org/jira/browse/LUCENE-4570 Project: Lucene - Core Issue Type: New Feature Reporter: Robert Muir Assignee: Uwe Schindler Attachments: LUCENE-4570.patch Currently there is source code in lucene/tools/src (e.g. Forbidden APIs checker ant task). It would be convenient if you could download this thing in your ant build from ivy (especially if maybe it included our definitions .txt files as resources). In general checking for locale/charset violations in this way is a pretty general useful thing for a server-side app. Can we either release lucene-tools.jar as an artifact, or maybe alternatively move this somewhere else as a standalone project and suck it in ourselves? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4693) FixedBitset might return wrong results if words.length actual words in the bitset
[ https://issues.apache.org/jira/browse/LUCENE-4693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557242#comment-13557242 ] Simon Willnauer commented on LUCENE-4693: - I will commit this patch if nobody objects FixedBitset might return wrong results if words.length actual words in the bitset --- Key: LUCENE-4693 URL: https://issues.apache.org/jira/browse/LUCENE-4693 Project: Lucene - Core Issue Type: Bug Components: core/other Affects Versions: 4.0, 4.1 Reporter: Simon Willnauer Fix For: 4.2, 5.0 Attachments: LUCENE-4693.patch Currently we allow to pass in the actual words as a long[] to the FixedBitSet yet if this array is oversized with respect to the actual words it needs to hold the bits the FixedBitSet can return wrong results since we use words.length (bits.lenght) as the bounds when we iterate over the bits ie. if we need to find the next set bit. We should use the actual bound rather than the size of the array. as a site note, I think it would be interesting to explore passing an offset to this too to enable to create bitsets from slices -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4693) FixedBitset might return wrong results if words.length actual words in the bitset
[ https://issues.apache.org/jira/browse/LUCENE-4693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557244#comment-13557244 ] Adrien Grand commented on LUCENE-4693: -- +1 FixedBitset might return wrong results if words.length actual words in the bitset --- Key: LUCENE-4693 URL: https://issues.apache.org/jira/browse/LUCENE-4693 Project: Lucene - Core Issue Type: Bug Components: core/other Affects Versions: 4.0, 4.1 Reporter: Simon Willnauer Fix For: 4.2, 5.0 Attachments: LUCENE-4693.patch Currently we allow to pass in the actual words as a long[] to the FixedBitSet yet if this array is oversized with respect to the actual words it needs to hold the bits the FixedBitSet can return wrong results since we use words.length (bits.lenght) as the bounds when we iterate over the bits ie. if we need to find the next set bit. We should use the actual bound rather than the size of the array. as a site note, I think it would be interesting to explore passing an offset to this too to enable to create bitsets from slices -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4600) Explore facets aggregation during documents collection
[ https://issues.apache.org/jira/browse/LUCENE-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-4600: --- Attachment: LUCENE-4600.patch Patch introduces CountingFacetsCollector, very similar to Mike's version, only productized. Made FacetsCollector abstract with a utility create() method which returns either CountingFacetsCollector or StandardFacetsCollector (previously, FC), given the parameters. All tests were migrated to use FC.create and all pass (utilizing the new collector). Still, I wrote a dedicated test for the new Collector too. Preliminary results that we have, show nice improvements w/ this Collector. Mike, can you paste them here? There are some nocommits, which I will resolve before committing. But before that, I'd like to compare this Collector to ones that use different abstractions from the code, e.g. IntDecoder (vs hard-wiring to dgap+vint), CategoryListIterator etc. Also, I also want to compare this Collector to one that in collect() marks a bitset, and does all the work in getFacetResults. Explore facets aggregation during documents collection -- Key: LUCENE-4600 URL: https://issues.apache.org/jira/browse/LUCENE-4600 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Michael McCandless Attachments: LUCENE-4600-cli.patch, LUCENE-4600.patch, LUCENE-4600.patch, LUCENE-4600.patch Today the facet module simply gathers all hits (as a bitset, optionally with a float[] to hold scores as well, if you will aggregate them) during collection, and then at the end when you call getFacetsResults(), it makes a 2nd pass over all those hits doing the actual aggregation. We should investigate just aggregating as we collect instead, so we don't have to tie up transient RAM (fairly small for the bit set but possibly big for the float[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4693) FixedBitset might return wrong results if words.length actual words in the bitset
[ https://issues.apache.org/jira/browse/LUCENE-4693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557257#comment-13557257 ] Michael McCandless commented on LUCENE-4693: +1, but could we change the assert wordLength = bits.length; to a real if instead? FixedBitset might return wrong results if words.length actual words in the bitset --- Key: LUCENE-4693 URL: https://issues.apache.org/jira/browse/LUCENE-4693 Project: Lucene - Core Issue Type: Bug Components: core/other Affects Versions: 4.0, 4.1 Reporter: Simon Willnauer Fix For: 4.2, 5.0 Attachments: LUCENE-4693.patch Currently we allow to pass in the actual words as a long[] to the FixedBitSet yet if this array is oversized with respect to the actual words it needs to hold the bits the FixedBitSet can return wrong results since we use words.length (bits.lenght) as the bounds when we iterate over the bits ie. if we need to find the next set bit. We should use the actual bound rather than the size of the array. as a site note, I think it would be interesting to explore passing an offset to this too to enable to create bitsets from slices -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4600) Explore facets aggregation during documents collection
[ https://issues.apache.org/jira/browse/LUCENE-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557263#comment-13557263 ] Michael McCandless commented on LUCENE-4600: Patch looks great: +1 And this is a healthy speedup, on the Wikipedia 1M / 25 ords per doc test: {noformat} TaskQPS base StdDevQPS comp StdDev Pct diff PKLookup 239.18 (1.5%) 238.87 (1.1%) -0.1% ( -2% -2%) LowTerm 98.99 (3.1%) 135.95 (1.8%) 37.3% ( 31% - 43%) HighTerm 20.95 (1.2%) 29.08 (2.4%) 38.8% ( 34% - 42%) MedTerm 34.55 (1.5%) 48.31 (2.0%) 39.8% ( 35% - 43%) {noformat} Explore facets aggregation during documents collection -- Key: LUCENE-4600 URL: https://issues.apache.org/jira/browse/LUCENE-4600 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Michael McCandless Attachments: LUCENE-4600-cli.patch, LUCENE-4600.patch, LUCENE-4600.patch, LUCENE-4600.patch Today the facet module simply gathers all hits (as a bitset, optionally with a float[] to hold scores as well, if you will aggregate them) during collection, and then at the end when you call getFacetsResults(), it makes a 2nd pass over all those hits doing the actual aggregation. We should investigate just aggregating as we collect instead, so we don't have to tie up transient RAM (fairly small for the bit set but possibly big for the float[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4039) MergeIndex on multiple cores impossible with SolrJ
[ https://issues.apache.org/jira/browse/SOLR-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557266#comment-13557266 ] Olof Jonasson commented on SOLR-4039: - This happens in solr4 and we believe this is the problem: In org.apache.solr.client.solrj.request.CoreAdminRequest.MergeIndexes.getParams() if (srcCores != null) { for (String srcCore : srcCores) { params.set(CoreAdminParams.SRC_CORE, srcCore); } } param.set overwrites other cores MergeIndex on multiple cores impossible with SolrJ -- Key: SOLR-4039 URL: https://issues.apache.org/jira/browse/SOLR-4039 Project: Solr Issue Type: Bug Affects Versions: 3.6.1 Environment: Windows Reporter: Mathieu Gond It is not possible to do a mergeIndexes action on multiple cores at the same time with SolrJ. Only the last core set in the srcCores parameter is used. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4693) FixedBitset might return wrong results if words.length actual words in the bitset
[ https://issues.apache.org/jira/browse/LUCENE-4693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557267#comment-13557267 ] Simon Willnauer commented on LUCENE-4693: - bq. +1, but could we change the assert wordLength = bits.length; to a real if instead? yeah I will throw a IAE instead! good point FixedBitset might return wrong results if words.length actual words in the bitset --- Key: LUCENE-4693 URL: https://issues.apache.org/jira/browse/LUCENE-4693 Project: Lucene - Core Issue Type: Bug Components: core/other Affects Versions: 4.0, 4.1 Reporter: Simon Willnauer Fix For: 4.2, 5.0 Attachments: LUCENE-4693.patch Currently we allow to pass in the actual words as a long[] to the FixedBitSet yet if this array is oversized with respect to the actual words it needs to hold the bits the FixedBitSet can return wrong results since we use words.length (bits.lenght) as the bounds when we iterate over the bits ie. if we need to find the next set bit. We should use the actual bound rather than the size of the array. as a site note, I think it would be interesting to explore passing an offset to this too to enable to create bitsets from slices -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4039) MergeIndex on multiple cores impossible with SolrJ
[ https://issues.apache.org/jira/browse/SOLR-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-4039: -- Fix Version/s: 5.0 4.2 MergeIndex on multiple cores impossible with SolrJ -- Key: SOLR-4039 URL: https://issues.apache.org/jira/browse/SOLR-4039 Project: Solr Issue Type: Bug Affects Versions: 3.6.1 Environment: Windows Reporter: Mathieu Gond Fix For: 4.2, 5.0 It is not possible to do a mergeIndexes action on multiple cores at the same time with SolrJ. Only the last core set in the srcCores parameter is used. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4693) FixedBitset might return wrong results if words.length actual words in the bitset
[ https://issues.apache.org/jira/browse/LUCENE-4693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557275#comment-13557275 ] Commit Tag Bot commented on LUCENE-4693: [trunk commit] Simon Willnauer http://svn.apache.org/viewvc?view=revisionrevision=1435191 LUCENE-4693: FixedBitset might return wrong results if words.length actual words in the bitset FixedBitset might return wrong results if words.length actual words in the bitset --- Key: LUCENE-4693 URL: https://issues.apache.org/jira/browse/LUCENE-4693 Project: Lucene - Core Issue Type: Bug Components: core/other Affects Versions: 4.0, 4.1 Reporter: Simon Willnauer Fix For: 4.2, 5.0 Attachments: LUCENE-4693.patch Currently we allow to pass in the actual words as a long[] to the FixedBitSet yet if this array is oversized with respect to the actual words it needs to hold the bits the FixedBitSet can return wrong results since we use words.length (bits.lenght) as the bounds when we iterate over the bits ie. if we need to find the next set bit. We should use the actual bound rather than the size of the array. as a site note, I think it would be interesting to explore passing an offset to this too to enable to create bitsets from slices -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4693) FixedBitset might return wrong results if words.length actual words in the bitset
[ https://issues.apache.org/jira/browse/LUCENE-4693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557282#comment-13557282 ] Commit Tag Bot commented on LUCENE-4693: [branch_4x commit] Simon Willnauer http://svn.apache.org/viewvc?view=revisionrevision=1435197 LUCENE-4693: FixedBitset might return wrong results if words.length actual words in the bitset FixedBitset might return wrong results if words.length actual words in the bitset --- Key: LUCENE-4693 URL: https://issues.apache.org/jira/browse/LUCENE-4693 Project: Lucene - Core Issue Type: Bug Components: core/other Affects Versions: 4.0, 4.1 Reporter: Simon Willnauer Fix For: 4.2, 5.0 Attachments: LUCENE-4693.patch Currently we allow to pass in the actual words as a long[] to the FixedBitSet yet if this array is oversized with respect to the actual words it needs to hold the bits the FixedBitSet can return wrong results since we use words.length (bits.lenght) as the bounds when we iterate over the bits ie. if we need to find the next set bit. We should use the actual bound rather than the size of the array. as a site note, I think it would be interesting to explore passing an offset to this too to enable to create bitsets from slices -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-4039) MergeIndex on multiple cores impossible with SolrJ
[ https://issues.apache.org/jira/browse/SOLR-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller reassigned SOLR-4039: - Assignee: Mark Miller MergeIndex on multiple cores impossible with SolrJ -- Key: SOLR-4039 URL: https://issues.apache.org/jira/browse/SOLR-4039 Project: Solr Issue Type: Bug Affects Versions: 3.6.1 Environment: Windows Reporter: Mathieu Gond Assignee: Mark Miller Fix For: 4.2, 5.0 It is not possible to do a mergeIndexes action on multiple cores at the same time with SolrJ. Only the last core set in the srcCores parameter is used. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-4693) FixedBitset might return wrong results if words.length actual words in the bitset
[ https://issues.apache.org/jira/browse/LUCENE-4693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer resolved LUCENE-4693. - Resolution: Fixed FixedBitset might return wrong results if words.length actual words in the bitset --- Key: LUCENE-4693 URL: https://issues.apache.org/jira/browse/LUCENE-4693 Project: Lucene - Core Issue Type: Bug Components: core/other Affects Versions: 4.0, 4.1 Reporter: Simon Willnauer Fix For: 4.2, 5.0 Attachments: LUCENE-4693.patch Currently we allow to pass in the actual words as a long[] to the FixedBitSet yet if this array is oversized with respect to the actual words it needs to hold the bits the FixedBitSet can return wrong results since we use words.length (bits.lenght) as the bounds when we iterate over the bits ie. if we need to find the next set bit. We should use the actual bound rather than the size of the array. as a site note, I think it would be interesting to explore passing an offset to this too to enable to create bitsets from slices -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4312) SolrCloud upgrade path
[ https://issues.apache.org/jira/browse/SOLR-4312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557348#comment-13557348 ] Yonik Seeley commented on SOLR-4312: Looking into this a little more, I guess the node naming change was needed for SOLR-4088 SolrCloud upgrade path -- Key: SOLR-4312 URL: https://issues.apache.org/jira/browse/SOLR-4312 Project: Solr Issue Type: Task Components: SolrCloud Affects Versions: 4.0, 4.1 Reporter: Steve Rowe Upgrading from one SolrCloud version to another needs to be figured out and documented. Mark Miller wrote on the 4.1 VOTE email on dev@l.a.o: {quote} One issue that is probably still a problem is that you can't easily upgrade form a 4.0 to 4.1 SolrCloud setup in some cases - at least to my knowledge. I don't know all the details, but at a minimum, we should probably add an entry to changes about what you should do. It may require blowing away your own clusterstate.json and re doing your numShards settings, or starting over, or…I don't really know. I don't think anyone has tested. {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4312) SolrCloud upgrade path
[ https://issues.apache.org/jira/browse/SOLR-4312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557361#comment-13557361 ] Mark Miller commented on SOLR-4312: --- I think that points out a change we should probably make - we should not re guess the address on every startup - we should guess it once and keep using that unless someone then overrides it? Or sets a flag to force a re guess? SolrCloud upgrade path -- Key: SOLR-4312 URL: https://issues.apache.org/jira/browse/SOLR-4312 Project: Solr Issue Type: Task Components: SolrCloud Affects Versions: 4.0, 4.1 Reporter: Steve Rowe Upgrading from one SolrCloud version to another needs to be figured out and documented. Mark Miller wrote on the 4.1 VOTE email on dev@l.a.o: {quote} One issue that is probably still a problem is that you can't easily upgrade form a 4.0 to 4.1 SolrCloud setup in some cases - at least to my knowledge. I don't know all the details, but at a minimum, we should probably add an entry to changes about what you should do. It may require blowing away your own clusterstate.json and re doing your numShards settings, or starting over, or…I don't really know. I don't think anyone has tested. {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #742: POMs out of sync
Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/742/ 1 tests failed. REGRESSION: org.apache.solr.cloud.SyncSliceTest.testDistribSearch Error Message: shard1 should have just been set up to be inconsistent - but it's still consistent Stack Trace: java.lang.AssertionError: shard1 should have just been set up to be inconsistent - but it's still consistent at __randomizedtesting.SeedInfo.seed([8D8CF1695801F063:C6A7F712F5E905F]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.assertTrue(Assert.java:43) at org.junit.Assert.assertNotNull(Assert.java:526) at org.apache.solr.cloud.SyncSliceTest.doTest(SyncSliceTest.java:214) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:794) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
[jira] [Closed] (SOLR-4307) Solr join scoring
[ https://issues.apache.org/jira/browse/SOLR-4307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David vandendriessche closed SOLR-4307. --- Resolution: Invalid As stated here: https://issues.apache.org/jira/browse/LUCENE-4043?focusedCommentId=13557151page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13557151 It's possible to write a custom query parser. Solr join scoring - Key: SOLR-4307 URL: https://issues.apache.org/jira/browse/SOLR-4307 Project: Solr Issue Type: Wish Components: query parsers Environment: I'm using Solr 4.0.0 Reporter: David vandendriessche Labels: java, solr Add queryTimeJoining to solr. Example: q={!join from=docId to=fileId}pageTxt:test-123 No scoring on the result just a list of documents that have a match. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-4316) Admin UI - SolrCloud - extend core options to collections
Shawn Heisey created SOLR-4316: -- Summary: Admin UI - SolrCloud - extend core options to collections Key: SOLR-4316 URL: https://issues.apache.org/jira/browse/SOLR-4316 Project: Solr Issue Type: Improvement Components: web gui Affects Versions: 4.1 Reporter: Shawn Heisey Fix For: 4.2, 5.0 There are a number of sections available when you are looking at a core in the UI - Ping, Query, Schema, Config, Replication, Analysis, Schema Browser, Plugins / Stats, and Dataimport are the ones that I can see. A list of collections should be available, with as many of those options that can apply to a collection, If options specific to collections/SolrCloud can be implemented, those should be there too. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4600) Explore facets aggregation during documents collection
[ https://issues.apache.org/jira/browse/LUCENE-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-4600: --- Attachment: LUCENE-4600.patch handle some nocommits. Now there's no translation from OrdinalValue to FRNImpl in getFacetResults (the latter is used directly in the queue). I wonder if this buys us anything. Explore facets aggregation during documents collection -- Key: LUCENE-4600 URL: https://issues.apache.org/jira/browse/LUCENE-4600 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Michael McCandless Attachments: LUCENE-4600-cli.patch, LUCENE-4600.patch, LUCENE-4600.patch, LUCENE-4600.patch, LUCENE-4600.patch Today the facet module simply gathers all hits (as a bitset, optionally with a float[] to hold scores as well, if you will aggregate them) during collection, and then at the end when you call getFacetsResults(), it makes a 2nd pass over all those hits doing the actual aggregation. We should investigate just aggregating as we collect instead, so we don't have to tie up transient RAM (fairly small for the bit set but possibly big for the float[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4316) Admin UI - SolrCloud - extend core options to collections
[ https://issues.apache.org/jira/browse/SOLR-4316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557571#comment-13557571 ] Shawn Heisey commented on SOLR-4316: If you have SolrCloud enabled, IMHO the list should have collapsible sections for collections and cores, with collections open and cores collapsed. Admin UI - SolrCloud - extend core options to collections - Key: SOLR-4316 URL: https://issues.apache.org/jira/browse/SOLR-4316 Project: Solr Issue Type: Improvement Components: web gui Affects Versions: 4.1 Reporter: Shawn Heisey Fix For: 4.2, 5.0 There are a number of sections available when you are looking at a core in the UI - Ping, Query, Schema, Config, Replication, Analysis, Schema Browser, Plugins / Stats, and Dataimport are the ones that I can see. A list of collections should be available, with as many of those options that can apply to a collection, If options specific to collections/SolrCloud can be implemented, those should be there too. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4600) Explore facets aggregation during documents collection
[ https://issues.apache.org/jira/browse/LUCENE-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557581#comment-13557581 ] Michael McCandless commented on LUCENE-4600: It's faster! {noformat} TaskQPS base StdDevQPS comp StdDev Pct diff PKLookup 239.75 (1.2%) 237.59 (1.0%) -0.9% ( -3% -1%) HighTerm 21.21 (1.5%) 29.80 (2.6%) 40.5% ( 35% - 45%) MedTerm 34.90 (1.9%) 50.24 (1.9%) 44.0% ( 39% - 48%) LowTerm 99.85 (3.7%) 152.40 (1.1%) 52.6% ( 46% - 59%) {noformat} Explore facets aggregation during documents collection -- Key: LUCENE-4600 URL: https://issues.apache.org/jira/browse/LUCENE-4600 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Michael McCandless Attachments: LUCENE-4600-cli.patch, LUCENE-4600.patch, LUCENE-4600.patch, LUCENE-4600.patch, LUCENE-4600.patch Today the facet module simply gathers all hits (as a bitset, optionally with a float[] to hold scores as well, if you will aggregate them) during collection, and then at the end when you call getFacetsResults(), it makes a 2nd pass over all those hits doing the actual aggregation. We should investigate just aggregating as we collect instead, so we don't have to tie up transient RAM (fairly small for the bit set but possibly big for the float[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2878) Allow Scorer to expose positions and payloads aka. nuke spans
[ https://issues.apache.org/jira/browse/LUCENE-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557667#comment-13557667 ] Alan Woodward commented on LUCENE-2878: --- Since the last patch went up I've fixed a bunch of bugs (BrouwerianQuery works properly now, as do various nested IntervalQuery subtypes that were throwing NPEs), as well as adding Span-type scoring and fleshing out the explain() methods. The only Span functionality that's missing I think is payload queries. If we want to have *all* the span functionality in here before it can land on trunk I can work on that next. It would also be good to do some proper benchmarking. Do we already have something that can compare sets of queries? Allow Scorer to expose positions and payloads aka. nuke spans -- Key: LUCENE-2878 URL: https://issues.apache.org/jira/browse/LUCENE-2878 Project: Lucene - Core Issue Type: Improvement Components: core/search Affects Versions: Positions Branch Reporter: Simon Willnauer Assignee: Simon Willnauer Labels: gsoc2011, gsoc2012, lucene-gsoc-11, lucene-gsoc-12, mentor Fix For: Positions Branch Attachments: LUCENE-2878-OR.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878_trunk.patch, LUCENE-2878_trunk.patch, LUCENE-2878-vs-trunk.patch, PosHighlighter.patch, PosHighlighter.patch Currently we have two somewhat separate types of queries, the one which can make use of positions (mainly spans) and payloads (spans). Yet Span*Query doesn't really do scoring comparable to what other queries do and at the end of the day they are duplicating lot of code all over lucene. Span*Queries are also limited to other Span*Query instances such that you can not use a TermQuery or a BooleanQuery with SpanNear or anthing like that. Beside of the Span*Query limitation other queries lacking a quiet interesting feature since they can not score based on term proximity since scores doesn't expose any positional information. All those problems bugged me for a while now so I stared working on that using the bulkpostings API. I would have done that first cut on trunk but TermScorer is working on BlockReader that do not expose positions while the one in this branch does. I started adding a new Positions class which users can pull from a scorer, to prevent unnecessary positions enums I added ScorerContext#needsPositions and eventually Scorere#needsPayloads to create the corresponding enum on demand. Yet, currently only TermQuery / TermScorer implements this API and other simply return null instead. To show that the API really works and our BulkPostings work fine too with positions I cut over TermSpanQuery to use a TermScorer under the hood and nuked TermSpans entirely. A nice sideeffect of this was that the Position BulkReading implementation got some exercise which now :) work all with positions while Payloads for bulkreading are kind of experimental in the patch and those only work with Standard codec. So all spans now work on top of TermScorer ( I truly hate spans since today ) including the ones that need Payloads (StandardCodec ONLY)!! I didn't bother to implement the other codecs yet since I want to get feedback on the API and on this first cut before I go one with it. I will upload the corresponding patch in a minute. I also had to cut over SpanQuery.getSpans(IR) to SpanQuery.getSpans(AtomicReaderContext) which I should probably do on trunk first but after that pain today I need a break first :). The patch passes all core tests (org.apache.lucene.search.highlight.HighlighterTest still fails but I didn't look into the MemoryIndex BulkPostings API yet) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: SolrTestCaseJ4: Can't avoid collection1 convention
Hi folks, I think that there is still an issue after the SOLR-3826 patch was applied for 4.0 [https://issues.apache.org/jira/browse/SOLR-3826] in September 2012. This line is missing: Index: solr/test-framework/src/java/org/apache/solr/SolrTestCaseJ4.java === --- solr/test-framework/src/java/org/apache/solr/SolrTestCaseJ4.java (revision 1435375) +++ solr/test-framework/src/java/org/apache/solr/SolrTestCaseJ4.java (working copy) @@ -384,9 +384,9 @@ public static void createCore() { assertNotNull(testSolrHome); solrConfig = TestHarness.createConfig(testSolrHome, coreName, getSolrConfigFile()); -h = new TestHarness( dataDir.getAbsolutePath(), +h = new TestHarness( coreName, new Initializer( coreName, dataDir.getAbsolutePath(), solrConfig, -getSchemaFile()); +getSchemaFile() ) ); lrf = h.getRequestFactory (standard,0,20,CommonParams.VERSION,2.2); } TestHarness( String dataDirectory,SolrConfig solrConfig, IndexSchema indexSchema) sets coreName to null and opens the default core: collection1. I would expect that coreName is carried all the way through the test. What's the best course of action for getting this fixed? Should I re-open SOLR-3826 or create a new issue? Thanks, Tricia On Tue, Aug 14, 2012 at 12:32 PM, Smiley, David W. dsmi...@mitre.orgwrote: I've got some code that extends Solr and I use the Solr test framework for my tests. I upgraded from Solr 4 alpha to Solr 4 beta today, and it appears I am forced to put my test solr home directory in solr/collection1 rather than just plain solr/ (relative to my test classpath). I looked through the code and found that SolrTestCaseJ4.initCore() calls createCore() which calls TestHarness.createConfig(solrHome,confFile) which adds the collection1 to solr home. This is a minor issue, but it annoys me and I see it as a needless change. If it isn't fixed, we'll have to at least put that in the release notes and definitely the javadoc so that it is clear you *have* to use collection1. ~ David - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: SolrTestCaseJ4: Can't avoid collection1 convention
I'd suggest creating a new issue and referencing the old issue in it. - Mark On Jan 18, 2013, at 5:48 PM, P Williams williams.tricia.l...@gmail.com wrote: Hi folks, I think that there is still an issue after the SOLR-3826 patch was applied for 4.0 [https://issues.apache.org/jira/browse/SOLR-3826] in September 2012. This line is missing: Index: solr/test-framework/src/java/org/apache/solr/SolrTestCaseJ4.java === --- solr/test-framework/src/java/org/apache/solr/SolrTestCaseJ4.java (revision 1435375) +++ solr/test-framework/src/java/org/apache/solr/SolrTestCaseJ4.java (working copy) @@ -384,9 +384,9 @@ public static void createCore() { assertNotNull(testSolrHome); solrConfig = TestHarness.createConfig(testSolrHome, coreName, getSolrConfigFile()); -h = new TestHarness( dataDir.getAbsolutePath(), +h = new TestHarness( coreName, new Initializer( coreName, dataDir.getAbsolutePath(), solrConfig, -getSchemaFile()); +getSchemaFile() ) ); lrf = h.getRequestFactory (standard,0,20,CommonParams.VERSION,2.2); } TestHarness( String dataDirectory,SolrConfig solrConfig, IndexSchema indexSchema) sets coreName to null and opens the default core: collection1. I would expect that coreName is carried all the way through the test. What's the best course of action for getting this fixed? Should I re-open SOLR-3826 or create a new issue? Thanks, Tricia On Tue, Aug 14, 2012 at 12:32 PM, Smiley, David W. dsmi...@mitre.org wrote: I've got some code that extends Solr and I use the Solr test framework for my tests. I upgraded from Solr 4 alpha to Solr 4 beta today, and it appears I am forced to put my test solr home directory in solr/collection1 rather than just plain solr/ (relative to my test classpath). I looked through the code and found that SolrTestCaseJ4.initCore() calls createCore() which calls TestHarness.createConfig(solrHome,confFile) which adds the collection1 to solr home. This is a minor issue, but it annoys me and I see it as a needless change. If it isn't fixed, we'll have to at least put that in the release notes and definitely the javadoc so that it is clear you *have* to use collection1. ~ David - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-4317) SolrTestCaseJ4: Can't avoid collection1 convention
Tricia Jenkins created SOLR-4317: Summary: SolrTestCaseJ4: Can't avoid collection1 convention Key: SOLR-4317 URL: https://issues.apache.org/jira/browse/SOLR-4317 Project: Solr Issue Type: Improvement Components: Tests Affects Versions: 4.0 Reporter: Tricia Jenkins Priority: Minor Fix For: 4.1 I think that there is still an issue after the SOLR-3826 patch was applied for 4.0 [https://issues.apache.org/jira/browse/SOLR-3826] in September 2012. This line is missing: Index: solr/test-framework/src/java/org/apache/solr/SolrTestCaseJ4.java === --- solr/test-framework/src/java/org/apache/solr/SolrTestCaseJ4.java (revision 1435375) +++ solr/test-framework/src/java/org/apache/solr/SolrTestCaseJ4.java (working copy) @@ -384,9 +384,9 @@ public static void createCore() { assertNotNull(testSolrHome); solrConfig = TestHarness.createConfig(testSolrHome, coreName, getSolrConfigFile()); -h = new TestHarness( dataDir.getAbsolutePath(), +h = new TestHarness( coreName, new Initializer( coreName, dataDir.getAbsolutePath(), solrConfig, -getSchemaFile()); +getSchemaFile() ) ); lrf = h.getRequestFactory (standard,0,20,CommonParams.VERSION,2.2); } TestHarness( String dataDirectory,SolrConfig solrConfig, IndexSchema indexSchema) sets coreName to null and opens the default core: collection1. I would expect that coreName is carried all the way through the test. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4317) SolrTestCaseJ4: Can't avoid collection1 convention
[ https://issues.apache.org/jira/browse/SOLR-4317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tricia Jenkins updated SOLR-4317: - Attachment: SOLR-4317.patch This is the patch from my description. SolrTestCaseJ4: Can't avoid collection1 convention Key: SOLR-4317 URL: https://issues.apache.org/jira/browse/SOLR-4317 Project: Solr Issue Type: Improvement Components: Tests Affects Versions: 4.0 Reporter: Tricia Jenkins Priority: Minor Fix For: 4.1 Attachments: SOLR-4317.patch I think that there is still an issue after the SOLR-3826 patch was applied for 4.0 [https://issues.apache.org/jira/browse/SOLR-3826] in September 2012. This line is missing: Index: solr/test-framework/src/java/org/apache/solr/SolrTestCaseJ4.java === --- solr/test-framework/src/java/org/apache/solr/SolrTestCaseJ4.java (revision 1435375) +++ solr/test-framework/src/java/org/apache/solr/SolrTestCaseJ4.java (working copy) @@ -384,9 +384,9 @@ public static void createCore() { assertNotNull(testSolrHome); solrConfig = TestHarness.createConfig(testSolrHome, coreName, getSolrConfigFile()); -h = new TestHarness( dataDir.getAbsolutePath(), +h = new TestHarness( coreName, new Initializer( coreName, dataDir.getAbsolutePath(), solrConfig, -getSchemaFile()); +getSchemaFile() ) ); lrf = h.getRequestFactory (standard,0,20,CommonParams.VERSION,2.2); } TestHarness( String dataDirectory,SolrConfig solrConfig, IndexSchema indexSchema) sets coreName to null and opens the default core: collection1. I would expect that coreName is carried all the way through the test. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: SolrTestCaseJ4: Can't avoid collection1 convention
Done. You can find it here: https://issues.apache.org/jira/browse/SOLR-4317 On Fri, Jan 18, 2013 at 4:01 PM, Mark Miller markrmil...@gmail.com wrote: I'd suggest creating a new issue and referencing the old issue in it. - Mark On Jan 18, 2013, at 5:48 PM, P Williams williams.tricia.l...@gmail.com wrote: Hi folks, I think that there is still an issue after the SOLR-3826 patch was applied for 4.0 [https://issues.apache.org/jira/browse/SOLR-3826] in September 2012. This line is missing: Index: solr/test-framework/src/java/org/apache/solr/SolrTestCaseJ4.java === --- solr/test-framework/src/java/org/apache/solr/SolrTestCaseJ4.java (revision 1435375) +++ solr/test-framework/src/java/org/apache/solr/SolrTestCaseJ4.java (working copy) @@ -384,9 +384,9 @@ public static void createCore() { assertNotNull(testSolrHome); solrConfig = TestHarness.createConfig(testSolrHome, coreName, getSolrConfigFile()); -h = new TestHarness( dataDir.getAbsolutePath(), +h = new TestHarness( coreName, new Initializer( coreName, dataDir.getAbsolutePath(), solrConfig, -getSchemaFile()); +getSchemaFile() ) ); lrf = h.getRequestFactory (standard,0,20,CommonParams.VERSION,2.2); } TestHarness( String dataDirectory,SolrConfig solrConfig, IndexSchema indexSchema) sets coreName to null and opens the default core: collection1. I would expect that coreName is carried all the way through the test. What's the best course of action for getting this fixed? Should I re-open SOLR-3826 or create a new issue? Thanks, Tricia On Tue, Aug 14, 2012 at 12:32 PM, Smiley, David W. dsmi...@mitre.org wrote: I've got some code that extends Solr and I use the Solr test framework for my tests. I upgraded from Solr 4 alpha to Solr 4 beta today, and it appears I am forced to put my test solr home directory in solr/collection1 rather than just plain solr/ (relative to my test classpath). I looked through the code and found that SolrTestCaseJ4.initCore() calls createCore() which calls TestHarness.createConfig(solrHome,confFile) which adds the collection1 to solr home. This is a minor issue, but it annoys me and I see it as a needless change. If it isn't fixed, we'll have to at least put that in the release notes and definitely the javadoc so that it is clear you *have* to use collection1. ~ David - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4317) SolrTestCaseJ4: Can't avoid collection1 convention
[ https://issues.apache.org/jira/browse/SOLR-4317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tricia Jenkins updated SOLR-4317: - Fix Version/s: (was: 4.1) 4.2 SolrTestCaseJ4: Can't avoid collection1 convention Key: SOLR-4317 URL: https://issues.apache.org/jira/browse/SOLR-4317 Project: Solr Issue Type: Improvement Components: Tests Affects Versions: 4.0 Reporter: Tricia Jenkins Priority: Minor Fix For: 4.2 Attachments: SOLR-4317.patch I think that there is still an issue after the SOLR-3826 patch was applied for 4.0 [https://issues.apache.org/jira/browse/SOLR-3826] in September 2012. This line is missing: Index: solr/test-framework/src/java/org/apache/solr/SolrTestCaseJ4.java === --- solr/test-framework/src/java/org/apache/solr/SolrTestCaseJ4.java (revision 1435375) +++ solr/test-framework/src/java/org/apache/solr/SolrTestCaseJ4.java (working copy) @@ -384,9 +384,9 @@ public static void createCore() { assertNotNull(testSolrHome); solrConfig = TestHarness.createConfig(testSolrHome, coreName, getSolrConfigFile()); -h = new TestHarness( dataDir.getAbsolutePath(), +h = new TestHarness( coreName, new Initializer( coreName, dataDir.getAbsolutePath(), solrConfig, -getSchemaFile()); +getSchemaFile() ) ); lrf = h.getRequestFactory (standard,0,20,CommonParams.VERSION,2.2); } TestHarness( String dataDirectory,SolrConfig solrConfig, IndexSchema indexSchema) sets coreName to null and opens the default core: collection1. I would expect that coreName is carried all the way through the test. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4317) SolrTestCaseJ4: Can't avoid collection1 convention
[ https://issues.apache.org/jira/browse/SOLR-4317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tricia Jenkins updated SOLR-4317: - Fix Version/s: 5.0 SolrTestCaseJ4: Can't avoid collection1 convention Key: SOLR-4317 URL: https://issues.apache.org/jira/browse/SOLR-4317 Project: Solr Issue Type: Improvement Components: Tests Affects Versions: 4.0 Reporter: Tricia Jenkins Priority: Minor Fix For: 4.2, 5.0 Attachments: SOLR-4317.patch I think that there is still an issue after the SOLR-3826 patch was applied for 4.0 [https://issues.apache.org/jira/browse/SOLR-3826] in September 2012. This line is missing: Index: solr/test-framework/src/java/org/apache/solr/SolrTestCaseJ4.java === --- solr/test-framework/src/java/org/apache/solr/SolrTestCaseJ4.java (revision 1435375) +++ solr/test-framework/src/java/org/apache/solr/SolrTestCaseJ4.java (working copy) @@ -384,9 +384,9 @@ public static void createCore() { assertNotNull(testSolrHome); solrConfig = TestHarness.createConfig(testSolrHome, coreName, getSolrConfigFile()); -h = new TestHarness( dataDir.getAbsolutePath(), +h = new TestHarness( coreName, new Initializer( coreName, dataDir.getAbsolutePath(), solrConfig, -getSchemaFile()); +getSchemaFile() ) ); lrf = h.getRequestFactory (standard,0,20,CommonParams.VERSION,2.2); } TestHarness( String dataDirectory,SolrConfig solrConfig, IndexSchema indexSchema) sets coreName to null and opens the default core: collection1. I would expect that coreName is carried all the way through the test. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4599) Compressed term vectors
[ https://issues.apache.org/jira/browse/LUCENE-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand updated LUCENE-4599: - Attachment: LUCENE-4599.patch New patch with tests, addProx and specialized merging. I think it is ready. This patch is similar to the previous ones except that it uses LZ4 compression on top of prefix compression (similarly to Lucene40TermVectorsFormat which writes the common prefix length with the previous term as a VInt before each term) instead of the raw term bytes to improve the compression ratio and relies on LUCENE-4643 for most integer encoding instead of raw packed ints. Otherwise: - vectors are still compressed into blocks of 16 KB, - looking up term vectors requires at most 1 disk seek. Here are the size reductions of the term vector files depending on the size of the input docs: || Field options / Document size || 1 KB (a few tens of docs per chunk) || 750 KB (one doc per chunk) || | none | 37% | 32% | | positions | 32% | 10% | | offsets | 41% | 31% | | positions+offsets | 40% | 35% | Regarding speed, indexing seems to be slightly slower but maybe the diminution of the size of the vector files would make merging faster when not everything fits in the I/O cache. I also ran a simple benchmark that loads term vectors for every doc of the index and iterates over all terms and positions. This new format was ~5x slower for small docs (likely because it has to decode the whole chunk even to read a single doc) and between 1.5x and 2x faster for large docs that are alone in their chunk (again, results would very likely be better on a large index which wouldn't fully fit in the O/S cache). If someone with very large term vector files wanted to test this new format, this would be great! I'll try on my side to perform more indexing/highlighting benchmarks.. Compressed term vectors --- Key: LUCENE-4599 URL: https://issues.apache.org/jira/browse/LUCENE-4599 Project: Lucene - Core Issue Type: Task Components: core/codecs, core/termvectors Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.2 Attachments: LUCENE-4599.patch, LUCENE-4599.patch, LUCENE-4599.patch We should have codec-compressed term vectors similarly to what we have with stored fields. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4599) Compressed term vectors
[ https://issues.apache.org/jira/browse/LUCENE-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557849#comment-13557849 ] Shawn Heisey commented on LUCENE-4599: -- bq. If someone with very large term vector files wanted to test this new format, this would be great! I'll try on my side to perform more indexing/highlighting benchmarks.. My indexes are pretty big, with termvectors taking up a lot of that. The 3.5.0 version of each of my shards is about 21GB. The same index in 4.1 with compressed stored fields is a little lres than 17 GB. I will give this patch a try on branch_4x. The full import will take 7-8 hours. Compressed term vectors --- Key: LUCENE-4599 URL: https://issues.apache.org/jira/browse/LUCENE-4599 Project: Lucene - Core Issue Type: Task Components: core/codecs, core/termvectors Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.2 Attachments: LUCENE-4599.patch, LUCENE-4599.patch, LUCENE-4599.patch We should have codec-compressed term vectors similarly to what we have with stored fields. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-4599) Compressed term vectors
[ https://issues.apache.org/jira/browse/LUCENE-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557849#comment-13557849 ] Shawn Heisey edited comment on LUCENE-4599 at 1/19/13 1:41 AM: --- bq. If someone with very large term vector files wanted to test this new format, this would be great! I'll try on my side to perform more indexing/highlighting benchmarks.. My indexes are pretty big, with termvectors taking up a lot of that. The 3.5.0 version of each of my shards is about 21GB. The same index in 4.1 with compressed stored fields is a little lres than 17 GB. I will give this patch a try on branch_4x. The full import will take 7-8 hours. was (Author: elyograg): bq. If someone with very large term vector files wanted to test this new format, this would be great! I'll try on my side to perform more indexing/highlighting benchmarks.. My indexes are pretty big, with termvectors taking up a lot of that. The 3.5.0 version of each of my shards is about 21GB. The same index in 4.1 with compressed stored fields is a little lres than 17 GB. I will give this patch a try on branch_4x. The full import will take 7-8 hours. Compressed term vectors --- Key: LUCENE-4599 URL: https://issues.apache.org/jira/browse/LUCENE-4599 Project: Lucene - Core Issue Type: Task Components: core/codecs, core/termvectors Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.2 Attachments: LUCENE-4599.patch, LUCENE-4599.patch, LUCENE-4599.patch We should have codec-compressed term vectors similarly to what we have with stored fields. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-4599) Compressed term vectors
[ https://issues.apache.org/jira/browse/LUCENE-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557849#comment-13557849 ] Shawn Heisey edited comment on LUCENE-4599 at 1/19/13 1:42 AM: --- bq. If someone with very large term vector files wanted to test this new format, this would be great! I'll try on my side to perform more indexing/highlighting benchmarks.. My indexes are pretty big, with termvectors taking up a lot of that. The 3.5.0 version of each of my shards is about 21GB. The same index in 4.1 with compressed stored fields is a little lres than 17 GB. I will give this patch a try on branch_4x. The full import will take 7-8 hours. was (Author: elyograg): bq. If someone with very large term vector files wanted to test this new format, this would be great! I'll try on my side to perform more indexing/highlighting benchmarks.. My indexes are pretty big, with termvectors taking up a lot of that. The 3.5.0 version of each of my shards is about 21GB. The same index in 4.1 with compressed stored fields is a little lres than 17 GB. I will give this patch a try on branch_4x. The full import will take 7-8 hours. Compressed term vectors --- Key: LUCENE-4599 URL: https://issues.apache.org/jira/browse/LUCENE-4599 Project: Lucene - Core Issue Type: Task Components: core/codecs, core/termvectors Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.2 Attachments: LUCENE-4599.patch, LUCENE-4599.patch, LUCENE-4599.patch We should have codec-compressed term vectors similarly to what we have with stored fields. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4599) Compressed term vectors
[ https://issues.apache.org/jira/browse/LUCENE-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557889#comment-13557889 ] Shawn Heisey commented on LUCENE-4599: -- I should ask - will this be on by default in Solr with the patch? I just got the patch applied to 4.1 because I already had it, decided to try it before branch_4x. It has occurred to me that as a LUCENE issue, it might not be turned on for Solr. Compressed term vectors --- Key: LUCENE-4599 URL: https://issues.apache.org/jira/browse/LUCENE-4599 Project: Lucene - Core Issue Type: Task Components: core/codecs, core/termvectors Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.2 Attachments: LUCENE-4599.patch, LUCENE-4599.patch, LUCENE-4599.patch We should have codec-compressed term vectors similarly to what we have with stored fields. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org