[jira] [Commented] (LUCENE-3977) generated/duplicated javadocs are wasteful and bloat the release
[ https://issues.apache.org/jira/browse/LUCENE-3977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13258443#comment-13258443 ] Dawid Weiss commented on LUCENE-3977: - It's funny -- I feel the same way Uwe does but at the same time I absolutely never looked into off-line javadocs that I downloaded with distributions of open source projects. It's usually faster to just find these online. generated/duplicated javadocs are wasteful and bloat the release Key: LUCENE-3977 URL: https://issues.apache.org/jira/browse/LUCENE-3977 Project: Lucene - Java Issue Type: Bug Components: general/javadocs Reporter: Robert Muir Priority: Blocker Fix For: 4.0 Attachments: LUCENE-3977-triplication.patch, LUCENE-3977.patch, LUCENE-3977.patch, LUCENE-3977.patch Some stats for the generated javadocs of 3.6: * 9,146 files * 161,872 KB uncompressed * 25MB compressed (this is responsible for nearly half of our binary release) The fact we intentionally double our javadocs size with the 'javadocs-all' thing is truly wasteful and compression doesn't help at all. Just testing, i nuked 'all' and found: * 4,944 files * 81,084 KB uncompressed * 12.8MB compressed We need to clean this up for 4.0. We only need to ship javadocs 'one way'. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4000) Non-redirected JVM output causes build errors
[ https://issues.apache.org/jira/browse/LUCENE-4000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257884#comment-13257884 ] Dawid Weiss commented on LUCENE-4000: - Not so harmless after all. Code cache exhaustion seems to trigger a fallback to interpreted mode and this makes tests run forever. Non-redirected JVM output causes build errors - Key: LUCENE-4000 URL: https://issues.apache.org/jira/browse/LUCENE-4000 Project: Lucene - Java Issue Type: Sub-task Components: general/test Reporter: Dawid Weiss Assignee: Dawid Weiss Fix For: 4.0 https://builds.apache.org/job/Lucene-Trunk/1899/consoleText Code cache JVM warning. Harmless but causes build errors. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3994) some nightly tests take hours
[ https://issues.apache.org/jira/browse/LUCENE-3994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256550#comment-13256550 ] Dawid Weiss commented on LUCENE-3994: - I've fixed that per-suite constant suite randomization already in github but I'll need some time to push to maven central, etc. some nightly tests take hours - Key: LUCENE-3994 URL: https://issues.apache.org/jira/browse/LUCENE-3994 Project: Lucene - Java Issue Type: Bug Components: general/build Affects Versions: 4.0 Reporter: Robert Muir Fix For: 4.0 Attachments: LUCENE-3994.patch The nightly builds are taking 4-7 hours. This is caused by a few bad apples (can be seen https://builds.apache.org/job/Lucene-trunk/1896/testReport/). The top 5 are (all in analysis): * TestSynonymMapFilter: 1 hr 54 min * TestRandomChains: 1 hr 22 min * TestRemoveDuplicatesTokenFilter: 32 min * TestMappingCharFilter: 28 min * TestWordDelimiterFilter: 22 min so thats 4.5 hours right there for that run -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3995) In LuceneTestCase.beforeClass, make a new random (also using the class hashcode) to vary defaults
[ https://issues.apache.org/jira/browse/LUCENE-3995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256650#comment-13256650 ] Dawid Weiss commented on LUCENE-3995: - Robert, this would mean it works fine, right (note dumped randomVal for each suite)? {noformat} Executing 296 suites with 4 JVMs. Suite: org.apache.lucene.util.TestCloseableThreadLocal (@BeforeClass output) 1 randomVal: 9 1 OK 0.05s J1 | TestCloseableThreadLocal.testDefaultValueWithoutSetting OK 0.01s J1 | TestCloseableThreadLocal.testInitValue OK 0.01s J1 | TestCloseableThreadLocal.testNullValue Completed on J1 in 0.27s, 3 tests Suite: org.apache.lucene.util.TestTwoPhaseCommitTool (@BeforeClass output) 1 randomVal: 6 1 OK 0.04s J2 | TestTwoPhaseCommitTool.testRollback OK 0.01s J2 | TestTwoPhaseCommitTool.testNullTPCs OK 0.01s J2 | TestTwoPhaseCommitTool.testWrapper OK 0.01s J2 | TestTwoPhaseCommitTool.testPrepareThenCommit Completed on J2 in 0.37s, 4 tests Suite: org.apache.lucene.util.TestNamedSPILoader (@BeforeClass output) 1 randomVal: 7 1 OK 0.04s J0 | TestNamedSPILoader.testAvailableServices OK 0.01s J0 | TestNamedSPILoader.testBogusLookup OK 0.01s J0 | TestNamedSPILoader.testLookup Completed on J0 in 0.34s, 3 tests Suite: org.apache.lucene.util.TestSmallFloat (@BeforeClass output) 1 randomVal: 2 1 OK 0.20s J3 | TestSmallFloat.testFloatToByte OK 0.01s J3 | TestSmallFloat.testByteToFloat Completed on J3 in 0.48s, 2 tests Suite: org.apache.lucene.index.TestTerm (@BeforeClass output) 1 randomVal: 0 1 {noformat} In LuceneTestCase.beforeClass, make a new random (also using the class hashcode) to vary defaults - Key: LUCENE-3995 URL: https://issues.apache.org/jira/browse/LUCENE-3995 Project: Lucene - Java Issue Type: Improvement Components: general/test Affects Versions: 4.0 Reporter: Robert Muir Assignee: Dawid Weiss In LuceneTestCase, we set many static defaults like: * default codec * default infostream impl * default locale * default timezone * default similarity Currently each test run gets a single seed for the run, which means for example across one test run every single test will have say, SimpleText + infostream=off + Locale=german + timezone=EDT + similarity=BM25 Because of that, we lose lots of basic mixed coverage across tests, and it also means the unfortunate individual who gets SimpleText or other slow options gets a REALLY SLOW test run, rather than amortizing this across all test runs. We should at least make a new random (getRandom() ^ className.hashCode()) to fix this so it works like before, but unfortunately that only fixes it for LuceneTestCase. Won't any subclasses that make random decisions in @BeforeClass (and we have many) still have the same problem? Maybe RandomizedRunner can instead be improved here? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3987) Ivy/maven config to pull from sonatype releases
[ https://issues.apache.org/jira/browse/LUCENE-3987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256657#comment-13256657 ] Dawid Weiss commented on LUCENE-3987: - After some deliberation I would like to add ivysettings.xml to test-framework module which would allow (this module) to fetch dependencies from an additional repository (sonatype releases). I will also add this to corresponding maven descriptor so these would be in sync. Maintenance-wise this is not an issue -- sonatype is mirroring to central so effectively they're the same but there is no lag between releases and syncs. Ivy/maven config to pull from sonatype releases --- Key: LUCENE-3987 URL: https://issues.apache.org/jira/browse/LUCENE-3987 Project: Lucene - Java Issue Type: Improvement Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Trivial Attachments: ivy-sonatype.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3994) some nightly tests take hours
[ https://issues.apache.org/jira/browse/LUCENE-3994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255790#comment-13255790 ] Dawid Weiss commented on LUCENE-3994: - You could also update statistics -- remove the previous ones and run two three times, then update. Alternatively, we could have jenkins update stats and fetch these from time to time. some nightly tests take hours - Key: LUCENE-3994 URL: https://issues.apache.org/jira/browse/LUCENE-3994 Project: Lucene - Java Issue Type: Bug Components: general/build Affects Versions: 4.0 Reporter: Robert Muir Fix For: 4.0 Attachments: LUCENE-3994.patch The nightly builds are taking 4-7 hours. This is caused by a few bad apples (can be seen https://builds.apache.org/job/Lucene-trunk/1896/testReport/). The top 5 are (all in analysis): * TestSynonymMapFilter: 1 hr 54 min * TestRandomChains: 1 hr 22 min * TestRemoveDuplicatesTokenFilter: 32 min * TestMappingCharFilter: 28 min * TestWordDelimiterFilter: 22 min so thats 4.5 hours right there for that run -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3994) some nightly tests take hours
[ https://issues.apache.org/jira/browse/LUCENE-3994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255803#comment-13255803 ] Dawid Weiss commented on LUCENE-3994: - Ok. I'll recalculate them from time to time. There is a large variance in tests anyway (this can also be computed from log stats because we can keep a history of N runs... it'd be interesting to see which tests have the largest variance). some nightly tests take hours - Key: LUCENE-3994 URL: https://issues.apache.org/jira/browse/LUCENE-3994 Project: Lucene - Java Issue Type: Bug Components: general/build Affects Versions: 4.0 Reporter: Robert Muir Fix For: 4.0 Attachments: LUCENE-3994.patch The nightly builds are taking 4-7 hours. This is caused by a few bad apples (can be seen https://builds.apache.org/job/Lucene-trunk/1896/testReport/). The top 5 are (all in analysis): * TestSynonymMapFilter: 1 hr 54 min * TestRandomChains: 1 hr 22 min * TestRemoveDuplicatesTokenFilter: 32 min * TestMappingCharFilter: 28 min * TestWordDelimiterFilter: 22 min so thats 4.5 hours right there for that run -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3995) In LuceneTestCase.beforeClass, make a new random (also using the class hashcode) to vary defaults
[ https://issues.apache.org/jira/browse/LUCENE-3995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255961#comment-13255961 ] Dawid Weiss commented on LUCENE-3995: - Note to myself - this also affectes test coverage because it reduces static context entropy (as pointed by Robert, Uwe). In LuceneTestCase.beforeClass, make a new random (also using the class hashcode) to vary defaults - Key: LUCENE-3995 URL: https://issues.apache.org/jira/browse/LUCENE-3995 Project: Lucene - Java Issue Type: Improvement Components: general/test Affects Versions: 4.0 Reporter: Robert Muir Assignee: Dawid Weiss In LuceneTestCase, we set many static defaults like: * default codec * default infostream impl * default locale * default timezone * default similarity Currently each test run gets a single seed for the run, which means for example across one test run every single test will have say, SimpleText + infostream=off + Locale=german + timezone=EDT + similarity=BM25 Because of that, we lose lots of basic mixed coverage across tests, and it also means the unfortunate individual who gets SimpleText or other slow options gets a REALLY SLOW test run, rather than amortizing this across all test runs. We should at least make a new random (getRandom() ^ className.hashCode()) to fix this so it works like before, but unfortunately that only fixes it for LuceneTestCase. Won't any subclasses that make random decisions in @BeforeClass (and we have many) still have the same problem? Maybe RandomizedRunner can instead be improved here? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3988) improve test output to be nicer to 80chars long terminals
[ https://issues.apache.org/jira/browse/LUCENE-3988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254734#comment-13254734 ] Dawid Weiss commented on LUCENE-3988: - So change it like I suggested -- I can't please everybody. If it bothers you, change it: {noformat} useSimpleNames=false maxClassNameColumns=100 {noformat} or remove maxClassNameColumns entirely. improve test output to be nicer to 80chars long terminals - Key: LUCENE-3988 URL: https://issues.apache.org/jira/browse/LUCENE-3988 Project: Lucene - Java Issue Type: Improvement Components: general/test Reporter: Robert Muir Fix For: 4.0 these lines tend to always use 82 chars: {noformat} [junit4] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time: 3.97s {noformat} Can we remove some of the spaces so it fits? Maybe remove the word 'run' from Tests run. occasionally (not always) long classnames wrap too 'Running org.apache.lucene.this.that.TestFoo' ... maybe just print the short classname? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3992) TestIndexWriterOnJRECrash failure
[ https://issues.apache.org/jira/browse/LUCENE-3992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254908#comment-13254908 ] Dawid Weiss commented on LUCENE-3992: - I see why it's slipped through -- I ran @Nightly only one or two times, the build server was running regular daily tests... Thanks for fixing. TestIndexWriterOnJRECrash failure - Key: LUCENE-3992 URL: https://issues.apache.org/jira/browse/LUCENE-3992 Project: Lucene - Java Issue Type: Bug Reporter: Robert Muir Attachments: LUCENE-3992.patch triggered this beasting a bunch of tests... gonna probably be hard to reproduce... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3987) Ivy/maven config to pull from sonatype releases
[ https://issues.apache.org/jira/browse/LUCENE-3987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254325#comment-13254325 ] Dawid Weiss commented on LUCENE-3987: - I don't want to merge this in (note no fix version). I just filed it for reference in case somebody needs it. Ivy/maven config to pull from sonatype releases --- Key: LUCENE-3987 URL: https://issues.apache.org/jira/browse/LUCENE-3987 Project: Lucene - Java Issue Type: Improvement Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Trivial Attachments: ivy-sonatype.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2161) BasicDistributedZkTest.testDistribSearch test failure
[ https://issues.apache.org/jira/browse/SOLR-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254416#comment-13254416 ] Dawid Weiss commented on SOLR-2161: --- This test fails very frequently. The most recent failure here: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk-java7/2235/ I am putting it in a @AwaitsFix group. BasicDistributedZkTest.testDistribSearch test failure - Key: SOLR-2161 URL: https://issues.apache.org/jira/browse/SOLR-2161 Project: Solr Issue Type: Bug Components: Build Affects Versions: 4.0 Environment: Hudson Reporter: Robert Muir Fix For: 4.0 BasicDistributedZkTest.testDistribSearch failed in Hudson. Here is the stacktrace: {noformat} [junit] Testsuite: org.apache.solr.cloud.BasicDistributedZkTest [junit] Testcase: testDistribSearch(org.apache.solr.cloud.BasicDistributedZkTest): Caused an ERROR [junit] Error executing query [junit] org.apache.solr.client.solrj.SolrServerException: Error executing query [junit] at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:95) [junit] at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:119) [junit] at org.apache.solr.BaseDistributedSearchTestCase.queryServer(BaseDistributedSearchTestCase.java:290) [junit] at org.apache.solr.cloud.BasicDistributedZkTest.queryServer(BasicDistributedZkTest.java:256) [junit] at org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:305) [junit] at org.apache.solr.cloud.BasicDistributedZkTest.doTest(BasicDistributedZkTest.java:227) [junit] at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:562) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:795) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:768) [junit] Caused by: org.apache.solr.common.SolrException: org.apache.solr.client.solrj.SolrServerException: org.apache.commons.httpclient.NoHttpResponseException: The server 127.0.0.1 failed to respond org.apache.solr.common.SolrException: org.apache.solr.client.solrj.SolrServerException: org.apache.commons.httpclient.NoHttpResponseException: The server 127.0.0.1 failed to respond at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:318) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1325)at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:337) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:240) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:388) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:923) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:547) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) Caused by: org.apache.solr.client.solrj.SolrServerException: org.apache.commons.httpclient.NoHttpResponseException: The server 127.0.0.1 failed to respond at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:483) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.reque [junit] [junit] org.apache.solr.client.solrj.SolrServerException: org.apache.commons.httpclient.NoHttpResponseException: The server 127.0.0.1 failed to respond org.apache.solr.common.SolrException: org.apache.solr.client.solrj.SolrServerException: org.apache.commons.httpclient.NoHttpResponseException: The server 127.0.0.1 failed to respondat org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:318) at
[jira] [Commented] (LUCENE-3988) improve test output to be nicer to 80chars long terminals
[ https://issues.apache.org/jira/browse/LUCENE-3988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254422#comment-13254422 ] Dawid Weiss commented on LUCENE-3988: - I thought about this a bit. The previous output was a mirror of surefire. After some deliberation I don't think it makes sense to present the information so verbosely (0 errors, 0 failures, etc.). How about this: {noformat} [junit4] Suite: TestReversedWildcardFilterFactory [junit4] Time: 3.00s, 4 tests [junit4] [junit4] Suite: [...]r.update.processor.UniqFieldsUpdateProcessorFactoryTest [junit4] Time: 3.00s, 4 tests, 1 skipped [junit4] [junit4] Running org.apache.solr.spelling.SpellPossibilityIteratorTest [junit4] Time: 3.00s, 4 tests, 1 error FAILURES! [junit4] [junit4] Suite: org.buhu.update.processor.BlahBlag [junit4] Time: 3.00s, 4 tests, 1 error, 2 failures, 1 skipped {noformat} Test name will be displayed in full or truncated (with an ellipsis) to fit into the desired number of columns (80 by default)? improve test output to be nicer to 80chars long terminals - Key: LUCENE-3988 URL: https://issues.apache.org/jira/browse/LUCENE-3988 Project: Lucene - Java Issue Type: Improvement Components: general/test Reporter: Robert Muir these lines tend to always use 82 chars: {noformat} [junit4] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time: 3.97s {noformat} Can we remove some of the spaces so it fits? Maybe remove the word 'run' from Tests run. occasionally (not always) long classnames wrap too 'Running org.apache.lucene.this.that.TestFoo' ... maybe just print the short classname? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3971) MappingCharFilter rarely has wrong correctOffset (for finalOffset)
[ https://issues.apache.org/jira/browse/LUCENE-3971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254017#comment-13254017 ] Dawid Weiss commented on LUCENE-3971: - Passes for me with multiple runs. I'll commit it in. MappingCharFilter rarely has wrong correctOffset (for finalOffset) --- Key: LUCENE-3971 URL: https://issues.apache.org/jira/browse/LUCENE-3971 Project: Lucene - Java Issue Type: Bug Components: modules/analysis Reporter: Robert Muir Fix For: 4.0 Attachments: LUCENE-3971.patch, LUCENE-3971_test.patch Found this bug over on LUCENE-3969, but I'm currently tracking a ton of bugs, so I figure I would open an issue and see if this one is obvious to anyone: Consider this input string: gzw f quaxot (length = 12) with a WhitespaceTokenizer. If i have mapping rules like this, then it works!: {noformat} t = {noformat} But if I have mapping rules like this: {noformat} t = tmakdbl = c {noformat} Then it will compute final offset wrong: {noformat} [junit] junit.framework.AssertionFailedError: finalOffset expected:12 but was:11 {noformat} Looks like some logic/recursion bug in the correctOffset method? The second rule is not even used for this string, it just happens to also start with 't' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3808) Switch LuceneTestCaseRunner to RandomizedRunner. Enforce Random sharing contracts. Enforce thread leaks.
[ https://issues.apache.org/jira/browse/LUCENE-3808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254054#comment-13254054 ] Dawid Weiss commented on LUCENE-3808: - I'm planning to merge github branched code into trunk this weekend. It's been running in parallel for some time now on my build server and it seems to have the same failure coverage and at the same time is a start to clean up LuceneTestCase and associated test code. I hope you'll also like the new infrastructure -- will elaborate about this a bit once merged. Switch LuceneTestCaseRunner to RandomizedRunner. Enforce Random sharing contracts. Enforce thread leaks. Key: LUCENE-3808 URL: https://issues.apache.org/jira/browse/LUCENE-3808 Project: Lucene - Java Issue Type: Sub-task Components: general/test Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Minor Fix For: 4.0 Dev. branch at: https://github.com/dweiss/lucene_solr/tree/rr Switch the runner to RandomizedRunner. Enforce the following: - (/) Random sharing will result in a failure/ exception. - (/) -Add a validator for testXXX without @Test annotation.- (custom test provider added). - (/) Make sure tests are executed with assertions enabled (at least for solr/lucene packages). - (/) Add a validator for static hook shadowing (no-no). - (/) Modify custom execution groups in LTC to be real @Groups. - Thread leaks will result in a failure (add lingering if needed, but no ignores). [this is done, but disabled] - Add a validator for @Test method overrides (check how many of these we already have first). - What to do with thread-shared Random instances copies in MockIndexWriter and MockAnalyzer? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3984) Add a target to recalculate SHA1 checksums for JAR
[ https://issues.apache.org/jira/browse/LUCENE-3984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254170#comment-13254170 ] Dawid Weiss commented on LUCENE-3984: - Can I commit this in as a top-level target? It shouldn't matter for svn/git files that don't change (their timestamps will but contents will not) and it helps folks on Windows who can't use Hoss's magic bash pipe (doesn't this sound wrong somehow?). Add a target to recalculate SHA1 checksums for JAR -- Key: LUCENE-3984 URL: https://issues.apache.org/jira/browse/LUCENE-3984 Project: Lucene - Java Issue Type: Task Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Trivial Fix For: 4.0 Something like this. Either top-level or common-build.xml? {noformat} target name=refresh-checksums checksum algorithm=SHA1 fileset dir=${basedir} include name=**/*.jar/ /fileset /checksum /target {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3973) Incorporate PMD / FindBugs
[ https://issues.apache.org/jira/browse/LUCENE-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253180#comment-13253180 ] Dawid Weiss commented on LUCENE-3973: - bq. Unless you run into the same taskdef/classloader/sub-build/permgen-OOM I was just saying to fetch them via ivy and then spawn a separate jvm to run them, much like you'd do anyway if they are separate installations. Besides -- we already have an 'ivy warning with instructions', the same can be done with permgen/OOM problems -- detect the current (ANT's) VM's settings (can be done via mx bean) and warn/ fail the build if the defaults are too low, instructing the user to set up ANT_OPTS properly... I'm not pressing on this, this is a no-issue. Incorporate PMD / FindBugs -- Key: LUCENE-3973 URL: https://issues.apache.org/jira/browse/LUCENE-3973 Project: Lucene - Java Issue Type: Improvement Components: general/build Reporter: Chris Male This has been touched on a few times over the years. Having static analysis as part of our build seems like a big win. For example, we could use PMD to look at {{System.out.println}} statements like discussed in LUCENE-3877 and we could possibly incorporate the nocommit / @author checks as well. There are a few things to work out as part of this: - Should we use both PMD and FindBugs or just one of them? They look at code from different perspectives (bytecode vs source code) and target different issues. At the moment I'm in favour of trying both but that might be too heavy handed for our needs. - What checks should we use? There's no point having the analysis if it's going to raise too many false-positives or problems we don't deem problematic. - How should the analysis be integrated in our build? Need to work out when the analysis should run, how it should be incorporated in Ant and/or Maven, what impact errors should have. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3973) Incorporate PMD / FindBugs
[ https://issues.apache.org/jira/browse/LUCENE-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252245#comment-13252245 ] Dawid Weiss commented on LUCENE-3973: - There is also this interesting tool: http://babelfish.arc.nasa.gov/trac/jpf I haven't used it and I don't know if it can handle Lucene size codebase (the number of execution paths will be astronomic) but if somebody has some time to play with it, it'd be interesting to hear what it can do. Incorporate PMD / FindBugs -- Key: LUCENE-3973 URL: https://issues.apache.org/jira/browse/LUCENE-3973 Project: Lucene - Java Issue Type: Improvement Components: general/build Reporter: Chris Male This has been touched on a few times over the years. Having static analysis as part of our build seems like a big win. For example, we could use PMD to look at {{System.out.println}} statements like discussed in LUCENE-3877 and we could possibly incorporate the nocommit / @author checks as well. There are a few things to work out as part of this: - Should we use both PMD and FindBugs or just one of them? They look at code from different perspectives (bytecode vs source code) and target different issues. At the moment I'm in favour of trying both but that might be too heavy handed for our needs. - What checks should we use? There's no point having the analysis if it's going to raise too many false-positives or problems we don't deem problematic. - How should the analysis be integrated in our build? Need to work out when the analysis should run, how it should be incorporated in Ant and/or Maven, what impact errors should have. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3973) Incorporate PMD / FindBugs
[ https://issues.apache.org/jira/browse/LUCENE-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252244#comment-13252244 ] Dawid Weiss commented on LUCENE-3973: - Both are helpful. We use both and I think FindBugs is slightly more useful than PMD but it's just a subjective opinion not anything I measured. Also, both can be verbose and a pain in the ass at times when you know the code is right and they still complain... And they are long to execute so they should be part of jenkins nightly/ smoke tests I think, not regular builds (and definitely not ant test...). Incorporate PMD / FindBugs -- Key: LUCENE-3973 URL: https://issues.apache.org/jira/browse/LUCENE-3973 Project: Lucene - Java Issue Type: Improvement Components: general/build Reporter: Chris Male This has been touched on a few times over the years. Having static analysis as part of our build seems like a big win. For example, we could use PMD to look at {{System.out.println}} statements like discussed in LUCENE-3877 and we could possibly incorporate the nocommit / @author checks as well. There are a few things to work out as part of this: - Should we use both PMD and FindBugs or just one of them? They look at code from different perspectives (bytecode vs source code) and target different issues. At the moment I'm in favour of trying both but that might be too heavy handed for our needs. - What checks should we use? There's no point having the analysis if it's going to raise too many false-positives or problems we don't deem problematic. - How should the analysis be integrated in our build? Need to work out when the analysis should run, how it should be incorporated in Ant and/or Maven, what impact errors should have. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3972) Improve AllGroupsCollector implementations
[ https://issues.apache.org/jira/browse/LUCENE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252463#comment-13252463 ] Dawid Weiss commented on LUCENE-3972: - Yes, sorry -- hash of course. The hash method that should redistribute keys space into buckets (but currently doesn't). As for BytesRefHash vs. BytesRef instances -- maybe it's the source of the speedup, who knows. I would try the hash method though, if nothing else just for curiosity. I would also patch it for the future in either case. Not rehashing input keys is a flaw in my opinion (again -- backed by real life experience from HPPC). Improve AllGroupsCollector implementations -- Key: LUCENE-3972 URL: https://issues.apache.org/jira/browse/LUCENE-3972 Project: Lucene - Java Issue Type: Improvement Components: modules/grouping Reporter: Martijn van Groningen Attachments: LUCENE-3972.patch, LUCENE-3972.patch I think that the performance of TermAllGroupsCollectorm, DVAllGroupsCollector.BR and DVAllGroupsCollector.SortedBR can be improved by using BytesRefHash to store the groups instead of an ArrayList. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3972) Improve AllGroupsCollector implementations
[ https://issues.apache.org/jira/browse/LUCENE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252486#comment-13252486 ] Dawid Weiss commented on LUCENE-3972: - Hmmm... it's not collisions then, it was worth a try. I still find the difference puzzling -- I can't justify your version being 3x faster. Curious what it might be. bq. But we know a lot about docids, and extra hashing should just lead to an average-case slowdown. Ok. Improve AllGroupsCollector implementations -- Key: LUCENE-3972 URL: https://issues.apache.org/jira/browse/LUCENE-3972 Project: Lucene - Java Issue Type: Improvement Components: modules/grouping Reporter: Martijn van Groningen Attachments: LUCENE-3972.patch, LUCENE-3972.patch I think that the performance of TermAllGroupsCollectorm, DVAllGroupsCollector.BR and DVAllGroupsCollector.SortedBR can be improved by using BytesRefHash to store the groups instead of an ArrayList. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3973) Incorporate PMD / FindBugs
[ https://issues.apache.org/jira/browse/LUCENE-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252758#comment-13252758 ] Dawid Weiss commented on LUCENE-3973: - I believe both pmd and findbugs are on maven repos so one could use ivy to fetch them automatically. One thing less to think about. Incorporate PMD / FindBugs -- Key: LUCENE-3973 URL: https://issues.apache.org/jira/browse/LUCENE-3973 Project: Lucene - Java Issue Type: Improvement Components: general/build Reporter: Chris Male This has been touched on a few times over the years. Having static analysis as part of our build seems like a big win. For example, we could use PMD to look at {{System.out.println}} statements like discussed in LUCENE-3877 and we could possibly incorporate the nocommit / @author checks as well. There are a few things to work out as part of this: - Should we use both PMD and FindBugs or just one of them? They look at code from different perspectives (bytecode vs source code) and target different issues. At the moment I'm in favour of trying both but that might be too heavy handed for our needs. - What checks should we use? There's no point having the analysis if it's going to raise too many false-positives or problems we don't deem problematic. - How should the analysis be integrated in our build? Need to work out when the analysis should run, how it should be incorporated in Ant and/or Maven, what impact errors should have. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3971) MappingCharFilter rarely has wrong correctOffset (for finalOffset)
[ https://issues.apache.org/jira/browse/LUCENE-3971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13251633#comment-13251633 ] Dawid Weiss commented on LUCENE-3971: - I think this bug is similar (if not identical) to what I fixed a while ago in PatternReplaceCharFilter -- I remember it suffered off by one as well and looking at the code it may be a similar in structure (linked list and all). There is also a question how this filter _should_ work -- should it be greedy or reluctant (match the first pattern or the longest pattern)? MappingCharFilter rarely has wrong correctOffset (for finalOffset) --- Key: LUCENE-3971 URL: https://issues.apache.org/jira/browse/LUCENE-3971 Project: Lucene - Java Issue Type: Bug Components: modules/analysis Reporter: Robert Muir Fix For: 4.0 Attachments: LUCENE-3971_test.patch Found this bug over on LUCENE-3969, but I'm currently tracking a ton of bugs, so I figure I would open an issue and see if this one is obvious to anyone: Consider this input string: gzw f quaxot (length = 12) with a WhitespaceTokenizer. If i have mapping rules like this, then it works!: {noformat} t = {noformat} But if I have mapping rules like this: {noformat} t = tmakdbl = c {noformat} Then it will compute final offset wrong: {noformat} [junit] junit.framework.AssertionFailedError: finalOffset expected:12 but was:11 {noformat} Looks like some logic/recursion bug in the correctOffset method? The second rule is not even used for this string, it just happens to also start with 't' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3335) testDistribSearch failure
[ https://issues.apache.org/jira/browse/SOLR-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250518#comment-13250518 ] Dawid Weiss commented on SOLR-3335: --- @Yonik: I run trunk tests in non-nightly mode and I see at least 1-2 failures a day (runs every two hours). This does change over time though as i merge with new commits. Some tests are frequent offenders though, like the latest one -- {noformat} build 10-Apr-2012 00:25:25[junit] Testsuite: org.apache.solr.cloud.OverseerTest build 10-Apr-2012 00:25:25[junit] Testcase: testShardLeaderChange(org.apache.solr.cloud.OverseerTest):FAILED build 10-Apr-2012 00:25:25[junit] Unexpected shard leader coll:collection1 shard:shard1 expected:core4 but was:null build 10-Apr-2012 00:25:25[junit] junit.framework.AssertionFailedError: Unexpected shard leader coll:collection1 shard:shard1 expected:core4 but was:null build 10-Apr-2012 00:25:25[junit] at org.junit.Assert.fail(Assert.java:93) build 10-Apr-2012 00:25:25[junit] at org.junit.Assert.failNotEquals(Assert.java:647) build 10-Apr-2012 00:25:25[junit] at org.junit.Assert.assertEquals(Assert.java:128) build 10-Apr-2012 00:25:25[junit] at org.apache.solr.cloud.OverseerTest.verifyShardLeader(OverseerTest.java:549) build 10-Apr-2012 00:25:25[junit] at org.apache.solr.cloud.OverseerTest.testShardLeaderChange(OverseerTest.java:711) build 10-Apr-2012 00:25:25[junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) build 10-Apr-2012 00:25:25[junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) build 10-Apr-2012 00:25:25[junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) build 10-Apr-2012 00:25:25[junit] at java.lang.reflect.Method.invoke(Method.java:597) build 10-Apr-2012 00:25:25[junit] at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) build 10-Apr-2012 00:25:25[junit] at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) build 10-Apr-2012 00:25:25[junit] at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) build 10-Apr-2012 00:25:25[junit] at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) build 10-Apr-2012 00:25:25[junit] at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) build 10-Apr-2012 00:25:25[junit] at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30) build 10-Apr-2012 00:25:25[junit] at org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:63) build 10-Apr-2012 00:25:25[junit] at org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:754) build 10-Apr-2012 00:25:25[junit] at org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:670) build 10-Apr-2012 00:25:25[junit] at org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:69) build 10-Apr-2012 00:25:25[junit] at org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:591) build 10-Apr-2012 00:25:25[junit] at org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:75) build 10-Apr-2012 00:25:25[junit] at org.apache.lucene.util.LuceneTestCase$SaveThreadAndTestNameRule$1.evaluate(LuceneTestCase.java:642) build 10-Apr-2012 00:25:25[junit] at org.junit.rules.RunRules.evaluate(RunRules.java:18) build 10-Apr-2012 00:25:25[junit] at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) build 10-Apr-2012 00:25:25[junit] at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) build 10-Apr-2012 00:25:25[junit] at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:164) build 10-Apr-2012 00:25:25[junit] at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57) build 10-Apr-2012 00:25:25[junit] at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) build 10-Apr-2012 00:25:25[junit] at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) build 10-Apr-2012 00:25:25[junit] at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) build 10-Apr-2012 00:25:25[junit] at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) build 10-Apr-2012
[jira] [Commented] (SOLR-3237) OverseerTest failure (non-reproducible)
[ https://issues.apache.org/jira/browse/SOLR-3237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250564#comment-13250564 ] Dawid Weiss commented on SOLR-3237: --- I have more if you need logs, Sami. Thanks for taking care of this one! OverseerTest failure (non-reproducible) --- Key: SOLR-3237 URL: https://issues.apache.org/jira/browse/SOLR-3237 Project: Solr Issue Type: Bug Reporter: Dawid Weiss Assignee: Sami Siren Priority: Minor Fix For: 4.0 Nighly log harvest. Couldn't reproduce, unfortunately. {noformat} build 13-Mar-2012 06:08:43[junit] Testsuite: org.apache.solr.cloud.OverseerTest build 13-Mar-2012 06:08:43[junit] Testcase: testShardLeaderChange(org.apache.solr.cloud.OverseerTest):FAILED build 13-Mar-2012 06:08:43[junit] Unexpected shard leader coll:collection1 shard:shard1 expected:core4 but was:null build 13-Mar-2012 06:08:43[junit] junit.framework.AssertionFailedError: Unexpected shard leader coll:collection1 shard:shard1 expected:core4 but was:null build 13-Mar-2012 06:08:43[junit] at org.apache.solr.cloud.OverseerTest.verifyShardLeader(OverseerTest.java:549) build 13-Mar-2012 06:08:43[junit] at org.apache.solr.cloud.OverseerTest.testShardLeaderChange(OverseerTest.java:711) build 13-Mar-2012 06:08:43[junit] at org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:20) build 13-Mar-2012 06:08:43[junit] at org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:729) build 13-Mar-2012 06:08:43[junit] at org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:645) build 13-Mar-2012 06:08:43[junit] at org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:22) build 13-Mar-2012 06:08:43[junit] at org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:556) build 13-Mar-2012 06:08:43[junit] at org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:51) build 13-Mar-2012 06:08:43[junit] at org.apache.lucene.util.LuceneTestCase$RememberThreadRule$1.evaluate(LuceneTestCase.java:618) build 13-Mar-2012 06:08:43[junit] at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:164) build 13-Mar-2012 06:08:43[junit] at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57) build 13-Mar-2012 06:08:43[junit] at org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:20) build 13-Mar-2012 06:08:43[junit] at org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:51) build 13-Mar-2012 06:08:43[junit] at org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:21) build 13-Mar-2012 06:08:43[junit] at org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:22) build 13-Mar-2012 06:08:43[junit] build 13-Mar-2012 06:08:43[junit] build 13-Mar-2012 06:08:43[junit] Tests run: 7, Failures: 1, Errors: 0, Time elapsed: 74.666 sec build 13-Mar-2012 06:08:43[junit] build 13-Mar-2012 06:08:43[junit] - Standard Error - build 13-Mar-2012 06:08:43[junit] NOTE: reproduce with: ant test -Dtestcase=OverseerTest -Dtestmethod=testShardLeaderChange -Dtests.seed=48c9960216b3d5d:6c1600de0df53cdd:69c37083161d807d -Dargs=-Dfile.encoding=UTF-8 build 13-Mar-2012 06:08:43[junit] WARNING: test class left thread running: Session Sets (4): build 13-Mar-2012 06:08:43[junit] 0 expire at Mon Mar 12 22:08:45 MST 2012: build 13-Mar-2012 06:08:43[junit] 0 expire at Mon Mar 12 22:08:48 MST 2012: build 13-Mar-2012 06:08:43[junit] 0 expire at Mon Mar 12 22:08:51 MST 2012: build 13-Mar-2012 06:08:43[junit] 0 expire at Mon Mar 12 22:08:54 MST 2012: build 13-Mar-2012 06:08:43[junit] build 13-Mar-2012 06:08:43[junit] RESOURCE LEAK: test class left 1 thread(s) running build 13-Mar-2012 06:08:43[junit] NOTE: test params are: codec=Lucene40: {}, sim=DefaultSimilarity, locale=zh_TW, timezone=Mexico/BajaSur build 13-Mar-2012 06:08:43[junit] NOTE: all tests run in this JVM: build 13-Mar-2012 06:08:43[junit] [BasicFunctionalityTest, SolrInfoMBeanTest, SnowballPorterFilterFactoryTest, TestCJKTokenizerFactory, TestCJKWidthFilterFactory,
[jira] [Commented] (SOLR-3335) testDistribSearch failure
[ https://issues.apache.org/jira/browse/SOLR-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13249492#comment-13249492 ] Dawid Weiss commented on SOLR-3335: --- This is weird. I've had something like this before on the branch -- see SOLR-3233. If you go back to that particular revision it was reproducible (but no longer is with that seed). I didn't investigate further. testDistribSearch failure - Key: SOLR-3335 URL: https://issues.apache.org/jira/browse/SOLR-3335 Project: Solr Issue Type: Bug Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Trivial Fix For: 4.0 Happened on my test machine. Is there a way to disable these tests if we cannot fix them? There are two three tests that fail most of the time and that apparently nobody knows how to fix (including me). There is also a typo in the error message (I'm away from home for Easter, can't do it now). {noformat} build 06-Apr-2012 16:11:54[junit] Testsuite: org.apache.solr.cloud.RecoveryZkTest build 06-Apr-2012 16:11:54[junit] Testcase: testDistribSearch(org.apache.solr.cloud.RecoveryZkTest): FAILED build 06-Apr-2012 16:11:54[junit] There are still nodes recoverying build 06-Apr-2012 16:11:54[junit] junit.framework.AssertionFailedError: There are still nodes recoverying build 06-Apr-2012 16:11:54[junit] at org.junit.Assert.fail(Assert.java:93) build 06-Apr-2012 16:11:54[junit] at org.apache.solr.cloud.AbstractDistributedZkTestCase.waitForRecoveriesToFinish(AbstractDistributedZkTestCase.java:132) build 06-Apr-2012 16:11:54[junit] at org.apache.solr.cloud.RecoveryZkTest.doTest(RecoveryZkTest.java:84) build 06-Apr-2012 16:11:54[junit] at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:670) build 06-Apr-2012 16:11:54[junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) build 06-Apr-2012 16:11:54[junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) build 06-Apr-2012 16:11:54[junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) build 06-Apr-2012 16:11:54[junit] at java.lang.reflect.Method.invoke(Method.java:597) build 06-Apr-2012 16:11:54[junit] at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) build 06-Apr-2012 16:11:54[junit] at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) build 06-Apr-2012 16:11:54[junit] at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) build 06-Apr-2012 16:11:54[junit] at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) build 06-Apr-2012 16:11:54[junit] at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) build 06-Apr-2012 16:11:54[junit] at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30) build 06-Apr-2012 16:11:54[junit] at org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:63) build 06-Apr-2012 16:11:54[junit] at org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:754) build 06-Apr-2012 16:11:54[junit] at org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:670) build 06-Apr-2012 16:11:54[junit] at org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:69) build 06-Apr-2012 16:11:54[junit] at org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:591) build 06-Apr-2012 16:11:54[junit] at org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:75) build 06-Apr-2012 16:11:54[junit] at org.apache.lucene.util.LuceneTestCase$SaveThreadAndTestNameRule$1.evaluate(LuceneTestCase.java:642) build 06-Apr-2012 16:11:54[junit] at org.junit.rules.RunRules.evaluate(RunRules.java:18) build 06-Apr-2012 16:11:54[junit] at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) build 06-Apr-2012 16:11:54[junit] at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) build 06-Apr-2012 16:11:54[junit] at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:164) build 06-Apr-2012 16:11:54[junit] at
[jira] [Commented] (SOLR-3335) testDistribSearch failure
[ https://issues.apache.org/jira/browse/SOLR-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13249199#comment-13249199 ] Dawid Weiss commented on SOLR-3335: --- I'll wait a few days to give people a chance to object. If I hear nothing I will successively disable those tests that fail for me often (without much feedback). testDistribSearch failure - Key: SOLR-3335 URL: https://issues.apache.org/jira/browse/SOLR-3335 Project: Solr Issue Type: Bug Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Trivial Fix For: 4.0 Happened on my test machine. Is there a way to disable these tests if we cannot fix them? There are two three tests that fail most of the time and that apparently nobody knows how to fix (including me). There is also a typo in the error message (I'm away from home for Easter, can't do it now). {noformat} build 06-Apr-2012 16:11:54[junit] Testsuite: org.apache.solr.cloud.RecoveryZkTest build 06-Apr-2012 16:11:54[junit] Testcase: testDistribSearch(org.apache.solr.cloud.RecoveryZkTest): FAILED build 06-Apr-2012 16:11:54[junit] There are still nodes recoverying build 06-Apr-2012 16:11:54[junit] junit.framework.AssertionFailedError: There are still nodes recoverying build 06-Apr-2012 16:11:54[junit] at org.junit.Assert.fail(Assert.java:93) build 06-Apr-2012 16:11:54[junit] at org.apache.solr.cloud.AbstractDistributedZkTestCase.waitForRecoveriesToFinish(AbstractDistributedZkTestCase.java:132) build 06-Apr-2012 16:11:54[junit] at org.apache.solr.cloud.RecoveryZkTest.doTest(RecoveryZkTest.java:84) build 06-Apr-2012 16:11:54[junit] at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:670) build 06-Apr-2012 16:11:54[junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) build 06-Apr-2012 16:11:54[junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) build 06-Apr-2012 16:11:54[junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) build 06-Apr-2012 16:11:54[junit] at java.lang.reflect.Method.invoke(Method.java:597) build 06-Apr-2012 16:11:54[junit] at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) build 06-Apr-2012 16:11:54[junit] at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) build 06-Apr-2012 16:11:54[junit] at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) build 06-Apr-2012 16:11:54[junit] at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) build 06-Apr-2012 16:11:54[junit] at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) build 06-Apr-2012 16:11:54[junit] at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30) build 06-Apr-2012 16:11:54[junit] at org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:63) build 06-Apr-2012 16:11:54[junit] at org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:754) build 06-Apr-2012 16:11:54[junit] at org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:670) build 06-Apr-2012 16:11:54[junit] at org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:69) build 06-Apr-2012 16:11:54[junit] at org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:591) build 06-Apr-2012 16:11:54[junit] at org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:75) build 06-Apr-2012 16:11:54[junit] at org.apache.lucene.util.LuceneTestCase$SaveThreadAndTestNameRule$1.evaluate(LuceneTestCase.java:642) build 06-Apr-2012 16:11:54[junit] at org.junit.rules.RunRules.evaluate(RunRules.java:18) build 06-Apr-2012 16:11:54[junit] at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) build 06-Apr-2012 16:11:54[junit] at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) build 06-Apr-2012 16:11:54[junit] at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:164) build 06-Apr-2012 16:11:54[junit] at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57) build 06-Apr-2012
[jira] [Commented] (SOLR-3335) testDistribSearch failure
[ https://issues.apache.org/jira/browse/SOLR-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13249279#comment-13249279 ] Dawid Weiss commented on SOLR-3335: --- I couldn't reproduce it either. My test machine is an ubuntu quad core (I7) and it is running full Lucene builds much like Jenkins. There are a few recurring problems that I couldn't reproduce locally no matter what. This ALSO happens on LUCENE-3808 branch which leads me to believe the problem may stem from interaction between concurrently running JVMs, not the code itself (perhaps they're modifying each other's configs, perhaps something else). Anything comes to your mind? testDistribSearch failure - Key: SOLR-3335 URL: https://issues.apache.org/jira/browse/SOLR-3335 Project: Solr Issue Type: Bug Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Trivial Fix For: 4.0 Happened on my test machine. Is there a way to disable these tests if we cannot fix them? There are two three tests that fail most of the time and that apparently nobody knows how to fix (including me). There is also a typo in the error message (I'm away from home for Easter, can't do it now). {noformat} build 06-Apr-2012 16:11:54[junit] Testsuite: org.apache.solr.cloud.RecoveryZkTest build 06-Apr-2012 16:11:54[junit] Testcase: testDistribSearch(org.apache.solr.cloud.RecoveryZkTest): FAILED build 06-Apr-2012 16:11:54[junit] There are still nodes recoverying build 06-Apr-2012 16:11:54[junit] junit.framework.AssertionFailedError: There are still nodes recoverying build 06-Apr-2012 16:11:54[junit] at org.junit.Assert.fail(Assert.java:93) build 06-Apr-2012 16:11:54[junit] at org.apache.solr.cloud.AbstractDistributedZkTestCase.waitForRecoveriesToFinish(AbstractDistributedZkTestCase.java:132) build 06-Apr-2012 16:11:54[junit] at org.apache.solr.cloud.RecoveryZkTest.doTest(RecoveryZkTest.java:84) build 06-Apr-2012 16:11:54[junit] at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:670) build 06-Apr-2012 16:11:54[junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) build 06-Apr-2012 16:11:54[junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) build 06-Apr-2012 16:11:54[junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) build 06-Apr-2012 16:11:54[junit] at java.lang.reflect.Method.invoke(Method.java:597) build 06-Apr-2012 16:11:54[junit] at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) build 06-Apr-2012 16:11:54[junit] at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) build 06-Apr-2012 16:11:54[junit] at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) build 06-Apr-2012 16:11:54[junit] at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) build 06-Apr-2012 16:11:54[junit] at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) build 06-Apr-2012 16:11:54[junit] at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30) build 06-Apr-2012 16:11:54[junit] at org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:63) build 06-Apr-2012 16:11:54[junit] at org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:754) build 06-Apr-2012 16:11:54[junit] at org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:670) build 06-Apr-2012 16:11:54[junit] at org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:69) build 06-Apr-2012 16:11:54[junit] at org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:591) build 06-Apr-2012 16:11:54[junit] at org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:75) build 06-Apr-2012 16:11:54[junit] at org.apache.lucene.util.LuceneTestCase$SaveThreadAndTestNameRule$1.evaluate(LuceneTestCase.java:642) build 06-Apr-2012 16:11:54[junit] at org.junit.rules.RunRules.evaluate(RunRules.java:18) build 06-Apr-2012 16:11:54[junit] at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) build 06-Apr-2012 16:11:54[junit] at
[jira] [Commented] (SOLR-3335) testDistribSearch failure
[ https://issues.apache.org/jira/browse/SOLR-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13249290#comment-13249290 ] Dawid Weiss commented on SOLR-3335: --- Try looping over full ant test cycles (maybe limited to solr-core only). I did this a while back in a shell loop and redirected output to files. This brought back some failures after 30 iterations or so. I can also try to see if doing the above with 1 forked jvm is any different than with 3-4 forked jvms -- this would make it clear if it's a concurrent tests conflict or not (and possibly provide a way to reproduce). Thanks for trying to clean this up -- it's been bugging me for a while now. testDistribSearch failure - Key: SOLR-3335 URL: https://issues.apache.org/jira/browse/SOLR-3335 Project: Solr Issue Type: Bug Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Trivial Fix For: 4.0 Happened on my test machine. Is there a way to disable these tests if we cannot fix them? There are two three tests that fail most of the time and that apparently nobody knows how to fix (including me). There is also a typo in the error message (I'm away from home for Easter, can't do it now). {noformat} build 06-Apr-2012 16:11:54[junit] Testsuite: org.apache.solr.cloud.RecoveryZkTest build 06-Apr-2012 16:11:54[junit] Testcase: testDistribSearch(org.apache.solr.cloud.RecoveryZkTest): FAILED build 06-Apr-2012 16:11:54[junit] There are still nodes recoverying build 06-Apr-2012 16:11:54[junit] junit.framework.AssertionFailedError: There are still nodes recoverying build 06-Apr-2012 16:11:54[junit] at org.junit.Assert.fail(Assert.java:93) build 06-Apr-2012 16:11:54[junit] at org.apache.solr.cloud.AbstractDistributedZkTestCase.waitForRecoveriesToFinish(AbstractDistributedZkTestCase.java:132) build 06-Apr-2012 16:11:54[junit] at org.apache.solr.cloud.RecoveryZkTest.doTest(RecoveryZkTest.java:84) build 06-Apr-2012 16:11:54[junit] at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:670) build 06-Apr-2012 16:11:54[junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) build 06-Apr-2012 16:11:54[junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) build 06-Apr-2012 16:11:54[junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) build 06-Apr-2012 16:11:54[junit] at java.lang.reflect.Method.invoke(Method.java:597) build 06-Apr-2012 16:11:54[junit] at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) build 06-Apr-2012 16:11:54[junit] at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) build 06-Apr-2012 16:11:54[junit] at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) build 06-Apr-2012 16:11:54[junit] at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) build 06-Apr-2012 16:11:54[junit] at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) build 06-Apr-2012 16:11:54[junit] at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30) build 06-Apr-2012 16:11:54[junit] at org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:63) build 06-Apr-2012 16:11:54[junit] at org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:754) build 06-Apr-2012 16:11:54[junit] at org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:670) build 06-Apr-2012 16:11:54[junit] at org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:69) build 06-Apr-2012 16:11:54[junit] at org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:591) build 06-Apr-2012 16:11:54[junit] at org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:75) build 06-Apr-2012 16:11:54[junit] at org.apache.lucene.util.LuceneTestCase$SaveThreadAndTestNameRule$1.evaluate(LuceneTestCase.java:642) build 06-Apr-2012 16:11:54[junit] at org.junit.rules.RunRules.evaluate(RunRules.java:18) build 06-Apr-2012 16:11:54[junit] at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) build 06-Apr-2012 16:11:54[junit] at
[jira] [Commented] (SOLR-3328) executable bits of shellscripts in solr source release
[ https://issues.apache.org/jira/browse/SOLR-3328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13248138#comment-13248138 ] Dawid Weiss commented on SOLR-3328: --- http://ant.apache.org/manual/Tasks/zip.html bq. Starting with Ant 1.5.2, zip can store Unix permissions inside the archive (see description of the filemode and dirmode attributes for zipfileset). Unfortunately there is no portable way to store these permissions. Ant uses the algorithm used by Info-Zip's implementation of the zip and unzip commands - these are the default versions of zip and unzip for many Unix and Unix-like systems. I remember we used to ZIP with unix permissions and they unzipped just fine (with permission sets). executable bits of shellscripts in solr source release -- Key: SOLR-3328 URL: https://issues.apache.org/jira/browse/SOLR-3328 Project: Solr Issue Type: Improvement Components: Build Reporter: Robert Muir Fix For: 4.0 HossmanSays: in the solr src releases, some shell scripts are not executable by default. I don't know if we can improve this? Maybe its an svn prop? Maybe something needs to be specified to the tar/zip process? Currently the 'source release' is really an svn export... Personally i always do 'sh foo.sh' rather than './foo.sh', but if it makes it more user-friendly we should figure it out Just opening the issue since we don't forget about it, I think solr cloud adds some more shell scripts so we should at least figure out what we want to do. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3950) load rat via ivy for rat-sources task
[ https://issues.apache.org/jira/browse/LUCENE-3950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13246323#comment-13246323 ] Dawid Weiss commented on LUCENE-3950: - +1. I think this, license checks, CRLFs and other non-code things should be part of an integration test target. So that if you want to actually test code you can apply a filter and have a quick turnaround and for full integration tests you can fire them before the commit etc. load rat via ivy for rat-sources task - Key: LUCENE-3950 URL: https://issues.apache.org/jira/browse/LUCENE-3950 Project: Lucene - Java Issue Type: Improvement Components: general/build Reporter: Robert Muir we now fail the build on rat problems (LUCENE-1866), so we should make it easy to run rat-sources for people to test locally (it takes like 3 seconds total for the whole trunk) Also this is safer than putting rat in your ~/.ant/lib because that adds some classes from commons to your ant classpath (which we currently wrongly use in compile). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3943) Use ivy cachepath and cachefileset instead of ivy retrieve
[ https://issues.apache.org/jira/browse/LUCENE-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245139#comment-13245139 ] Dawid Weiss commented on LUCENE-3943: - This will require moving license checks till after the distribution is assembled, but it's a good idea. It's much like with Maven when things get stored once and IDEs and the build system reuses the same artifacts. Use ivy cachepath and cachefileset instead of ivy retrieve -- Key: LUCENE-3943 URL: https://issues.apache.org/jira/browse/LUCENE-3943 Project: Lucene - Java Issue Type: Improvement Components: general/build Reporter: Chris Male In LUCENE-3930 we moved to resolving all external dependencies using ivy:retrieve. This process places the dependencies into the lib/ folder of the respective modules which was ideal since it replicated the existing build process and limited the number of changes to be made to the build. However it can lead to multiple jars for the same dependency in the lib folder when the dependency is upgraded, and just isn't the most efficient way to use Ivy. Uwe pointed out that we can remove the ivy:retrieve calls and make use of ivy:cachepath and ivy:cachefileset to build our classpaths and packages respectively, which will go some way to addressing these limitations -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3943) Use ivy cachepath and cachefileset instead of ivy retrieve
[ https://issues.apache.org/jira/browse/LUCENE-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245147#comment-13245147 ] Dawid Weiss commented on LUCENE-3943: - Yep, sure. Use ivy cachepath and cachefileset instead of ivy retrieve -- Key: LUCENE-3943 URL: https://issues.apache.org/jira/browse/LUCENE-3943 Project: Lucene - Java Issue Type: Improvement Components: general/build Reporter: Chris Male In LUCENE-3930 we moved to resolving all external dependencies using ivy:retrieve. This process places the dependencies into the lib/ folder of the respective modules which was ideal since it replicated the existing build process and limited the number of changes to be made to the build. However it can lead to multiple jars for the same dependency in the lib folder when the dependency is upgraded, and just isn't the most efficient way to use Ivy. Uwe pointed out that we can remove the ivy:retrieve calls and make use of ivy:cachepath and ivy:cachefileset to build our classpaths and packages respectively, which will go some way to addressing these limitations -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3943) Use ivy cachepath and cachefileset instead of ivy retrieve
[ https://issues.apache.org/jira/browse/LUCENE-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245551#comment-13245551 ] Dawid Weiss commented on LUCENE-3943: - bq. In my opinion, the ideal situation would be that we pass these filesets directly to the zip/tar/gz whatever in the binary release targets +1. Use ivy cachepath and cachefileset instead of ivy retrieve -- Key: LUCENE-3943 URL: https://issues.apache.org/jira/browse/LUCENE-3943 Project: Lucene - Java Issue Type: Improvement Components: general/build Reporter: Chris Male In LUCENE-3930 we moved to resolving all external dependencies using ivy:retrieve. This process places the dependencies into the lib/ folder of the respective modules which was ideal since it replicated the existing build process and limited the number of changes to be made to the build. However it can lead to multiple jars for the same dependency in the lib folder when the dependency is upgraded, and just isn't the most efficient way to use Ivy. Uwe pointed out that _when working from svn or in using src releases_ we can remove the ivy:retrieve calls and make use of ivy:cachepath and ivy:cachefileset to build our classpaths and packages respectively, which will go some way to addressing these limitations -- however we still need the build system capable of putting the actual jars into specific lib folders when assembling the binary artifacts -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3944) ant clean should remove pom.xml's
[ https://issues.apache.org/jira/browse/LUCENE-3944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245554#comment-13245554 ] Dawid Weiss commented on LUCENE-3944: - bq. I think Maven Ant Tasks' deploy target needs to be able to access the parent and grandparent POMs, which (I think) means either putting them into the user's local maven repository, or putting them at the relative location given in the parent POM section of each POM. I just recently peeked at Apache ANT's source distribution and this seems to be done this way (separate folder structure just for POMs with relative refs). ant clean should remove pom.xml's - Key: LUCENE-3944 URL: https://issues.apache.org/jira/browse/LUCENE-3944 Project: Lucene - Java Issue Type: Improvement Components: general/build Reporter: Chris Male Priority: Blocker Fix For: 3.6, 4.0 Attachments: LUCENE-3944.patch, LUCENE-3944.patch Currently once the pom.xml's are in place, its hard to get them out. Having them can be a little trappy when you're trying to debug the bug. We should facilitate their removal during clean. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3945) we should include checksums for every jar ivy fetches in svn src releases to verify the jars are the ones we expect
[ https://issues.apache.org/jira/browse/LUCENE-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245632#comment-13245632 ] Dawid Weiss commented on LUCENE-3945: - {noformat} reader = new BufferedReader(new FileReader(f)); {noformat} Isn't this locale-sensitive? I think it should be explicit UTF-8 (or US-ASCII for that matter). {noformat} + String hexStr = Integer.toHexString(CHECKSUM_BYTE_MASK digest[i]); + if (hexStr.length() 2) { +checksum.append(0); + } + checksum.append(hexStr); {noformat} Isn't any of these simpler? {noformat} checksum.append(String.format(Locale.ENGLISH, %02x, CHECKSUM_BYTE_MASK digest[i])); {noformat} or {noformat} char [] HEX = 0123456789abcdef.toCharArray(); int v = digest[i]; checksum.append(HEX[(v 4) 0x0F]).append(HEX 0x0F); {noformat} we should include checksums for every jar ivy fetches in svn src releases to verify the jars are the ones we expect - Key: LUCENE-3945 URL: https://issues.apache.org/jira/browse/LUCENE-3945 Project: Lucene - Java Issue Type: Task Reporter: Hoss Man Fix For: 3.6, 4.0 Attachments: LUCENE-3945.patch Conversation with rmuir last night got me thinking about the fact that one thing we lose by using ivy is confidence that every user of a release is compiling against (and likely using at run time) the same dependencies as every other user. Up to 3.5, users of src and binary releases could be confident that the jars included in the release were the same jars the lucene devs vetted and tested against when voting on the release candidate, but with ivy there is now the possibility that after the source release is published, the owner of a domain where these dependencies are hosted might change the jars in some way w/o anyone knowing. Likewise: we as developers could commit an ivy.xml file pointing to a specific URL which we then use for and test for months, and just prior to a release, the contents of the remote URL could change such that a JAR included in the binary artifacts might not match the ones we've vetted and tested leading up to that RC. So i propose that we include checksum files in svn and in our source releases that can be used by users to verify that the jars they get from ivy match the jars we tested against. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3945) we should include checksums for every jar ivy fetches in svn src releases to verify the jars are the ones we expect
[ https://issues.apache.org/jira/browse/LUCENE-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245634#comment-13245634 ] Dawid Weiss commented on LUCENE-3945: - Btw. you can also avoid a recrawl by passing a refid of the same fileset to two tasks rather than constructing a new one in each. I don't mind renaming the class either. we should include checksums for every jar ivy fetches in svn src releases to verify the jars are the ones we expect - Key: LUCENE-3945 URL: https://issues.apache.org/jira/browse/LUCENE-3945 Project: Lucene - Java Issue Type: Task Reporter: Hoss Man Fix For: 3.6, 4.0 Attachments: LUCENE-3945.patch Conversation with rmuir last night got me thinking about the fact that one thing we lose by using ivy is confidence that every user of a release is compiling against (and likely using at run time) the same dependencies as every other user. Up to 3.5, users of src and binary releases could be confident that the jars included in the release were the same jars the lucene devs vetted and tested against when voting on the release candidate, but with ivy there is now the possibility that after the source release is published, the owner of a domain where these dependencies are hosted might change the jars in some way w/o anyone knowing. Likewise: we as developers could commit an ivy.xml file pointing to a specific URL which we then use for and test for months, and just prior to a release, the contents of the remote URL could change such that a JAR included in the binary artifacts might not match the ones we've vetted and tested leading up to that RC. So i propose that we include checksum files in svn and in our source releases that can be used by users to verify that the jars they get from ivy match the jars we tested against. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy
[ https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244008#comment-13244008 ] Dawid Weiss commented on LUCENE-3930: - Looks good to me, Chris. Two minor things: 1) sourceDirectory and testSourceDirectory look like default values anyway? 2) there is a newer version of jsonic in maven repositories; don't know if this matters at all. nuke jars from source tree and use ivy -- Key: LUCENE-3930 URL: https://issues.apache.org/jira/browse/LUCENE-3930 Project: Lucene - Java Issue Type: Task Components: general/build Reporter: Robert Muir Assignee: Robert Muir Priority: Blocker Fix For: 3.6 Attachments: LUCENE-3930-skip-sources-javadoc.patch, LUCENE-3930-solr-example.patch, LUCENE-3930-solr-example.patch, LUCENE-3930.patch, LUCENE-3930.patch, LUCENE-3930.patch, LUCENE-3930__ivy_bootstrap_target.patch, LUCENE-3930_includetestlibs_excludeexamplexml.patch, ant_-verbose_clean_test.out.txt, noggit-commons-csv.patch, patch-jetty-build.patch, pom.xml As mentioned on the ML thread: switch jars to ivy mechanism?. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3774) check-legal isn't doing its job
[ https://issues.apache.org/jira/browse/LUCENE-3774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243162#comment-13243162 ] Dawid Weiss commented on LUCENE-3774: - I'm for pushing it to the top level. This will simplify handling of exceptional patterns and such too. Shouldn't be much of a problem to move it too. check-legal isn't doing its job --- Key: LUCENE-3774 URL: https://issues.apache.org/jira/browse/LUCENE-3774 Project: Lucene - Java Issue Type: Improvement Components: general/build Affects Versions: 3.6, 4.0 Reporter: Steven Rowe Assignee: Dawid Weiss Fix For: 3.6, 4.0 Attachments: LUCENE-3774.patch, LUCENE-3774.patch, LUCENE-3774.patch, LUCENE-3774.patch, LUCENE-3774.patch, LUCENE-3774.patch, LUCENE3774.patch, backport.patch In trunk, the {{check-legal-lucene}} ant target is not checking any {{lucene/contrib/\*\*/lib/}} directories; the {{modules/**/lib/}} directories are not being checked; and {{check-legal-solr}} can't be checking {{solr/example/lib/\*\*/\*.jar}}, because there are currently {{.jar}} files in there that don't have a license. These targets are set up to take in a full list of {{lib/}} directories in which to check, but modules move around, and these lists are not being kept up-to-date. Instead, {{check-legal-\*}} should run for each module, if the module has a {{lib/}} directory, and it should be specialized for modules that have more than one ({{solr/core/}}) or that have a {{lib/}} directory in a non-standard place ({{lucene/core/}}). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3296) Explore alternatives to Commons CSV
[ https://issues.apache.org/jira/browse/SOLR-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242104#comment-13242104 ] Dawid Weiss commented on SOLR-3296: --- BSD or ASL2 -- either is fine with another ASL2 project. Explore alternatives to Commons CSV --- Key: SOLR-3296 URL: https://issues.apache.org/jira/browse/SOLR-3296 Project: Solr Issue Type: Improvement Components: Build Reporter: Chris Male In LUCENE-3930 we're implementing some less than ideal solutions to make available the unreleased version of commons-csv. We could remove these solutions if we didn't rely on this lib. So I think we should explore alternatives. I think [opencsv|http://opencsv.sourceforge.net/] is an alternative to consider, I've used it in many commercial projects. Bizarrely Commons-CSV's website says that Opencsv uses a BSD license, but this isn't the case, OpenCSV uses ASL2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3295) Binaries contain 1.6 classes
[ https://issues.apache.org/jira/browse/SOLR-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242106#comment-13242106 ] Dawid Weiss commented on SOLR-3295: --- bq. if all tests pass without this jar, why do we need it? It's some obscure (?) data format that tika can convert to plain text. I've never seen it, don't know what it is. Uwe filed a bug for Tika. Binaries contain 1.6 classes Key: SOLR-3295 URL: https://issues.apache.org/jira/browse/SOLR-3295 Project: Solr Issue Type: Bug Reporter: Dawid Weiss Priority: Minor Fix For: 3.6 Attachments: output.log I've ran this tool (does the job): http://code.google.com/p/versioncheck/ on the checkout of branch_3x. To my surprise there is a JAR which contains Java 1.6 code: {noformat} Major.Minor Version : 50.0 JAVA compatibility : Java 1.6 platform: 45.3-50.0 Number of classes : 60 Classes are : c:\Work\lucene-solr\.\solr\contrib\extraction\lib\netcdf-4.2-min.jar [:] ucar/unidata/geoloc/Bearing.class ... {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3296) Explore alternatives to Commons CSV
[ https://issues.apache.org/jira/browse/SOLR-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242111#comment-13242111 ] Dawid Weiss commented on SOLR-3296: --- I used GSON (http://code.google.com/p/google-gson/) and was happy with it. It even contains sanity checks which come in handly if you're emitting insane data... Explore alternatives to Commons CSV --- Key: SOLR-3296 URL: https://issues.apache.org/jira/browse/SOLR-3296 Project: Solr Issue Type: Improvement Components: Build Reporter: Chris Male In LUCENE-3930 we're implementing some less than ideal solutions to make available the unreleased version of commons-csv. We could remove these solutions if we didn't rely on this lib. So I think we should explore alternatives. I think [opencsv|http://opencsv.sourceforge.net/] is an alternative to consider, I've used it in many commercial projects. Bizarrely Commons-CSV's website says that Opencsv uses a BSD license, but this isn't the case, OpenCSV uses ASL2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3295) Binaries contain 1.6 classes
[ https://issues.apache.org/jira/browse/SOLR-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242121#comment-13242121 ] Dawid Weiss commented on SOLR-3295: --- Climate format data? Man... This just calls for a custom simplified parser that would read the header and forget the rest. And it'd be 5mb less to distribute... Binaries contain 1.6 classes Key: SOLR-3295 URL: https://issues.apache.org/jira/browse/SOLR-3295 Project: Solr Issue Type: Bug Reporter: Dawid Weiss Priority: Minor Fix For: 3.6 Attachments: output.log I've ran this tool (does the job): http://code.google.com/p/versioncheck/ on the checkout of branch_3x. To my surprise there is a JAR which contains Java 1.6 code: {noformat} Major.Minor Version : 50.0 JAVA compatibility : Java 1.6 platform: 45.3-50.0 Number of classes : 60 Classes are : c:\Work\lucene-solr\.\solr\contrib\extraction\lib\netcdf-4.2-min.jar [:] ucar/unidata/geoloc/Bearing.class ... {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3935) Optimize Kuromoji inner loop - rewrite ConnectionCosts.get() method
[ https://issues.apache.org/jira/browse/LUCENE-3935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242183#comment-13242183 ] Dawid Weiss commented on LUCENE-3935: - bq. I did this hastily last night and results suggested that there wasn't a lot to be gained on Mac OS X I agree it may not be noticeable because there are so many factors kicking in here (smaller structure - better cpu cache utilization vs. larger structure - potentially faster access to each value but potential cache misses). Makes sense to keep short[] in place, ignore my comment. Optimize Kuromoji inner loop - rewrite ConnectionCosts.get() method --- Key: LUCENE-3935 URL: https://issues.apache.org/jira/browse/LUCENE-3935 Project: Lucene - Java Issue Type: Improvement Components: modules/analysis Affects Versions: 3.6, 4.0 Reporter: Christian Moen Assignee: Christian Moen Attachments: LUCENE-3935.patch I've been profiling Kuromoji, and not very surprisingly, method {{ConnectionCosts.get(int forwardId, int backwardId)}} that looks up costs in the Viterbi is called many many times and contributes to more processing time than I had expected. This method is currently backed by a {{short[][]}}. This data stored here structure is a two dimensional array with both dimensions being fixed with 1316 elements in each dimension. (The data is {{matrix.def}} in MeCab-IPADIC.) We can rewrite this to use a single one-dimensional array instead, and we will at least save one bounds check, a pointer reference, and we should also get much better cache utilization since this structure is likely to be in very local CPU cache. I think this will be a nice optimization. Working on it... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy
[ https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242185#comment-13242185 ] Dawid Weiss commented on LUCENE-3930: - Don't we need to get rid of the binary JAR anyway? If so, the alternatives are to either put all the sources in lucene repo or push a maven release of that JAR. SonaType accepts third-party JAR pushes too -- one can do it as a last resort option. https://docs.sonatype.org/display/Repository/Uploading+3rd-party+Artifacts+to+The+Central+Repository nuke jars from source tree and use ivy -- Key: LUCENE-3930 URL: https://issues.apache.org/jira/browse/LUCENE-3930 Project: Lucene - Java Issue Type: Task Components: general/build Reporter: Robert Muir Assignee: Robert Muir Priority: Blocker Fix For: 3.6 Attachments: LUCENE-3930-skip-sources-javadoc.patch, LUCENE-3930-solr-example.patch, LUCENE-3930-solr-example.patch, LUCENE-3930.patch, LUCENE-3930.patch, LUCENE-3930.patch, LUCENE-3930__ivy_bootstrap_target.patch, ant_-verbose_clean_test.out.txt, noggit-commons-csv.patch, patch-jetty-build.patch As mentioned on the ML thread: switch jars to ivy mechanism?. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy
[ https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242203#comment-13242203 ] Dawid Weiss commented on LUCENE-3930: - You can do it with Maven by specifying an optional system dependency off the project's basedir and fetching the JAR in a preliminary phase... I think. But it's a hack beyond dirty. And it doesn't make other people's lives any easier (if somebody uses your pom). nuke jars from source tree and use ivy -- Key: LUCENE-3930 URL: https://issues.apache.org/jira/browse/LUCENE-3930 Project: Lucene - Java Issue Type: Task Components: general/build Reporter: Robert Muir Assignee: Robert Muir Priority: Blocker Fix For: 3.6 Attachments: LUCENE-3930-skip-sources-javadoc.patch, LUCENE-3930-solr-example.patch, LUCENE-3930-solr-example.patch, LUCENE-3930.patch, LUCENE-3930.patch, LUCENE-3930.patch, LUCENE-3930__ivy_bootstrap_target.patch, ant_-verbose_clean_test.out.txt, noggit-commons-csv.patch, patch-jetty-build.patch As mentioned on the ML thread: switch jars to ivy mechanism?. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3296) Explore alternatives to Commons CSV
[ https://issues.apache.org/jira/browse/SOLR-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242204#comment-13242204 ] Dawid Weiss commented on SOLR-3296: --- I didn't know it's Yonik's actually. It even has a pom.xml file -- http://svn.apache.org/repos/asf/labs/noggit/? Yonik if you have an account at SonaType this takes as much as changing the revision number to something without a SNAPSHOT and an mvn deploy (plus accept from Nexus). Let me know if you need some guidance but it should be a 10 minute effort if you have the maven code ready. Explore alternatives to Commons CSV --- Key: SOLR-3296 URL: https://issues.apache.org/jira/browse/SOLR-3296 Project: Solr Issue Type: Improvement Components: Build Reporter: Chris Male In LUCENE-3930 we're implementing some less than ideal solutions to make available the unreleased version of commons-csv. We could remove these solutions if we didn't rely on this lib. So I think we should explore alternatives. I think [opencsv|http://opencsv.sourceforge.net/] is an alternative to consider, I've used it in many commercial projects. Bizarrely Commons-CSV's website says that Opencsv uses a BSD license, but this isn't the case, OpenCSV uses ASL2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy
[ https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242237#comment-13242237 ] Dawid Weiss commented on LUCENE-3930: - If we go the third party route I suggest to release an artifact with a -jdk15 classifier to make it explicit it's a 1.5 build. Perhaps we can suggest to the maintainer to compile with 1.5 compatibility if this doesn't involve any source code changes? nuke jars from source tree and use ivy -- Key: LUCENE-3930 URL: https://issues.apache.org/jira/browse/LUCENE-3930 Project: Lucene - Java Issue Type: Task Components: general/build Reporter: Robert Muir Assignee: Robert Muir Priority: Blocker Fix For: 3.6 Attachments: LUCENE-3930-skip-sources-javadoc.patch, LUCENE-3930-solr-example.patch, LUCENE-3930-solr-example.patch, LUCENE-3930.patch, LUCENE-3930.patch, LUCENE-3930.patch, LUCENE-3930__ivy_bootstrap_target.patch, ant_-verbose_clean_test.out.txt, noggit-commons-csv.patch, patch-jetty-build.patch As mentioned on the ML thread: switch jars to ivy mechanism?. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3296) Explore alternatives to Commons CSV
[ https://issues.apache.org/jira/browse/SOLR-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242374#comment-13242374 ] Dawid Weiss commented on SOLR-3296: --- I guess this means official apache releases but if the release is done in a private namespace then this isn't a problem? I mean -- I could probably take the source right now, change the group id to something I have access to (com.carrotsearch.thirdparty) and release it, but so can Yonik (under his own domain or whatever namespace he wishes that is different than Apache's)? I admit this is kind of weird that Solr is using something that cannot be officially released. Why not make it part of Solr then? Just copy the source code over and publish as a separate artefact? Explore alternatives to Commons CSV --- Key: SOLR-3296 URL: https://issues.apache.org/jira/browse/SOLR-3296 Project: Solr Issue Type: Improvement Components: Build Reporter: Chris Male In LUCENE-3930 we're implementing some less than ideal solutions to make available the unreleased version of commons-csv. We could remove these solutions if we didn't rely on this lib. So I think we should explore alternatives. I think [opencsv|http://opencsv.sourceforge.net/] is an alternative to consider, I've used it in many commercial projects. Bizarrely Commons-CSV's website says that Opencsv uses a BSD license, but this isn't the case, OpenCSV uses ASL2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy
[ https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242513#comment-13242513 ] Dawid Weiss commented on LUCENE-3930: - +1. Maybe it's good that this issue came out. I think it straightened a few things out. nuke jars from source tree and use ivy -- Key: LUCENE-3930 URL: https://issues.apache.org/jira/browse/LUCENE-3930 Project: Lucene - Java Issue Type: Task Components: general/build Reporter: Robert Muir Assignee: Robert Muir Priority: Blocker Fix For: 3.6 Attachments: LUCENE-3930-skip-sources-javadoc.patch, LUCENE-3930-solr-example.patch, LUCENE-3930-solr-example.patch, LUCENE-3930.patch, LUCENE-3930.patch, LUCENE-3930.patch, LUCENE-3930__ivy_bootstrap_target.patch, ant_-verbose_clean_test.out.txt, noggit-commons-csv.patch, patch-jetty-build.patch As mentioned on the ML thread: switch jars to ivy mechanism?. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3295) Binaries contain 1.6 classes
[ https://issues.apache.org/jira/browse/SOLR-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242517#comment-13242517 ] Dawid Weiss commented on SOLR-3295: --- bq. It's some obscure data format I meant no offense, just my take at how many people in the wild may be using it compared to how many download solr in general. Binaries contain 1.6 classes Key: SOLR-3295 URL: https://issues.apache.org/jira/browse/SOLR-3295 Project: Solr Issue Type: Bug Reporter: Dawid Weiss Assignee: Robert Muir Priority: Minor Fix For: 3.6 Attachments: output.log I've ran this tool (does the job): http://code.google.com/p/versioncheck/ on the checkout of branch_3x. To my surprise there is a JAR which contains Java 1.6 code: {noformat} Major.Minor Version : 50.0 JAVA compatibility : Java 1.6 platform: 45.3-50.0 Number of classes : 60 Classes are : c:\Work\lucene-solr\.\solr\contrib\extraction\lib\netcdf-4.2-min.jar [:] ucar/unidata/geoloc/Bearing.class ... {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy
[ https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13241056#comment-13241056 ] Dawid Weiss commented on LUCENE-3930: - bq. So you have to install ivy in your ~/.ant/lib I personally don't like it when I need to install ant dependencies in a global scope -- this may not be a problem from one project's perspective but if you're working on multiple projects then this can result in global dependencies shadowing project's local definitions and debugging this is a pain. Not to mention it's another requirement after checkout. I won't be able to look into this now, just expressing my opinion. nuke jars from source tree and use ivy -- Key: LUCENE-3930 URL: https://issues.apache.org/jira/browse/LUCENE-3930 Project: Lucene - Java Issue Type: Task Components: general/build Reporter: Robert Muir Assignee: Robert Muir Priority: Blocker Fix For: 3.6 Attachments: LUCENE-3930-solr-example.patch, LUCENE-3930-solr-example.patch, LUCENE-3930.patch, LUCENE-3930.patch, LUCENE-3930.patch, ant_-verbose_clean_test.out.txt, patch-jetty-build.patch As mentioned on the ML thread: switch jars to ivy mechanism?. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy
[ https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13241083#comment-13241083 ] Dawid Weiss commented on LUCENE-3930: - bq. Having jars in our source release is. Is a requirement that the source release must build out of the box? Or is it about code publication only? I don't know, I'm asking. This seems like a revolutionary build change before the release :) bq. Having the build OOM is. So I made a tradeoff. I understand this, but my gut feeling still says that if you need to install ivy in an ant global space you might as well set ANT_OPTS to increase permgen... The tradeoff made is one of many. bq. i need your help... its just that simple Can't jump into it right now, sorry. I'll take a look when I get a spare cycle though. I'm not sure it can be fixed but I'll take a look. nuke jars from source tree and use ivy -- Key: LUCENE-3930 URL: https://issues.apache.org/jira/browse/LUCENE-3930 Project: Lucene - Java Issue Type: Task Components: general/build Reporter: Robert Muir Assignee: Robert Muir Priority: Blocker Fix For: 3.6 Attachments: LUCENE-3930-solr-example.patch, LUCENE-3930-solr-example.patch, LUCENE-3930.patch, LUCENE-3930.patch, LUCENE-3930.patch, ant_-verbose_clean_test.out.txt, patch-jetty-build.patch As mentioned on the ML thread: switch jars to ivy mechanism?. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy
[ https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13241091#comment-13241091 ] Dawid Weiss commented on LUCENE-3930: - Clear. It's a pity we have to deal with it right before the release but I understand (or rather accept) the rationale. nuke jars from source tree and use ivy -- Key: LUCENE-3930 URL: https://issues.apache.org/jira/browse/LUCENE-3930 Project: Lucene - Java Issue Type: Task Components: general/build Reporter: Robert Muir Assignee: Robert Muir Priority: Blocker Fix For: 3.6 Attachments: LUCENE-3930-solr-example.patch, LUCENE-3930-solr-example.patch, LUCENE-3930.patch, LUCENE-3930.patch, LUCENE-3930.patch, ant_-verbose_clean_test.out.txt, patch-jetty-build.patch As mentioned on the ML thread: switch jars to ivy mechanism?. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy
[ https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13241157#comment-13241157 ] Dawid Weiss commented on LUCENE-3930: - For git the revision md5 is unique and you can always do a checkout of a particular revision (typically using so-called detached head). This just moves you to a particular version in the revision tree. nuke jars from source tree and use ivy -- Key: LUCENE-3930 URL: https://issues.apache.org/jira/browse/LUCENE-3930 Project: Lucene - Java Issue Type: Task Components: general/build Reporter: Robert Muir Assignee: Robert Muir Priority: Blocker Fix For: 3.6 Attachments: LUCENE-3930-solr-example.patch, LUCENE-3930-solr-example.patch, LUCENE-3930.patch, LUCENE-3930.patch, LUCENE-3930.patch, ant_-verbose_clean_test.out.txt, patch-jetty-build.patch As mentioned on the ML thread: switch jars to ivy mechanism?. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3935) Optimize Kuromoji inner loop - rewrite ConnectionCosts.get() method
[ https://issues.apache.org/jira/browse/LUCENE-3935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13241166#comment-13241166 ] Dawid Weiss commented on LUCENE-3935: - Ah, this brings back a small project that kind of lies in a dormant state for some time -- I've written an annotation processor that generated classes for handling arrays of struct-like types (objects with fields only), including flattened multi-dimensional arrays. The code is on a branch here -- https://github.com/carrotsearch/hppc/blob/structs/hppc-examples/src/main/java/com/carrotsearch/hppc/examples/BattleshipsCell.java But it's been a while, I need to get back to it, it may be useful. Optimize Kuromoji inner loop - rewrite ConnectionCosts.get() method --- Key: LUCENE-3935 URL: https://issues.apache.org/jira/browse/LUCENE-3935 Project: Lucene - Java Issue Type: Improvement Components: modules/analysis Affects Versions: 3.6, 4.0 Reporter: Christian Moen I've been profiling Kuromoji, and not very surprisingly, method {{ConnectionCosts.get(int forwardId, int backwardId)}} that looks up costs in the Viterbi is called many many times and contributes to more processing time than I had expected. This method is currently backed by a {{short[][]}}. This data stored here structure is a two dimensional array with both dimensions being fixed with 1316 elements in each dimension. (The data is {{matrix.def}} in MeCab-IPADIC.) We can rewrite this to use a single one-dimensional array instead, and we will at least save one bounds check, a pointer reference, and we should also get much better cache utilization since this structure is likely to be in very local CPU cache. I think this will be a nice optimization. Working on it... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy
[ https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13241276#comment-13241276 ] Dawid Weiss commented on LUCENE-3930: - Moving C2 JARs out means we will have to re-release an archival version of C2 just for this purpose. The reasons are that we depend on libraries which themselves don't have 1.5 equivalents (mahout-math) so any newer version would have to be re-released along with backcompat of these libraries too... long story. Anyway, I will release a weird-looking version 3.5.0.1 which will be 1.5 compatible. I will let you know (and possibly modify the branch) once this happens. Give me a few hours, it'll require some checks/ testing. nuke jars from source tree and use ivy -- Key: LUCENE-3930 URL: https://issues.apache.org/jira/browse/LUCENE-3930 Project: Lucene - Java Issue Type: Task Components: general/build Reporter: Robert Muir Assignee: Robert Muir Priority: Blocker Fix For: 3.6 Attachments: LUCENE-3930-solr-example.patch, LUCENE-3930-solr-example.patch, LUCENE-3930.patch, LUCENE-3930.patch, LUCENE-3930.patch, ant_-verbose_clean_test.out.txt, noggit-commons-csv.patch, patch-jetty-build.patch As mentioned on the ML thread: switch jars to ivy mechanism?. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3935) Optimize Kuromoji inner loop - rewrite ConnectionCosts.get() method
[ https://issues.apache.org/jira/browse/LUCENE-3935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13241343#comment-13241343 ] Dawid Weiss commented on LUCENE-3935: - +1. If this is called very frequently and you can affort storing ints instead of shorts then an int[] will have better alignment properties (and will not require extending to an int). May or may not play a difference depending on architecture (cpu cache sizes also matter here). Optimize Kuromoji inner loop - rewrite ConnectionCosts.get() method --- Key: LUCENE-3935 URL: https://issues.apache.org/jira/browse/LUCENE-3935 Project: Lucene - Java Issue Type: Improvement Components: modules/analysis Affects Versions: 3.6, 4.0 Reporter: Christian Moen Attachments: LUCENE-3935.patch I've been profiling Kuromoji, and not very surprisingly, method {{ConnectionCosts.get(int forwardId, int backwardId)}} that looks up costs in the Viterbi is called many many times and contributes to more processing time than I had expected. This method is currently backed by a {{short[][]}}. This data stored here structure is a two dimensional array with both dimensions being fixed with 1316 elements in each dimension. (The data is {{matrix.def}} in MeCab-IPADIC.) We can rewrite this to use a single one-dimensional array instead, and we will at least save one bounds check, a pointer reference, and we should also get much better cache utilization since this structure is likely to be in very local CPU cache. I think this will be a nice optimization. Working on it... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3294) Remove binary carrot2.jar and replace it with a maven dependency.
[ https://issues.apache.org/jira/browse/SOLR-3294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13241541#comment-13241541 ] Dawid Weiss commented on SOLR-3294: --- Oh, there are also binary file changes: {noformat} c:\Work\lucene-solrgit st # new file: solr/contrib/clustering/lib/carrot2-core-3.5.0.1.jar # deleted:solr/contrib/clustering/lib/carrot2-core-3.5.0.jar # deleted:solr/contrib/clustering/lib/jackson-core-asl-1.5.2.jar # new file: solr/contrib/clustering/lib/jackson-core-asl-1.7.4.jar # deleted:solr/contrib/clustering/lib/jackson-mapper-asl-1.5.2.jar # new file: solr/contrib/clustering/lib/jackson-mapper-asl-1.7.4.jar # deleted: solr/contrib/clustering/lib/solr-carrot2-core-pom.xml.template {noformat} These can be fetched from Maven Central and Carrot2 pom has these dependencies too. I've excluded everything else. Remove binary carrot2.jar and replace it with a maven dependency. - Key: SOLR-3294 URL: https://issues.apache.org/jira/browse/SOLR-3294 Project: Solr Issue Type: Task Components: contrib - Clustering Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Blocker Fix For: 3.6 Attachments: SOLR-3294.patch The repo contains a manually retrowoven Carrot2 JAR which does not have a corresponding artefact in Maven Central (so won't work for ivy). We will make a release with 1.5 backport (I hate this!). http://issues.carrot2.org/browse/CARROT-902 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy
[ https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13241599#comment-13241599 ] Dawid Weiss commented on LUCENE-3930: - Hey Hoss, there's a typo in the target name :) ivy-availablity-check. nuke jars from source tree and use ivy -- Key: LUCENE-3930 URL: https://issues.apache.org/jira/browse/LUCENE-3930 Project: Lucene - Java Issue Type: Task Components: general/build Reporter: Robert Muir Assignee: Robert Muir Priority: Blocker Fix For: 3.6 Attachments: LUCENE-3930-solr-example.patch, LUCENE-3930-solr-example.patch, LUCENE-3930.patch, LUCENE-3930.patch, LUCENE-3930.patch, LUCENE-3930__ivy_bootstrap_target.patch, ant_-verbose_clean_test.out.txt, noggit-commons-csv.patch, patch-jetty-build.patch As mentioned on the ML thread: switch jars to ivy mechanism?. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3294) Remove binary carrot2.jar and replace it with a maven dependency.
[ https://issues.apache.org/jira/browse/SOLR-3294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13241779#comment-13241779 ] Dawid Weiss commented on SOLR-3294: --- Thanks Steven! Since you have it open would you commit it in too? Remove that 'dist-maven' section, it isn't needed indeed. Thanks! Remove binary carrot2.jar and replace it with a maven dependency. - Key: SOLR-3294 URL: https://issues.apache.org/jira/browse/SOLR-3294 Project: Solr Issue Type: Task Components: contrib - Clustering Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Blocker Fix For: 3.6 Attachments: SOLR-3294.patch The repo contains a manually retrowoven Carrot2 JAR which does not have a corresponding artefact in Maven Central (so won't work for ivy). We will make a release with 1.5 backport (I hate this!). http://issues.carrot2.org/browse/CARROT-902 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3295) Binaries contain 1.6 classes
[ https://issues.apache.org/jira/browse/SOLR-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13241780#comment-13241780 ] Dawid Weiss commented on SOLR-3295: --- Robert says this isn't the case. Also: this is THE only jar that requires 1.6 so I'd say it's probably a mistake? Binaries contain 1.6 classes Key: SOLR-3295 URL: https://issues.apache.org/jira/browse/SOLR-3295 Project: Solr Issue Type: Bug Reporter: Dawid Weiss Priority: Minor Fix For: 3.6 Attachments: output.log I've ran this tool (does the job): http://code.google.com/p/versioncheck/ on the checkout of branch_3x. To my surprise there is a JAR which contains Java 1.6 code: {noformat} Major.Minor Version : 50.0 JAVA compatibility : Java 1.6 platform: 45.3-50.0 Number of classes : 60 Classes are : c:\Work\lucene-solr\.\solr\contrib\extraction\lib\netcdf-4.2-min.jar [:] ucar/unidata/geoloc/Bearing.class ... {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3272) Solr filter factory for MorfologikFilter
[ https://issues.apache.org/jira/browse/SOLR-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13240268#comment-13240268 ] Dawid Weiss commented on SOLR-3272: --- This is in trunk now, thanks Rafał. Solr filter factory for MorfologikFilter Key: SOLR-3272 URL: https://issues.apache.org/jira/browse/SOLR-3272 Project: Solr Issue Type: New Feature Components: Schema and Analysis Affects Versions: 4.0 Reporter: Rafał Kuć Assignee: Dawid Weiss Fix For: 4.0 Attachments: SOLR-3272-toupper-correction.patch, SOLR-3272-with-javadoc-example-usage.patch, SOLR-3272.patch, SOLR-3272.patch, SOLR-3272.patch, SOLR-3727-new.patch I didn't find MorfologikFilter factory in Solr, so here is a simple. Maybe someone will have make use of it :) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3927) allow running trunk tests with IBM JRE
[ https://issues.apache.org/jira/browse/LUCENE-3927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239277#comment-13239277 ] Dawid Weiss commented on LUCENE-3927: - bq. t uses a HashMap instead of LinkedHashMap as lookup cache and that mixes the SPI classes up. This is sad. It's a simple bug to fix but will never be probably... allow running trunk tests with IBM JRE -- Key: LUCENE-3927 URL: https://issues.apache.org/jira/browse/LUCENE-3927 Project: Lucene - Java Issue Type: Task Components: general/test Affects Versions: 4.0 Reporter: Robert Muir Attachments: LUCENE-3927.patch This is currently not possible because of how the SPI loader works, we cannot simulate Lucene3x codec with PreFlexRWCodec. But we should still allow basic testing (even though we cannot test preflex). After hacking around the issue, I get interesting fails with this JRE so I think its worth it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3272) Solr filter factory for MorfologikFilter
[ https://issues.apache.org/jira/browse/SOLR-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239793#comment-13239793 ] Dawid Weiss commented on SOLR-3272: --- I actually don't know what the policy is -- I asked on the dev list, we'll see what solr folks prefer. Solr filter factory for MorfologikFilter Key: SOLR-3272 URL: https://issues.apache.org/jira/browse/SOLR-3272 Project: Solr Issue Type: New Feature Components: Schema and Analysis Affects Versions: 4.0 Reporter: Rafał Kuć Assignee: Dawid Weiss Fix For: 4.0 Attachments: SOLR-3272-with-javadoc-example-usage.patch, SOLR-3272.patch, SOLR-3727-new.patch I didn't find MorfologikFilter factory in Solr, so here is a simple. Maybe someone will have make use of it :) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3272) Solr filter factory for MorfologikFilter
[ https://issues.apache.org/jira/browse/SOLR-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239811#comment-13239811 ] Dawid Weiss commented on SOLR-3272: --- Thanks Uwe. Btw. should we apply it to 3.x as well? This seems like a harmless patch and it'd be a nice-to-have feature. Solr filter factory for MorfologikFilter Key: SOLR-3272 URL: https://issues.apache.org/jira/browse/SOLR-3272 Project: Solr Issue Type: New Feature Components: Schema and Analysis Affects Versions: 4.0 Reporter: Rafał Kuć Assignee: Dawid Weiss Fix For: 4.0 Attachments: SOLR-3272-with-javadoc-example-usage.patch, SOLR-3272.patch, SOLR-3727-new.patch I didn't find MorfologikFilter factory in Solr, so here is a simple. Maybe someone will have make use of it :) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3272) Solr filter factory for MorfologikFilter
[ https://issues.apache.org/jira/browse/SOLR-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239824#comment-13239824 ] Dawid Weiss commented on SOLR-3272: --- Damn. [Blushing]. I could prepare a 1.5 compatible version with retroweaver and integrate it in. I guess now I don't have excuses, do I... Do we want to push it in at the last minute though? Solr filter factory for MorfologikFilter Key: SOLR-3272 URL: https://issues.apache.org/jira/browse/SOLR-3272 Project: Solr Issue Type: New Feature Components: Schema and Analysis Affects Versions: 4.0 Reporter: Rafał Kuć Assignee: Dawid Weiss Fix For: 4.0 Attachments: SOLR-3272-with-javadoc-example-usage.patch, SOLR-3272.patch, SOLR-3727-new.patch I didn't find MorfologikFilter factory in Solr, so here is a simple. Maybe someone will have make use of it :) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3272) Solr filter factory for MorfologikFilter
[ https://issues.apache.org/jira/browse/SOLR-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239939#comment-13239939 ] Dawid Weiss commented on SOLR-3272: --- Can I ask somebody to look at the build file changes (and determine if morfologik JARs should be copied and where). Otherwise this is ready to be committed I think. After some deliberation I won't rush to make Morfologik part of 3.x -- last minute features are the worst. Solr filter factory for MorfologikFilter Key: SOLR-3272 URL: https://issues.apache.org/jira/browse/SOLR-3272 Project: Solr Issue Type: New Feature Components: Schema and Analysis Affects Versions: 4.0 Reporter: Rafał Kuć Assignee: Dawid Weiss Fix For: 4.0 Attachments: SOLR-3272-toupper-correction.patch, SOLR-3272-with-javadoc-example-usage.patch, SOLR-3272.patch, SOLR-3272.patch, SOLR-3727-new.patch I didn't find MorfologikFilter factory in Solr, so here is a simple. Maybe someone will have make use of it :) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect
[ https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238203#comment-13238203 ] Dawid Weiss commented on LUCENE-3867: - For historical records: the previous implementation of RamUsageEstimator was off by anything between 3% (random size objects, including arrays) to 20% (objects smaller than 80 bytes). Again -- these are perfect scenario measurements with empty heap and max. allocation until OOM, with a serial GC. With a concurrent and parallel GCs the memory consumption estimation is still accurate but it's nearly impossible to tell when an OOM will occur or how the GC will manage the heap space. RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect -- Key: LUCENE-3867 URL: https://issues.apache.org/jira/browse/LUCENE-3867 Project: Lucene - Java Issue Type: Bug Components: core/index Reporter: Shai Erera Assignee: Uwe Schindler Priority: Trivial Fix For: 3.6, 4.0 Attachments: LUCENE-3867-3.x.patch, LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The NUM_BYTES_OBJECT_REF part should not be included, at least not according to this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml {quote} A single-dimension array is a single object. As expected, the array has the usual object header. However, this object head is 12 bytes to accommodate a four-byte array length. Then comes the actual array data which, as you might expect, consists of the number of elements multiplied by the number of bytes required for one element, depending on its type. The memory usage for one element is 4 bytes for an object reference ... {quote} While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel about including such helper methods in RUE, as static, stateless, methods? It's not perfect, there's some room for improvement I'm sure, here it is: {code} /** * Computes the approximate size of a String object. Note that if this object * is also referenced by another object, you should add * {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this * method. */ public static int sizeOf(String str) { return 2 * str.length() + 6 // chars + additional safeness for arrays alignment + 3 * RamUsageEstimator.NUM_BYTES_INT // String maintains 3 integers + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // char[] array + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // String object } {code} If people are not against it, I'd like to also add sizeOf(int[] / byte[] / long[] / double[] ... and String[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect
[ https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238208#comment-13238208 ] Dawid Weiss commented on LUCENE-3867: - I didn't say it's wrong -- it is fine and accurate. What I'm saying is that it's not really suitable for predictions; for answering questions like: how many objects of a given type/ types can I allocate before an OOM hits me? It doesn't really surprise me that much, but it would be nice. For measuring already allocated stuff it's more than fine of course. RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect -- Key: LUCENE-3867 URL: https://issues.apache.org/jira/browse/LUCENE-3867 Project: Lucene - Java Issue Type: Bug Components: core/index Reporter: Shai Erera Assignee: Uwe Schindler Priority: Trivial Fix For: 3.6, 4.0 Attachments: LUCENE-3867-3.x.patch, LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The NUM_BYTES_OBJECT_REF part should not be included, at least not according to this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml {quote} A single-dimension array is a single object. As expected, the array has the usual object header. However, this object head is 12 bytes to accommodate a four-byte array length. Then comes the actual array data which, as you might expect, consists of the number of elements multiplied by the number of bytes required for one element, depending on its type. The memory usage for one element is 4 bytes for an object reference ... {quote} While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel about including such helper methods in RUE, as static, stateless, methods? It's not perfect, there's some room for improvement I'm sure, here it is: {code} /** * Computes the approximate size of a String object. Note that if this object * is also referenced by another object, you should add * {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this * method. */ public static int sizeOf(String str) { return 2 * str.length() + 6 // chars + additional safeness for arrays alignment + 3 * RamUsageEstimator.NUM_BYTES_INT // String maintains 3 integers + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // char[] array + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // String object } {code} If people are not against it, I'd like to also add sizeOf(int[] / byte[] / long[] / double[] ... and String[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3272) Solr filter factory for MorfologikFilter
[ https://issues.apache.org/jira/browse/SOLR-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238327#comment-13238327 ] Dawid Weiss commented on SOLR-3272: --- Hi Michał. Could you modify this patch to include support for the three dictionaries (combined, morfeusz and morfologik)? This would be more flexible (and the combined dictionary is nearly twice larger than morfologik itself so it's worth it). {code} return new MorfologikFilter(ts, DICTIONARY.MORFOLOGIK, luceneMatchVersion); {code} Also, an example of use in the JavaDoc would be nice (see BeiderMorseFilterFactory for example). The test should be using DEFAULT_VERSION not the fixed LUCENE_40. Thanks! Solr filter factory for MorfologikFilter Key: SOLR-3272 URL: https://issues.apache.org/jira/browse/SOLR-3272 Project: Solr Issue Type: New Feature Components: Schema and Analysis Affects Versions: 4.0 Reporter: Rafał Kuć Assignee: Dawid Weiss Fix For: 4.0 Attachments: SOLR-3272.patch I didn't find MorfologikFilter factory in Solr, so here is a simple. Maybe someone will have make use of it :) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3272) Solr filter factory for MorfologikFilter
[ https://issues.apache.org/jira/browse/SOLR-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238332#comment-13238332 ] Dawid Weiss commented on SOLR-3272: --- Thanks. Sorry about the name confusion btw. Don't know where I took Michał from :) Solr filter factory for MorfologikFilter Key: SOLR-3272 URL: https://issues.apache.org/jira/browse/SOLR-3272 Project: Solr Issue Type: New Feature Components: Schema and Analysis Affects Versions: 4.0 Reporter: Rafał Kuć Assignee: Dawid Weiss Fix For: 4.0 Attachments: SOLR-3272.patch I didn't find MorfologikFilter factory in Solr, so here is a simple. Maybe someone will have make use of it :) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3272) Solr filter factory for MorfologikFilter
[ https://issues.apache.org/jira/browse/SOLR-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238715#comment-13238715 ] Dawid Weiss commented on SOLR-3272: --- Thanks Rafał. Solr filter factory for MorfologikFilter Key: SOLR-3272 URL: https://issues.apache.org/jira/browse/SOLR-3272 Project: Solr Issue Type: New Feature Components: Schema and Analysis Affects Versions: 4.0 Reporter: Rafał Kuć Assignee: Dawid Weiss Fix For: 4.0 Attachments: SOLR-3272.patch, SOLR-3727-new.patch I didn't find MorfologikFilter factory in Solr, so here is a simple. Maybe someone will have make use of it :) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3268) remove write acess to source tree (chmod 555) when running tests in jenkins
[ https://issues.apache.org/jira/browse/SOLR-3268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238775#comment-13238775 ] Dawid Weiss commented on SOLR-3268: --- All of these are in .gitignore, Steven (and can be regenerated via dev-tools/scripts/gitignore-gen.sh. remove write acess to source tree (chmod 555) when running tests in jenkins --- Key: SOLR-3268 URL: https://issues.apache.org/jira/browse/SOLR-3268 Project: Solr Issue Type: Bug Reporter: Robert Muir Assignee: Robert Muir Fix For: 4.0 Attachments: SOLR-3268_sync.patch Some tests are currently creating files under the source tree. This causes a lot of problems, it makes my checkout look dirty after running 'ant test' and i have to cleanup. I opened and issue for this a month in a half for solrj/src/test-files/solrj/solr/shared/test-solr.xml (SOLR-3112), but now we have a second file (core/src/test-files/solr/conf/elevate-data-distrib.xml). So I think hudson needs to chmod these src directories to 555, so that solr tests that do this will fail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3910) remove special hudson nightly linedocs
[ https://issues.apache.org/jira/browse/LUCENE-3910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237944#comment-13237944 ] Dawid Weiss commented on LUCENE-3910: - I agree with you both. No, it's not a paradox. On one hand -- I agree that having larger test files is good and on the other I agree with Robert that not being able to reproduce locally because of different (or inconsistent) data is a pain. At Carrot Search we have put all the big data into a separate git repository and this is simply mirrored across build servers and our local machines. Granted, the first clone takes a while, but then pulls of additional data are much faster and (which is a big plus) git repo has an md5 of the revision so this can be emitted as a log upon failure (we don't do it because we're pretty much sure the checkouts are consistent, but it _could_ be done to ensure testing against exact same test files). Just thoughts to consider. remove special hudson nightly linedocs -- Key: LUCENE-3910 URL: https://issues.apache.org/jira/browse/LUCENE-3910 Project: Lucene - Java Issue Type: Bug Reporter: Robert Muir Fix For: 4.0 Hudson has a special huge linedocs file that it sets via a -D parameter, but this means that anything using LineDocs won't reproduce via our home computers if it fails on hudson. I think we should disable this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3877) Lucene should not call System.out.println
[ https://issues.apache.org/jira/browse/LUCENE-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237479#comment-13237479 ] Dawid Weiss commented on LUCENE-3877: - No worries Greg, really. For 3.x I think manual check will do (or what I've done above with AspectJ). For 4.x it'd be nice to have findbugs lint anyway (for this and other issues). It'll most likely require some rules tuning too, so it can be a separate issue. Lucene should not call System.out.println - Key: LUCENE-3877 URL: https://issues.apache.org/jira/browse/LUCENE-3877 Project: Lucene - Java Issue Type: Bug Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 4.0 Attachments: IllegalSystemTest.java, IllegalSystemTest.java, SystemPrintCheck.java We seem to have accumulated a few random sops... Eg, PairOutputs.java (oal.util.fst) and MultiDocValues.java, at least. Can we somehow detect (eg, have a test failure) if we accidentally leave errant System.out.println's (leftover from debugging)...? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3268) remove write acess to source tree (chmod 555) when running tests in jenkins
[ https://issues.apache.org/jira/browse/SOLR-3268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13236489#comment-13236489 ] Dawid Weiss commented on SOLR-3268: --- I agree that tests modifying sources or writing to source areas are a pain. I know these files can be svn:ignored but it just... feels dirty somehow. On a constructive note -- maybe we can use this: http://ant.apache.org/manual/Tasks/sync.html and mirror whatever src folder structure is required for these tests in the build area? remove write acess to source tree (chmod 555) when running tests in jenkins --- Key: SOLR-3268 URL: https://issues.apache.org/jira/browse/SOLR-3268 Project: Solr Issue Type: Bug Reporter: Robert Muir Fix For: 3.6, 4.0 Some tests are currently creating files under the source tree. This causes a lot of problems, it makes my checkout look dirty after running 'ant test' and i have to cleanup. I opened and issue for this a month in a half for solrj/src/test-files/solrj/solr/shared/test-solr.xml (SOLR-3112), but now we have a second file (core/src/test-files/solr/conf/elevate-data-distrib.xml). So I think hudson needs to chmod these src directories to 555, so that solr tests that do this will fail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3268) remove write acess to source tree (chmod 555) when running tests in jenkins
[ https://issues.apache.org/jira/browse/SOLR-3268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13236494#comment-13236494 ] Dawid Weiss commented on SOLR-3268: --- Oh... in that case the tests just need to be fixed :) remove write acess to source tree (chmod 555) when running tests in jenkins --- Key: SOLR-3268 URL: https://issues.apache.org/jira/browse/SOLR-3268 Project: Solr Issue Type: Bug Reporter: Robert Muir Fix For: 3.6, 4.0 Some tests are currently creating files under the source tree. This causes a lot of problems, it makes my checkout look dirty after running 'ant test' and i have to cleanup. I opened and issue for this a month in a half for solrj/src/test-files/solrj/solr/shared/test-solr.xml (SOLR-3112), but now we have a second file (core/src/test-files/solr/conf/elevate-data-distrib.xml). So I think hudson needs to chmod these src directories to 555, so that solr tests that do this will fail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect
[ https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13236626#comment-13236626 ] Dawid Weiss commented on LUCENE-3867: - Thanks Uwe. I'll be working in the evening again but if you're faster go ahead and commit it in. RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect -- Key: LUCENE-3867 URL: https://issues.apache.org/jira/browse/LUCENE-3867 Project: Lucene - Java Issue Type: Bug Components: core/index Reporter: Shai Erera Assignee: Uwe Schindler Priority: Trivial Fix For: 3.6, 4.0 Attachments: LUCENE-3867-3.x.patch, LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The NUM_BYTES_OBJECT_REF part should not be included, at least not according to this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml {quote} A single-dimension array is a single object. As expected, the array has the usual object header. However, this object head is 12 bytes to accommodate a four-byte array length. Then comes the actual array data which, as you might expect, consists of the number of elements multiplied by the number of bytes required for one element, depending on its type. The memory usage for one element is 4 bytes for an object reference ... {quote} While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel about including such helper methods in RUE, as static, stateless, methods? It's not perfect, there's some room for improvement I'm sure, here it is: {code} /** * Computes the approximate size of a String object. Note that if this object * is also referenced by another object, you should add * {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this * method. */ public static int sizeOf(String str) { return 2 * str.length() + 6 // chars + additional safeness for arrays alignment + 3 * RamUsageEstimator.NUM_BYTES_INT // String maintains 3 integers + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // char[] array + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // String object } {code} If people are not against it, I'd like to also add sizeOf(int[] / byte[] / long[] / double[] ... and String[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3877) Lucene should not call System.out.println
[ https://issues.apache.org/jira/browse/LUCENE-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237117#comment-13237117 ] Dawid Weiss commented on LUCENE-3877: - I'd push it to 4.0 (automation in whatever form). Lucene should not call System.out.println - Key: LUCENE-3877 URL: https://issues.apache.org/jira/browse/LUCENE-3877 Project: Lucene - Java Issue Type: Bug Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 3.6, 4.0 Attachments: IllegalSystemTest.java, IllegalSystemTest.java, SystemPrintCheck.java We seem to have accumulated a few random sops... Eg, PairOutputs.java (oal.util.fst) and MultiDocValues.java, at least. Can we somehow detect (eg, have a test failure) if we accidentally leave errant System.out.println's (leftover from debugging)...? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect
[ https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237172#comment-13237172 ] Dawid Weiss commented on LUCENE-3867: - I've been thinking how one can assess the estimation quality of the new code. I cam up with this: - I allocate an Object[] half the size of estimated maximum available RAM (just to make sure all objects will fit without the need to reallocate), - I precompute shallow sizes for instances of all wild classes (classes with random fields, including arrays). - I then fill in the vault array above with random instances of wild classes, summing up the estimated size UNTIL I HIT OOM. - Once I git OOM I know how much we actually allocated vs. how much space we thought we did allocate. The results are very accurate on HotSpot if one is using serial GC. For example: {noformat} [JVM: Java HotSpot(TM) 64-Bit Server VM, 20.4-b02, Sun Microsystems Inc., Sun Microsystems Inc., 1.6.0_29] Max: 483.4 MB, Used: 698.9 KB, Committed: 123.8 MB Expected free: 240.9 MB, Allocated estimation: 240.8 MB, Difference: -0.05% (113.6 KB) {noformat} If one runs with a parallel GC things do get out of hand because the GC is not keeping up with allocations (although I'm not sure how I should interpret this because we only allocate; it's not possible to free any space -- maybe there are different GC pools or something): {noformat} [JVM: Java HotSpot(TM) 64-Bit Server VM, 20.4-b02, Sun Microsystems Inc., Sun Microsystems Inc., 1.6.0_29] Max: 444.5 MB, Used: 655.4 KB, Committed: 122.7 MB Expected free: 221.5 MB, Allocated estimation: 174.2 MB, Difference: -21.34% (47.3 MB) {noformat} JRockit: {noformat} [JVM: Oracle JRockit(R), R28.1.4-7-144370-1.6.0_26-20110617-2130-windows-x86_64, Oracle Corporation, Oracle Corporation, 1.6.0_26] Max: 500 MB, Used: 3.5 MB, Committed: 64 MB Expected free: 247.7 MB, Allocated estimation: 249.5 MB, Difference: 0.74% (1.8 MB) {noformat} I think we're good. If somebody wishes to experiment, the spike is here: https://github.com/dweiss/java-sizeof {noformat} mvn test mvn dependency:copy-dependencies java -cp target\classes:target\test-classes:target\dependency\junit-4.10.jar \ com.carrotsearch.sizeof.TestEstimationQuality {noformat} RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect -- Key: LUCENE-3867 URL: https://issues.apache.org/jira/browse/LUCENE-3867 Project: Lucene - Java Issue Type: Bug Components: core/index Reporter: Shai Erera Assignee: Uwe Schindler Priority: Trivial Fix For: 3.6, 4.0 Attachments: LUCENE-3867-3.x.patch, LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The NUM_BYTES_OBJECT_REF part should not be included, at least not according to this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml {quote} A single-dimension array is a single object. As expected, the array has the usual object header. However, this object head is 12 bytes to accommodate a four-byte array length. Then comes the actual array data which, as you might expect, consists of the number of elements multiplied by the number of bytes required for one element, depending on its type. The memory usage for one element is 4 bytes for an object reference ... {quote} While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel about including such helper methods in RUE, as static, stateless, methods? It's not perfect, there's some room for improvement I'm sure, here it is: {code} /** * Computes the approximate size of a String object. Note that if this object * is also referenced by another object, you should add * {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this * method. */ public static int sizeOf(String str) { return 2 * str.length() + 6 // chars + additional safeness for arrays alignment + 3 * RamUsageEstimator.NUM_BYTES_INT // String maintains 3 integers + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // char[] array + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // String object } {code} If
[jira] [Commented] (LUCENE-3877) Lucene should not call System.out.println
[ https://issues.apache.org/jira/browse/LUCENE-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235422#comment-13235422 ] Dawid Weiss commented on LUCENE-3877: - bq. I have seen it not work in the past for obscure reasons Most likely the reasons were incorrect pointcut definitions? These can be tricky, I agree. Nonetheless, I've been using AspectJ for a long time and it always fits my needs and expectations. I'm not saying it doesn't have any bugs -- I'm sure it has. But the right tool for the right job; it took me about 5 mins to write and apply that aspect (with follow ups, I sent an e-mail to the mailing list, JIRA didn't work at the time). I'm not advocating for any tool, really. To me aspectj is a fast tool for expressing where I want a given snippet of code to be injected (or what I want excluded) and for such tasks I don't see a faster or more pleasant to use alternative. Oh, I've been using asmlib too; extensively in fact; so it's not lack of knowledge about the tool itself. Lucene should not call System.out.println - Key: LUCENE-3877 URL: https://issues.apache.org/jira/browse/LUCENE-3877 Project: Lucene - Java Issue Type: Bug Reporter: Michael McCandless Fix For: 3.6, 4.0 Attachments: IllegalSystemTest.java, IllegalSystemTest.java, SystemPrintCheck.java We seem to have accumulated a few random sops... Eg, PairOutputs.java (oal.util.fst) and MultiDocValues.java, at least. Can we somehow detect (eg, have a test failure) if we accidentally leave errant System.out.println's (leftover from debugging)...? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3877) Lucene should not call System.out.println
[ https://issues.apache.org/jira/browse/LUCENE-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235424#comment-13235424 ] Dawid Weiss commented on LUCENE-3877: - My aspectj experiments from yesterday when JIRA was dead. I applied that aspect just to see what happens. {noformat} ajc -sourceroots aspects \ -inpath lucene-core-3.6-SNAPSHOT.jar \ -d none \ -cp aspectjrt.jar \ -showWeaveInfo {noformat} Here's what I got: {noformat} Join point 'field-get(java.io.PrintStream java.lang.System.out)' in Type 'org.apache.lucene.analysis.PorterStemmer' (PorterStemmer.java:529) advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6) Join point 'field-get(java.io.PrintStream java.lang.System.out)' in Type 'org.apache.lucene.analysis.PorterStemmer' (PorterStemmer.java:534) advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6) Join point 'field-get(java.io.PrintStream java.lang.System.out)' in Type 'org.apache.lucene.analysis.PorterStemmer' (PorterStemmer.java:542) advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6) Join point 'field-get(java.io.PrintStream java.lang.System.out)' in Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:989) advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6) Join point 'field-get(java.io.PrintStream java.lang.System.out)' in Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:996) advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6) Join point 'field-get(java.io.PrintStream java.lang.System.out)' in Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1003) advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6) Join point 'field-get(java.io.PrintStream java.lang.System.out)' in Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1012) advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6) Join point 'field-get(java.io.PrintStream java.lang.System.out)' in Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1013) advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6) Join point 'field-get(java.io.PrintStream java.lang.System.out)' in Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1038) advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6) Join point 'field-get(java.io.PrintStream java.lang.System.out)' in Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1043) advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6) Join point 'field-get(java.io.PrintStream java.lang.System.out)' in Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1047) advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6) Join point 'field-get(java.io.PrintStream java.lang.System.out)' in Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1056) advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6) Join point 'field-get(java.io.PrintStream java.lang.System.out)' in Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1057) advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6) Join point 'field-get(java.io.PrintStream java.lang.System.out)' in Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1062) advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6) Join point 'field-get(java.io.PrintStream java.lang.System.out)' in Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1071) advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6) Join point 'field-get(java.io.PrintStream java.lang.System.out)' in Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1073) advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6) Join point 'field-get(java.io.PrintStream java.lang.System.out)' in Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1074) advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6) Join point 'field-get(java.io.PrintStream java.lang.System.out)' in Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1077) advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6) Join point 'field-get(java.io.PrintStream java.lang.System.out)' in Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1079) advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6) Join point 'field-get(java.io.PrintStream java.lang.System.out)' in Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1081) advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6) Join point 'field-get(java.io.PrintStream java.lang.System.out)' in Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1082) advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6) Join point 'field-get(java.io.PrintStream java.lang.System.out)' in Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1085) advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6) Join point 'field-get(java.io.PrintStream
[jira] [Commented] (LUCENE-3877) Lucene should not call System.out.println
[ https://issues.apache.org/jira/browse/LUCENE-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235428#comment-13235428 ] Dawid Weiss commented on LUCENE-3877: - Oh, btw. I think a FindBugs rule for detecting sysouts/syserrs would be a great addition to FindBugs -- you should definitely file it as an improvement there. In reality at least class-level exclusions will be needed to avoid legitimate matches like the ones shown above (main methods, exception handlers), but these can be lived with. Lucene should not call System.out.println - Key: LUCENE-3877 URL: https://issues.apache.org/jira/browse/LUCENE-3877 Project: Lucene - Java Issue Type: Bug Reporter: Michael McCandless Fix For: 3.6, 4.0 Attachments: IllegalSystemTest.java, IllegalSystemTest.java, SystemPrintCheck.java We seem to have accumulated a few random sops... Eg, PairOutputs.java (oal.util.fst) and MultiDocValues.java, at least. Can we somehow detect (eg, have a test failure) if we accidentally leave errant System.out.println's (leftover from debugging)...? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect
[ https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235494#comment-13235494 ] Dawid Weiss commented on LUCENE-3867: - I've been experimenting a bit with the new code. Field offsets for three classes in a hierarchy with unalignable fields (byte, long combinations at all levels). Note unaligned reordering of byte field in JRockit - nice. {noformat} JVM: [JVM: HotSpot, Sun Microsystems Inc., 1.6.0_31] (compressed OOPs) @12 4 Super.superByte @16 8 Super.subLong @24 8 Sub.subLong @32 4 Sub.subByte @36 4 SubSub.subSubByte @40 8 SubSub.subSubLong @48sizeOf(SubSub.class instance) JVM: [JVM: HotSpot, Sun Microsystems Inc., 1.6.0_31] (normal OOPs) @16 8 Super.subLong @24 8 Super.superByte @32 8 Sub.subLong @40 8 Sub.subByte @48 8 SubSub.subSubLong @56 8 SubSub.subSubByte @64sizeOf(SubSub.class instance) JVM: [JVM: J9, IBM Corporation, 1.6.0] @24 8 Super.subLong @32 4 Super.superByte @36 4 Sub.subByte @40 8 Sub.subLong @48 8 SubSub.subSubLong @56 8 SubSub.subSubByte @64sizeOf(SubSub.class instance) JVM: [JVM: JRockit, Oracle Corporation, 1.6.0_26] (64-bit JVM!) @ 8 8 Super.subLong @16 1 Super.superByte @17 7 Sub.subByte @24 8 Sub.subLong @32 8 SubSub.subSubLong @40 8 SubSub.subSubByte @48sizeOf(SubSub.class instance) {noformat} RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect -- Key: LUCENE-3867 URL: https://issues.apache.org/jira/browse/LUCENE-3867 Project: Lucene - Java Issue Type: Bug Components: core/index Reporter: Shai Erera Assignee: Uwe Schindler Priority: Trivial Fix For: 3.6, 4.0 Attachments: LUCENE-3867-3.x.patch, LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The NUM_BYTES_OBJECT_REF part should not be included, at least not according to this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml {quote} A single-dimension array is a single object. As expected, the array has the usual object header. However, this object head is 12 bytes to accommodate a four-byte array length. Then comes the actual array data which, as you might expect, consists of the number of elements multiplied by the number of bytes required for one element, depending on its type. The memory usage for one element is 4 bytes for an object reference ... {quote} While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel about including such helper methods in RUE, as static, stateless, methods? It's not perfect, there's some room for improvement I'm sure, here it is: {code} /** * Computes the approximate size of a String object. Note that if this object * is also referenced by another object, you should add * {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this * method. */ public static int sizeOf(String str) { return 2 * str.length() + 6 // chars + additional safeness for arrays alignment + 3 * RamUsageEstimator.NUM_BYTES_INT // String maintains 3 integers + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // char[] array + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // String object } {code} If people are not against it, I'd like to also add sizeOf(int[] / byte[] / long[] / double[] ... and String[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect
[ https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235501#comment-13235501 ] Dawid Weiss commented on LUCENE-3867: - bq. I hope my explanation was understandable... Perfectly well. Yes, I agree, it's possible to fill in the holes packing them with fields from subclasses. It would be a nice vm-level optimization in fact! I'm still experimenting on this code and cleaning/ adding javadocs -- I'll patch this and provide a complete patch once I'm done, ok? RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect -- Key: LUCENE-3867 URL: https://issues.apache.org/jira/browse/LUCENE-3867 Project: Lucene - Java Issue Type: Bug Components: core/index Reporter: Shai Erera Assignee: Uwe Schindler Priority: Trivial Fix For: 3.6, 4.0 Attachments: LUCENE-3867-3.x.patch, LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The NUM_BYTES_OBJECT_REF part should not be included, at least not according to this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml {quote} A single-dimension array is a single object. As expected, the array has the usual object header. However, this object head is 12 bytes to accommodate a four-byte array length. Then comes the actual array data which, as you might expect, consists of the number of elements multiplied by the number of bytes required for one element, depending on its type. The memory usage for one element is 4 bytes for an object reference ... {quote} While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel about including such helper methods in RUE, as static, stateless, methods? It's not perfect, there's some room for improvement I'm sure, here it is: {code} /** * Computes the approximate size of a String object. Note that if this object * is also referenced by another object, you should add * {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this * method. */ public static int sizeOf(String str) { return 2 * str.length() + 6 // chars + additional safeness for arrays alignment + 3 * RamUsageEstimator.NUM_BYTES_INT // String maintains 3 integers + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // char[] array + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // String object } {code} If people are not against it, I'd like to also add sizeOf(int[] / byte[] / long[] / double[] ... and String[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect
[ https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235506#comment-13235506 ] Dawid Weiss commented on LUCENE-3867: - Maybe it does such things already. I didn't check extensively. RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect -- Key: LUCENE-3867 URL: https://issues.apache.org/jira/browse/LUCENE-3867 Project: Lucene - Java Issue Type: Bug Components: core/index Reporter: Shai Erera Assignee: Uwe Schindler Priority: Trivial Fix For: 3.6, 4.0 Attachments: LUCENE-3867-3.x.patch, LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The NUM_BYTES_OBJECT_REF part should not be included, at least not according to this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml {quote} A single-dimension array is a single object. As expected, the array has the usual object header. However, this object head is 12 bytes to accommodate a four-byte array length. Then comes the actual array data which, as you might expect, consists of the number of elements multiplied by the number of bytes required for one element, depending on its type. The memory usage for one element is 4 bytes for an object reference ... {quote} While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel about including such helper methods in RUE, as static, stateless, methods? It's not perfect, there's some room for improvement I'm sure, here it is: {code} /** * Computes the approximate size of a String object. Note that if this object * is also referenced by another object, you should add * {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this * method. */ public static int sizeOf(String str) { return 2 * str.length() + 6 // chars + additional safeness for arrays alignment + 3 * RamUsageEstimator.NUM_BYTES_INT // String maintains 3 integers + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // char[] array + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // String object } {code} If people are not against it, I'd like to also add sizeOf(int[] / byte[] / long[] / double[] ... and String[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect
[ https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235570#comment-13235570 ] Dawid Weiss commented on LUCENE-3867: - I confirmed that this packing indeed takes place. Wrote a pseudo-random test with lots of classes and fields. Here's an offender on J9 for example (Wild_{inheritance-level}_{field-number}): {noformat} @24 4 Wild_0_92.fld_0_0_92 @28 4 Wild_0_92.fld_1_0_92 @32 4 Wild_0_92.fld_2_0_92 @36 4 Wild_0_92.fld_3_0_92 @40 4 Wild_0_92.fld_4_0_92 @44 4 Wild_0_92.fld_5_0_92 @48 4 Wild_0_92.fld_6_0_92 @52 4 Wild_2_5.fld_0_2_5 @56 8 Wild_1_85.fld_0_1_85 @64 8 Wild_1_85.fld_1_1_85 @72sizeOf(Wild_2_5 instance) {noformat} HotSpot and JRockit don't seem to do this (at least it didn't fail on the example). RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect -- Key: LUCENE-3867 URL: https://issues.apache.org/jira/browse/LUCENE-3867 Project: Lucene - Java Issue Type: Bug Components: core/index Reporter: Shai Erera Assignee: Uwe Schindler Priority: Trivial Fix For: 3.6, 4.0 Attachments: LUCENE-3867-3.x.patch, LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The NUM_BYTES_OBJECT_REF part should not be included, at least not according to this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml {quote} A single-dimension array is a single object. As expected, the array has the usual object header. However, this object head is 12 bytes to accommodate a four-byte array length. Then comes the actual array data which, as you might expect, consists of the number of elements multiplied by the number of bytes required for one element, depending on its type. The memory usage for one element is 4 bytes for an object reference ... {quote} While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel about including such helper methods in RUE, as static, stateless, methods? It's not perfect, there's some room for improvement I'm sure, here it is: {code} /** * Computes the approximate size of a String object. Note that if this object * is also referenced by another object, you should add * {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this * method. */ public static int sizeOf(String str) { return 2 * str.length() + 6 // chars + additional safeness for arrays alignment + 3 * RamUsageEstimator.NUM_BYTES_INT // String maintains 3 integers + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // char[] array + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // String object } {code} If people are not against it, I'd like to also add sizeOf(int[] / byte[] / long[] / double[] ... and String[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect
[ https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235588#comment-13235588 ] Dawid Weiss commented on LUCENE-3867: - Yep, that assumption was wrong -- indeed: {noformat} WildClasses.Wild_2_5 wc = new WildClasses.Wild_2_5(); wc.fld_6_0_92 = 0x1122; wc.fld_0_2_5 = Float.intBitsToFloat(0xa1a2a3a4); wc.fld_0_1_85 = Double.longBitsToDouble(0xb1b2b3b4b5b6b7L); System.out.println(ExpMemoryDumper.dumpObjectMem(wc)); {noformat} results in: {noformat} 0x b0 3d 6f 01 00 00 00 00 0e 80 79 01 00 00 00 00 0x0010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0030 22 11 00 00 a4 a3 a2 a1 b7 b6 b5 b4 b3 b2 b1 00 0x0040 00 00 00 00 00 00 00 00 {noformat} And you can see they are reordered and longs are aligned. I'll provide a cumulative patch of changes in the evening, there's one more thing I wanted to add (cache of fields) because this affects processing speed. RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect -- Key: LUCENE-3867 URL: https://issues.apache.org/jira/browse/LUCENE-3867 Project: Lucene - Java Issue Type: Bug Components: core/index Reporter: Shai Erera Assignee: Uwe Schindler Priority: Trivial Fix For: 3.6, 4.0 Attachments: LUCENE-3867-3.x.patch, LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The NUM_BYTES_OBJECT_REF part should not be included, at least not according to this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml {quote} A single-dimension array is a single object. As expected, the array has the usual object header. However, this object head is 12 bytes to accommodate a four-byte array length. Then comes the actual array data which, as you might expect, consists of the number of elements multiplied by the number of bytes required for one element, depending on its type. The memory usage for one element is 4 bytes for an object reference ... {quote} While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel about including such helper methods in RUE, as static, stateless, methods? It's not perfect, there's some room for improvement I'm sure, here it is: {code} /** * Computes the approximate size of a String object. Note that if this object * is also referenced by another object, you should add * {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this * method. */ public static int sizeOf(String str) { return 2 * str.length() + 6 // chars + additional safeness for arrays alignment + 3 * RamUsageEstimator.NUM_BYTES_INT // String maintains 3 integers + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // char[] array + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // String object } {code} If people are not against it, I'd like to also add sizeOf(int[] / byte[] / long[] / double[] ... and String[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3847) LuceneTestCase should check for modifications on System properties
[ https://issues.apache.org/jira/browse/LUCENE-3847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235710#comment-13235710 ] Dawid Weiss commented on LUCENE-3847: - Well... something is changing it, the question is what it is. I'll take a look. LuceneTestCase should check for modifications on System properties -- Key: LUCENE-3847 URL: https://issues.apache.org/jira/browse/LUCENE-3847 Project: Lucene - Java Issue Type: Improvement Components: general/test Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Minor Fix For: 3.6, 4.0 Attachments: LUCENE-3847.patch - fail the test if changes have been detected. - revert the state of system properties before the suite. - cleanup after the suite. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3847) LuceneTestCase should check for modifications on System properties
[ https://issues.apache.org/jira/browse/LUCENE-3847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235749#comment-13235749 ] Dawid Weiss commented on LUCENE-3847: - I know what's changing it. Eh. So -- there is a warning being printed: {noformat} Mar 22, 2012 6:20:33 PM org.apache.solr.core.Config parseLuceneVersionString WARNING: You should not use LUCENE_CURRENT as luceneMatchVersion property: if you use this setting, and then Solr upgrades to a newer release of Lucene, sizable changes may happen. If precise back compatibility is important then you should instead explicitly specify an actual Lucene version. Mar 22, 2012 6:20:33 PM org.apache.solr.analysis.BaseTokenStreamFactory warnDeprecated WARNING: RussianLetterTokenizerFactory is deprecated. Use StandardTokenizerFactory instead. {noformat} These warnings go through Java logging and this in turn is localized (date format, warning info, etc.). This in turn asks for the default TimeZone and this in turn sets the system property (I mentioned it a while ago). I suggest that we just ignore user.timezone as it is triggered from multiple locations and doesn't seem that important? LuceneTestCase should check for modifications on System properties -- Key: LUCENE-3847 URL: https://issues.apache.org/jira/browse/LUCENE-3847 Project: Lucene - Java Issue Type: Improvement Components: general/test Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Minor Fix For: 3.6, 4.0 Attachments: LUCENE-3847.patch - fail the test if changes have been detected. - revert the state of system properties before the suite. - cleanup after the suite. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3847) LuceneTestCase should check for modifications on System properties
[ https://issues.apache.org/jira/browse/LUCENE-3847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235995#comment-13235995 ] Dawid Weiss commented on LUCENE-3847: - Applied a fix for this. user.timezone is ignored (and is not reset). LuceneTestCase should check for modifications on System properties -- Key: LUCENE-3847 URL: https://issues.apache.org/jira/browse/LUCENE-3847 Project: Lucene - Java Issue Type: Improvement Components: general/test Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Minor Fix For: 3.6, 4.0 Attachments: LUCENE-3847.patch - fail the test if changes have been detected. - revert the state of system properties before the suite. - cleanup after the suite. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect
[ https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13236028#comment-13236028 ] Dawid Weiss commented on LUCENE-3867: - Ok, I admit J9 is fascinating... ;) How much memory does this take? {code} class X { byte a = 0x11; byte b = 0x22; } {code} Here is the memory layout: {code} [JVM: IBM J9 VM, 2.6, IBM Corporation, IBM Corporation, 1.7.0] 0x 00 b8 21 c4 5f 7f 00 00 00 00 00 00 00 00 00 00 0x0010 11 00 00 00 22 00 00 00 @16 4 Super.b1 @20 4 Super.b2 @24sizeOf(Super instance) {code} I don't think I screwed up anything. It really is 4 byte alignment _on all fields_. RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect -- Key: LUCENE-3867 URL: https://issues.apache.org/jira/browse/LUCENE-3867 Project: Lucene - Java Issue Type: Bug Components: core/index Reporter: Shai Erera Assignee: Uwe Schindler Priority: Trivial Fix For: 3.6, 4.0 Attachments: LUCENE-3867-3.x.patch, LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The NUM_BYTES_OBJECT_REF part should not be included, at least not according to this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml {quote} A single-dimension array is a single object. As expected, the array has the usual object header. However, this object head is 12 bytes to accommodate a four-byte array length. Then comes the actual array data which, as you might expect, consists of the number of elements multiplied by the number of bytes required for one element, depending on its type. The memory usage for one element is 4 bytes for an object reference ... {quote} While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel about including such helper methods in RUE, as static, stateless, methods? It's not perfect, there's some room for improvement I'm sure, here it is: {code} /** * Computes the approximate size of a String object. Note that if this object * is also referenced by another object, you should add * {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this * method. */ public static int sizeOf(String str) { return 2 * str.length() + 6 // chars + additional safeness for arrays alignment + 3 * RamUsageEstimator.NUM_BYTES_INT // String maintains 3 integers + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // char[] array + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // String object } {code} If people are not against it, I'd like to also add sizeOf(int[] / byte[] / long[] / double[] ... and String[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3895) Not getting random-seed/reproduce-with if a test fails from another thread
[ https://issues.apache.org/jira/browse/LUCENE-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13234185#comment-13234185 ] Dawid Weiss commented on LUCENE-3895: - bq. Hopefully Dawid hates it and knows of a way to fix it cleanly It's fine for the trunk. It will be redundant in LUCENE-3808 (the seed is reported at master build level there + exceptions have an injected fake stack trace entry with the current master/test seed combination, even though the test seed is redundant most of the time because it's derived). Not getting random-seed/reproduce-with if a test fails from another thread -- Key: LUCENE-3895 URL: https://issues.apache.org/jira/browse/LUCENE-3895 Project: Lucene - Java Issue Type: Bug Components: general/test Affects Versions: 4.0 Reporter: Robert Muir Attachments: LUCENE-3895.patch See https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/12822/console as an example. This is at least affecting 4.0, maybe 3.x too -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3895) Not getting random-seed/reproduce-with if a test fails from another thread
[ https://issues.apache.org/jira/browse/LUCENE-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13234189#comment-13234189 ] Dawid Weiss commented on LUCENE-3895: - Feel free to commit in (4.0/3.x?), Robert. Not getting random-seed/reproduce-with if a test fails from another thread -- Key: LUCENE-3895 URL: https://issues.apache.org/jira/browse/LUCENE-3895 Project: Lucene - Java Issue Type: Bug Components: general/test Affects Versions: 4.0 Reporter: Robert Muir Attachments: LUCENE-3895.patch, LUCENE-3895.patch See https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/12822/console as an example. This is at least affecting 4.0, maybe 3.x too -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3877) Lucene should not call System.out.println
[ https://issues.apache.org/jira/browse/LUCENE-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13234881#comment-13234881 ] Dawid Weiss commented on LUCENE-3877: - You can just as well substitute your own implementation of PrintStream using System.setOut/setErr and check stacks on printlns... But I agree with Benson that a static analysis approach is much cleaner. Don't know if there's anything out of the box in findbugs/ pmd, but even if not then this can be done as a 10-liner by applying an aspect to classes via aspectj and parsing the output logs detecting if an aspect has been applied (it shouldn't match anywhere). Lucene should not call System.out.println - Key: LUCENE-3877 URL: https://issues.apache.org/jira/browse/LUCENE-3877 Project: Lucene - Java Issue Type: Bug Reporter: Michael McCandless Fix For: 3.6, 4.0 Attachments: IllegalSystemTest.java, IllegalSystemTest.java We seem to have accumulated a few random sops... Eg, PairOutputs.java (oal.util.fst) and MultiDocValues.java, at least. Can we somehow detect (eg, have a test failure) if we accidentally leave errant System.out.println's (leftover from debugging)...? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3877) Lucene should not call System.out.println
[ https://issues.apache.org/jira/browse/LUCENE-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235087#comment-13235087 ] Dawid Weiss commented on LUCENE-3877: - fyi. PMD has a rule for this -- SystemPrintln. http://pmd.sourceforge.net/rules/index.html Didn't check the details though. Lucene should not call System.out.println - Key: LUCENE-3877 URL: https://issues.apache.org/jira/browse/LUCENE-3877 Project: Lucene - Java Issue Type: Bug Reporter: Michael McCandless Fix For: 3.6, 4.0 Attachments: IllegalSystemTest.java, IllegalSystemTest.java, SystemPrintCheck.java We seem to have accumulated a few random sops... Eg, PairOutputs.java (oal.util.fst) and MultiDocValues.java, at least. Can we somehow detect (eg, have a test failure) if we accidentally leave errant System.out.println's (leftover from debugging)...? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3877) Lucene should not call System.out.println
[ https://issues.apache.org/jira/browse/LUCENE-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235112#comment-13235112 ] Dawid Weiss commented on LUCENE-3877: - I don't like PMD that much either, I'm just saying it seems to have it. If I were to choose though, I'd use aspectj rather than asm-based code. It just seems cleaner to me. {code} public aspect NoSysOuts { before(): within(org.apache.lucene..*) get(static PrintStream System.*) { throw new RuntimeException(Attempted sysout/syserr/sysin access.); } } {code} You don't even need to run it, just weave with verbose output and see if the aspect matched anywhere. Lucene should not call System.out.println - Key: LUCENE-3877 URL: https://issues.apache.org/jira/browse/LUCENE-3877 Project: Lucene - Java Issue Type: Bug Reporter: Michael McCandless Fix For: 3.6, 4.0 Attachments: IllegalSystemTest.java, IllegalSystemTest.java, SystemPrintCheck.java We seem to have accumulated a few random sops... Eg, PairOutputs.java (oal.util.fst) and MultiDocValues.java, at least. Can we somehow detect (eg, have a test failure) if we accidentally leave errant System.out.println's (leftover from debugging)...? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3258) Ping query caused exception..Invalid version (expected 2, but 60) or the data in not in 'javabin' format
[ https://issues.apache.org/jira/browse/SOLR-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233408#comment-13233408 ] Dawid Weiss commented on SOLR-3258: --- And here comes the moment where my knowledge of Solr ends :) I'd say there is definitely a bug in improper handling of HTTP response status (and this should be fixed), unless there is a filter somewhere that emits this HTML and fakes HTTP 200... But as for the cause of why this happens in general -- no idea. Ping query caused exception..Invalid version (expected 2, but 60) or the data in not in 'javabin' format Key: SOLR-3258 URL: https://issues.apache.org/jira/browse/SOLR-3258 Project: Solr Issue Type: Bug Environment: solr-impl 4.0-SNAPSHOT 1302403 - markus - 2012-03-19 13:55:51 Reporter: Markus Jelsma Fix For: 4.0 Attachments: debugging.patch In a test set-up with nodes=2, shards=3 and cores=6 we often see this exception in the logs. Once every few ping requests this is thrown, other request return a proper OK. Ping request handler: {code} requestHandler name=/admin/ping class=solr.PingRequestHandler lst name=invariants str name=qtselect/str str name=q*:*/str int name=rows0/int /lst lst name=defaults str name=wtjson/str str name=echoParamsall/str bool name=omitHeadertrue/bool /lst /requestHandler {code} Exception: {code} 2012-03-20 13:16:06,405 INFO [solr.core.SolrCore] - [http-80-18] - : [core_a] webapp=/solr path=/admin/ping params={} status=500 QTime=7 2012-03-20 13:16:06,406 ERROR [solr.servlet.SolrDispatchFilter] - [http-80-18] - : null:org.apache.solr.common.SolrException: Ping query caused exception: org.apache.solr.client.solrj.SolrServerException: java.lang.RuntimeException: Invalid version (expected 2, but 60) or the data in not in 'javabin' format at org.apache.solr.handler.PingRequestHandler.handleRequestBody(PingRequestHandler.java:77) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1540) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:435) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:256) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:602) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) at java.lang.Thread.run(Thread.java:662) Caused by: org.apache.solr.common.SolrException: org.apache.solr.client.solrj.SolrServerException: java.lang.RuntimeException: Invalid version (expected 2, but 60) or the data in not in 'javabin' format at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:298) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1540) at org.apache.solr.handler.PingRequestHandler.handleRequestBody(PingRequestHandler.java:68) ... 16 more Caused by: org.apache.solr.client.solrj.SolrServerException: java.lang.RuntimeException: Invalid version (expected 2, but 60) or the data in not in 'javabin' format at org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(LBHttpSolrServer.java:278) at org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:158) at org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:123) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at
[jira] [Commented] (LUCENE-3893) TermsFilter should use AutomatonQuery
[ https://issues.apache.org/jira/browse/LUCENE-3893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233711#comment-13233711 ] Dawid Weiss commented on LUCENE-3893: - bq. Dahiwikwukblabla Daciuk, the name is Jan Daciuk :) Although the same algorithm has been discovered independently by Stoyan Mihov and (I think) Bruce W. Watson. TermsFilter should use AutomatonQuery - Key: LUCENE-3893 URL: https://issues.apache.org/jira/browse/LUCENE-3893 Project: Lucene - Java Issue Type: Improvement Reporter: Michael McCandless Labels: gsoc2012, lucene-gsoc-12 I think we could see perf gains if TermsFilter sorted the terms, built a minimal automaton, and used TermsEnum.intersect to visit the terms... This idea came up on the dev list recently. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org