from:"Dawid Weiss \(Commented\) \(JIRA\)"

[jira] [Commented] (LUCENE-3977) generated/duplicated javadocs are wasteful and bloat the release

2012-04-20 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13258443#comment-13258443
 ] 

Dawid Weiss commented on LUCENE-3977:
-

It's funny -- I feel the same way Uwe does but at the same time I absolutely 
never looked into off-line javadocs that I downloaded with distributions of 
open source projects. It's usually faster to just find these online.

 generated/duplicated javadocs are wasteful and bloat the release
 

 Key: LUCENE-3977
 URL: https://issues.apache.org/jira/browse/LUCENE-3977
 Project: Lucene - Java
  Issue Type: Bug
  Components: general/javadocs
Reporter: Robert Muir
Priority: Blocker
 Fix For: 4.0

 Attachments: LUCENE-3977-triplication.patch, LUCENE-3977.patch, 
 LUCENE-3977.patch, LUCENE-3977.patch


 Some stats for the generated javadocs of 3.6:
 * 9,146 files
 * 161,872 KB uncompressed
 * 25MB compressed (this is responsible for nearly half of our binary release)
 The fact we intentionally double our javadocs size with the 'javadocs-all' 
 thing
 is truly wasteful and compression doesn't help at all. Just testing, i nuked 
 'all'
 and found:
 * 4,944 files
 * 81,084 KB uncompressed
 * 12.8MB compressed
 We need to clean this up for 4.0. We only need to ship javadocs 'one way'.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4000) Non-redirected JVM output causes build errors

2012-04-19 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257884#comment-13257884
 ] 

Dawid Weiss commented on LUCENE-4000:
-

Not so harmless after all. Code cache exhaustion seems to trigger a fallback to 
interpreted mode and this makes tests run forever.

 Non-redirected JVM output causes build errors
 -

 Key: LUCENE-4000
 URL: https://issues.apache.org/jira/browse/LUCENE-4000
 Project: Lucene - Java
  Issue Type: Sub-task
  Components: general/test
Reporter: Dawid Weiss
Assignee: Dawid Weiss
 Fix For: 4.0


 https://builds.apache.org/job/Lucene-Trunk/1899/consoleText
 Code cache JVM warning. Harmless but causes build errors. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3994) some nightly tests take hours

2012-04-18 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256550#comment-13256550
 ] 

Dawid Weiss commented on LUCENE-3994:
-

I've fixed that per-suite constant suite randomization already in github but 
I'll need some time to push to maven central, etc. 

 some nightly tests take hours
 -

 Key: LUCENE-3994
 URL: https://issues.apache.org/jira/browse/LUCENE-3994
 Project: Lucene - Java
  Issue Type: Bug
  Components: general/build
Affects Versions: 4.0
Reporter: Robert Muir
 Fix For: 4.0

 Attachments: LUCENE-3994.patch


 The nightly builds are taking 4-7 hours.
 This is caused by a few bad apples (can be seen 
 https://builds.apache.org/job/Lucene-trunk/1896/testReport/).
 The top 5 are (all in analysis):
 * TestSynonymMapFilter: 1 hr 54 min
 * TestRandomChains: 1 hr 22 min
 * TestRemoveDuplicatesTokenFilter: 32 min
 * TestMappingCharFilter: 28 min
 * TestWordDelimiterFilter: 22 min
 so thats 4.5 hours right there for that run

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3995) In LuceneTestCase.beforeClass, make a new random (also using the class hashcode) to vary defaults

2012-04-18 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256650#comment-13256650
 ] 

Dawid Weiss commented on LUCENE-3995:
-

Robert, this would mean it works fine, right (note dumped randomVal for each 
suite)?
{noformat}
Executing 296 suites with 4 JVMs.
Suite: org.apache.lucene.util.TestCloseableThreadLocal
(@BeforeClass output)
  1 randomVal: 9
  1 

OK  0.05s J1 | TestCloseableThreadLocal.testDefaultValueWithoutSetting
OK  0.01s J1 | TestCloseableThreadLocal.testInitValue
OK  0.01s J1 | TestCloseableThreadLocal.testNullValue
Completed on J1 in 0.27s, 3 tests
 
Suite: org.apache.lucene.util.TestTwoPhaseCommitTool
(@BeforeClass output)
  1 randomVal: 6
  1 

OK  0.04s J2 | TestTwoPhaseCommitTool.testRollback
OK  0.01s J2 | TestTwoPhaseCommitTool.testNullTPCs
OK  0.01s J2 | TestTwoPhaseCommitTool.testWrapper
OK  0.01s J2 | TestTwoPhaseCommitTool.testPrepareThenCommit
Completed on J2 in 0.37s, 4 tests
 
Suite: org.apache.lucene.util.TestNamedSPILoader
(@BeforeClass output)
  1 randomVal: 7
  1 

OK  0.04s J0 | TestNamedSPILoader.testAvailableServices
OK  0.01s J0 | TestNamedSPILoader.testBogusLookup
OK  0.01s J0 | TestNamedSPILoader.testLookup
Completed on J0 in 0.34s, 3 tests
 
Suite: org.apache.lucene.util.TestSmallFloat
(@BeforeClass output)
  1 randomVal: 2
  1 

OK  0.20s J3 | TestSmallFloat.testFloatToByte
OK  0.01s J3 | TestSmallFloat.testByteToFloat
Completed on J3 in 0.48s, 2 tests
 
Suite: org.apache.lucene.index.TestTerm
(@BeforeClass output)
  1 randomVal: 0
  1  
{noformat}

 In LuceneTestCase.beforeClass, make a new random (also using the class 
 hashcode) to vary defaults
 -

 Key: LUCENE-3995
 URL: https://issues.apache.org/jira/browse/LUCENE-3995
 Project: Lucene - Java
  Issue Type: Improvement
  Components: general/test
Affects Versions: 4.0
Reporter: Robert Muir
Assignee: Dawid Weiss

 In LuceneTestCase, we set many static defaults like:
 * default codec
 * default infostream impl
 * default locale
 * default timezone
 * default similarity
 Currently each test run gets a single seed for the run, which means for 
 example across one test run
 every single test will have say, SimpleText + infostream=off + Locale=german 
 + timezone=EDT + similarity=BM25
 Because of that, we lose lots of basic mixed coverage across tests, and it 
 also means the unfortunate
 individual who gets SimpleText or other slow options gets a REALLY SLOW test 
 run, rather than amortizing
 this across all test runs.
 We should at least make a new random (getRandom() ^ className.hashCode()) to 
 fix this so it works like before,
 but unfortunately that only fixes it for LuceneTestCase.
 Won't any subclasses that make random decisions in @BeforeClass (and we have 
 many) still have the same problem?
 Maybe RandomizedRunner can instead be improved here?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3987) Ivy/maven config to pull from sonatype releases

2012-04-18 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256657#comment-13256657
 ] 

Dawid Weiss commented on LUCENE-3987:
-

After some deliberation I would like to add ivysettings.xml to test-framework 
module which would allow (this module) to fetch dependencies from an additional 
repository (sonatype releases). I will also add this to corresponding maven 
descriptor so these would be in sync.

Maintenance-wise this is not an issue -- sonatype is mirroring to central so 
effectively they're the same but there is no lag between releases and syncs.

 Ivy/maven config to pull from sonatype releases
 ---

 Key: LUCENE-3987
 URL: https://issues.apache.org/jira/browse/LUCENE-3987
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Trivial
 Attachments: ivy-sonatype.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3994) some nightly tests take hours

2012-04-17 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255790#comment-13255790
 ] 

Dawid Weiss commented on LUCENE-3994:
-

You could also update statistics -- remove the previous ones and run two three 
times, then update.

Alternatively, we could have jenkins update stats and fetch these from time to 
time.

 some nightly tests take hours
 -

 Key: LUCENE-3994
 URL: https://issues.apache.org/jira/browse/LUCENE-3994
 Project: Lucene - Java
  Issue Type: Bug
  Components: general/build
Affects Versions: 4.0
Reporter: Robert Muir
 Fix For: 4.0

 Attachments: LUCENE-3994.patch


 The nightly builds are taking 4-7 hours.
 This is caused by a few bad apples (can be seen 
 https://builds.apache.org/job/Lucene-trunk/1896/testReport/).
 The top 5 are (all in analysis):
 * TestSynonymMapFilter: 1 hr 54 min
 * TestRandomChains: 1 hr 22 min
 * TestRemoveDuplicatesTokenFilter: 32 min
 * TestMappingCharFilter: 28 min
 * TestWordDelimiterFilter: 22 min
 so thats 4.5 hours right there for that run

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3994) some nightly tests take hours

2012-04-17 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255803#comment-13255803
 ] 

Dawid Weiss commented on LUCENE-3994:
-

Ok. I'll recalculate them from time to time. There is a large variance in tests 
anyway (this can also be computed from log stats because we can keep a history 
of N runs... it'd be interesting to see which tests have the largest variance).

 some nightly tests take hours
 -

 Key: LUCENE-3994
 URL: https://issues.apache.org/jira/browse/LUCENE-3994
 Project: Lucene - Java
  Issue Type: Bug
  Components: general/build
Affects Versions: 4.0
Reporter: Robert Muir
 Fix For: 4.0

 Attachments: LUCENE-3994.patch


 The nightly builds are taking 4-7 hours.
 This is caused by a few bad apples (can be seen 
 https://builds.apache.org/job/Lucene-trunk/1896/testReport/).
 The top 5 are (all in analysis):
 * TestSynonymMapFilter: 1 hr 54 min
 * TestRandomChains: 1 hr 22 min
 * TestRemoveDuplicatesTokenFilter: 32 min
 * TestMappingCharFilter: 28 min
 * TestWordDelimiterFilter: 22 min
 so thats 4.5 hours right there for that run

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3995) In LuceneTestCase.beforeClass, make a new random (also using the class hashcode) to vary defaults

2012-04-17 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255961#comment-13255961
 ] 

Dawid Weiss commented on LUCENE-3995:
-

Note to myself - this also affectes test coverage because it reduces static 
context entropy (as pointed by Robert, Uwe).

 In LuceneTestCase.beforeClass, make a new random (also using the class 
 hashcode) to vary defaults
 -

 Key: LUCENE-3995
 URL: https://issues.apache.org/jira/browse/LUCENE-3995
 Project: Lucene - Java
  Issue Type: Improvement
  Components: general/test
Affects Versions: 4.0
Reporter: Robert Muir
Assignee: Dawid Weiss

 In LuceneTestCase, we set many static defaults like:
 * default codec
 * default infostream impl
 * default locale
 * default timezone
 * default similarity
 Currently each test run gets a single seed for the run, which means for 
 example across one test run
 every single test will have say, SimpleText + infostream=off + Locale=german 
 + timezone=EDT + similarity=BM25
 Because of that, we lose lots of basic mixed coverage across tests, and it 
 also means the unfortunate
 individual who gets SimpleText or other slow options gets a REALLY SLOW test 
 run, rather than amortizing
 this across all test runs.
 We should at least make a new random (getRandom() ^ className.hashCode()) to 
 fix this so it works like before,
 but unfortunately that only fixes it for LuceneTestCase.
 Won't any subclasses that make random decisions in @BeforeClass (and we have 
 many) still have the same problem?
 Maybe RandomizedRunner can instead be improved here?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3988) improve test output to be nicer to 80chars long terminals

2012-04-16 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254734#comment-13254734
 ] 

Dawid Weiss commented on LUCENE-3988:
-

So change it like I suggested -- I can't please everybody. If it bothers you, 
change it:
{noformat}
useSimpleNames=false
maxClassNameColumns=100 
{noformat}
or remove maxClassNameColumns entirely.

 improve test output to be nicer to 80chars long terminals
 -

 Key: LUCENE-3988
 URL: https://issues.apache.org/jira/browse/LUCENE-3988
 Project: Lucene - Java
  Issue Type: Improvement
  Components: general/test
Reporter: Robert Muir
 Fix For: 4.0


 these lines tend to always use 82 chars:
 {noformat}
 [junit4] Tests run:   4, Failures:   0, Errors:   0, Skipped:   0, Time:  
 3.97s
 {noformat}
 Can we remove some of the spaces so it fits? Maybe remove the word 'run' from 
 Tests run.
 occasionally (not always) long classnames wrap too 'Running 
 org.apache.lucene.this.that.TestFoo' ... maybe
 just print the short classname?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3992) TestIndexWriterOnJRECrash failure

2012-04-16 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254908#comment-13254908
 ] 

Dawid Weiss commented on LUCENE-3992:
-

I see why it's slipped through -- I ran @Nightly only one or two times, the 
build server was running regular daily tests... Thanks for fixing.

 TestIndexWriterOnJRECrash failure
 -

 Key: LUCENE-3992
 URL: https://issues.apache.org/jira/browse/LUCENE-3992
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Robert Muir
 Attachments: LUCENE-3992.patch


 triggered this beasting a bunch of tests... gonna probably be hard to 
 reproduce...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3987) Ivy/maven config to pull from sonatype releases

2012-04-15 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254325#comment-13254325
 ] 

Dawid Weiss commented on LUCENE-3987:
-

I don't want to merge this in (note no fix version). I just filed it for 
reference in case somebody needs it.

 Ivy/maven config to pull from sonatype releases
 ---

 Key: LUCENE-3987
 URL: https://issues.apache.org/jira/browse/LUCENE-3987
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Trivial
 Attachments: ivy-sonatype.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2161) BasicDistributedZkTest.testDistribSearch test failure

2012-04-15 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254416#comment-13254416
 ] 

Dawid Weiss commented on SOLR-2161:
---

This test fails very frequently. The most recent failure here:
https://builds.apache.org/job/Lucene-Solr-tests-only-trunk-java7/2235/

I am putting it in a @AwaitsFix group.

 BasicDistributedZkTest.testDistribSearch test failure
 -

 Key: SOLR-2161
 URL: https://issues.apache.org/jira/browse/SOLR-2161
 Project: Solr
  Issue Type: Bug
  Components: Build
Affects Versions: 4.0
 Environment: Hudson
Reporter: Robert Muir
 Fix For: 4.0


 BasicDistributedZkTest.testDistribSearch failed in Hudson.
 Here is the stacktrace:
 {noformat}
 [junit] Testsuite: org.apache.solr.cloud.BasicDistributedZkTest
 [junit] Testcase: 
 testDistribSearch(org.apache.solr.cloud.BasicDistributedZkTest):
 Caused an ERROR
 [junit] Error executing query
 [junit] org.apache.solr.client.solrj.SolrServerException: Error executing 
 query
 [junit]   at 
 org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:95)
 [junit]   at 
 org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:119)
 [junit]   at 
 org.apache.solr.BaseDistributedSearchTestCase.queryServer(BaseDistributedSearchTestCase.java:290)
 [junit]   at 
 org.apache.solr.cloud.BasicDistributedZkTest.queryServer(BasicDistributedZkTest.java:256)
 [junit]   at 
 org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:305)
 [junit]   at 
 org.apache.solr.cloud.BasicDistributedZkTest.doTest(BasicDistributedZkTest.java:227)
 [junit]   at 
 org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:562)
 [junit]   at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:795)
 [junit]   at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:768)
 [junit] Caused by: org.apache.solr.common.SolrException: 
 org.apache.solr.client.solrj.SolrServerException: 
 org.apache.commons.httpclient.NoHttpResponseException: The server 127.0.0.1 
 failed to respond  org.apache.solr.common.SolrException: 
 org.apache.solr.client.solrj.SolrServerException: 
 org.apache.commons.httpclient.NoHttpResponseException: The server 127.0.0.1 
 failed to respond   at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:318)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1325)at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:337)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:240)
  at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
   at 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:388) 
 at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)   
   at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765) 
 at 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) 
 at org.mortbay.jetty.Server.handle(Server.java:326) at 
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)  
 at 
 org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:923)
   at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:547)  at 
 org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at 
 org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at 
 org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)  
at 
 org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) 
 Caused by: org.apache.solr.client.solrj.SolrServerException: 
 org.apache.commons.httpclient.NoHttpResponseException: The server 127.0.0.1 
 failed to respond at 
 org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:483)
   at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.reque
 [junit] 
 [junit] org.apache.solr.client.solrj.SolrServerException: 
 org.apache.commons.httpclient.NoHttpResponseException: The server 127.0.0.1 
 failed to respond  org.apache.solr.common.SolrException: 
 org.apache.solr.client.solrj.SolrServerException: 
 org.apache.commons.httpclient.NoHttpResponseException: The server 127.0.0.1 
 failed to respondat 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:318)
 at

[jira] [Commented] (LUCENE-3988) improve test output to be nicer to 80chars long terminals

2012-04-15 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254422#comment-13254422
 ] 

Dawid Weiss commented on LUCENE-3988:
-

I thought about this a bit. The previous output was a mirror of surefire. After 
some deliberation I don't think it makes sense to present the information so 
verbosely (0 errors, 0 failures, etc.). How about this:

{noformat}
   [junit4] Suite: TestReversedWildcardFilterFactory
   [junit4] Time:  3.00s, 4 tests
   [junit4]  
   [junit4] Suite: [...]r.update.processor.UniqFieldsUpdateProcessorFactoryTest
   [junit4] Time:  3.00s, 4 tests, 1 skipped
   [junit4]  
   [junit4] Running org.apache.solr.spelling.SpellPossibilityIteratorTest
   [junit4] Time:  3.00s, 4 tests, 1 error   FAILURES!
   [junit4]  
   [junit4] Suite: org.buhu.update.processor.BlahBlag
   [junit4] Time:  3.00s, 4 tests, 1 error, 2 failures, 1 skipped
{noformat}

Test name will be displayed in full or truncated (with an ellipsis) to fit into 
the desired number of columns (80 by default)?

 improve test output to be nicer to 80chars long terminals
 -

 Key: LUCENE-3988
 URL: https://issues.apache.org/jira/browse/LUCENE-3988
 Project: Lucene - Java
  Issue Type: Improvement
  Components: general/test
Reporter: Robert Muir

 these lines tend to always use 82 chars:
 {noformat}
 [junit4] Tests run:   4, Failures:   0, Errors:   0, Skipped:   0, Time:  
 3.97s
 {noformat}
 Can we remove some of the spaces so it fits? Maybe remove the word 'run' from 
 Tests run.
 occasionally (not always) long classnames wrap too 'Running 
 org.apache.lucene.this.that.TestFoo' ... maybe
 just print the short classname?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3971) MappingCharFilter rarely has wrong correctOffset (for finalOffset)

2012-04-14 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254017#comment-13254017
 ] 

Dawid Weiss commented on LUCENE-3971:
-

Passes for me with multiple runs. I'll commit it in.

 MappingCharFilter rarely has wrong correctOffset (for finalOffset) 
 ---

 Key: LUCENE-3971
 URL: https://issues.apache.org/jira/browse/LUCENE-3971
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/analysis
Reporter: Robert Muir
 Fix For: 4.0

 Attachments: LUCENE-3971.patch, LUCENE-3971_test.patch


 Found this bug over on LUCENE-3969, but I'm currently tracking a ton of bugs, 
 so
 I figure I would open an issue and see if this one is obvious to anyone:
 Consider this input string: gzw f quaxot (length = 12) with a 
 WhitespaceTokenizer.
 If i have mapping rules like this, then it works!:
 {noformat}
 t = 
 {noformat}
 But if I have mapping rules like this:
 {noformat}
 t = 
 tmakdbl = c
 {noformat}
 Then it will compute final offset wrong:
 {noformat}
 [junit] junit.framework.AssertionFailedError: finalOffset  expected:12 
 but was:11
 {noformat}
 Looks like some logic/recursion bug in the correctOffset method? The second 
 rule is not even used for this string,
 it just happens to also start with 't'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3808) Switch LuceneTestCaseRunner to RandomizedRunner. Enforce Random sharing contracts. Enforce thread leaks.

2012-04-14 Thread Dawid Weiss (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-3808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254054#comment-13254054
]

Dawid Weiss commented on LUCENE-3808:
-

I'm planning to merge github branched code into trunk this weekend. It's been
running in parallel for some time now on my build server and it seems to have
the same failure coverage and at the same time is a start to clean up
LuceneTestCase and associated test code.

I hope you'll also like the new infrastructure -- will elaborate about this a
bit once merged.

Switch LuceneTestCaseRunner to RandomizedRunner. Enforce Random sharing
contracts. Enforce thread leaks.

Key: LUCENE-3808
URL: https://issues.apache.org/jira/browse/LUCENE-3808
Project: Lucene - Java
Issue Type: Sub-task
Components: general/test
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor
Fix For: 4.0

Dev. branch at: https://github.com/dweiss/lucene_solr/tree/rr
Switch the runner to RandomizedRunner. Enforce the following:
- (/) Random sharing will result in a failure/ exception.
- (/) -Add a validator for testXXX without @Test annotation.- (custom test
provider added).
- (/) Make sure tests are executed with assertions enabled (at least for
solr/lucene packages).
- (/) Add a validator for static hook shadowing (no-no).
- (/) Modify custom execution groups in LTC to be real @Groups.
- Thread leaks will result in a failure (add lingering if needed, but no
ignores). [this is done, but disabled]
- Add a validator for @Test method overrides (check how many of these we
already have first).
- What to do with thread-shared Random instances copies in MockIndexWriter
and MockAnalyzer?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3984) Add a target to recalculate SHA1 checksums for JAR

2012-04-14 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254170#comment-13254170
 ] 

Dawid Weiss commented on LUCENE-3984:
-

Can I commit this in as a top-level target? It shouldn't matter for svn/git 
files that don't change (their timestamps will but contents will not) and it 
helps folks on Windows who can't use Hoss's magic bash pipe (doesn't this sound 
wrong somehow?).

 Add a target to recalculate SHA1 checksums for JAR
 --

 Key: LUCENE-3984
 URL: https://issues.apache.org/jira/browse/LUCENE-3984
 Project: Lucene - Java
  Issue Type: Task
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Trivial
 Fix For: 4.0


 Something like this. Either top-level or common-build.xml?
 {noformat}
   target name=refresh-checksums
 checksum algorithm=SHA1
   fileset dir=${basedir}
 include name=**/*.jar/
   /fileset
 /checksum
   /target
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3973) Incorporate PMD / FindBugs

2012-04-13 Thread Dawid Weiss (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253180#comment-13253180
]

Dawid Weiss commented on LUCENE-3973:
-

bq. Unless you run into the same taskdef/classloader/sub-build/permgen-OOM

I was just saying to fetch them via ivy and then spawn a separate jvm to run
them, much like you'd do anyway if they are separate installations.

Besides -- we already have an 'ivy warning with instructions', the same can be
done with permgen/OOM problems -- detect the current (ANT's) VM's settings (can
be done via mx bean) and warn/ fail the build if the defaults are too low,
instructing the user to set up ANT_OPTS properly...

I'm not pressing on this, this is a no-issue.

Incorporate PMD / FindBugs
--

Key: LUCENE-3973
URL: https://issues.apache.org/jira/browse/LUCENE-3973
Project: Lucene - Java
Issue Type: Improvement
Components: general/build
Reporter: Chris Male

This has been touched on a few times over the years. Having static analysis
as part of our build seems like a big win. For example, we could use PMD to
look at {{System.out.println}} statements like discussed in LUCENE-3877 and
we could possibly incorporate the nocommit / @author checks as well.
There are a few things to work out as part of this:
- Should we use both PMD and FindBugs or just one of them? They look at code
from different perspectives (bytecode vs source code) and target different
issues. At the moment I'm in favour of trying both but that might be too
heavy handed for our needs.
- What checks should we use? There's no point having the analysis if it's
going to raise too many false-positives or problems we don't deem
problematic.
- How should the analysis be integrated in our build? Need to work out when
the analysis should run, how it should be incorporated in Ant and/or Maven,
what impact errors should have.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3973) Incorporate PMD / FindBugs

2012-04-12 Thread Dawid Weiss (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252245#comment-13252245
]

Dawid Weiss commented on LUCENE-3973:
-

There is also this interesting tool: http://babelfish.arc.nasa.gov/trac/jpf

I haven't used it and I don't know if it can handle Lucene size codebase (the
number of execution paths will be astronomic) but if somebody has some time to
play with it, it'd be interesting to hear what it can do.

Incorporate PMD / FindBugs
--

Key: LUCENE-3973
URL: https://issues.apache.org/jira/browse/LUCENE-3973
Project: Lucene - Java
Issue Type: Improvement
Components: general/build
Reporter: Chris Male

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3973) Incorporate PMD / FindBugs

2012-04-12 Thread Dawid Weiss (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252244#comment-13252244
]

Dawid Weiss commented on LUCENE-3973:
-

Both are helpful. We use both and I think FindBugs is slightly more useful than
PMD but it's just a subjective opinion not anything I measured.

Also, both can be verbose and a pain in the ass at times when you know the code
is right and they still complain... And they are long to execute so they should
be part of jenkins nightly/ smoke tests I think, not regular builds (and
definitely not ant test...).

Incorporate PMD / FindBugs
--

Key: LUCENE-3973
URL: https://issues.apache.org/jira/browse/LUCENE-3973
Project: Lucene - Java
Issue Type: Improvement
Components: general/build
Reporter: Chris Male

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3972) Improve AllGroupsCollector implementations

2012-04-12 Thread Dawid Weiss (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252463#comment-13252463
]

Dawid Weiss commented on LUCENE-3972:
-

Yes, sorry -- hash of course. The hash method that should redistribute keys
space into buckets (but currently doesn't).

As for BytesRefHash vs. BytesRef instances -- maybe it's the source of the
speedup, who knows. I would try the hash method though, if nothing else just
for curiosity. I would also patch it for the future in either case. Not
rehashing input keys is a flaw in my opinion (again -- backed by real life
experience from HPPC).

Improve AllGroupsCollector implementations
--

Key: LUCENE-3972
URL: https://issues.apache.org/jira/browse/LUCENE-3972
Project: Lucene - Java
Issue Type: Improvement
Components: modules/grouping
Reporter: Martijn van Groningen
Attachments: LUCENE-3972.patch, LUCENE-3972.patch

I think that the performance of TermAllGroupsCollectorm,
DVAllGroupsCollector.BR and DVAllGroupsCollector.SortedBR can be improved by
using BytesRefHash to store the groups instead of an ArrayList.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3972) Improve AllGroupsCollector implementations

2012-04-12 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252486#comment-13252486
 ] 

Dawid Weiss commented on LUCENE-3972:
-

Hmmm... it's not collisions then, it was worth a try. I still find the 
difference puzzling -- I can't justify your version being 3x faster. Curious 
what it might be.

bq. But we know a lot about docids, and extra hashing should just lead to an 
average-case slowdown.

Ok.

 Improve AllGroupsCollector implementations
 --

 Key: LUCENE-3972
 URL: https://issues.apache.org/jira/browse/LUCENE-3972
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/grouping
Reporter: Martijn van Groningen
 Attachments: LUCENE-3972.patch, LUCENE-3972.patch


 I think that the performance of TermAllGroupsCollectorm, 
 DVAllGroupsCollector.BR and DVAllGroupsCollector.SortedBR can be improved by 
 using BytesRefHash to store the groups instead of an ArrayList.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3973) Incorporate PMD / FindBugs

2012-04-12 Thread Dawid Weiss (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252758#comment-13252758
]

Dawid Weiss commented on LUCENE-3973:
-

I believe both pmd and findbugs are on maven repos so one could use ivy to
fetch them automatically. One thing less to think about.

Incorporate PMD / FindBugs
--

Key: LUCENE-3973
URL: https://issues.apache.org/jira/browse/LUCENE-3973
Project: Lucene - Java
Issue Type: Improvement
Components: general/build
Reporter: Chris Male

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3971) MappingCharFilter rarely has wrong correctOffset (for finalOffset)

2012-04-11 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13251633#comment-13251633
 ] 

Dawid Weiss commented on LUCENE-3971:
-

I think this bug is similar (if not identical) to what I fixed a while ago in 
PatternReplaceCharFilter -- I remember it suffered off by one as well and 
looking at the code it may be a similar in structure (linked list and all). 

There is also a question how this filter _should_ work -- should it be greedy 
or reluctant (match the first pattern or the longest pattern)? 

 MappingCharFilter rarely has wrong correctOffset (for finalOffset) 
 ---

 Key: LUCENE-3971
 URL: https://issues.apache.org/jira/browse/LUCENE-3971
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/analysis
Reporter: Robert Muir
 Fix For: 4.0

 Attachments: LUCENE-3971_test.patch


 Found this bug over on LUCENE-3969, but I'm currently tracking a ton of bugs, 
 so
 I figure I would open an issue and see if this one is obvious to anyone:
 Consider this input string: gzw f quaxot (length = 12) with a 
 WhitespaceTokenizer.
 If i have mapping rules like this, then it works!:
 {noformat}
 t = 
 {noformat}
 But if I have mapping rules like this:
 {noformat}
 t = 
 tmakdbl = c
 {noformat}
 Then it will compute final offset wrong:
 {noformat}
 [junit] junit.framework.AssertionFailedError: finalOffset  expected:12 
 but was:11
 {noformat}
 Looks like some logic/recursion bug in the correctOffset method? The second 
 rule is not even used for this string,
 it just happens to also start with 't'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3335) testDistribSearch failure

2012-04-10 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250518#comment-13250518
 ] 

Dawid Weiss commented on SOLR-3335:
---

@Yonik: I run trunk tests in non-nightly mode and I see at least 1-2 failures a 
day (runs every two hours). This does change over time though as i merge with 
new commits. Some tests are frequent offenders though, like the latest one --
{noformat}
build   10-Apr-2012 00:25:25[junit] Testsuite: 
org.apache.solr.cloud.OverseerTest
build   10-Apr-2012 00:25:25[junit] Testcase: 
testShardLeaderChange(org.apache.solr.cloud.OverseerTest):FAILED
build   10-Apr-2012 00:25:25[junit] Unexpected shard leader 
coll:collection1 shard:shard1 expected:core4 but was:null
build   10-Apr-2012 00:25:25[junit] 
junit.framework.AssertionFailedError: Unexpected shard leader coll:collection1 
shard:shard1 expected:core4 but was:null
build   10-Apr-2012 00:25:25[junit] at 
org.junit.Assert.fail(Assert.java:93)
build   10-Apr-2012 00:25:25[junit] at 
org.junit.Assert.failNotEquals(Assert.java:647)
build   10-Apr-2012 00:25:25[junit] at 
org.junit.Assert.assertEquals(Assert.java:128)
build   10-Apr-2012 00:25:25[junit] at 
org.apache.solr.cloud.OverseerTest.verifyShardLeader(OverseerTest.java:549)
build   10-Apr-2012 00:25:25[junit] at 
org.apache.solr.cloud.OverseerTest.testShardLeaderChange(OverseerTest.java:711)
build   10-Apr-2012 00:25:25[junit] at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
build   10-Apr-2012 00:25:25[junit] at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
build   10-Apr-2012 00:25:25[junit] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
build   10-Apr-2012 00:25:25[junit] at 
java.lang.reflect.Method.invoke(Method.java:597)
build   10-Apr-2012 00:25:25[junit] at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
build   10-Apr-2012 00:25:25[junit] at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
build   10-Apr-2012 00:25:25[junit] at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
build   10-Apr-2012 00:25:25[junit] at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
build   10-Apr-2012 00:25:25[junit] at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
build   10-Apr-2012 00:25:25[junit] at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
build   10-Apr-2012 00:25:25[junit] at 
org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:63)
build   10-Apr-2012 00:25:25[junit] at 
org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:754)
build   10-Apr-2012 00:25:25[junit] at 
org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:670)
build   10-Apr-2012 00:25:25[junit] at 
org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:69)
build   10-Apr-2012 00:25:25[junit] at 
org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:591)
build   10-Apr-2012 00:25:25[junit] at 
org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:75)
build   10-Apr-2012 00:25:25[junit] at 
org.apache.lucene.util.LuceneTestCase$SaveThreadAndTestNameRule$1.evaluate(LuceneTestCase.java:642)
build   10-Apr-2012 00:25:25[junit] at 
org.junit.rules.RunRules.evaluate(RunRules.java:18)
build   10-Apr-2012 00:25:25[junit] at 
org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
build   10-Apr-2012 00:25:25[junit] at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
build   10-Apr-2012 00:25:25[junit] at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:164)
build   10-Apr-2012 00:25:25[junit] at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57)
build   10-Apr-2012 00:25:25[junit] at 
org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
build   10-Apr-2012 00:25:25[junit] at 
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
build   10-Apr-2012 00:25:25[junit] at 
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
build   10-Apr-2012 00:25:25[junit] at 
org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
build   10-Apr-2012

[jira] [Commented] (SOLR-3237) OverseerTest failure (non-reproducible)

2012-04-10 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250564#comment-13250564
 ] 

Dawid Weiss commented on SOLR-3237:
---

I have more if you need logs, Sami. Thanks for taking care of this one!

 OverseerTest failure (non-reproducible)
 ---

 Key: SOLR-3237
 URL: https://issues.apache.org/jira/browse/SOLR-3237
 Project: Solr
  Issue Type: Bug
Reporter: Dawid Weiss
Assignee: Sami Siren
Priority: Minor
 Fix For: 4.0


 Nighly log harvest. Couldn't reproduce, unfortunately.
 {noformat}
 build 13-Mar-2012 06:08:43[junit] Testsuite: 
 org.apache.solr.cloud.OverseerTest
 build 13-Mar-2012 06:08:43[junit] Testcase: 
 testShardLeaderChange(org.apache.solr.cloud.OverseerTest):FAILED
 build 13-Mar-2012 06:08:43[junit] Unexpected shard leader 
 coll:collection1 shard:shard1 expected:core4 but was:null
 build 13-Mar-2012 06:08:43[junit] 
 junit.framework.AssertionFailedError: Unexpected shard leader 
 coll:collection1 shard:shard1 expected:core4 but was:null
 build 13-Mar-2012 06:08:43[junit] at 
 org.apache.solr.cloud.OverseerTest.verifyShardLeader(OverseerTest.java:549)
 build 13-Mar-2012 06:08:43[junit] at 
 org.apache.solr.cloud.OverseerTest.testShardLeaderChange(OverseerTest.java:711)
 build 13-Mar-2012 06:08:43[junit] at 
 org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:20)
 build 13-Mar-2012 06:08:43[junit] at 
 org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:729)
 build 13-Mar-2012 06:08:43[junit] at 
 org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:645)
 build 13-Mar-2012 06:08:43[junit] at 
 org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:22)
 build 13-Mar-2012 06:08:43[junit] at 
 org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:556)
 build 13-Mar-2012 06:08:43[junit] at 
 org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:51)
 build 13-Mar-2012 06:08:43[junit] at 
 org.apache.lucene.util.LuceneTestCase$RememberThreadRule$1.evaluate(LuceneTestCase.java:618)
 build 13-Mar-2012 06:08:43[junit] at 
 org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:164)
 build 13-Mar-2012 06:08:43[junit] at 
 org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57)
 build 13-Mar-2012 06:08:43[junit] at 
 org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:20)
 build 13-Mar-2012 06:08:43[junit] at 
 org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:51)
 build 13-Mar-2012 06:08:43[junit] at 
 org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:21)
 build 13-Mar-2012 06:08:43[junit] at 
 org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:22)
 build 13-Mar-2012 06:08:43[junit] 
 build 13-Mar-2012 06:08:43[junit] 
 build 13-Mar-2012 06:08:43[junit] Tests run: 7, Failures: 1, Errors: 
 0, Time elapsed: 74.666 sec
 build 13-Mar-2012 06:08:43[junit] 
 build 13-Mar-2012 06:08:43[junit] - Standard Error 
 -
 build 13-Mar-2012 06:08:43[junit] NOTE: reproduce with: ant test 
 -Dtestcase=OverseerTest -Dtestmethod=testShardLeaderChange 
 -Dtests.seed=48c9960216b3d5d:6c1600de0df53cdd:69c37083161d807d 
 -Dargs=-Dfile.encoding=UTF-8
 build 13-Mar-2012 06:08:43[junit] WARNING: test class left thread 
 running: Session Sets (4):
 build 13-Mar-2012 06:08:43[junit] 0 expire at Mon Mar 12 22:08:45 MST 
 2012:
 build 13-Mar-2012 06:08:43[junit] 0 expire at Mon Mar 12 22:08:48 MST 
 2012:
 build 13-Mar-2012 06:08:43[junit] 0 expire at Mon Mar 12 22:08:51 MST 
 2012:
 build 13-Mar-2012 06:08:43[junit] 0 expire at Mon Mar 12 22:08:54 MST 
 2012:
 build 13-Mar-2012 06:08:43[junit] 
 build 13-Mar-2012 06:08:43[junit] RESOURCE LEAK: test class left 1 
 thread(s) running
 build 13-Mar-2012 06:08:43[junit] NOTE: test params are: 
 codec=Lucene40: {}, sim=DefaultSimilarity, locale=zh_TW, 
 timezone=Mexico/BajaSur
 build 13-Mar-2012 06:08:43[junit] NOTE: all tests run in this JVM:
 build 13-Mar-2012 06:08:43[junit] [BasicFunctionalityTest, 
 SolrInfoMBeanTest, SnowballPorterFilterFactoryTest, TestCJKTokenizerFactory, 
 TestCJKWidthFilterFactory,

[jira] [Commented] (SOLR-3335) testDistribSearch failure

2012-04-08 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13249492#comment-13249492
 ] 

Dawid Weiss commented on SOLR-3335:
---

This is weird. I've had something like this before on the branch -- see 
SOLR-3233. If you go back to that particular revision it was reproducible (but 
no longer is with that seed). I didn't investigate further.

 testDistribSearch failure
 -

 Key: SOLR-3335
 URL: https://issues.apache.org/jira/browse/SOLR-3335
 Project: Solr
  Issue Type: Bug
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Trivial
 Fix For: 4.0


 Happened on my test machine. Is there a way to disable these tests if we 
 cannot fix them? There are two three tests that fail most of the time and 
 that apparently nobody knows how to fix (including me).
 There is also a typo in the error message (I'm away from home for Easter, 
 can't do it now).
 {noformat}
 build 06-Apr-2012 16:11:54[junit] Testsuite: 
 org.apache.solr.cloud.RecoveryZkTest
 build 06-Apr-2012 16:11:54[junit] Testcase: 
 testDistribSearch(org.apache.solr.cloud.RecoveryZkTest):  FAILED
 build 06-Apr-2012 16:11:54[junit] There are still nodes recoverying
 build 06-Apr-2012 16:11:54[junit] 
 junit.framework.AssertionFailedError: There are still nodes recoverying
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.Assert.fail(Assert.java:93)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.solr.cloud.AbstractDistributedZkTestCase.waitForRecoveriesToFinish(AbstractDistributedZkTestCase.java:132)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.solr.cloud.RecoveryZkTest.doTest(RecoveryZkTest.java:84)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:670)
 build 06-Apr-2012 16:11:54[junit] at 
 sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 build 06-Apr-2012 16:11:54[junit] at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 build 06-Apr-2012 16:11:54[junit] at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 build 06-Apr-2012 16:11:54[junit] at 
 java.lang.reflect.Method.invoke(Method.java:597)
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:63)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:754)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:670)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:69)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:591)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:75)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.lucene.util.LuceneTestCase$SaveThreadAndTestNameRule$1.evaluate(LuceneTestCase.java:642)
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.rules.RunRules.evaluate(RunRules.java:18)
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:164)
 build 06-Apr-2012 16:11:54[junit] at

[jira] [Commented] (SOLR-3335) testDistribSearch failure

2012-04-07 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13249199#comment-13249199
 ] 

Dawid Weiss commented on SOLR-3335:
---

I'll wait a few days to give people a chance to object. If I hear nothing I 
will successively disable those tests that fail for me often (without much 
feedback).

 testDistribSearch failure
 -

 Key: SOLR-3335
 URL: https://issues.apache.org/jira/browse/SOLR-3335
 Project: Solr
  Issue Type: Bug
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Trivial
 Fix For: 4.0


 Happened on my test machine. Is there a way to disable these tests if we 
 cannot fix them? There are two three tests that fail most of the time and 
 that apparently nobody knows how to fix (including me).
 There is also a typo in the error message (I'm away from home for Easter, 
 can't do it now).
 {noformat}
 build 06-Apr-2012 16:11:54[junit] Testsuite: 
 org.apache.solr.cloud.RecoveryZkTest
 build 06-Apr-2012 16:11:54[junit] Testcase: 
 testDistribSearch(org.apache.solr.cloud.RecoveryZkTest):  FAILED
 build 06-Apr-2012 16:11:54[junit] There are still nodes recoverying
 build 06-Apr-2012 16:11:54[junit] 
 junit.framework.AssertionFailedError: There are still nodes recoverying
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.Assert.fail(Assert.java:93)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.solr.cloud.AbstractDistributedZkTestCase.waitForRecoveriesToFinish(AbstractDistributedZkTestCase.java:132)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.solr.cloud.RecoveryZkTest.doTest(RecoveryZkTest.java:84)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:670)
 build 06-Apr-2012 16:11:54[junit] at 
 sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 build 06-Apr-2012 16:11:54[junit] at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 build 06-Apr-2012 16:11:54[junit] at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 build 06-Apr-2012 16:11:54[junit] at 
 java.lang.reflect.Method.invoke(Method.java:597)
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:63)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:754)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:670)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:69)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:591)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:75)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.lucene.util.LuceneTestCase$SaveThreadAndTestNameRule$1.evaluate(LuceneTestCase.java:642)
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.rules.RunRules.evaluate(RunRules.java:18)
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:164)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57)
 build 06-Apr-2012

[jira] [Commented] (SOLR-3335) testDistribSearch failure

2012-04-07 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13249279#comment-13249279
 ] 

Dawid Weiss commented on SOLR-3335:
---

I couldn't reproduce it either. My test machine is an ubuntu quad core (I7) and 
it is running full Lucene builds much like Jenkins. There are a few recurring 
problems that I couldn't reproduce locally no matter what. This ALSO happens on 
LUCENE-3808 branch which leads me to believe the problem may stem from 
interaction between concurrently running JVMs, not the code itself (perhaps 
they're modifying each other's configs, perhaps something else).

Anything comes to your mind?

 testDistribSearch failure
 -

 Key: SOLR-3335
 URL: https://issues.apache.org/jira/browse/SOLR-3335
 Project: Solr
  Issue Type: Bug
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Trivial
 Fix For: 4.0


 Happened on my test machine. Is there a way to disable these tests if we 
 cannot fix them? There are two three tests that fail most of the time and 
 that apparently nobody knows how to fix (including me).
 There is also a typo in the error message (I'm away from home for Easter, 
 can't do it now).
 {noformat}
 build 06-Apr-2012 16:11:54[junit] Testsuite: 
 org.apache.solr.cloud.RecoveryZkTest
 build 06-Apr-2012 16:11:54[junit] Testcase: 
 testDistribSearch(org.apache.solr.cloud.RecoveryZkTest):  FAILED
 build 06-Apr-2012 16:11:54[junit] There are still nodes recoverying
 build 06-Apr-2012 16:11:54[junit] 
 junit.framework.AssertionFailedError: There are still nodes recoverying
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.Assert.fail(Assert.java:93)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.solr.cloud.AbstractDistributedZkTestCase.waitForRecoveriesToFinish(AbstractDistributedZkTestCase.java:132)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.solr.cloud.RecoveryZkTest.doTest(RecoveryZkTest.java:84)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:670)
 build 06-Apr-2012 16:11:54[junit] at 
 sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 build 06-Apr-2012 16:11:54[junit] at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 build 06-Apr-2012 16:11:54[junit] at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 build 06-Apr-2012 16:11:54[junit] at 
 java.lang.reflect.Method.invoke(Method.java:597)
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:63)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:754)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:670)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:69)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:591)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:75)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.lucene.util.LuceneTestCase$SaveThreadAndTestNameRule$1.evaluate(LuceneTestCase.java:642)
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.rules.RunRules.evaluate(RunRules.java:18)
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
 build 06-Apr-2012 16:11:54[junit] at

[jira] [Commented] (SOLR-3335) testDistribSearch failure

2012-04-07 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13249290#comment-13249290
 ] 

Dawid Weiss commented on SOLR-3335:
---

Try looping over full ant test cycles (maybe limited to solr-core only). I did 
this a while back in a shell loop and redirected output to files. This brought 
back some failures after 30 iterations or so.

I can also try to see if doing the above with 1 forked jvm is any different 
than with 3-4 forked jvms -- this would make it clear if it's a concurrent 
tests conflict or not (and possibly provide a way to reproduce).

Thanks for trying to clean this up -- it's been bugging me for a while now. 

 testDistribSearch failure
 -

 Key: SOLR-3335
 URL: https://issues.apache.org/jira/browse/SOLR-3335
 Project: Solr
  Issue Type: Bug
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Trivial
 Fix For: 4.0


 Happened on my test machine. Is there a way to disable these tests if we 
 cannot fix them? There are two three tests that fail most of the time and 
 that apparently nobody knows how to fix (including me).
 There is also a typo in the error message (I'm away from home for Easter, 
 can't do it now).
 {noformat}
 build 06-Apr-2012 16:11:54[junit] Testsuite: 
 org.apache.solr.cloud.RecoveryZkTest
 build 06-Apr-2012 16:11:54[junit] Testcase: 
 testDistribSearch(org.apache.solr.cloud.RecoveryZkTest):  FAILED
 build 06-Apr-2012 16:11:54[junit] There are still nodes recoverying
 build 06-Apr-2012 16:11:54[junit] 
 junit.framework.AssertionFailedError: There are still nodes recoverying
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.Assert.fail(Assert.java:93)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.solr.cloud.AbstractDistributedZkTestCase.waitForRecoveriesToFinish(AbstractDistributedZkTestCase.java:132)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.solr.cloud.RecoveryZkTest.doTest(RecoveryZkTest.java:84)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:670)
 build 06-Apr-2012 16:11:54[junit] at 
 sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 build 06-Apr-2012 16:11:54[junit] at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 build 06-Apr-2012 16:11:54[junit] at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 build 06-Apr-2012 16:11:54[junit] at 
 java.lang.reflect.Method.invoke(Method.java:597)
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:63)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:754)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:670)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:69)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:591)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:75)
 build 06-Apr-2012 16:11:54[junit] at 
 org.apache.lucene.util.LuceneTestCase$SaveThreadAndTestNameRule$1.evaluate(LuceneTestCase.java:642)
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.rules.RunRules.evaluate(RunRules.java:18)
 build 06-Apr-2012 16:11:54[junit] at 
 org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
 build 06-Apr-2012 16:11:54[junit] at

[jira] [Commented] (SOLR-3328) executable bits of shellscripts in solr source release

2012-04-06 Thread Dawid Weiss (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-3328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13248138#comment-13248138
]

Dawid Weiss commented on SOLR-3328:
---

http://ant.apache.org/manual/Tasks/zip.html

bq. Starting with Ant 1.5.2, zip can store Unix permissions inside the
archive (see description of the filemode and dirmode attributes for
zipfileset). Unfortunately there is no portable way to store these
permissions. Ant uses the algorithm used by Info-Zip's implementation of the
zip and unzip commands - these are the default versions of zip and unzip for
many Unix and Unix-like systems.

I remember we used to ZIP with unix permissions and they unzipped just fine
(with permission sets).

executable bits of shellscripts in solr source release
--

Key: SOLR-3328
URL: https://issues.apache.org/jira/browse/SOLR-3328
Project: Solr
Issue Type: Improvement
Components: Build
Reporter: Robert Muir
Fix For: 4.0

HossmanSays: in the solr src releases, some shell scripts are not executable
by default.
I don't know if we can improve this? Maybe its an svn prop?
Maybe something needs to be specified to the tar/zip process?
Currently the 'source release' is really an svn export...
Personally i always do 'sh foo.sh' rather than './foo.sh',
but if it makes it more user-friendly we should figure it out
Just opening the issue since we don't forget about it, I think solr cloud
adds some more shell scripts so we should at least figure out what we want to
do.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3950) load rat via ivy for rat-sources task

2012-04-04 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13246323#comment-13246323
 ] 

Dawid Weiss commented on LUCENE-3950:
-

+1. I think this, license checks, CRLFs and other non-code things should be 
part of an integration test target. So that if you want to actually test code 
you can apply a filter and have a quick turnaround and for full integration 
tests you can fire them before the commit etc.

 load rat via ivy for rat-sources task
 -

 Key: LUCENE-3950
 URL: https://issues.apache.org/jira/browse/LUCENE-3950
 Project: Lucene - Java
  Issue Type: Improvement
  Components: general/build
Reporter: Robert Muir

 we now fail the build on rat problems (LUCENE-1866),
 so we should make it easy to run rat-sources for people
 to test locally (it takes like 3 seconds total for the whole trunk)
 Also this is safer than putting rat in your ~/.ant/lib because that 
 adds some classes from commons to your ant classpath (which we currently
 wrongly use in compile).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3943) Use ivy cachepath and cachefileset instead of ivy retrieve

2012-04-03 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245139#comment-13245139
 ] 

Dawid Weiss commented on LUCENE-3943:
-

This will require moving license checks till after the distribution is 
assembled, but it's a good idea. It's much like with Maven when things get 
stored once and IDEs and the build system reuses the same artifacts.

 Use ivy cachepath and cachefileset instead of ivy retrieve
 --

 Key: LUCENE-3943
 URL: https://issues.apache.org/jira/browse/LUCENE-3943
 Project: Lucene - Java
  Issue Type: Improvement
  Components: general/build
Reporter: Chris Male

 In LUCENE-3930 we moved to resolving all external dependencies using 
 ivy:retrieve.  This process places the dependencies into the lib/ folder of 
 the respective modules which was ideal since it replicated the existing build 
 process and limited the number of changes to be made to the build.
 However it can lead to multiple jars for the same dependency in the lib 
 folder when the dependency is upgraded, and just isn't the most efficient way 
 to use Ivy.
 Uwe pointed out that we can remove the ivy:retrieve calls and make use of 
 ivy:cachepath and ivy:cachefileset to build our classpaths and packages 
 respectively, which will go some way to addressing these limitations

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3943) Use ivy cachepath and cachefileset instead of ivy retrieve

2012-04-03 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245147#comment-13245147
 ] 

Dawid Weiss commented on LUCENE-3943:
-

Yep, sure.

 Use ivy cachepath and cachefileset instead of ivy retrieve
 --

 Key: LUCENE-3943
 URL: https://issues.apache.org/jira/browse/LUCENE-3943
 Project: Lucene - Java
  Issue Type: Improvement
  Components: general/build
Reporter: Chris Male

 In LUCENE-3930 we moved to resolving all external dependencies using 
 ivy:retrieve.  This process places the dependencies into the lib/ folder of 
 the respective modules which was ideal since it replicated the existing build 
 process and limited the number of changes to be made to the build.
 However it can lead to multiple jars for the same dependency in the lib 
 folder when the dependency is upgraded, and just isn't the most efficient way 
 to use Ivy.
 Uwe pointed out that we can remove the ivy:retrieve calls and make use of 
 ivy:cachepath and ivy:cachefileset to build our classpaths and packages 
 respectively, which will go some way to addressing these limitations

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3943) Use ivy cachepath and cachefileset instead of ivy retrieve

2012-04-03 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245551#comment-13245551
 ] 

Dawid Weiss commented on LUCENE-3943:
-

bq. In my opinion, the ideal situation would be that we pass these filesets 
directly to the zip/tar/gz whatever in the binary release targets

+1.

 Use ivy cachepath and cachefileset instead of ivy retrieve
 --

 Key: LUCENE-3943
 URL: https://issues.apache.org/jira/browse/LUCENE-3943
 Project: Lucene - Java
  Issue Type: Improvement
  Components: general/build
Reporter: Chris Male

 In LUCENE-3930 we moved to resolving all external dependencies using 
 ivy:retrieve.  This process places the dependencies into the lib/ folder of 
 the respective modules which was ideal since it replicated the existing build 
 process and limited the number of changes to be made to the build.
 However it can lead to multiple jars for the same dependency in the lib 
 folder when the dependency is upgraded, and just isn't the most efficient way 
 to use Ivy.
 Uwe pointed out that _when working from svn or in using src releases_ we can 
 remove the ivy:retrieve calls and make use of ivy:cachepath and 
 ivy:cachefileset to build our classpaths and packages respectively, which 
 will go some way to addressing these limitations -- however we still need the 
 build system capable of putting the actual jars into specific lib folders 
 when assembling the binary artifacts

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3944) ant clean should remove pom.xml's

2012-04-03 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245554#comment-13245554
 ] 

Dawid Weiss commented on LUCENE-3944:
-

bq. I think Maven Ant Tasks' deploy target needs to be able to access the 
parent and grandparent POMs, which (I think) means either putting them into the 
user's local maven repository, or putting them at the relative location given 
in the parent POM section of each POM. 

I just recently peeked at Apache ANT's source distribution and this seems to be 
done this way (separate folder structure just for POMs with relative refs).

 ant clean should remove pom.xml's
 -

 Key: LUCENE-3944
 URL: https://issues.apache.org/jira/browse/LUCENE-3944
 Project: Lucene - Java
  Issue Type: Improvement
  Components: general/build
Reporter: Chris Male
Priority: Blocker
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3944.patch, LUCENE-3944.patch


 Currently once the pom.xml's are in place, its hard to get them out.  Having 
 them can be a little trappy when you're trying to debug the bug.  We should 
 facilitate their removal during clean.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3945) we should include checksums for every jar ivy fetches in svn src releases to verify the jars are the ones we expect

2012-04-03 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245632#comment-13245632
 ] 

Dawid Weiss commented on LUCENE-3945:
-

{noformat}
reader = new BufferedReader(new FileReader(f));
{noformat}

Isn't this locale-sensitive? I think it should be explicit UTF-8 (or US-ASCII 
for that matter).

{noformat}
+  String hexStr = Integer.toHexString(CHECKSUM_BYTE_MASK  digest[i]);
+  if (hexStr.length()  2) {
+checksum.append(0);
+  }
+  checksum.append(hexStr);
{noformat}

Isn't any of these simpler?
{noformat}
checksum.append(String.format(Locale.ENGLISH, %02x, CHECKSUM_BYTE_MASK  
digest[i]));
{noformat}
or
{noformat}
char [] HEX = 0123456789abcdef.toCharArray();
int v = digest[i];
checksum.append(HEX[(v  4)  0x0F]).append(HEX  0x0F);
{noformat}

 we should include checksums for every jar ivy fetches in svn  src releases 
 to verify the jars are the ones we expect
 -

 Key: LUCENE-3945
 URL: https://issues.apache.org/jira/browse/LUCENE-3945
 Project: Lucene - Java
  Issue Type: Task
Reporter: Hoss Man
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3945.patch


 Conversation with rmuir last night got me thinking about the fact that one 
 thing we lose by using ivy is confidence that every user of a release is 
 compiling against (and likely using at run time) the same dependencies as 
 every other user.
 Up to 3.5, users of src and binary releases could be confident that the jars 
 included in the release were the same jars the lucene devs vetted and tested 
 against when voting on the release candidate, but with ivy there is now the 
 possibility that after the source release is published, the owner of a domain 
 where these dependencies are hosted might change the jars in some way w/o 
 anyone knowing.  Likewise: we as developers could commit an ivy.xml file 
 pointing to a specific URL which we then use for and test for months, and 
 just prior to a release, the contents of the remote URL could change such 
 that a JAR included in the binary artifacts might not match the ones we've 
 vetted and tested leading up to that RC.
 So i propose that we include checksum files in svn and in our source releases 
 that can be used by users to verify that the jars they get from ivy match the 
 jars we tested against.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3945) we should include checksums for every jar ivy fetches in svn src releases to verify the jars are the ones we expect

2012-04-03 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245634#comment-13245634
 ] 

Dawid Weiss commented on LUCENE-3945:
-

Btw. you can also avoid a recrawl by passing a refid of the same fileset to two 
tasks rather than constructing a new one in each. I don't mind renaming the 
class either.

 we should include checksums for every jar ivy fetches in svn  src releases 
 to verify the jars are the ones we expect
 -

 Key: LUCENE-3945
 URL: https://issues.apache.org/jira/browse/LUCENE-3945
 Project: Lucene - Java
  Issue Type: Task
Reporter: Hoss Man
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3945.patch


 Conversation with rmuir last night got me thinking about the fact that one 
 thing we lose by using ivy is confidence that every user of a release is 
 compiling against (and likely using at run time) the same dependencies as 
 every other user.
 Up to 3.5, users of src and binary releases could be confident that the jars 
 included in the release were the same jars the lucene devs vetted and tested 
 against when voting on the release candidate, but with ivy there is now the 
 possibility that after the source release is published, the owner of a domain 
 where these dependencies are hosted might change the jars in some way w/o 
 anyone knowing.  Likewise: we as developers could commit an ivy.xml file 
 pointing to a specific URL which we then use for and test for months, and 
 just prior to a release, the contents of the remote URL could change such 
 that a JAR included in the binary artifacts might not match the ones we've 
 vetted and tested leading up to that RC.
 So i propose that we include checksum files in svn and in our source releases 
 that can be used by users to verify that the jars they get from ivy match the 
 jars we tested against.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy

2012-04-02 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244008#comment-13244008
 ] 

Dawid Weiss commented on LUCENE-3930:
-

Looks good to me, Chris. Two minor things:
1) sourceDirectory and testSourceDirectory look like default values anyway?
2) there is a newer version of jsonic in maven repositories; don't know if this 
matters at all.

 nuke jars from source tree and use ivy
 --

 Key: LUCENE-3930
 URL: https://issues.apache.org/jira/browse/LUCENE-3930
 Project: Lucene - Java
  Issue Type: Task
  Components: general/build
Reporter: Robert Muir
Assignee: Robert Muir
Priority: Blocker
 Fix For: 3.6

 Attachments: LUCENE-3930-skip-sources-javadoc.patch, 
 LUCENE-3930-solr-example.patch, LUCENE-3930-solr-example.patch, 
 LUCENE-3930.patch, LUCENE-3930.patch, LUCENE-3930.patch, 
 LUCENE-3930__ivy_bootstrap_target.patch, 
 LUCENE-3930_includetestlibs_excludeexamplexml.patch, 
 ant_-verbose_clean_test.out.txt, noggit-commons-csv.patch, 
 patch-jetty-build.patch, pom.xml


 As mentioned on the ML thread: switch jars to ivy mechanism?.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3774) check-legal isn't doing its job

2012-03-31 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243162#comment-13243162
 ] 

Dawid Weiss commented on LUCENE-3774:
-

I'm for pushing it to the top level. This will simplify handling of exceptional 
patterns and such too. Shouldn't be much of a problem to move it too.

 check-legal isn't doing its job
 ---

 Key: LUCENE-3774
 URL: https://issues.apache.org/jira/browse/LUCENE-3774
 Project: Lucene - Java
  Issue Type: Improvement
  Components: general/build
Affects Versions: 3.6, 4.0
Reporter: Steven Rowe
Assignee: Dawid Weiss
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3774.patch, LUCENE-3774.patch, LUCENE-3774.patch, 
 LUCENE-3774.patch, LUCENE-3774.patch, LUCENE-3774.patch, LUCENE3774.patch, 
 backport.patch


 In trunk, the {{check-legal-lucene}} ant target is not checking any 
 {{lucene/contrib/\*\*/lib/}} directories; the {{modules/**/lib/}} directories 
 are not being checked; and {{check-legal-solr}} can't be checking 
 {{solr/example/lib/\*\*/\*.jar}}, because there are currently {{.jar}} files 
 in there that don't have a license.
 These targets are set up to take in a full list of {{lib/}} directories in 
 which to check, but modules move around, and these lists are not being kept 
 up-to-date.
 Instead, {{check-legal-\*}} should run for each module, if the module has a 
 {{lib/}} directory, and it should be specialized for modules that have more 
 than one ({{solr/core/}}) or that have a {{lib/}} directory in a non-standard 
 place ({{lucene/core/}}).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3296) Explore alternatives to Commons CSV

2012-03-30 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242104#comment-13242104
 ] 

Dawid Weiss commented on SOLR-3296:
---

BSD or ASL2 -- either is fine with another ASL2 project.

 Explore alternatives to Commons CSV
 ---

 Key: SOLR-3296
 URL: https://issues.apache.org/jira/browse/SOLR-3296
 Project: Solr
  Issue Type: Improvement
  Components: Build
Reporter: Chris Male

 In LUCENE-3930 we're implementing some less than ideal solutions to make 
 available the unreleased version of commons-csv.  We could remove these 
 solutions if we didn't rely on this lib.  So I think we should explore 
 alternatives. 
 I think [opencsv|http://opencsv.sourceforge.net/] is an alternative to 
 consider, I've used it in many commercial projects.  Bizarrely Commons-CSV's 
 website says that Opencsv uses a BSD license, but this isn't the case, 
 OpenCSV uses ASL2.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3295) Binaries contain 1.6 classes

2012-03-30 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242106#comment-13242106
 ] 

Dawid Weiss commented on SOLR-3295:
---

bq. if all tests pass without this jar, why do we need it?

It's some obscure (?) data format that tika can convert to plain text. I've 
never seen it, don't know what it is. Uwe filed a bug for Tika.

 Binaries contain 1.6 classes
 

 Key: SOLR-3295
 URL: https://issues.apache.org/jira/browse/SOLR-3295
 Project: Solr
  Issue Type: Bug
Reporter: Dawid Weiss
Priority: Minor
 Fix For: 3.6

 Attachments: output.log


 I've ran this tool (does the job): http://code.google.com/p/versioncheck/ on 
 the checkout of branch_3x. To my surprise there is a JAR which contains Java 
 1.6 code:
 {noformat}
 Major.Minor Version : 50.0 JAVA compatibility : Java 1.6 
 platform: 45.3-50.0
 Number of classes : 60
 Classes are : 
 c:\Work\lucene-solr\.\solr\contrib\extraction\lib\netcdf-4.2-min.jar [:] 
 ucar/unidata/geoloc/Bearing.class
 ...
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3296) Explore alternatives to Commons CSV

2012-03-30 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242111#comment-13242111
 ] 

Dawid Weiss commented on SOLR-3296:
---

I used GSON (http://code.google.com/p/google-gson/) and was happy with it. It 
even contains sanity checks which come in handly if you're emitting insane 
data...

 Explore alternatives to Commons CSV
 ---

 Key: SOLR-3296
 URL: https://issues.apache.org/jira/browse/SOLR-3296
 Project: Solr
  Issue Type: Improvement
  Components: Build
Reporter: Chris Male

 In LUCENE-3930 we're implementing some less than ideal solutions to make 
 available the unreleased version of commons-csv.  We could remove these 
 solutions if we didn't rely on this lib.  So I think we should explore 
 alternatives. 
 I think [opencsv|http://opencsv.sourceforge.net/] is an alternative to 
 consider, I've used it in many commercial projects.  Bizarrely Commons-CSV's 
 website says that Opencsv uses a BSD license, but this isn't the case, 
 OpenCSV uses ASL2.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3295) Binaries contain 1.6 classes

2012-03-30 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242121#comment-13242121
 ] 

Dawid Weiss commented on SOLR-3295:
---

Climate format data? Man... This just calls for a custom simplified parser that 
would read the header and forget the rest. And it'd be 5mb less to distribute...

 Binaries contain 1.6 classes
 

 Key: SOLR-3295
 URL: https://issues.apache.org/jira/browse/SOLR-3295
 Project: Solr
  Issue Type: Bug
Reporter: Dawid Weiss
Priority: Minor
 Fix For: 3.6

 Attachments: output.log


 I've ran this tool (does the job): http://code.google.com/p/versioncheck/ on 
 the checkout of branch_3x. To my surprise there is a JAR which contains Java 
 1.6 code:
 {noformat}
 Major.Minor Version : 50.0 JAVA compatibility : Java 1.6 
 platform: 45.3-50.0
 Number of classes : 60
 Classes are : 
 c:\Work\lucene-solr\.\solr\contrib\extraction\lib\netcdf-4.2-min.jar [:] 
 ucar/unidata/geoloc/Bearing.class
 ...
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3935) Optimize Kuromoji inner loop - rewrite ConnectionCosts.get() method

2012-03-30 Thread Dawid Weiss (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-3935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242183#comment-13242183
]

Dawid Weiss commented on LUCENE-3935:
-

bq. I did this hastily last night and results suggested that there wasn't a lot
to be gained on Mac OS X

I agree it may not be noticeable because there are so many factors kicking in
here (smaller structure - better cpu cache utilization vs. larger structure -
potentially faster access to each value but potential cache misses).

Makes sense to keep short[] in place, ignore my comment.

Optimize Kuromoji inner loop - rewrite ConnectionCosts.get() method
---

Key: LUCENE-3935
URL: https://issues.apache.org/jira/browse/LUCENE-3935
Project: Lucene - Java
Issue Type: Improvement
Components: modules/analysis
Affects Versions: 3.6, 4.0
Reporter: Christian Moen
Assignee: Christian Moen
Attachments: LUCENE-3935.patch

I've been profiling Kuromoji, and not very surprisingly, method
{{ConnectionCosts.get(int forwardId, int backwardId)}} that looks up costs in
the Viterbi is called many many times and contributes to more processing time
than I had expected.
This method is currently backed by a {{short[][]}}. This data stored here
structure is a two dimensional array with both dimensions being fixed with
1316 elements in each dimension. (The data is {{matrix.def}} in
MeCab-IPADIC.)
We can rewrite this to use a single one-dimensional array instead, and we
will at least save one bounds check, a pointer reference, and we should also
get much better cache utilization since this structure is likely to be in
very local CPU cache.
I think this will be a nice optimization. Working on it...

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy

2012-03-30 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242185#comment-13242185
 ] 

Dawid Weiss commented on LUCENE-3930:
-

Don't we need to get rid of the binary JAR anyway? If so, the alternatives are 
to either put all the sources in lucene repo or push a maven release of that 
JAR. SonaType accepts third-party JAR pushes too -- one can do it as a last 
resort option.

https://docs.sonatype.org/display/Repository/Uploading+3rd-party+Artifacts+to+The+Central+Repository

 nuke jars from source tree and use ivy
 --

 Key: LUCENE-3930
 URL: https://issues.apache.org/jira/browse/LUCENE-3930
 Project: Lucene - Java
  Issue Type: Task
  Components: general/build
Reporter: Robert Muir
Assignee: Robert Muir
Priority: Blocker
 Fix For: 3.6

 Attachments: LUCENE-3930-skip-sources-javadoc.patch, 
 LUCENE-3930-solr-example.patch, LUCENE-3930-solr-example.patch, 
 LUCENE-3930.patch, LUCENE-3930.patch, LUCENE-3930.patch, 
 LUCENE-3930__ivy_bootstrap_target.patch, ant_-verbose_clean_test.out.txt, 
 noggit-commons-csv.patch, patch-jetty-build.patch


 As mentioned on the ML thread: switch jars to ivy mechanism?.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy

2012-03-30 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242203#comment-13242203
 ] 

Dawid Weiss commented on LUCENE-3930:
-

You can do it with Maven by specifying an optional system dependency off the 
project's basedir and fetching the JAR in a preliminary phase... I think. But 
it's a hack beyond dirty. And it doesn't make other people's lives any easier 
(if somebody uses your pom).

 nuke jars from source tree and use ivy
 --

 Key: LUCENE-3930
 URL: https://issues.apache.org/jira/browse/LUCENE-3930
 Project: Lucene - Java
  Issue Type: Task
  Components: general/build
Reporter: Robert Muir
Assignee: Robert Muir
Priority: Blocker
 Fix For: 3.6

 Attachments: LUCENE-3930-skip-sources-javadoc.patch, 
 LUCENE-3930-solr-example.patch, LUCENE-3930-solr-example.patch, 
 LUCENE-3930.patch, LUCENE-3930.patch, LUCENE-3930.patch, 
 LUCENE-3930__ivy_bootstrap_target.patch, ant_-verbose_clean_test.out.txt, 
 noggit-commons-csv.patch, patch-jetty-build.patch


 As mentioned on the ML thread: switch jars to ivy mechanism?.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3296) Explore alternatives to Commons CSV

2012-03-30 Thread Dawid Weiss (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242204#comment-13242204
]

Dawid Weiss commented on SOLR-3296:
---

I didn't know it's Yonik's actually. It even has a pom.xml file --
http://svn.apache.org/repos/asf/labs/noggit/?

Yonik if you have an account at SonaType this takes as much as changing the
revision number to something without a SNAPSHOT and an mvn deploy (plus accept
from Nexus). Let me know if you need some guidance but it should be a 10 minute
effort if you have the maven code ready.

Explore alternatives to Commons CSV
---

Key: SOLR-3296
URL: https://issues.apache.org/jira/browse/SOLR-3296
Project: Solr
Issue Type: Improvement
Components: Build
Reporter: Chris Male

In LUCENE-3930 we're implementing some less than ideal solutions to make
available the unreleased version of commons-csv. We could remove these
solutions if we didn't rely on this lib. So I think we should explore
alternatives.
I think [opencsv|http://opencsv.sourceforge.net/] is an alternative to
consider, I've used it in many commercial projects. Bizarrely Commons-CSV's
website says that Opencsv uses a BSD license, but this isn't the case,
OpenCSV uses ASL2.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy

2012-03-30 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242237#comment-13242237
 ] 

Dawid Weiss commented on LUCENE-3930:
-

If we go the third party route I suggest to release an artifact with a -jdk15 
classifier to make it explicit it's a 1.5 build. Perhaps we can suggest to the 
maintainer to compile with 1.5 compatibility if this doesn't involve any source 
code changes?

 nuke jars from source tree and use ivy
 --

 Key: LUCENE-3930
 URL: https://issues.apache.org/jira/browse/LUCENE-3930
 Project: Lucene - Java
  Issue Type: Task
  Components: general/build
Reporter: Robert Muir
Assignee: Robert Muir
Priority: Blocker
 Fix For: 3.6

 Attachments: LUCENE-3930-skip-sources-javadoc.patch, 
 LUCENE-3930-solr-example.patch, LUCENE-3930-solr-example.patch, 
 LUCENE-3930.patch, LUCENE-3930.patch, LUCENE-3930.patch, 
 LUCENE-3930__ivy_bootstrap_target.patch, ant_-verbose_clean_test.out.txt, 
 noggit-commons-csv.patch, patch-jetty-build.patch


 As mentioned on the ML thread: switch jars to ivy mechanism?.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3296) Explore alternatives to Commons CSV

2012-03-30 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242374#comment-13242374
 ] 

Dawid Weiss commented on SOLR-3296:
---

I guess this means official apache releases but if the release is done in a 
private namespace then this isn't a problem? I mean -- I could probably take 
the source right now, change the group id to something I have access to 
(com.carrotsearch.thirdparty) and release it, but so can Yonik (under his own 
domain or whatever namespace he wishes that is different than Apache's)?

I admit this is kind of weird that Solr is using something that cannot be 
officially released. Why not make it part of Solr then? Just copy the source 
code over and publish as a separate artefact?

 Explore alternatives to Commons CSV
 ---

 Key: SOLR-3296
 URL: https://issues.apache.org/jira/browse/SOLR-3296
 Project: Solr
  Issue Type: Improvement
  Components: Build
Reporter: Chris Male

 In LUCENE-3930 we're implementing some less than ideal solutions to make 
 available the unreleased version of commons-csv.  We could remove these 
 solutions if we didn't rely on this lib.  So I think we should explore 
 alternatives. 
 I think [opencsv|http://opencsv.sourceforge.net/] is an alternative to 
 consider, I've used it in many commercial projects.  Bizarrely Commons-CSV's 
 website says that Opencsv uses a BSD license, but this isn't the case, 
 OpenCSV uses ASL2.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy

2012-03-30 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242513#comment-13242513
 ] 

Dawid Weiss commented on LUCENE-3930:
-

+1. Maybe it's good that this issue came out. I think it straightened a few 
things out.

 nuke jars from source tree and use ivy
 --

 Key: LUCENE-3930
 URL: https://issues.apache.org/jira/browse/LUCENE-3930
 Project: Lucene - Java
  Issue Type: Task
  Components: general/build
Reporter: Robert Muir
Assignee: Robert Muir
Priority: Blocker
 Fix For: 3.6

 Attachments: LUCENE-3930-skip-sources-javadoc.patch, 
 LUCENE-3930-solr-example.patch, LUCENE-3930-solr-example.patch, 
 LUCENE-3930.patch, LUCENE-3930.patch, LUCENE-3930.patch, 
 LUCENE-3930__ivy_bootstrap_target.patch, ant_-verbose_clean_test.out.txt, 
 noggit-commons-csv.patch, patch-jetty-build.patch


 As mentioned on the ML thread: switch jars to ivy mechanism?.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3295) Binaries contain 1.6 classes

2012-03-30 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242517#comment-13242517
 ] 

Dawid Weiss commented on SOLR-3295:
---

bq. It's some obscure  data format

I meant no offense, just my take at how many people in the wild may be using it 
compared to how many download solr in general.


 Binaries contain 1.6 classes
 

 Key: SOLR-3295
 URL: https://issues.apache.org/jira/browse/SOLR-3295
 Project: Solr
  Issue Type: Bug
Reporter: Dawid Weiss
Assignee: Robert Muir
Priority: Minor
 Fix For: 3.6

 Attachments: output.log


 I've ran this tool (does the job): http://code.google.com/p/versioncheck/ on 
 the checkout of branch_3x. To my surprise there is a JAR which contains Java 
 1.6 code:
 {noformat}
 Major.Minor Version : 50.0 JAVA compatibility : Java 1.6 
 platform: 45.3-50.0
 Number of classes : 60
 Classes are : 
 c:\Work\lucene-solr\.\solr\contrib\extraction\lib\netcdf-4.2-min.jar [:] 
 ucar/unidata/geoloc/Bearing.class
 ...
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy

2012-03-29 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13241056#comment-13241056
 ] 

Dawid Weiss commented on LUCENE-3930:
-

bq. So you have to install ivy in your ~/.ant/lib

I personally don't like it when I need to install ant dependencies in a global 
scope -- this may not be a problem from one project's perspective but if you're 
working on multiple projects then this can result in global dependencies 
shadowing project's local definitions and debugging this is a pain.

Not to mention it's another requirement after checkout. I won't be able to look 
into this now, just expressing my opinion.


 nuke jars from source tree and use ivy
 --

 Key: LUCENE-3930
 URL: https://issues.apache.org/jira/browse/LUCENE-3930
 Project: Lucene - Java
  Issue Type: Task
  Components: general/build
Reporter: Robert Muir
Assignee: Robert Muir
Priority: Blocker
 Fix For: 3.6

 Attachments: LUCENE-3930-solr-example.patch, 
 LUCENE-3930-solr-example.patch, LUCENE-3930.patch, LUCENE-3930.patch, 
 LUCENE-3930.patch, ant_-verbose_clean_test.out.txt, patch-jetty-build.patch


 As mentioned on the ML thread: switch jars to ivy mechanism?.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy

2012-03-29 Thread Dawid Weiss (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13241083#comment-13241083
]

Dawid Weiss commented on LUCENE-3930:
-

bq. Having jars in our source release is.

Is a requirement that the source release must build out of the box? Or is it
about code publication only? I don't know, I'm asking. This seems like a
revolutionary build change before the release :)

bq. Having the build OOM is. So I made a tradeoff.

I understand this, but my gut feeling still says that if you need to install
ivy in an ant global space you might as well set ANT_OPTS to increase
permgen... The tradeoff made is one of many.

bq. i need your help... its just that simple

Can't jump into it right now, sorry. I'll take a look when I get a spare cycle
though. I'm not sure it can be fixed but I'll take a look.

nuke jars from source tree and use ivy
--

Key: LUCENE-3930
URL: https://issues.apache.org/jira/browse/LUCENE-3930
Project: Lucene - Java
Issue Type: Task
Components: general/build
Reporter: Robert Muir
Assignee: Robert Muir
Priority: Blocker
Fix For: 3.6

Attachments: LUCENE-3930-solr-example.patch,
LUCENE-3930-solr-example.patch, LUCENE-3930.patch, LUCENE-3930.patch,
LUCENE-3930.patch, ant_-verbose_clean_test.out.txt, patch-jetty-build.patch

As mentioned on the ML thread: switch jars to ivy mechanism?.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy

2012-03-29 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13241091#comment-13241091
 ] 

Dawid Weiss commented on LUCENE-3930:
-

Clear. It's a pity we have to deal with it right before the release but I 
understand (or rather accept) the rationale.

 nuke jars from source tree and use ivy
 --

 Key: LUCENE-3930
 URL: https://issues.apache.org/jira/browse/LUCENE-3930
 Project: Lucene - Java
  Issue Type: Task
  Components: general/build
Reporter: Robert Muir
Assignee: Robert Muir
Priority: Blocker
 Fix For: 3.6

 Attachments: LUCENE-3930-solr-example.patch, 
 LUCENE-3930-solr-example.patch, LUCENE-3930.patch, LUCENE-3930.patch, 
 LUCENE-3930.patch, ant_-verbose_clean_test.out.txt, patch-jetty-build.patch


 As mentioned on the ML thread: switch jars to ivy mechanism?.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy

2012-03-29 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13241157#comment-13241157
 ] 

Dawid Weiss commented on LUCENE-3930:
-

For git the revision md5 is unique and you can always do a checkout of a 
particular revision (typically using so-called detached head). This just moves 
you to a particular version in the revision tree.


 nuke jars from source tree and use ivy
 --

 Key: LUCENE-3930
 URL: https://issues.apache.org/jira/browse/LUCENE-3930
 Project: Lucene - Java
  Issue Type: Task
  Components: general/build
Reporter: Robert Muir
Assignee: Robert Muir
Priority: Blocker
 Fix For: 3.6

 Attachments: LUCENE-3930-solr-example.patch, 
 LUCENE-3930-solr-example.patch, LUCENE-3930.patch, LUCENE-3930.patch, 
 LUCENE-3930.patch, ant_-verbose_clean_test.out.txt, patch-jetty-build.patch


 As mentioned on the ML thread: switch jars to ivy mechanism?.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3935) Optimize Kuromoji inner loop - rewrite ConnectionCosts.get() method

2012-03-29 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13241166#comment-13241166
 ] 

Dawid Weiss commented on LUCENE-3935:
-

Ah, this brings back a small project that kind of lies in a dormant state for 
some time -- I've written an annotation processor that generated classes for 
handling arrays of struct-like types (objects with fields only), including 
flattened multi-dimensional arrays. The code is on a branch here --

https://github.com/carrotsearch/hppc/blob/structs/hppc-examples/src/main/java/com/carrotsearch/hppc/examples/BattleshipsCell.java

But it's been a while, I need to get back to it, it may be useful.

 Optimize Kuromoji inner loop - rewrite ConnectionCosts.get() method
 ---

 Key: LUCENE-3935
 URL: https://issues.apache.org/jira/browse/LUCENE-3935
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/analysis
Affects Versions: 3.6, 4.0
Reporter: Christian Moen

 I've been profiling Kuromoji, and not very surprisingly, method 
 {{ConnectionCosts.get(int forwardId, int backwardId)}} that looks up costs in 
 the Viterbi is called many many times and contributes to more processing time 
 than I had expected.
 This method is currently backed by a {{short[][]}}.  This data stored here 
 structure is a two dimensional array with both dimensions being fixed with 
 1316 elements in each dimension.  (The data is {{matrix.def}} in 
 MeCab-IPADIC.)
 We can rewrite this to use a single one-dimensional array instead, and we 
 will at least save one bounds check, a pointer reference, and we should also 
 get much better cache utilization since this structure is likely to be in 
 very local CPU cache.
 I think this will be a nice optimization.  Working on it... 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy

2012-03-29 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13241276#comment-13241276
 ] 

Dawid Weiss commented on LUCENE-3930:
-

Moving C2 JARs out means we will have to re-release an archival version of C2 
just for this purpose. The reasons are that we depend on libraries which 
themselves don't have 1.5 equivalents (mahout-math) so any newer version would 
have to be re-released along with backcompat of these libraries too... long 
story.

Anyway, I will release a weird-looking version 3.5.0.1 which will be 1.5 
compatible. I will let you know (and possibly modify the branch) once this 
happens. Give me a few hours, it'll require some checks/ testing.

 nuke jars from source tree and use ivy
 --

 Key: LUCENE-3930
 URL: https://issues.apache.org/jira/browse/LUCENE-3930
 Project: Lucene - Java
  Issue Type: Task
  Components: general/build
Reporter: Robert Muir
Assignee: Robert Muir
Priority: Blocker
 Fix For: 3.6

 Attachments: LUCENE-3930-solr-example.patch, 
 LUCENE-3930-solr-example.patch, LUCENE-3930.patch, LUCENE-3930.patch, 
 LUCENE-3930.patch, ant_-verbose_clean_test.out.txt, noggit-commons-csv.patch, 
 patch-jetty-build.patch


 As mentioned on the ML thread: switch jars to ivy mechanism?.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3935) Optimize Kuromoji inner loop - rewrite ConnectionCosts.get() method

2012-03-29 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13241343#comment-13241343
 ] 

Dawid Weiss commented on LUCENE-3935:
-

+1. If this is called very frequently and you can affort storing ints instead 
of shorts then an int[] will have better alignment properties (and will not 
require extending to an int). May or may not play a difference depending on 
architecture (cpu cache sizes also matter here). 

 Optimize Kuromoji inner loop - rewrite ConnectionCosts.get() method
 ---

 Key: LUCENE-3935
 URL: https://issues.apache.org/jira/browse/LUCENE-3935
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/analysis
Affects Versions: 3.6, 4.0
Reporter: Christian Moen
 Attachments: LUCENE-3935.patch


 I've been profiling Kuromoji, and not very surprisingly, method 
 {{ConnectionCosts.get(int forwardId, int backwardId)}} that looks up costs in 
 the Viterbi is called many many times and contributes to more processing time 
 than I had expected.
 This method is currently backed by a {{short[][]}}.  This data stored here 
 structure is a two dimensional array with both dimensions being fixed with 
 1316 elements in each dimension.  (The data is {{matrix.def}} in 
 MeCab-IPADIC.)
 We can rewrite this to use a single one-dimensional array instead, and we 
 will at least save one bounds check, a pointer reference, and we should also 
 get much better cache utilization since this structure is likely to be in 
 very local CPU cache.
 I think this will be a nice optimization.  Working on it... 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3294) Remove binary carrot2.jar and replace it with a maven dependency.

2012-03-29 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13241541#comment-13241541
 ] 

Dawid Weiss commented on SOLR-3294:
---

Oh, there are also binary file changes:
{noformat}
c:\Work\lucene-solrgit st
#   new file:   solr/contrib/clustering/lib/carrot2-core-3.5.0.1.jar
#   deleted:solr/contrib/clustering/lib/carrot2-core-3.5.0.jar
#   deleted:solr/contrib/clustering/lib/jackson-core-asl-1.5.2.jar
#   new file:   solr/contrib/clustering/lib/jackson-core-asl-1.7.4.jar
#   deleted:solr/contrib/clustering/lib/jackson-mapper-asl-1.5.2.jar
#   new file:   solr/contrib/clustering/lib/jackson-mapper-asl-1.7.4.jar
#   deleted:
solr/contrib/clustering/lib/solr-carrot2-core-pom.xml.template
{noformat}

These can be fetched from Maven Central and Carrot2 pom has these dependencies 
too. I've excluded everything else.

 Remove binary carrot2.jar and replace it with a maven dependency.
 -

 Key: SOLR-3294
 URL: https://issues.apache.org/jira/browse/SOLR-3294
 Project: Solr
  Issue Type: Task
  Components: contrib - Clustering
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Blocker
 Fix For: 3.6

 Attachments: SOLR-3294.patch


 The repo contains a manually retrowoven Carrot2 JAR which does not have a 
 corresponding artefact in Maven Central (so won't work for ivy).
 We will make a release with 1.5 backport (I hate this!).
 http://issues.carrot2.org/browse/CARROT-902

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy

2012-03-29 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13241599#comment-13241599
 ] 

Dawid Weiss commented on LUCENE-3930:
-

Hey Hoss, there's a typo in the target name :) ivy-availablity-check.

 nuke jars from source tree and use ivy
 --

 Key: LUCENE-3930
 URL: https://issues.apache.org/jira/browse/LUCENE-3930
 Project: Lucene - Java
  Issue Type: Task
  Components: general/build
Reporter: Robert Muir
Assignee: Robert Muir
Priority: Blocker
 Fix For: 3.6

 Attachments: LUCENE-3930-solr-example.patch, 
 LUCENE-3930-solr-example.patch, LUCENE-3930.patch, LUCENE-3930.patch, 
 LUCENE-3930.patch, LUCENE-3930__ivy_bootstrap_target.patch, 
 ant_-verbose_clean_test.out.txt, noggit-commons-csv.patch, 
 patch-jetty-build.patch


 As mentioned on the ML thread: switch jars to ivy mechanism?.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3294) Remove binary carrot2.jar and replace it with a maven dependency.

2012-03-29 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13241779#comment-13241779
 ] 

Dawid Weiss commented on SOLR-3294:
---

Thanks Steven! Since you have it open would you commit it in too? Remove that 
'dist-maven' section, it isn't needed indeed. Thanks!

 Remove binary carrot2.jar and replace it with a maven dependency.
 -

 Key: SOLR-3294
 URL: https://issues.apache.org/jira/browse/SOLR-3294
 Project: Solr
  Issue Type: Task
  Components: contrib - Clustering
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Blocker
 Fix For: 3.6

 Attachments: SOLR-3294.patch


 The repo contains a manually retrowoven Carrot2 JAR which does not have a 
 corresponding artefact in Maven Central (so won't work for ivy).
 We will make a release with 1.5 backport (I hate this!).
 http://issues.carrot2.org/browse/CARROT-902

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3295) Binaries contain 1.6 classes

2012-03-29 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13241780#comment-13241780
 ] 

Dawid Weiss commented on SOLR-3295:
---

Robert says this isn't the case. Also: this is THE only jar that requires 1.6 
so I'd say it's probably a mistake?

 Binaries contain 1.6 classes
 

 Key: SOLR-3295
 URL: https://issues.apache.org/jira/browse/SOLR-3295
 Project: Solr
  Issue Type: Bug
Reporter: Dawid Weiss
Priority: Minor
 Fix For: 3.6

 Attachments: output.log


 I've ran this tool (does the job): http://code.google.com/p/versioncheck/ on 
 the checkout of branch_3x. To my surprise there is a JAR which contains Java 
 1.6 code:
 {noformat}
 Major.Minor Version : 50.0 JAVA compatibility : Java 1.6 
 platform: 45.3-50.0
 Number of classes : 60
 Classes are : 
 c:\Work\lucene-solr\.\solr\contrib\extraction\lib\netcdf-4.2-min.jar [:] 
 ucar/unidata/geoloc/Bearing.class
 ...
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3272) Solr filter factory for MorfologikFilter

2012-03-28 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13240268#comment-13240268
 ] 

Dawid Weiss commented on SOLR-3272:
---

This is in trunk now, thanks Rafał.

 Solr filter factory for MorfologikFilter
 

 Key: SOLR-3272
 URL: https://issues.apache.org/jira/browse/SOLR-3272
 Project: Solr
  Issue Type: New Feature
  Components: Schema and Analysis
Affects Versions: 4.0
Reporter: Rafał Kuć
Assignee: Dawid Weiss
 Fix For: 4.0

 Attachments: SOLR-3272-toupper-correction.patch, 
 SOLR-3272-with-javadoc-example-usage.patch, SOLR-3272.patch, SOLR-3272.patch, 
 SOLR-3272.patch, SOLR-3727-new.patch


 I didn't find MorfologikFilter factory in Solr, so here is a simple. Maybe 
 someone will have make use of it :)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3927) allow running trunk tests with IBM JRE

2012-03-27 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239277#comment-13239277
 ] 

Dawid Weiss commented on LUCENE-3927:
-

bq. t uses a HashMap instead of LinkedHashMap as lookup cache and that mixes 
the SPI classes up. 

This is sad. It's a simple bug to fix but will never be probably...

 allow running trunk tests with IBM JRE
 --

 Key: LUCENE-3927
 URL: https://issues.apache.org/jira/browse/LUCENE-3927
 Project: Lucene - Java
  Issue Type: Task
  Components: general/test
Affects Versions: 4.0
Reporter: Robert Muir
 Attachments: LUCENE-3927.patch


 This is currently not possible because of how the SPI loader works,
 we cannot simulate Lucene3x codec with PreFlexRWCodec.
 But we should still allow basic testing (even though we cannot test preflex).
 After hacking around the issue, I get interesting fails with this JRE so I 
 think its worth it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3272) Solr filter factory for MorfologikFilter

2012-03-27 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239793#comment-13239793
 ] 

Dawid Weiss commented on SOLR-3272:
---

I actually don't know what the policy is -- I asked on the dev list, we'll see 
what solr folks prefer.

 Solr filter factory for MorfologikFilter
 

 Key: SOLR-3272
 URL: https://issues.apache.org/jira/browse/SOLR-3272
 Project: Solr
  Issue Type: New Feature
  Components: Schema and Analysis
Affects Versions: 4.0
Reporter: Rafał Kuć
Assignee: Dawid Weiss
 Fix For: 4.0

 Attachments: SOLR-3272-with-javadoc-example-usage.patch, 
 SOLR-3272.patch, SOLR-3727-new.patch


 I didn't find MorfologikFilter factory in Solr, so here is a simple. Maybe 
 someone will have make use of it :)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3272) Solr filter factory for MorfologikFilter

2012-03-27 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239811#comment-13239811
 ] 

Dawid Weiss commented on SOLR-3272:
---

Thanks Uwe. Btw. should we apply it to 3.x as well? This seems like a harmless 
patch and it'd be a nice-to-have feature.

 Solr filter factory for MorfologikFilter
 

 Key: SOLR-3272
 URL: https://issues.apache.org/jira/browse/SOLR-3272
 Project: Solr
  Issue Type: New Feature
  Components: Schema and Analysis
Affects Versions: 4.0
Reporter: Rafał Kuć
Assignee: Dawid Weiss
 Fix For: 4.0

 Attachments: SOLR-3272-with-javadoc-example-usage.patch, 
 SOLR-3272.patch, SOLR-3727-new.patch


 I didn't find MorfologikFilter factory in Solr, so here is a simple. Maybe 
 someone will have make use of it :)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3272) Solr filter factory for MorfologikFilter

2012-03-27 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239824#comment-13239824
 ] 

Dawid Weiss commented on SOLR-3272:
---

Damn. [Blushing]. 

I could prepare a 1.5 compatible version with retroweaver and integrate it in. 
I guess now I don't have excuses, do I... Do we want to push it in at the last 
minute though? 

 Solr filter factory for MorfologikFilter
 

 Key: SOLR-3272
 URL: https://issues.apache.org/jira/browse/SOLR-3272
 Project: Solr
  Issue Type: New Feature
  Components: Schema and Analysis
Affects Versions: 4.0
Reporter: Rafał Kuć
Assignee: Dawid Weiss
 Fix For: 4.0

 Attachments: SOLR-3272-with-javadoc-example-usage.patch, 
 SOLR-3272.patch, SOLR-3727-new.patch


 I didn't find MorfologikFilter factory in Solr, so here is a simple. Maybe 
 someone will have make use of it :)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3272) Solr filter factory for MorfologikFilter

2012-03-27 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239939#comment-13239939
 ] 

Dawid Weiss commented on SOLR-3272:
---

Can I ask somebody to look at the build file changes (and determine if 
morfologik JARs should be copied and where). Otherwise this is ready to be 
committed I think.

After some deliberation I won't rush to make Morfologik part of 3.x -- last 
minute features are the worst.

 Solr filter factory for MorfologikFilter
 

 Key: SOLR-3272
 URL: https://issues.apache.org/jira/browse/SOLR-3272
 Project: Solr
  Issue Type: New Feature
  Components: Schema and Analysis
Affects Versions: 4.0
Reporter: Rafał Kuć
Assignee: Dawid Weiss
 Fix For: 4.0

 Attachments: SOLR-3272-toupper-correction.patch, 
 SOLR-3272-with-javadoc-example-usage.patch, SOLR-3272.patch, SOLR-3272.patch, 
 SOLR-3727-new.patch


 I didn't find MorfologikFilter factory in Solr, so here is a simple. Maybe 
 someone will have make use of it :)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect

2012-03-26 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238203#comment-13238203
 ] 

Dawid Weiss commented on LUCENE-3867:
-

For historical records: the previous implementation of RamUsageEstimator was 
off by anything between 3% (random size objects, including arrays) to 20% 
(objects smaller than 80 bytes). Again -- these are perfect scenario 
measurements with empty heap and max. allocation until OOM, with a serial GC. 
With a concurrent and parallel GCs the memory consumption estimation is still 
accurate but it's nearly impossible to tell when an OOM will occur or how the 
GC will manage the heap space. 

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect
 --

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Uwe Schindler
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-3.x.patch, LUCENE-3867-compressedOops.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect

2012-03-26 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238208#comment-13238208
 ] 

Dawid Weiss commented on LUCENE-3867:
-

I didn't say it's wrong -- it is fine and accurate. What I'm saying is that 
it's not really suitable for predictions; for answering questions like: how 
many objects of a given type/ types can I allocate before an OOM hits me? It 
doesn't really surprise me that much, but it would be nice. For measuring 
already allocated stuff it's more than fine of course.

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect
 --

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Uwe Schindler
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-3.x.patch, LUCENE-3867-compressedOops.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3272) Solr filter factory for MorfologikFilter

2012-03-26 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238327#comment-13238327
 ] 

Dawid Weiss commented on SOLR-3272:
---

Hi Michał. Could you modify this patch to include support for the three 
dictionaries (combined, morfeusz and morfologik)? This would be more flexible 
(and the combined dictionary is nearly twice larger than morfologik itself so 
it's worth it).
{code}
return new MorfologikFilter(ts, DICTIONARY.MORFOLOGIK, luceneMatchVersion);
{code}

Also, an example of use in the JavaDoc would be nice (see 
BeiderMorseFilterFactory for example). The test should be using DEFAULT_VERSION 
not the fixed LUCENE_40. Thanks!

 Solr filter factory for MorfologikFilter
 

 Key: SOLR-3272
 URL: https://issues.apache.org/jira/browse/SOLR-3272
 Project: Solr
  Issue Type: New Feature
  Components: Schema and Analysis
Affects Versions: 4.0
Reporter: Rafał Kuć
Assignee: Dawid Weiss
 Fix For: 4.0

 Attachments: SOLR-3272.patch


 I didn't find MorfologikFilter factory in Solr, so here is a simple. Maybe 
 someone will have make use of it :)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3272) Solr filter factory for MorfologikFilter

2012-03-26 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238332#comment-13238332
 ] 

Dawid Weiss commented on SOLR-3272:
---

Thanks. Sorry about the name confusion btw. Don't know where I took Michał from 
:)

 Solr filter factory for MorfologikFilter
 

 Key: SOLR-3272
 URL: https://issues.apache.org/jira/browse/SOLR-3272
 Project: Solr
  Issue Type: New Feature
  Components: Schema and Analysis
Affects Versions: 4.0
Reporter: Rafał Kuć
Assignee: Dawid Weiss
 Fix For: 4.0

 Attachments: SOLR-3272.patch


 I didn't find MorfologikFilter factory in Solr, so here is a simple. Maybe 
 someone will have make use of it :)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3272) Solr filter factory for MorfologikFilter

2012-03-26 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238715#comment-13238715
 ] 

Dawid Weiss commented on SOLR-3272:
---

Thanks Rafał.

 Solr filter factory for MorfologikFilter
 

 Key: SOLR-3272
 URL: https://issues.apache.org/jira/browse/SOLR-3272
 Project: Solr
  Issue Type: New Feature
  Components: Schema and Analysis
Affects Versions: 4.0
Reporter: Rafał Kuć
Assignee: Dawid Weiss
 Fix For: 4.0

 Attachments: SOLR-3272.patch, SOLR-3727-new.patch


 I didn't find MorfologikFilter factory in Solr, so here is a simple. Maybe 
 someone will have make use of it :)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3268) remove write acess to source tree (chmod 555) when running tests in jenkins

2012-03-26 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238775#comment-13238775
 ] 

Dawid Weiss commented on SOLR-3268:
---

All of these are in .gitignore, Steven (and can be regenerated via 
dev-tools/scripts/gitignore-gen.sh.

 remove write acess to source tree (chmod 555) when running tests in jenkins
 ---

 Key: SOLR-3268
 URL: https://issues.apache.org/jira/browse/SOLR-3268
 Project: Solr
  Issue Type: Bug
Reporter: Robert Muir
Assignee: Robert Muir
 Fix For: 4.0

 Attachments: SOLR-3268_sync.patch


 Some tests are currently creating files under the source tree.
 This causes a lot of problems, it makes my checkout look dirty after running 
 'ant test' and i have to cleanup.
 I opened and issue for this a month in a half for 
 solrj/src/test-files/solrj/solr/shared/test-solr.xml (SOLR-3112), 
 but now we have a second file 
 (core/src/test-files/solr/conf/elevate-data-distrib.xml).
 So I think hudson needs to chmod these src directories to 555, so that solr 
 tests that do this will fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3910) remove special hudson nightly linedocs

2012-03-25 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237944#comment-13237944
 ] 

Dawid Weiss commented on LUCENE-3910:
-

I agree with you both. No, it's not a paradox. On one hand -- I agree that 
having larger test files is good and on the other I agree with Robert that not 
being able to reproduce locally because of different (or inconsistent) data is 
a pain.

At Carrot Search we have put all the big data into a separate git repository 
and this is simply mirrored across build servers and our local machines. 
Granted, the first clone takes a while, but then pulls of additional data are 
much faster and (which is a big plus) git repo has an md5 of the revision so 
this can be emitted as a log upon failure (we don't do it because we're pretty 
much sure the checkouts are consistent, but it _could_ be done to ensure 
testing against exact same test files).

Just thoughts to consider.



 remove special hudson nightly linedocs
 --

 Key: LUCENE-3910
 URL: https://issues.apache.org/jira/browse/LUCENE-3910
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Robert Muir
 Fix For: 4.0


 Hudson has a special huge linedocs file that it sets via a -D parameter,
 but this means that anything using LineDocs won't reproduce via our home
 computers if it fails on hudson.
 I think we should disable this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3877) Lucene should not call System.out.println

2012-03-24 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237479#comment-13237479
 ] 

Dawid Weiss commented on LUCENE-3877:
-

No worries Greg, really. For 3.x I think manual check will do (or what I've 
done above with AspectJ). For 4.x it'd be nice to have findbugs lint anyway 
(for this and other issues). It'll most likely require some rules tuning too, 
so it can be a separate issue.

 Lucene should not call System.out.println
 -

 Key: LUCENE-3877
 URL: https://issues.apache.org/jira/browse/LUCENE-3877
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.0

 Attachments: IllegalSystemTest.java, IllegalSystemTest.java, 
 SystemPrintCheck.java


 We seem to have accumulated a few random sops...
 Eg, PairOutputs.java (oal.util.fst) and MultiDocValues.java, at least.
 Can we somehow detect (eg, have a test failure) if we accidentally leave 
 errant System.out.println's (leftover from debugging)...?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3268) remove write acess to source tree (chmod 555) when running tests in jenkins

2012-03-23 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13236489#comment-13236489
 ] 

Dawid Weiss commented on SOLR-3268:
---

I agree that tests modifying sources or writing to source areas are a pain. I 
know these files can be svn:ignored but it just... feels dirty somehow. On a 
constructive note -- maybe we can use this:

http://ant.apache.org/manual/Tasks/sync.html

and mirror whatever src folder structure is required for these tests in the 
build area?

 remove write acess to source tree (chmod 555) when running tests in jenkins
 ---

 Key: SOLR-3268
 URL: https://issues.apache.org/jira/browse/SOLR-3268
 Project: Solr
  Issue Type: Bug
Reporter: Robert Muir
 Fix For: 3.6, 4.0


 Some tests are currently creating files under the source tree.
 This causes a lot of problems, it makes my checkout look dirty after running 
 'ant test' and i have to cleanup.
 I opened and issue for this a month in a half for 
 solrj/src/test-files/solrj/solr/shared/test-solr.xml (SOLR-3112), 
 but now we have a second file 
 (core/src/test-files/solr/conf/elevate-data-distrib.xml).
 So I think hudson needs to chmod these src directories to 555, so that solr 
 tests that do this will fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3268) remove write acess to source tree (chmod 555) when running tests in jenkins

2012-03-23 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13236494#comment-13236494
 ] 

Dawid Weiss commented on SOLR-3268:
---

Oh... in that case the tests just need to be fixed :)

 remove write acess to source tree (chmod 555) when running tests in jenkins
 ---

 Key: SOLR-3268
 URL: https://issues.apache.org/jira/browse/SOLR-3268
 Project: Solr
  Issue Type: Bug
Reporter: Robert Muir
 Fix For: 3.6, 4.0


 Some tests are currently creating files under the source tree.
 This causes a lot of problems, it makes my checkout look dirty after running 
 'ant test' and i have to cleanup.
 I opened and issue for this a month in a half for 
 solrj/src/test-files/solrj/solr/shared/test-solr.xml (SOLR-3112), 
 but now we have a second file 
 (core/src/test-files/solr/conf/elevate-data-distrib.xml).
 So I think hudson needs to chmod these src directories to 555, so that solr 
 tests that do this will fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect

2012-03-23 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13236626#comment-13236626
 ] 

Dawid Weiss commented on LUCENE-3867:
-

Thanks Uwe. I'll be working in the evening again but if you're faster go ahead 
and commit it in.

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect
 --

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Uwe Schindler
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-3.x.patch, LUCENE-3867-compressedOops.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3877) Lucene should not call System.out.println

2012-03-23 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237117#comment-13237117
 ] 

Dawid Weiss commented on LUCENE-3877:
-

I'd push it to 4.0 (automation in whatever form).

 Lucene should not call System.out.println
 -

 Key: LUCENE-3877
 URL: https://issues.apache.org/jira/browse/LUCENE-3877
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 3.6, 4.0

 Attachments: IllegalSystemTest.java, IllegalSystemTest.java, 
 SystemPrintCheck.java


 We seem to have accumulated a few random sops...
 Eg, PairOutputs.java (oal.util.fst) and MultiDocValues.java, at least.
 Can we somehow detect (eg, have a test failure) if we accidentally leave 
 errant System.out.println's (leftover from debugging)...?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect

2012-03-23 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237172#comment-13237172
 ] 

Dawid Weiss commented on LUCENE-3867:
-

I've been thinking how one can assess the estimation quality of the new code. I 
cam up with this:
- I allocate an Object[] half the size of estimated maximum available RAM (just 
to make sure all objects will fit without the need to reallocate),
- I precompute shallow sizes for instances of all wild classes (classes with 
random fields, including arrays).
- I then fill in the vault array above with random instances of wild classes, 
summing up the estimated size UNTIL I HIT OOM.
- Once I git OOM I know how much we actually allocated vs. how much space we 
thought we did allocate.

The results are very accurate on HotSpot if one is using serial GC. For example:
{noformat}
[JVM: Java HotSpot(TM) 64-Bit Server VM, 20.4-b02, Sun Microsystems Inc., Sun 
Microsystems Inc., 1.6.0_29]
Max: 483.4 MB, Used: 698.9 KB, Committed: 123.8 MB
Expected free: 240.9 MB, Allocated estimation: 240.8 MB, Difference: -0.05% 
(113.6 KB)
{noformat}

If one runs with a parallel GC things do get out of hand because the GC is not 
keeping up with allocations (although I'm not sure how I should interpret this 
because we only allocate; it's not possible to free any space -- maybe there 
are different GC pools or something):
{noformat}
[JVM: Java HotSpot(TM) 64-Bit Server VM, 20.4-b02, Sun Microsystems Inc., Sun 
Microsystems Inc., 1.6.0_29]
Max: 444.5 MB, Used: 655.4 KB, Committed: 122.7 MB
Expected free: 221.5 MB, Allocated estimation: 174.2 MB, Difference: -21.34% 
(47.3 MB)
{noformat}

JRockit:
{noformat}
[JVM: Oracle JRockit(R), 
R28.1.4-7-144370-1.6.0_26-20110617-2130-windows-x86_64, Oracle Corporation, 
Oracle Corporation, 1.6.0_26]
Max: 500 MB, Used: 3.5 MB, Committed: 64 MB
Expected free: 247.7 MB, Allocated estimation: 249.5 MB, Difference: 0.74% (1.8 
MB)
{noformat}

I think we're good. If somebody wishes to experiment, the spike is here:
https://github.com/dweiss/java-sizeof
{noformat}
mvn test
mvn dependency:copy-dependencies
java -cp target\classes:target\test-classes:target\dependency\junit-4.10.jar \
  com.carrotsearch.sizeof.TestEstimationQuality
{noformat}

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect
 --

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Uwe Schindler
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-3.x.patch, LUCENE-3867-compressedOops.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If

[jira] [Commented] (LUCENE-3877) Lucene should not call System.out.println

2012-03-22 Thread Dawid Weiss (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235422#comment-13235422
]

Dawid Weiss commented on LUCENE-3877:
-

bq. I have seen it not work in the past for obscure reasons

Most likely the reasons were incorrect pointcut definitions? These can be
tricky, I agree. Nonetheless, I've been using AspectJ for a long time and it
always fits my needs and expectations. I'm not saying it doesn't have any bugs
-- I'm sure it has. But the right tool for the right job; it took me about 5
mins to write and apply that aspect (with follow ups, I sent an e-mail to the
mailing list, JIRA didn't work at the time).

I'm not advocating for any tool, really. To me aspectj is a fast tool for
expressing where I want a given snippet of code to be injected (or what I want
excluded) and for such tasks I don't see a faster or more pleasant to use
alternative. Oh, I've been using asmlib too; extensively in fact; so it's not
lack of knowledge about the tool itself.

Lucene should not call System.out.println
-

Key: LUCENE-3877
URL: https://issues.apache.org/jira/browse/LUCENE-3877
Project: Lucene - Java
Issue Type: Bug
Reporter: Michael McCandless
Fix For: 3.6, 4.0

Attachments: IllegalSystemTest.java, IllegalSystemTest.java,
SystemPrintCheck.java

We seem to have accumulated a few random sops...
Eg, PairOutputs.java (oal.util.fst) and MultiDocValues.java, at least.
Can we somehow detect (eg, have a test failure) if we accidentally leave
errant System.out.println's (leftover from debugging)...?

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3877) Lucene should not call System.out.println

2012-03-22 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235424#comment-13235424
 ] 

Dawid Weiss commented on LUCENE-3877:
-

My aspectj experiments from yesterday when JIRA was dead.

I applied that aspect just to see what happens.
{noformat}
ajc -sourceroots aspects \
   -inpath lucene-core-3.6-SNAPSHOT.jar \
   -d none \
   -cp aspectjrt.jar \
   -showWeaveInfo
{noformat}
Here's what I got:
{noformat}
Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.analysis.PorterStemmer'
(PorterStemmer.java:529) advised by before advice from
'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.analysis.PorterStemmer'
(PorterStemmer.java:534) advised by before advice from
'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.analysis.PorterStemmer'
(PorterStemmer.java:542) advised by before advice from
'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:989)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:996)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1003)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1012)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1013)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1038)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1043)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1047)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1056)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1057)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1062)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1071)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1073)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1074)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1077)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1079)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1081)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1082)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream java.lang.System.out)' in
Type 'org.apache.lucene.index.CheckIndex' (CheckIndex.java:1085)
advised by before advice from 'spikes.NoSysOuts' (NoSysOuts.aj:6)

Join point 'field-get(java.io.PrintStream

[jira] [Commented] (LUCENE-3877) Lucene should not call System.out.println

2012-03-22 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235428#comment-13235428
 ] 

Dawid Weiss commented on LUCENE-3877:
-

Oh, btw. I think a FindBugs rule for detecting sysouts/syserrs would be a great 
addition to FindBugs -- you should definitely file it as an improvement there. 
In reality at least class-level exclusions will be needed to avoid legitimate 
matches like the ones shown above (main methods, exception handlers), but these 
can be lived with.

 Lucene should not call System.out.println
 -

 Key: LUCENE-3877
 URL: https://issues.apache.org/jira/browse/LUCENE-3877
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless
 Fix For: 3.6, 4.0

 Attachments: IllegalSystemTest.java, IllegalSystemTest.java, 
 SystemPrintCheck.java


 We seem to have accumulated a few random sops...
 Eg, PairOutputs.java (oal.util.fst) and MultiDocValues.java, at least.
 Can we somehow detect (eg, have a test failure) if we accidentally leave 
 errant System.out.println's (leftover from debugging)...?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect

2012-03-22 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235494#comment-13235494
 ] 

Dawid Weiss commented on LUCENE-3867:
-

I've been experimenting a bit with the new code. Field offsets for three 
classes in a hierarchy with unalignable fields (byte, long combinations at all 
levels). Note unaligned reordering of byte field in JRockit - nice.

{noformat}
JVM: [JVM: HotSpot, Sun Microsystems Inc., 1.6.0_31] (compressed OOPs)
@12  4 Super.superByte
@16  8 Super.subLong
@24  8 Sub.subLong
@32  4 Sub.subByte
@36  4 SubSub.subSubByte
@40  8 SubSub.subSubLong
@48sizeOf(SubSub.class instance)

JVM: [JVM: HotSpot, Sun Microsystems Inc., 1.6.0_31] (normal OOPs)
@16  8 Super.subLong
@24  8 Super.superByte
@32  8 Sub.subLong
@40  8 Sub.subByte
@48  8 SubSub.subSubLong
@56  8 SubSub.subSubByte
@64sizeOf(SubSub.class instance)


JVM: [JVM: J9, IBM Corporation, 1.6.0]
@24  8 Super.subLong
@32  4 Super.superByte
@36  4 Sub.subByte
@40  8 Sub.subLong
@48  8 SubSub.subSubLong
@56  8 SubSub.subSubByte
@64sizeOf(SubSub.class instance)

JVM: [JVM: JRockit, Oracle Corporation, 1.6.0_26] (64-bit JVM!)
@ 8  8 Super.subLong
@16  1 Super.superByte
@17  7 Sub.subByte
@24  8 Sub.subLong
@32  8 SubSub.subSubLong
@40  8 SubSub.subSubByte
@48sizeOf(SubSub.class instance)
{noformat}

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect
 --

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Uwe Schindler
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-3.x.patch, LUCENE-3867-compressedOops.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect

2012-03-22 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235501#comment-13235501
 ] 

Dawid Weiss commented on LUCENE-3867:
-

bq. I hope my explanation was understandable... 

Perfectly well. Yes, I agree, it's possible to fill in the holes packing them 
with fields from subclasses. It would be a nice vm-level optimization in fact! 

I'm still experimenting on this code and cleaning/ adding javadocs -- I'll 
patch this and provide a complete patch once I'm done, ok?

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect
 --

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Uwe Schindler
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-3.x.patch, LUCENE-3867-compressedOops.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect

2012-03-22 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235506#comment-13235506
 ] 

Dawid Weiss commented on LUCENE-3867:
-

Maybe it does such things already. I didn't check extensively.

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect
 --

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Uwe Schindler
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-3.x.patch, LUCENE-3867-compressedOops.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect

2012-03-22 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235570#comment-13235570
 ] 

Dawid Weiss commented on LUCENE-3867:
-

I confirmed that this packing indeed takes place. Wrote a pseudo-random test 
with lots of classes and fields. Here's an offender on J9 for example 
(Wild_{inheritance-level}_{field-number}):
{noformat}
@24  4 Wild_0_92.fld_0_0_92
@28  4 Wild_0_92.fld_1_0_92
@32  4 Wild_0_92.fld_2_0_92
@36  4 Wild_0_92.fld_3_0_92
@40  4 Wild_0_92.fld_4_0_92
@44  4 Wild_0_92.fld_5_0_92
@48  4 Wild_0_92.fld_6_0_92
@52  4 Wild_2_5.fld_0_2_5
@56  8 Wild_1_85.fld_0_1_85
@64  8 Wild_1_85.fld_1_1_85
@72sizeOf(Wild_2_5 instance)
{noformat}

HotSpot and JRockit don't seem to do this (at least it didn't fail on the 
example).


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect
 --

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Uwe Schindler
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-3.x.patch, LUCENE-3867-compressedOops.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect

2012-03-22 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235588#comment-13235588
 ] 

Dawid Weiss commented on LUCENE-3867:
-

Yep, that assumption was wrong -- indeed:
{noformat}
WildClasses.Wild_2_5 wc = new WildClasses.Wild_2_5();
wc.fld_6_0_92 = 0x1122;
wc.fld_0_2_5 = Float.intBitsToFloat(0xa1a2a3a4);
wc.fld_0_1_85 = Double.longBitsToDouble(0xb1b2b3b4b5b6b7L);
System.out.println(ExpMemoryDumper.dumpObjectMem(wc));
{noformat}
results in:
{noformat}
0x b0 3d 6f 01 00 00 00 00 0e 80 79 01 00 00 00 00
0x0010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0030 22 11 00 00 a4 a3 a2 a1 b7 b6 b5 b4 b3 b2 b1 00
0x0040 00 00 00 00 00 00 00 00
{noformat}
And you can see they are reordered and longs are aligned.

I'll provide a cumulative patch of changes in the evening, there's one more 
thing I wanted to add (cache of fields) because this affects processing speed.

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect
 --

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Uwe Schindler
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-3.x.patch, LUCENE-3867-compressedOops.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3847) LuceneTestCase should check for modifications on System properties

2012-03-22 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235710#comment-13235710
 ] 

Dawid Weiss commented on LUCENE-3847:
-

Well... something is changing it, the question is what it is. I'll take a look.

 LuceneTestCase should check for modifications on System properties
 --

 Key: LUCENE-3847
 URL: https://issues.apache.org/jira/browse/LUCENE-3847
 Project: Lucene - Java
  Issue Type: Improvement
  Components: general/test
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3847.patch


 - fail the test if changes have been detected.
 - revert the state of system properties before the suite.
 - cleanup after the suite.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3847) LuceneTestCase should check for modifications on System properties

2012-03-22 Thread Dawid Weiss (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-3847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235749#comment-13235749
]

Dawid Weiss commented on LUCENE-3847:
-

I know what's changing it. Eh. So -- there is a warning being printed:
{noformat}
Mar 22, 2012 6:20:33 PM org.apache.solr.core.Config parseLuceneVersionString
WARNING: You should not use LUCENE_CURRENT as luceneMatchVersion property: if
you use this setting, and then Solr upgrades to a newer release of Lucene,
sizable changes may happen. If precise back compatibility is important then you
should instead explicitly specify an actual Lucene version.
Mar 22, 2012 6:20:33 PM org.apache.solr.analysis.BaseTokenStreamFactory
warnDeprecated
WARNING: RussianLetterTokenizerFactory is deprecated. Use
StandardTokenizerFactory instead.
{noformat}

These warnings go through Java logging and this in turn is localized (date
format, warning info, etc.). This in turn asks for the default TimeZone and
this in turn sets the system property (I mentioned it a while ago).

I suggest that we just ignore user.timezone as it is triggered from multiple
locations and doesn't seem that important?

LuceneTestCase should check for modifications on System properties
--

Key: LUCENE-3847
URL: https://issues.apache.org/jira/browse/LUCENE-3847
Project: Lucene - Java
Issue Type: Improvement
Components: general/test
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor
Fix For: 3.6, 4.0

Attachments: LUCENE-3847.patch

- fail the test if changes have been detected.
- revert the state of system properties before the suite.
- cleanup after the suite.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3847) LuceneTestCase should check for modifications on System properties

2012-03-22 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235995#comment-13235995
 ] 

Dawid Weiss commented on LUCENE-3847:
-

Applied a fix for this. user.timezone is ignored (and is not reset).

 LuceneTestCase should check for modifications on System properties
 --

 Key: LUCENE-3847
 URL: https://issues.apache.org/jira/browse/LUCENE-3847
 Project: Lucene - Java
  Issue Type: Improvement
  Components: general/test
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3847.patch


 - fail the test if changes have been detected.
 - revert the state of system properties before the suite.
 - cleanup after the suite.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect

2012-03-22 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13236028#comment-13236028
 ] 

Dawid Weiss commented on LUCENE-3867:
-

Ok, I admit J9 is fascinating... ;) How much memory does this take?
{code}
class X {
  byte a = 0x11;
  byte b = 0x22;
}
{code}
Here is the memory layout:
{code}
[JVM: IBM J9 VM, 2.6, IBM Corporation, IBM Corporation, 1.7.0]
0x 00 b8 21 c4 5f 7f 00 00 00 00 00 00 00 00 00 00
0x0010 11 00 00 00 22 00 00 00
@16  4 Super.b1
@20  4 Super.b2
@24sizeOf(Super instance)
{code}

I don't think I screwed up anything. It really is 4 byte alignment _on all 
fields_.

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect
 --

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Uwe Schindler
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-3.x.patch, LUCENE-3867-compressedOops.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3895) Not getting random-seed/reproduce-with if a test fails from another thread

2012-03-21 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13234185#comment-13234185
 ] 

Dawid Weiss commented on LUCENE-3895:
-

bq. Hopefully Dawid hates it and knows of a way to fix it cleanly 

It's fine for the trunk. It will be redundant in LUCENE-3808 (the seed is 
reported at master build level there + exceptions have an injected fake stack 
trace entry with the current master/test seed combination, even though the test 
seed is redundant most of the time because it's derived).



 Not getting random-seed/reproduce-with if a test fails from another thread
 --

 Key: LUCENE-3895
 URL: https://issues.apache.org/jira/browse/LUCENE-3895
 Project: Lucene - Java
  Issue Type: Bug
  Components: general/test
Affects Versions: 4.0
Reporter: Robert Muir
 Attachments: LUCENE-3895.patch


 See https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/12822/console 
 as an example.
 This is at least affecting 4.0, maybe 3.x too

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3895) Not getting random-seed/reproduce-with if a test fails from another thread

2012-03-21 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13234189#comment-13234189
 ] 

Dawid Weiss commented on LUCENE-3895:
-

Feel free to commit in (4.0/3.x?), Robert.

 Not getting random-seed/reproduce-with if a test fails from another thread
 --

 Key: LUCENE-3895
 URL: https://issues.apache.org/jira/browse/LUCENE-3895
 Project: Lucene - Java
  Issue Type: Bug
  Components: general/test
Affects Versions: 4.0
Reporter: Robert Muir
 Attachments: LUCENE-3895.patch, LUCENE-3895.patch


 See https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/12822/console 
 as an example.
 This is at least affecting 4.0, maybe 3.x too

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3877) Lucene should not call System.out.println

2012-03-21 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13234881#comment-13234881
 ] 

Dawid Weiss commented on LUCENE-3877:
-

You can just as well substitute your own implementation of PrintStream using 
System.setOut/setErr and check stacks on printlns... But I agree with Benson 
that a static analysis approach is much cleaner. Don't know if there's anything 
out of the box in findbugs/ pmd, but even if not then this can be done as a 
10-liner by applying an aspect to classes via aspectj and parsing the output 
logs detecting if an aspect has been applied (it shouldn't match anywhere). 

 Lucene should not call System.out.println
 -

 Key: LUCENE-3877
 URL: https://issues.apache.org/jira/browse/LUCENE-3877
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless
 Fix For: 3.6, 4.0

 Attachments: IllegalSystemTest.java, IllegalSystemTest.java


 We seem to have accumulated a few random sops...
 Eg, PairOutputs.java (oal.util.fst) and MultiDocValues.java, at least.
 Can we somehow detect (eg, have a test failure) if we accidentally leave 
 errant System.out.println's (leftover from debugging)...?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3877) Lucene should not call System.out.println

2012-03-21 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235087#comment-13235087
 ] 

Dawid Weiss commented on LUCENE-3877:
-

fyi. PMD has a rule for this -- SystemPrintln.
http://pmd.sourceforge.net/rules/index.html

Didn't check the details though.

 Lucene should not call System.out.println
 -

 Key: LUCENE-3877
 URL: https://issues.apache.org/jira/browse/LUCENE-3877
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless
 Fix For: 3.6, 4.0

 Attachments: IllegalSystemTest.java, IllegalSystemTest.java, 
 SystemPrintCheck.java


 We seem to have accumulated a few random sops...
 Eg, PairOutputs.java (oal.util.fst) and MultiDocValues.java, at least.
 Can we somehow detect (eg, have a test failure) if we accidentally leave 
 errant System.out.println's (leftover from debugging)...?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3877) Lucene should not call System.out.println

2012-03-21 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235112#comment-13235112
 ] 

Dawid Weiss commented on LUCENE-3877:
-

I don't like PMD that much either, I'm just saying it seems to have it. If I 
were to choose though, I'd use aspectj rather than asm-based code. It just 
seems cleaner to me.
{code}
public aspect NoSysOuts {
before(): within(org.apache.lucene..*)  
  get(static PrintStream System.*) {
throw new RuntimeException(Attempted sysout/syserr/sysin access.);
}
}
{code}
You don't even need to run it, just weave with verbose output and see if the 
aspect matched anywhere.

 Lucene should not call System.out.println
 -

 Key: LUCENE-3877
 URL: https://issues.apache.org/jira/browse/LUCENE-3877
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless
 Fix For: 3.6, 4.0

 Attachments: IllegalSystemTest.java, IllegalSystemTest.java, 
 SystemPrintCheck.java


 We seem to have accumulated a few random sops...
 Eg, PairOutputs.java (oal.util.fst) and MultiDocValues.java, at least.
 Can we somehow detect (eg, have a test failure) if we accidentally leave 
 errant System.out.println's (leftover from debugging)...?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3258) Ping query caused exception..Invalid version (expected 2, but 60) or the data in not in 'javabin' format

2012-03-20 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233408#comment-13233408
 ] 

Dawid Weiss commented on SOLR-3258:
---

And here comes the moment where my knowledge of Solr ends :) I'd say there is 
definitely a bug in improper handling of HTTP response status (and this should 
be fixed), unless there is a filter somewhere that emits this HTML and fakes 
HTTP 200... But as for the cause of why this happens in general -- no idea.

 Ping query caused exception..Invalid version (expected 2, but 60) or the data 
 in not in 'javabin' format
 

 Key: SOLR-3258
 URL: https://issues.apache.org/jira/browse/SOLR-3258
 Project: Solr
  Issue Type: Bug
 Environment: solr-impl 4.0-SNAPSHOT 1302403 - markus - 2012-03-19 
 13:55:51 
Reporter: Markus Jelsma
 Fix For: 4.0

 Attachments: debugging.patch


 In a test set-up with nodes=2, shards=3 and cores=6 we often see this 
 exception in the logs. Once every few ping requests this is thrown, other 
 request return a proper OK.
 Ping request handler:
 {code}
 requestHandler name=/admin/ping class=solr.PingRequestHandler
 lst name=invariants
   str name=qtselect/str
   str name=q*:*/str
   int name=rows0/int
 /lst
 lst name=defaults
   str name=wtjson/str
   str name=echoParamsall/str
   bool name=omitHeadertrue/bool
 /lst
   /requestHandler
 {code}
 Exception:
 {code}
 2012-03-20 13:16:06,405 INFO [solr.core.SolrCore] - [http-80-18] - : [core_a] 
 webapp=/solr path=/admin/ping params={} status=500 QTime=7 
 2012-03-20 13:16:06,406 ERROR [solr.servlet.SolrDispatchFilter] - 
 [http-80-18] - : null:org.apache.solr.common.SolrException: Ping query caused 
 exception: org.apache.solr.client.solrj.SolrServerException: 
 java.lang.RuntimeException: Invalid version (expected 2, but 60) or the data 
 in not in 'javabin' format
 at 
 org.apache.solr.handler.PingRequestHandler.handleRequestBody(PingRequestHandler.java:77)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1540)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:435)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:256)
 at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
 at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
 at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
 at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
 at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
 at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
 at 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
 at 
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859)
 at 
 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:602)
 at 
 org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.solr.common.SolrException: 
 org.apache.solr.client.solrj.SolrServerException: java.lang.RuntimeException: 
 Invalid version (expected 2, but 60) or the data in not in 'javabin' format
 at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:298)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1540)
 at 
 org.apache.solr.handler.PingRequestHandler.handleRequestBody(PingRequestHandler.java:68)
 ... 16 more
 Caused by: org.apache.solr.client.solrj.SolrServerException: 
 java.lang.RuntimeException: Invalid version (expected 2, but 60) or the data 
 in not in 'javabin' format
 at 
 org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(LBHttpSolrServer.java:278)
 at 
 org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:158)
 at 
 org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:123)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at

[jira] [Commented] (LUCENE-3893) TermsFilter should use AutomatonQuery

2012-03-20 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233711#comment-13233711
 ] 

Dawid Weiss commented on LUCENE-3893:
-

bq. Dahiwikwukblabla 

Daciuk, the name is Jan Daciuk :) Although the same algorithm has been 
discovered independently by Stoyan Mihov and (I think) Bruce W. Watson.

 TermsFilter should use AutomatonQuery
 -

 Key: LUCENE-3893
 URL: https://issues.apache.org/jira/browse/LUCENE-3893
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Michael McCandless
  Labels: gsoc2012, lucene-gsoc-12

 I think we could see perf gains if TermsFilter sorted the terms, built a 
 minimal automaton, and used TermsEnum.intersect to visit the terms...
 This idea came up on the dev list recently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

1 2 3 4 >

1 - 100 of 342 matches

Mail list logo