Re: JCC linux patch
On Sep 30, 2012, at 21:35, Caleb Burns cpbu...@gmail.com wrote: Hi, The current method to build JCC on Linux requires patching the setuptools package. Would you be interested in a patch to JCC that monkey-patches the setuptools Library and Extension classes to avoid the manual patch. It works with setuptools-0.6c7-11 and distribute-0.6.1+ without the need of manually patching setuptools. If this is not the proper place to propose a fix, please let me know. That would be great ! Could you please make your monkey patch detect the version of setuptools/distribute used and issue the same the same error message as is currently emitted by the JCC linux setup.py code when the version is not supported by your monkey patch, ie, when manual patching is still needed. Thanks ! Andi., Thanks, Caleb Burns
[jira] [Created] (SOLR-3907) PERF: Add squared euclidean distance for geodist(). Name it geosqedist()
Bill Bell created SOLR-3907: --- Summary: PERF: Add squared euclidean distance for geodist(). Name it geosqedist() Key: SOLR-3907 URL: https://issues.apache.org/jira/browse/SOLR-3907 Project: Solr Issue Type: Improvement Affects Versions: 4.0 Reporter: Bill Bell geodist() does the exact haversine calculation. Add a new function similar to geodist() but call it geosqedist(). This should improve performance, when you want the order to be right, but don't care about exact calculations. Do the haversine calc but do NOT take the square root. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4450) Distance boost added to Suggester
Bill Bell created LUCENE-4450: - Summary: Distance boost added to Suggester Key: LUCENE-4450 URL: https://issues.apache.org/jira/browse/LUCENE-4450 Project: Lucene - Core Issue Type: Improvement Affects Versions: 4.0 Reporter: Bill Bell A common Suggester use case is to boost the results by closest (auto suggest the whole USA but boost the results in the suggester by geodistance). Would love to get faster response with that. At the Lucene Revolution 2012 in Boston a speaker did discuss using WFST to do this, but I have yet to figure out how to do it). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4450) Distance boost added to Suggester
[ https://issues.apache.org/jira/browse/LUCENE-4450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13466431#comment-13466431 ] Bill Bell commented on LUCENE-4450: --- See video on http://vimeo.com/43281536 Distance boost added to Suggester - Key: LUCENE-4450 URL: https://issues.apache.org/jira/browse/LUCENE-4450 Project: Lucene - Core Issue Type: Improvement Affects Versions: 4.0 Reporter: Bill Bell A common Suggester use case is to boost the results by closest (auto suggest the whole USA but boost the results in the suggester by geodistance). Would love to get faster response with that. At the Lucene Revolution 2012 in Boston a speaker did discuss using WFST to do this, but I have yet to figure out how to do it). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Linux (32bit/ibm-j9-jdk6) - Build # 1457 - Still Failing!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Linux/1457/ Java: 32bit/ibm-j9-jdk6 -Xjit:exclude={org/apache/lucene/util/fst/FST.pack(IIF)Lorg/apache/lucene/util/fst/FST;} All tests passed Build Log: [...truncated 1718 lines...] [junit4:junit4] ERROR: JVM J0 ended with an exception, command line: /opt/ibm/java-i386-60/jre/bin/java -Xjit:exclude={org/apache/lucene/util/fst/FST.pack(IIF)Lorg/apache/lucene/util/fst/FST;} -Dtests.prefix=tests -Dtests.seed=4AC300401D518FFA -Xmx512M -Dtests.iters= -Dtests.verbose=false -Dtests.infostream=false -Dtests.lockdir=/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build -Dtests.codec=random -Dtests.postingsformat=random -Dtests.locale=random -Dtests.timezone=random -Dtests.directory=random -Dtests.linedocsfile=europarl.lines.txt.gz -Dtests.luceneMatchVersion=5.0 -Dtests.cleanthreads=perMethod -Djava.util.logging.config.file=/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/tools/junit4/logging.properties -Dtests.nightly=false -Dtests.weekly=false -Dtests.slow=true -Dtests.asserts.gracious=false -Dtests.multiplier=3 -DtempDir=. -Djava.io.tmpdir=. -Dtests.sandbox.dir=/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build/core -Dclover.db.dir=/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build/clover/db -Djava.security.manager=org.apache.lucene.util.TestSecurityManager -Djava.security.policy=/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/tools/junit4/tests.policy -Dlucene.version=5.0-SNAPSHOT -Djetty.testMode=1 -Djetty.insecurerandom=1 -Dsolr.directoryFactory=org.apache.solr.core.MockDirectoryFactory -Djava.awt.headless=true -Dfile.encoding=US-ASCII -classpath /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build/codecs/classes/java:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build/test-framework/classes/java:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/test-framework/lib/junit-4.10.jar:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/test-framework/lib/randomizedtesting-runner-2.0.1.jar:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build/core/classes/java:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build/core/classes/test:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-launcher.jar:/var/lib/jenkins/.ant/lib/ivy-2.2.0.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-jdepend.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-netrexx.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-antlr.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-commons-net.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-javamail.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-apache-regexp.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-jsch.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-apache-xalan2.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-junit4.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-jmf.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-junit.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-apache-bcel.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-jai.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-commons-logging.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-apache-resolver.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-apache-oro.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-swing.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-apache-bsf.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-apache-log4j.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-testutil.jar:/opt/ibm/java-i386-60/lib/tools.jar:/var/lib/jenkins/.ivy2/cache/com.carrotsearch.randomizedtesting/junit4-ant/jars/junit4-ant-2.0.1.jar -ea:org.apache.lucene... -ea:org.apache.solr... com.carrotsearch.ant.tasks.junit4.slave.SlaveMainSafe -eventsfile /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build/core/test/junit4-J0-20120929_222706_774.events @/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build/core/test/junit4-J0-20120929_222706_774.suites [junit4:junit4] ERROR: JVM J0 ended with an exception: Forked process exited with an error code: 137 [junit4:junit4] at com.carrotsearch.ant.tasks.junit4.JUnit4.forkProcess(JUnit4.java:1312) [junit4:junit4] at com.carrotsearch.ant.tasks.junit4.JUnit4.executeSlave(JUnit4.java:1178) [junit4:junit4] at com.carrotsearch.ant.tasks.junit4.JUnit4.access$000(JUnit4.java:65) [junit4:junit4] at com.carrotsearch.ant.tasks.junit4.JUnit4$2.call(JUnit4.java:813) [junit4:junit4] at com.carrotsearch.ant.tasks.junit4.JUnit4$2.call(JUnit4.java:810) [junit4:junit4] at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) [junit4:junit4] at java.util.concurrent.FutureTask.run(FutureTask.java:138) [junit4:junit4] at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [junit4:junit4] at
RE: [JENKINS] Lucene-Solr-trunk-Linux (32bit/ibm-j9-jdk6) - Build # 1457 - Still Failing!
This one hung in the new test (TestPostingsFormat.testDocsAndFreqsAndPositionsAndOffsets) for 10 hrs. Unfortunately the kill -3 approach did not produce a stack trace in IBM J9. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Policeman Jenkins Server [mailto:jenk...@sd-datasolutions.de] Sent: Sunday, September 30, 2012 11:13 AM To: dev@lucene.apache.org; rm...@apache.org; mikemcc...@apache.org Subject: [JENKINS] Lucene-Solr-trunk-Linux (32bit/ibm-j9-jdk6) - Build # 1457 - Still Failing! Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Linux/1457/ Java: 32bit/ibm-j9-jdk6 - Xjit:exclude={org/apache/lucene/util/fst/FST.pack(IIF)Lorg/apache/lucene/util/f st/FST;} All tests passed Build Log: [...truncated 1718 lines...] [junit4:junit4] ERROR: JVM J0 ended with an exception, command line: /opt/ibm/java-i386-60/jre/bin/java - Xjit:exclude={org/apache/lucene/util/fst/FST.pack(IIF)Lorg/apache/lucene/util/f st/FST;} -Dtests.prefix=tests -Dtests.seed=4AC300401D518FFA -Xmx512M - Dtests.iters= -Dtests.verbose=false -Dtests.infostream=false - Dtests.lockdir=/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk- Linux/lucene/build -Dtests.codec=random -Dtests.postingsformat=random - Dtests.locale=random -Dtests.timezone=random -Dtests.directory=random - Dtests.linedocsfile=europarl.lines.txt.gz -Dtests.luceneMatchVersion=5.0 - Dtests.cleanthreads=perMethod - Djava.util.logging.config.file=/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk- Linux/lucene/tools/junit4/logging.properties -Dtests.nightly=false - Dtests.weekly=false -Dtests.slow=true -Dtests.asserts.gracious=false - Dtests.multiplier=3 -DtempDir=. -Djava.io.tmpdir=. - Dtests.sandbox.dir=/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk- Linux/lucene/build/core -Dclover.db.dir=/mnt/ssd/jenkins/workspace/Lucene- Solr-trunk-Linux/lucene/build/clover/db - Djava.security.manager=org.apache.lucene.util.TestSecurityManager - Djava.security.policy=/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk- Linux/lucene/tools/junit4/tests.policy -Dlucene.version=5.0-SNAPSHOT - Djetty.testMode=1 -Djetty.insecurerandom=1 - Dsolr.directoryFactory=org.apache.solr.core.MockDirectoryFactory - Djava.awt.headless=true -Dfile.encoding=US-ASCII -classpath /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk- Linux/lucene/build/codecs/classes/java:/mnt/ssd/jenkins/workspace/Lucene- Solr-trunk-Linux/lucene/build/test- framework/classes/java:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk- Linux/lucene/test-framework/lib/junit- 4.10.jar:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/test- framework/lib/randomizedtesting-runner- 2.0.1.jar:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk- Linux/lucene/build/core/classes/java:/mnt/ssd/jenkins/workspace/Lucene-Solr- trunk- Linux/lucene/build/core/classes/test:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/a nt-launcher.jar:/var/lib/jenkins/.ant/lib/ivy- 2.2.0.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant- jdepend.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant- netrexx.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant- antlr.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-commons- net.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant- javamail.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-apache- regexp.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant- jsch.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-apache- xalan2.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant- junit4.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant- jmf.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant.jar:/var/lib/jenkins/tools/A nt/ANT_1.8.2/lib/ant-junit.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant- apache-bcel.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant- jai.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-commons- logging.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-apache- resolver.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-apache- oro.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant- swing.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-apache- bsf.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-apache- log4j.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-testutil.jar:/opt/ibm/java- i386- 60/lib/tools.jar:/var/lib/jenkins/.ivy2/cache/com.carrotsearch.randomizedtestin g/junit4-ant/jars/junit4-ant-2.0.1.jar -ea:org.apache.lucene... - ea:org.apache.solr... com.carrotsearch.ant.tasks.junit4.slave.SlaveMainSafe - eventsfile /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk- Linux/lucene/build/core/test/junit4-J0-20120929_222706_774.events @/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk- Linux/lucene/build/core/test/junit4-J0-20120929_222706_774.suites [junit4:junit4] ERROR: JVM J0 ended with an exception: Forked process exited with an error code: 137 [junit4:junit4] at com.carrotsearch.ant.tasks.junit4.JUnit4.forkProcess(JUnit4.java:1312) [junit4:junit4] at
[JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.8.0-ea-b51) - Build # 1447 - Still Failing!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Linux/1447/ Java: 32bit/jdk1.8.0-ea-b51 -client -XX:+UseConcMarkSweepGC All tests passed Build Log: [...truncated 985 lines...] [junit4:junit4] ERROR: JVM J0 ended with an exception, command line: /mnt/ssd/jenkins/tools/java/32bit/jdk1.8.0-ea-b51/jre/bin/java -client -XX:+UseConcMarkSweepGC -Dtests.prefix=tests -Dtests.seed=512270084037FEE6 -Xmx512M -Dtests.iters= -Dtests.verbose=false -Dtests.infostream=false -Dtests.lockdir=/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build -Dtests.codec=random -Dtests.postingsformat=random -Dtests.locale=random -Dtests.timezone=random -Dtests.directory=random -Dtests.linedocsfile=europarl.lines.txt.gz -Dtests.luceneMatchVersion=4.1 -Dtests.cleanthreads=perMethod -Djava.util.logging.config.file=/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/tools/junit4/logging.properties -Dtests.nightly=false -Dtests.weekly=false -Dtests.slow=true -Dtests.asserts.gracious=false -Dtests.multiplier=3 -DtempDir=. -Djava.io.tmpdir=. -Dtests.sandbox.dir=/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/core -Dclover.db.dir=/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/clover/db -Djava.security.manager=org.apache.lucene.util.TestSecurityManager -Djava.security.policy=/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/tools/junit4/tests.policy -Dlucene.version=4.1-SNAPSHOT -Djetty.testMode=1 -Djetty.insecurerandom=1 -Dsolr.directoryFactory=org.apache.solr.core.MockDirectoryFactory -Djava.awt.headless=true -classpath /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/test-framework/classes/java:/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/codecs/classes/java:/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/test-framework/lib/junit-4.10.jar:/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/test-framework/lib/randomizedtesting-runner-2.0.1.jar:/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/core/classes/java:/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/core/classes/test:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-launcher.jar:/var/lib/jenkins/.ant/lib/ivy-2.2.0.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-jdepend.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-netrexx.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-antlr.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-commons-net.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-javamail.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-apache-regexp.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-jsch.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-apache-xalan2.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-junit4.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-jmf.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-junit.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-apache-bcel.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-jai.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-commons-logging.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-apache-resolver.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-apache-oro.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-swing.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-apache-bsf.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-apache-log4j.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-testutil.jar:/mnt/ssd/jenkins/tools/java/32bit/jdk1.8.0-ea-b51/lib/tools.jar:/var/lib/jenkins/.ivy2/cache/com.carrotsearch.randomizedtesting/junit4-ant/jars/junit4-ant-2.0.1.jar -ea:org.apache.lucene... -ea:org.apache.solr... com.carrotsearch.ant.tasks.junit4.slave.SlaveMainSafe -eventsfile /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/core/test/junit4-J0-20120930_091328_540.events @/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/core/test/junit4-J0-20120930_091328_540.suites [junit4:junit4] ERROR: JVM J0 ended with an exception: Forked process exited with an error code: 134 [junit4:junit4] at com.carrotsearch.ant.tasks.junit4.JUnit4.forkProcess(JUnit4.java:1312) [junit4:junit4] at com.carrotsearch.ant.tasks.junit4.JUnit4.executeSlave(JUnit4.java:1178) [junit4:junit4] at com.carrotsearch.ant.tasks.junit4.JUnit4.access$000(JUnit4.java:65) [junit4:junit4] at com.carrotsearch.ant.tasks.junit4.JUnit4$2.call(JUnit4.java:813) [junit4:junit4] at com.carrotsearch.ant.tasks.junit4.JUnit4$2.call(JUnit4.java:810) [junit4:junit4] at java.util.concurrent.FutureTask.run(FutureTask.java:262) [junit4:junit4] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) [junit4:junit4] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) [junit4:junit4] at java.lang.Thread.run(Thread.java:722) BUILD FAILED /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:38: The following error
[JENKINS] Lucene-Solr-NightlyTests-trunk - Build # 49 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-trunk/49/ 3 tests failed. FAILED: org.apache.lucene.codecs.memory.TestDirectPostingsFormat.testDocsAndFreqsAndPositionsAndOffsetsAndPayloads Error Message: Java heap space Stack Trace: java.lang.OutOfMemoryError: Java heap space at __randomizedtesting.SeedInfo.seed([4F0EC1174AC1F6D9:A12BD8EFF2366AD7]:0) at org.apache.lucene.codecs.memory.DirectPostingsFormat$DirectField.init(DirectPostingsFormat.java:394) at org.apache.lucene.codecs.memory.DirectPostingsFormat$DirectFields.init(DirectPostingsFormat.java:128) at org.apache.lucene.codecs.memory.DirectPostingsFormat.fieldsProducer(DirectPostingsFormat.java:112) at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.init(PerFieldPostingsFormat.java:194) at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat.fieldsProducer(PerFieldPostingsFormat.java:233) at org.apache.lucene.index.BasePostingsFormatTestCase.buildIndex(BasePostingsFormatTestCase.java:487) at org.apache.lucene.index.BasePostingsFormatTestCase.testFull(BasePostingsFormatTestCase.java:949) at org.apache.lucene.index.BasePostingsFormatTestCase.testDocsAndFreqsAndPositionsAndOffsetsAndPayloads(BasePostingsFormatTestCase.java:990) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693) FAILED: org.apache.lucene.codecs.memory.TestDirectPostingsFormat.testRandom Error Message: Java heap space Stack Trace: java.lang.OutOfMemoryError: Java heap space at __randomizedtesting.SeedInfo.seed([4F0EC1174AC1F6D9:3D42E418FBA140AA]:0) at org.apache.lucene.codecs.memory.DirectPostingsFormat$DirectField.init(DirectPostingsFormat.java:394) at org.apache.lucene.codecs.memory.DirectPostingsFormat$DirectFields.init(DirectPostingsFormat.java:128) at org.apache.lucene.codecs.memory.DirectPostingsFormat.fieldsProducer(DirectPostingsFormat.java:112) at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.init(PerFieldPostingsFormat.java:194) at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat.fieldsProducer(PerFieldPostingsFormat.java:233) at org.apache.lucene.index.BasePostingsFormatTestCase.buildIndex(BasePostingsFormatTestCase.java:487) at org.apache.lucene.index.BasePostingsFormatTestCase.testRandom(BasePostingsFormatTestCase.java:1003) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at
[JENKINS] Lucene-Solr-4.x-Windows (32bit/jdk1.6.0_35) - Build # 985 - Failure!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows/985/ Java: 32bit/jdk1.6.0_35 -server -XX:+UseConcMarkSweepGC All tests passed Build Log: [...truncated 23303 lines...] BUILD FAILED C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\build.xml:352: The following error occurred while executing this line: C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\build.xml:65: The following error occurred while executing this line: C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\lucene\build.xml:511: The following error occurred while executing this line: C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\lucene\common-build.xml:1911: Can't get https://issues.apache.org/jira/rest/api/2/project/LUCENE to C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\lucene\build\docs\changes\jiraVersionList.json Total time: 54 minutes 33 seconds Build step 'Invoke Ant' marked build as failure Recording test results Description set: Java: 32bit/jdk1.6.0_35 -server -XX:+UseConcMarkSweepGC Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4451) Memory leak per unique thread caused by RandomizedContext.contexts static map
Michael McCandless created LUCENE-4451: -- Summary: Memory leak per unique thread caused by RandomizedContext.contexts static map Key: LUCENE-4451 URL: https://issues.apache.org/jira/browse/LUCENE-4451 Project: Lucene - Core Issue Type: Bug Reporter: Michael McCandless In digging on the hard-to-understand OOMEs with TestDirectPostingsFormat ... I found (thank you YourKit) that RandomizedContext (in randomizedtesting JAR) seems to be holding onto all threads created by the test. The test does create many very short lived threads (testing the thread safety of the postings format), in BasePostingsFormatTestCase.testTerms), and somehow these seem to tie up a lot (~100 MB) of RAM in RandomizedContext.contexts static map. For now I've disabled all thread testing (committed {{false }} inside {{BPFTC.testTerms}}), but hopefully we can fix the root cause here, eg when a thread exits can we clear it from that map? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-Solr-trunk-Linux (32bit/ibm-j9-jdk6) - Build # 1457 - Still Failing!
That's strange -- should have timed out. I'm guessing the process never exited for some reason. It'd be nice if you could add a config so that all *.events files are archived after a build; we had that with Robert -- what you need to do is add an artefact to be stored after a build and specify **/*.events,README.txt as the pattern. The README.txt is so that jenkins doesn't fail with an empty fileset in case event files are removed (successful builds). I'd appreciate this, it'd help in debugging stuff. Dawid On Sun, Sep 30, 2012 at 11:15 AM, Uwe Schindler u...@thetaphi.de wrote: This one hung in the new test (TestPostingsFormat.testDocsAndFreqsAndPositionsAndOffsets) for 10 hrs. Unfortunately the kill -3 approach did not produce a stack trace in IBM J9. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Policeman Jenkins Server [mailto:jenk...@sd-datasolutions.de] Sent: Sunday, September 30, 2012 11:13 AM To: dev@lucene.apache.org; rm...@apache.org; mikemcc...@apache.org Subject: [JENKINS] Lucene-Solr-trunk-Linux (32bit/ibm-j9-jdk6) - Build # 1457 - Still Failing! Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Linux/1457/ Java: 32bit/ibm-j9-jdk6 - Xjit:exclude={org/apache/lucene/util/fst/FST.pack(IIF)Lorg/apache/lucene/util/f st/FST;} All tests passed Build Log: [...truncated 1718 lines...] [junit4:junit4] ERROR: JVM J0 ended with an exception, command line: /opt/ibm/java-i386-60/jre/bin/java - Xjit:exclude={org/apache/lucene/util/fst/FST.pack(IIF)Lorg/apache/lucene/util/f st/FST;} -Dtests.prefix=tests -Dtests.seed=4AC300401D518FFA -Xmx512M - Dtests.iters= -Dtests.verbose=false -Dtests.infostream=false - Dtests.lockdir=/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk- Linux/lucene/build -Dtests.codec=random -Dtests.postingsformat=random - Dtests.locale=random -Dtests.timezone=random -Dtests.directory=random - Dtests.linedocsfile=europarl.lines.txt.gz -Dtests.luceneMatchVersion=5.0 - Dtests.cleanthreads=perMethod - Djava.util.logging.config.file=/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk- Linux/lucene/tools/junit4/logging.properties -Dtests.nightly=false - Dtests.weekly=false -Dtests.slow=true -Dtests.asserts.gracious=false - Dtests.multiplier=3 -DtempDir=. -Djava.io.tmpdir=. - Dtests.sandbox.dir=/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk- Linux/lucene/build/core -Dclover.db.dir=/mnt/ssd/jenkins/workspace/Lucene- Solr-trunk-Linux/lucene/build/clover/db - Djava.security.manager=org.apache.lucene.util.TestSecurityManager - Djava.security.policy=/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk- Linux/lucene/tools/junit4/tests.policy -Dlucene.version=5.0-SNAPSHOT - Djetty.testMode=1 -Djetty.insecurerandom=1 - Dsolr.directoryFactory=org.apache.solr.core.MockDirectoryFactory - Djava.awt.headless=true -Dfile.encoding=US-ASCII -classpath /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk- Linux/lucene/build/codecs/classes/java:/mnt/ssd/jenkins/workspace/Lucene- Solr-trunk-Linux/lucene/build/test- framework/classes/java:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk- Linux/lucene/test-framework/lib/junit- 4.10.jar:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/test- framework/lib/randomizedtesting-runner- 2.0.1.jar:/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk- Linux/lucene/build/core/classes/java:/mnt/ssd/jenkins/workspace/Lucene-Solr- trunk- Linux/lucene/build/core/classes/test:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/a nt-launcher.jar:/var/lib/jenkins/.ant/lib/ivy- 2.2.0.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant- jdepend.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant- netrexx.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant- antlr.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-commons- net.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant- javamail.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-apache- regexp.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant- jsch.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-apache- xalan2.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant- junit4.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant- jmf.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant.jar:/var/lib/jenkins/tools/A nt/ANT_1.8.2/lib/ant-junit.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant- apache-bcel.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant- jai.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-commons- logging.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-apache- resolver.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-apache- oro.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant- swing.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-apache- bsf.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-apache- log4j.jar:/var/lib/jenkins/tools/Ant/ANT_1.8.2/lib/ant-testutil.jar:/opt/ibm/java- i386- 60/lib/tools.jar:/var/lib/jenkins/.ivy2/cache/com.carrotsearch.randomizedtestin
[jira] [Assigned] (LUCENE-4451) Memory leak per unique thread caused by RandomizedContext.contexts static map
[ https://issues.apache.org/jira/browse/LUCENE-4451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss reassigned LUCENE-4451: --- Assignee: Dawid Weiss Memory leak per unique thread caused by RandomizedContext.contexts static map - Key: LUCENE-4451 URL: https://issues.apache.org/jira/browse/LUCENE-4451 Project: Lucene - Core Issue Type: Bug Reporter: Michael McCandless Assignee: Dawid Weiss In digging on the hard-to-understand OOMEs with TestDirectPostingsFormat ... I found (thank you YourKit) that RandomizedContext (in randomizedtesting JAR) seems to be holding onto all threads created by the test. The test does create many very short lived threads (testing the thread safety of the postings format), in BasePostingsFormatTestCase.testTerms), and somehow these seem to tie up a lot (~100 MB) of RAM in RandomizedContext.contexts static map. For now I've disabled all thread testing (committed {{false }} inside {{BPFTC.testTerms}}), but hopefully we can fix the root cause here, eg when a thread exits can we clear it from that map? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4451) Memory leak per unique thread caused by RandomizedContext.contexts static map
[ https://issues.apache.org/jira/browse/LUCENE-4451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13466469#comment-13466469 ] Dawid Weiss commented on LUCENE-4451: - This is a border case. The problem is most likely not the threads themselves but also the fact that they keep refs to some other data structures (in thread locals)? I'll see what I can do not to keep hard refs to those thread/context pairs though. I remember there was a reason I didn't use soft refs for this from the beginning but I can't tell you right now what it was. Memory leak per unique thread caused by RandomizedContext.contexts static map - Key: LUCENE-4451 URL: https://issues.apache.org/jira/browse/LUCENE-4451 Project: Lucene - Core Issue Type: Bug Reporter: Michael McCandless Assignee: Dawid Weiss In digging on the hard-to-understand OOMEs with TestDirectPostingsFormat ... I found (thank you YourKit) that RandomizedContext (in randomizedtesting JAR) seems to be holding onto all threads created by the test. The test does create many very short lived threads (testing the thread safety of the postings format), in BasePostingsFormatTestCase.testTerms), and somehow these seem to tie up a lot (~100 MB) of RAM in RandomizedContext.contexts static map. For now I've disabled all thread testing (committed {{false }} inside {{BPFTC.testTerms}}), but hopefully we can fix the root cause here, eg when a thread exits can we clear it from that map? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4451) Memory leak per unique thread caused by RandomizedContext.contexts static map
[ https://issues.apache.org/jira/browse/LUCENE-4451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13466471#comment-13466471 ] Michael McCandless commented on LUCENE-4451: OK that's a good clue: the Threads are created as anon classes, which means they are holding references to lots of extra stuff. Let me try making them static classes, and explicitly null out the stuff when they are done running... Memory leak per unique thread caused by RandomizedContext.contexts static map - Key: LUCENE-4451 URL: https://issues.apache.org/jira/browse/LUCENE-4451 Project: Lucene - Core Issue Type: Bug Reporter: Michael McCandless Assignee: Dawid Weiss In digging on the hard-to-understand OOMEs with TestDirectPostingsFormat ... I found (thank you YourKit) that RandomizedContext (in randomizedtesting JAR) seems to be holding onto all threads created by the test. The test does create many very short lived threads (testing the thread safety of the postings format), in BasePostingsFormatTestCase.testTerms), and somehow these seem to tie up a lot (~100 MB) of RAM in RandomizedContext.contexts static map. For now I've disabled all thread testing (committed {{false }} inside {{BPFTC.testTerms}}), but hopefully we can fix the root cause here, eg when a thread exits can we clear it from that map? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2899) Add OpenNLP Analysis capabilities as a module
[ https://issues.apache.org/jira/browse/LUCENE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13466478#comment-13466478 ] mailformailingli...@yahoo.de commented on LUCENE-2899: -- Could you please create a new Patch for the current Trunk? I had some problems on applying it to my working copy... I am not entirely sure whether its the Trunk or your Code, but it seems like your OpenNLP-code only works for the first request. As far as I was able to debug, the create()-method of the TokenFilterFactory is only called every now and again (are created TokenFilters reused for longer than one call in Solr?). If create() of your FilterFactory was called, everything works. However if the TokenFilter is somehow reused, it fails. Is this a bug of Solr or of your Patch? Add OpenNLP Analysis capabilities as a module - Key: LUCENE-2899 URL: https://issues.apache.org/jira/browse/LUCENE-2899 Project: Lucene - Core Issue Type: New Feature Components: modules/analysis Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Attachments: LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, opennlp_trunk.patch Now that OpenNLP is an ASF project and has a nice license, it would be nice to have a submodule (under analysis) that exposed capabilities for it. Drew Farris, Tom Morton and I have code that does: * Sentence Detection as a Tokenizer (could also be a TokenFilter, although it would have to change slightly to buffer tokens) * NamedEntity recognition as a TokenFilter We are also planning a Tokenizer/TokenFilter that can put parts of speech as either payloads (PartOfSpeechAttribute?) on a token or at the same position. I'd propose it go under: modules/analysis/opennlp -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4450) Distance boost added to Suggester
[ https://issues.apache.org/jira/browse/LUCENE-4450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13466493#comment-13466493 ] Sudarshan Gaikaiwari commented on LUCENE-4450: -- I am sorry that my talk was confusing. The modifications I made to the WFST suggester are not for boosting the suggestions by geodistance but to restrict the suggestions to a particular geographical location. The code to do this is available here https://github.com/sudarshang/lucene-solr Please see https://github.com/sudarshang/lucene-solr/blob/master/lucene/spatial-suggest/src/java/org/apache/lucene/search/spatial_suggest/WFSTGeoSpatialLookup.java. I am interested in bring this branch up to date with the current trunk and contributing it to Lucene. Distance boost added to Suggester - Key: LUCENE-4450 URL: https://issues.apache.org/jira/browse/LUCENE-4450 Project: Lucene - Core Issue Type: Improvement Affects Versions: 4.0 Reporter: Bill Bell A common Suggester use case is to boost the results by closest (auto suggest the whole USA but boost the results in the suggester by geodistance). Would love to get faster response with that. At the Lucene Revolution 2012 in Boston a speaker did discuss using WFST to do this, but I have yet to figure out how to do it). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-2899) Add OpenNLP Analysis capabilities as a module
[ https://issues.apache.org/jira/browse/LUCENE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Em updated LUCENE-2899: --- Attachment: OpenNLPFilter.java Fixed OpenNLPFilter.java indexToken-Attribute has been reset. Add OpenNLP Analysis capabilities as a module - Key: LUCENE-2899 URL: https://issues.apache.org/jira/browse/LUCENE-2899 Project: Lucene - Core Issue Type: New Feature Components: modules/analysis Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Attachments: LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, OpenNLPFilter.java, opennlp_trunk.patch Now that OpenNLP is an ASF project and has a nice license, it would be nice to have a submodule (under analysis) that exposed capabilities for it. Drew Farris, Tom Morton and I have code that does: * Sentence Detection as a Tokenizer (could also be a TokenFilter, although it would have to change slightly to buffer tokens) * NamedEntity recognition as a TokenFilter We are also planning a Tokenizer/TokenFilter that can put parts of speech as either payloads (PartOfSpeechAttribute?) on a token or at the same position. I'd propose it go under: modules/analysis/opennlp -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-2899) Add OpenNLP Analysis capabilities as a module
[ https://issues.apache.org/jira/browse/LUCENE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13466495#comment-13466495 ] Em edited comment on LUCENE-2899 at 10/1/12 1:42 AM: - Fixed OpenNLPFilter.java indexToken-Attribute has been reset. Since I applied the Patch with some trouble, I think it is more save to provide you the source code than to create a patch from a maybe-corrupted working copy. was (Author: em): Fixed OpenNLPFilter.java indexToken-Attribute has been reset. Add OpenNLP Analysis capabilities as a module - Key: LUCENE-2899 URL: https://issues.apache.org/jira/browse/LUCENE-2899 Project: Lucene - Core Issue Type: New Feature Components: modules/analysis Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Attachments: LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, OpenNLPFilter.java, opennlp_trunk.patch Now that OpenNLP is an ASF project and has a nice license, it would be nice to have a submodule (under analysis) that exposed capabilities for it. Drew Farris, Tom Morton and I have code that does: * Sentence Detection as a Tokenizer (could also be a TokenFilter, although it would have to change slightly to buffer tokens) * NamedEntity recognition as a TokenFilter We are also planning a Tokenizer/TokenFilter that can put parts of speech as either payloads (PartOfSpeechAttribute?) on a token or at the same position. I'd propose it go under: modules/analysis/opennlp -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4452) Need to test BlockPostings when payloads/offsets are indexed, but DPEnum flags=0
Robert Muir created LUCENE-4452: --- Summary: Need to test BlockPostings when payloads/offsets are indexed, but DPEnum flags=0 Key: LUCENE-4452 URL: https://issues.apache.org/jira/browse/LUCENE-4452 Project: Lucene - Core Issue Type: Sub-task Components: core/codecs Reporter: Robert Muir In this case we get a BlockDocsAndPositionsEnum just reading positions and ignoring the stuff in the .pay: but this is untested. see BlockDocsAndPositionsEnum.refillPositions in https://builds.apache.org/job/Lucene-Solr-Clover-4.x/34/clover-report/org/apache/lucene/codecs/block/BlockPostingsReader.html#BlockPostingsReader -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4453) Need to test BlockPostings with term df/ttfs that = blockSize or are a multiple of blockSize
Robert Muir created LUCENE-4453: --- Summary: Need to test BlockPostings with term df/ttfs that = blockSize or are a multiple of blockSize Key: LUCENE-4453 URL: https://issues.apache.org/jira/browse/LUCENE-4453 Project: Lucene - Core Issue Type: Sub-task Components: core/codecs Reporter: Robert Muir This is a special case in some of the code, but its currently very rare (e.g. typically untested). We should add a test just for these corner cases: see https://builds.apache.org/job/Lucene-Solr-Clover-4.x/34/clover-report/org/apache/lucene/codecs/block/BlockPostingsReader.html#BlockPostingsReader and https://builds.apache.org/job/Lucene-Solr-Clover-4.x/34/clover-report/org/apache/lucene/codecs/block/BlockSkipReader.html#BlockSkipReader -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-2899) Add OpenNLP Analysis capabilities as a module
[ https://issues.apache.org/jira/browse/LUCENE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Em updated LUCENE-2899: --- Comment: was deleted (was: Fixed OpenNLPFilter.java indexToken-Attribute has been reset. Since I had some trouble while applying your patch, I think it is more save to provide you the source code than to create a patch from a maybe-corrupted working copy.) Add OpenNLP Analysis capabilities as a module - Key: LUCENE-2899 URL: https://issues.apache.org/jira/browse/LUCENE-2899 Project: Lucene - Core Issue Type: New Feature Components: modules/analysis Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Attachments: LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, opennlp_trunk.patch Now that OpenNLP is an ASF project and has a nice license, it would be nice to have a submodule (under analysis) that exposed capabilities for it. Drew Farris, Tom Morton and I have code that does: * Sentence Detection as a Tokenizer (could also be a TokenFilter, although it would have to change slightly to buffer tokens) * NamedEntity recognition as a TokenFilter We are also planning a Tokenizer/TokenFilter that can put parts of speech as either payloads (PartOfSpeechAttribute?) on a token or at the same position. I'd propose it go under: modules/analysis/opennlp -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-2899) Add OpenNLP Analysis capabilities as a module
[ https://issues.apache.org/jira/browse/LUCENE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Em updated LUCENE-2899: --- Attachment: (was: OpenNLPFilter.java) Add OpenNLP Analysis capabilities as a module - Key: LUCENE-2899 URL: https://issues.apache.org/jira/browse/LUCENE-2899 Project: Lucene - Core Issue Type: New Feature Components: modules/analysis Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Attachments: LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, opennlp_trunk.patch Now that OpenNLP is an ASF project and has a nice license, it would be nice to have a submodule (under analysis) that exposed capabilities for it. Drew Farris, Tom Morton and I have code that does: * Sentence Detection as a Tokenizer (could also be a TokenFilter, although it would have to change slightly to buffer tokens) * NamedEntity recognition as a TokenFilter We are also planning a Tokenizer/TokenFilter that can put parts of speech as either payloads (PartOfSpeechAttribute?) on a token or at the same position. I'd propose it go under: modules/analysis/opennlp -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4452) Need to test BlockPostings when payloads/offsets are indexed, but DPEnum flags=0
[ https://issues.apache.org/jira/browse/LUCENE-4452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13466498#comment-13466498 ] Robert Muir commented on LUCENE-4452: - I was actually surprised to see this, esp. since MockAnalyzer randomly indexes payloads and this should be the case for things like PhraseQuery. We should ensure this isn't dead code or a bug: it seems wierd to me that we would need to read this useless data here: partial blocks for these elements are still written to the .pay file right? Need to test BlockPostings when payloads/offsets are indexed, but DPEnum flags=0 - Key: LUCENE-4452 URL: https://issues.apache.org/jira/browse/LUCENE-4452 Project: Lucene - Core Issue Type: Sub-task Components: core/codecs Reporter: Robert Muir In this case we get a BlockDocsAndPositionsEnum just reading positions and ignoring the stuff in the .pay: but this is untested. see BlockDocsAndPositionsEnum.refillPositions in https://builds.apache.org/job/Lucene-Solr-Clover-4.x/34/clover-report/org/apache/lucene/codecs/block/BlockPostingsReader.html#BlockPostingsReader -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-4453) Need to test BlockPostings with term df/ttfs that = blockSize or are a multiple of blockSize
[ https://issues.apache.org/jira/browse/LUCENE-4453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-4453. - Resolution: Fixed Fix Version/s: 5.0 4.1 I added a simple test (TestBlockPostingsFormat2) with this. Need to test BlockPostings with term df/ttfs that = blockSize or are a multiple of blockSize Key: LUCENE-4453 URL: https://issues.apache.org/jira/browse/LUCENE-4453 Project: Lucene - Core Issue Type: Sub-task Components: core/codecs Reporter: Robert Muir Fix For: 4.1, 5.0 This is a special case in some of the code, but its currently very rare (e.g. typically untested). We should add a test just for these corner cases: see https://builds.apache.org/job/Lucene-Solr-Clover-4.x/34/clover-report/org/apache/lucene/codecs/block/BlockPostingsReader.html#BlockPostingsReader and https://builds.apache.org/job/Lucene-Solr-Clover-4.x/34/clover-report/org/apache/lucene/codecs/block/BlockSkipReader.html#BlockSkipReader -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: [jira] [Commented] (LUCENE-2899) Add OpenNLP Analysis capabilities as a module
Hello Grant, Lance and Joern I have been developing 'similarity' component for OpenNLP that can be plugged into SOLR. This component does relevance assessment based on matching the parse tree of query with the parse trees of candidate answers. The idea of this component is that a search engineer does not need to be familiar with the linguistics, just plugs inSyntGenRequestHandler for longer queries or longer texts, and checks out if it improves the relevance. There are many other applications of similarity component of OpenNLP besides search which live as junits such as semantic filtering for speech recognition, content generation, and auto code generation from NL.This component is about to be released, hopefully, and is currently there: https://issues.apache.org/jira/browse/OPENNLP-497 It sounds like it is complementary to LUCENE 2899. RegardsBoris Date: Mon, 1 Oct 2012 00:35:07 +1100 From: j...@apache.org To: dev@lucene.apache.org Subject: [jira] [Commented] (LUCENE-2899) Add OpenNLP Analysis capabilities as a module [ https://issues.apache.org/jira/browse/LUCENE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13466478#comment-13466478 ] mailformailingli...@yahoo.de commented on LUCENE-2899: -- Could you please create a new Patch for the current Trunk? I had some problems on applying it to my working copy... I am not entirely sure whether its the Trunk or your Code, but it seems like your OpenNLP-code only works for the first request. As far as I was able to debug, the create()-method of the TokenFilterFactory is only called every now and again (are created TokenFilters reused for longer than one call in Solr?). If create() of your FilterFactory was called, everything works. However if the TokenFilter is somehow reused, it fails. Is this a bug of Solr or of your Patch? Add OpenNLP Analysis capabilities as a module - Key: LUCENE-2899 URL: https://issues.apache.org/jira/browse/LUCENE-2899 Project: Lucene - Core Issue Type: New Feature Components: modules/analysis Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Attachments: LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, opennlp_trunk.patch Now that OpenNLP is an ASF project and has a nice license, it would be nice to have a submodule (under analysis) that exposed capabilities for it. Drew Farris, Tom Morton and I have code that does: * Sentence Detection as a Tokenizer (could also be a TokenFilter, although it would have to change slightly to buffer tokens) * NamedEntity recognition as a TokenFilter We are also planning a Tokenizer/TokenFilter that can put parts of speech as either payloads (PartOfSpeechAttribute?) on a token or at the same position. I'd propose it go under: modules/analysis/opennlp -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-Windows (32bit/jdk1.6.0_35) - Build # 988 - Failure!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows/988/ Java: 32bit/jdk1.6.0_35 -client -XX:+UseSerialGC All tests passed Build Log: [...truncated 25115 lines...] BUILD FAILED C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\build.xml:352: The following error occurred while executing this line: C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\build.xml:65: The following error occurred while executing this line: C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr\common-build.xml:427: The following error occurred while executing this line: C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr\common-build.xml:410: The following error occurred while executing this line: C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr\common-build.xml:359: The following error occurred while executing this line: C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr\common-build.xml:397: The following error occurred while executing this line: C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr\example\build.xml:42: impossible to ivy retrieve: java.lang.RuntimeException: problem during retrieve of org.apache.solr#example: java.text.ParseException: failed to parse report: C:\Users\JenkinsSlave\.ivy2\cache\org.apache.solr-example-default.xml: XML document structures must start and end within the same entity. at org.apache.ivy.core.retrieve.RetrieveEngine.retrieve(RetrieveEngine.java:206) at org.apache.ivy.Ivy.retrieve(Ivy.java:540) at org.apache.ivy.ant.IvyRetrieve.doExecute(IvyRetrieve.java:67) at org.apache.ivy.ant.IvyTask.execute(IvyTask.java:277) at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:291) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106) at org.apache.tools.ant.Task.perform(Task.java:348) at org.apache.tools.ant.taskdefs.Sequential.execute(Sequential.java:68) at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:291) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106) at org.apache.tools.ant.Task.perform(Task.java:348) at org.apache.tools.ant.Target.execute(Target.java:390) at org.apache.tools.ant.Target.performTasks(Target.java:411) at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1399) at org.apache.tools.ant.helper.SingleCheckExecutor.executeTargets(SingleCheckExecutor.java:38) at org.apache.tools.ant.Project.executeTargets(Project.java:1251) at org.apache.tools.ant.taskdefs.Ant.execute(Ant.java:442) at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:291) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106) at org.apache.tools.ant.Task.perform(Task.java:348) at org.apache.tools.ant.Target.execute(Target.java:390) at org.apache.tools.ant.Target.performTasks(Target.java:411) at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1399) at org.apache.tools.ant.helper.SingleCheckExecutor.executeTargets(SingleCheckExecutor.java:38) at org.apache.tools.ant.Project.executeTargets(Project.java:1251) at org.apache.tools.ant.taskdefs.Ant.execute(Ant.java:442) at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:291) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106) at org.apache.tools.ant.Task.perform(Task.java:348) at org.apache.tools.ant.Target.execute(Target.java:390) at org.apache.tools.ant.Target.performTasks(Target.java:411) at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1399) at org.apache.tools.ant.helper.SingleCheckExecutor.executeTargets(SingleCheckExecutor.java:38) at org.apache.tools.ant.Project.executeTargets(Project.java:1251) at org.apache.tools.ant.taskdefs.Ant.execute(Ant.java:442) at
[jira] [Updated] (LUCENE-2899) Add OpenNLP Analysis capabilities as a module
[ https://issues.apache.org/jira/browse/LUCENE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Em updated LUCENE-2899: --- Attachment: OpenNLPFilter.java OpenNLPTokenizer.java Some Attributes were not reset (i.e. first-Attribute in OpenNLPTokenizer and indexToken in OpenNLPFilter) correctly. Since I had trouble applying your patch, I'd like to provide the working source code. Please, create a patch from the current Trunk. Add OpenNLP Analysis capabilities as a module - Key: LUCENE-2899 URL: https://issues.apache.org/jira/browse/LUCENE-2899 Project: Lucene - Core Issue Type: New Feature Components: modules/analysis Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Attachments: LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, OpenNLPFilter.java, OpenNLPTokenizer.java, opennlp_trunk.patch Now that OpenNLP is an ASF project and has a nice license, it would be nice to have a submodule (under analysis) that exposed capabilities for it. Drew Farris, Tom Morton and I have code that does: * Sentence Detection as a Tokenizer (could also be a TokenFilter, although it would have to change slightly to buffer tokens) * NamedEntity recognition as a TokenFilter We are also planning a Tokenizer/TokenFilter that can put parts of speech as either payloads (PartOfSpeechAttribute?) on a token or at the same position. I'd propose it go under: modules/analysis/opennlp -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-2899) Add OpenNLP Analysis capabilities as a module
[ https://issues.apache.org/jira/browse/LUCENE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13466527#comment-13466527 ] Em edited comment on LUCENE-2899 at 10/1/12 4:39 AM: - Some Attributes were not reset (i.e. first-Attribute in OpenNLPTokenizer and indexToken in OpenNLPFilter) correctly. Since I had trouble applying your patch, I'd like to provide the working source code. Please, create a patch for the current Trunk. was (Author: em): Some Attributes were not reset (i.e. first-Attribute in OpenNLPTokenizer and indexToken in OpenNLPFilter) correctly. Since I had trouble applying your patch, I'd like to provide the working source code. Please, create a patch from the current Trunk. Add OpenNLP Analysis capabilities as a module - Key: LUCENE-2899 URL: https://issues.apache.org/jira/browse/LUCENE-2899 Project: Lucene - Core Issue Type: New Feature Components: modules/analysis Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Attachments: LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, OpenNLPFilter.java, OpenNLPTokenizer.java, opennlp_trunk.patch Now that OpenNLP is an ASF project and has a nice license, it would be nice to have a submodule (under analysis) that exposed capabilities for it. Drew Farris, Tom Morton and I have code that does: * Sentence Detection as a Tokenizer (could also be a TokenFilter, although it would have to change slightly to buffer tokens) * NamedEntity recognition as a TokenFilter We are also planning a Tokenizer/TokenFilter that can put parts of speech as either payloads (PartOfSpeechAttribute?) on a token or at the same position. I'd propose it go under: modules/analysis/opennlp -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3449) QueryComponent.doFieldSortValues throw ArrayIndexOutOfBoundsException when has maxDoc=0 Segment
[ https://issues.apache.org/jira/browse/SOLR-3449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13466545#comment-13466545 ] Bruce Butterfield commented on SOLR-3449: - This patch fixed an issue in our system; we do lots of merges and were getting sporadic ArrayIndexOutOfBounds execeptions when including any sort parameters to the query. Please incorporate this into 3.6.2. QueryComponent.doFieldSortValues throw ArrayIndexOutOfBoundsException when has maxDoc=0 Segment --- Key: SOLR-3449 URL: https://issues.apache.org/jira/browse/SOLR-3449 Project: Solr Issue Type: Bug Components: search Affects Versions: 3.5, 3.6 Reporter: Linbin Chen Fix For: 3.6.2 Attachments: SOLR-3449.patch have index {code} Segment name=_9, offest=[docBase=0, maxDoc=245] idx=0 Segment name=_a, offest=[docBase=245, maxDoc=3] idx=1 Segment name=_b, offest=[docBase=248, maxDoc=0] idx=2 Segment name=_c, offest=[docBase=248, maxDoc=1] idx=3 Segment name=_d, offest=[docBase=249, maxDoc=0] idx=4 Segment name=_e, offest=[docBase=249, maxDoc=1] idx=5 Segment name=_f, offest=[docBase=250, maxDoc=0] idx=6 Segment name=_g, offest=[docBase=250, maxDoc=3] idx=7 Segment name=_h, offest=[docBase=253, maxDoc=0] idx=8 {code} maxDoc=0 's Segment maybe create by mergeIndexes。(can make sure maxDoc=0 's segment not merge, but when couldn't control merge indexes) when use fsv=true get sort values, hit docId=249 throw ArrayIndexOutOfBoundsException {code} 2012-5-11 14:28:28 org.apache.solr.common.SolrException log ERROR: java.lang.ArrayIndexOutOfBoundsException: 0 at org.apache.lucene.search.FieldComparator$LongComparator.copy(FieldComparator.java:600) at org.apache.solr.handler.component.QueryComponent.doFieldSortValues(QueryComponent.java:463) at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:400) {code} reason: {code} //idx 012345678 //int[] maxDocs={245, 3, 0, 1, 0, 1, 0, 3, 0}; int[] offsets = { 0, 245, 248, 248, 249, 249, 250, 250, 253}; org.apache.solr.search.SolrIndexReader.readerIndex(249, offsets) return idx=4 not 5。 {code} correct idx=5。 patch {code} Index: solr/core/src/java/org/apache/solr/search/SolrIndexReader.java === --- solr/core/src/java/org/apache/solr/search/SolrIndexReader.java (revision 1337028) +++ solr/core/src/java/org/apache/solr/search/SolrIndexReader.java (working copy) @@ -138,6 +138,16 @@ } else { // exact match on the offset. + //skip equal offest + for(int i=mid+1; i=high; i++) { + if(doc == offsets[i]) { + //skip offests[i] == doc + mid = i; + } else { + //stop skip offest + break; + } + } return mid; } } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4450) Distance boost added to Suggester
[ https://issues.apache.org/jira/browse/LUCENE-4450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13466549#comment-13466549 ] Simon Willnauer commented on LUCENE-4450: - bq. I am interested in bring this branch up to date with the current trunk and contributing it to Lucene. +1 Distance boost added to Suggester - Key: LUCENE-4450 URL: https://issues.apache.org/jira/browse/LUCENE-4450 Project: Lucene - Core Issue Type: Improvement Affects Versions: 4.0 Reporter: Bill Bell A common Suggester use case is to boost the results by closest (auto suggest the whole USA but boost the results in the suggester by geodistance). Would love to get faster response with that. At the Lucene Revolution 2012 in Boston a speaker did discuss using WFST to do this, but I have yet to figure out how to do it). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2899) Add OpenNLP Analysis capabilities as a module
[ https://issues.apache.org/jira/browse/LUCENE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13466565#comment-13466565 ] Lance Norskog commented on LUCENE-2899: --- Thank you! This worked when I posted it. There have been many changes in 4.x and trunk since then. For example, all of the tokenizer and filter factories moved to Lucene from Solr. I'm waiting until 4.0 is finished before I redo this patch. Add OpenNLP Analysis capabilities as a module - Key: LUCENE-2899 URL: https://issues.apache.org/jira/browse/LUCENE-2899 Project: Lucene - Core Issue Type: New Feature Components: modules/analysis Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Attachments: LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, OpenNLPFilter.java, OpenNLPTokenizer.java, opennlp_trunk.patch Now that OpenNLP is an ASF project and has a nice license, it would be nice to have a submodule (under analysis) that exposed capabilities for it. Drew Farris, Tom Morton and I have code that does: * Sentence Detection as a Tokenizer (could also be a TokenFilter, although it would have to change slightly to buffer tokens) * NamedEntity recognition as a TokenFilter We are also planning a Tokenizer/TokenFilter that can put parts of speech as either payloads (PartOfSpeechAttribute?) on a token or at the same position. I'd propose it go under: modules/analysis/opennlp -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-1875) per-segment single valued string faceting
[ https://issues.apache.org/jira/browse/SOLR-1875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13466589#comment-13466589 ] Erik Hatcher commented on SOLR-1875: Isn't this fully resolved for 4.0 (and alpha/beta as well)? per-segment single valued string faceting - Key: SOLR-1875 URL: https://issues.apache.org/jira/browse/SOLR-1875 Project: Solr Issue Type: New Feature Reporter: Yonik Seeley Assignee: Yonik Seeley Fix For: 4.1 Attachments: ASF.LICENSE.NOT.GRANTED--SOLR-1875.patch, ASF.LICENSE.NOT.GRANTED--SOLR-1875.patch A little stepping stone to NRT: Per-segment single-valued string faceting using the Lucene FieldCache. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4454) QueryParser throws exception for string queries ending with ! (exclamation)
Andrew Gibb created LUCENE-4454: --- Summary: QueryParser throws exception for string queries ending with ! (exclamation) Key: LUCENE-4454 URL: https://issues.apache.org/jira/browse/LUCENE-4454 Project: Lucene - Core Issue Type: Bug Affects Versions: 3.6.1, 3.5 Reporter: Andrew Gibb Priority: Minor Query parser does not handle the ! (exclamation point) in the same way as the - (hyphen). A string query with the final character being a ! causes a ParseException foo- OK foo+ OK foo! ERROR - Parse Exception foo! OK I am using version 3.5 (have tried 3.6.1 - same issue) I was hoping 2566 may also have fixed this issue -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4454) QueryParser throws exception for string queries ending with ! (exclamation)
[ https://issues.apache.org/jira/browse/LUCENE-4454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Gibb updated LUCENE-4454: Description: Query parser does not handle the ! (exclamation point) in the same way as the - (hyphen). A string query with the final character being a ! causes a ParseException foo- OK foo+ OK foo! ERROR - Parse Exception foo! OK (trailing white space) I am using version 3.5 (have tried 3.6.1 - same issue) I was hoping 2566 may also have fixed this issue was: Query parser does not handle the ! (exclamation point) in the same way as the - (hyphen). A string query with the final character being a ! causes a ParseException foo- OK foo+ OK foo! ERROR - Parse Exception foo! OK I am using version 3.5 (have tried 3.6.1 - same issue) I was hoping 2566 may also have fixed this issue QueryParser throws exception for string queries ending with ! (exclamation) --- Key: LUCENE-4454 URL: https://issues.apache.org/jira/browse/LUCENE-4454 Project: Lucene - Core Issue Type: Bug Affects Versions: 3.5, 3.6.1 Reporter: Andrew Gibb Priority: Minor Query parser does not handle the ! (exclamation point) in the same way as the - (hyphen). A string query with the final character being a ! causes a ParseException foo- OK foo+ OK foo! ERROR - Parse Exception foo! OK (trailing white space) I am using version 3.5 (have tried 3.6.1 - same issue) I was hoping 2566 may also have fixed this issue -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org