Re: [JENKINS] Lucene-Solr-NightlyTests-trunk - Build # 434 - Still Failing
test bug: I committed a fix. On Thu, Nov 7, 2013 at 9:09 PM, Apache Jenkins Server wrote: > Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-trunk/434/ > > 1 tests failed. > REGRESSION: > org.apache.lucene.codecs.pulsing.TestPulsingPostingsFormat.testInvertedWrite > > Error Message: > Captured an uncaught exception in thread: Thread[id=12, name=Lucene Merge > Thread #0, state=RUNNABLE, group=TGRP-TestPulsingPostingsFormat] > > Stack Trace: > com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an > uncaught exception in thread: Thread[id=12, name=Lucene Merge Thread #0, > state=RUNNABLE, group=TGRP-TestPulsingPostingsFormat] > Caused by: org.apache.lucene.index.MergePolicy$MergeException: > java.util.ConcurrentModificationException > at __randomizedtesting.SeedInfo.seed([FB3E7DB6675B0F39]:0) > at > org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:545) > at > org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:518) > Caused by: java.util.ConcurrentModificationException > at java.util.HashMap$HashIterator.nextEntry(HashMap.java:894) > at java.util.HashMap$KeyIterator.next(HashMap.java:928) > at > org.apache.lucene.index.BasePostingsFormatTestCase$1$1$1.write(BasePostingsFormatTestCase.java:1478) > at > org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.write(PerFieldPostingsFormat.java:178) > at > org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:381) > at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:103) > at > org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4001) > at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3598) > at > org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:405) > at > org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482) > > > > > Build Log: > [...truncated 6901 lines...] >[junit4] Suite: org.apache.lucene.codecs.pulsing.TestPulsingPostingsFormat >[junit4] 2> Lap 08, 2013 10:02:26 AM > com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler > uncaughtException >[junit4] 2> WARNING: Uncaught exception in thread: Thread[Lucene Merge > Thread #0,6,TGRP-TestPulsingPostingsFormat] >[junit4] 2> org.apache.lucene.index.MergePolicy$MergeException: > java.util.ConcurrentModificationException >[junit4] 2>at > __randomizedtesting.SeedInfo.seed([FB3E7DB6675B0F39]:0) >[junit4] 2>at > org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:545) >[junit4] 2>at > org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:518) >[junit4] 2> Caused by: java.util.ConcurrentModificationException >[junit4] 2>at > java.util.HashMap$HashIterator.nextEntry(HashMap.java:894) >[junit4] 2>at > java.util.HashMap$KeyIterator.next(HashMap.java:928) >[junit4] 2>at > org.apache.lucene.index.BasePostingsFormatTestCase$1$1$1.write(BasePostingsFormatTestCase.java:1478) >[junit4] 2>at > org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.write(PerFieldPostingsFormat.java:178) >[junit4] 2>at > org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:381) >[junit4] 2>at > org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:103) >[junit4] 2>at > org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4001) >[junit4] 2>at > org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3598) >[junit4] 2>at > org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:405) >[junit4] 2>at > org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482) >[junit4] 2> >[junit4] 2> Lap 08, 2013 10:02:26 AM > com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler > uncaughtException >[junit4] 2> WARNING: Uncaught exception in thread: Thread[Lucene Merge > Thread #1,6,TGRP-TestPulsingPostingsFormat] >[junit4] 2> org.apache.lucene.index.MergePolicy$MergeException: > java.util.ConcurrentModificationException >[junit4] 2>at > __randomizedtesting.SeedInfo.seed([FB3E7DB6675B0F39]:0) >[junit4] 2>at > org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:545) >[junit4] 2>at > org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:518) >[junit4] 2> Caused by: java.util.ConcurrentModificationException >[junit4] 2
[JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.7.0) - Build # 999 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/999/ Java: 64bit/jdk1.7.0 -XX:+UseCompressedOops -XX:+UseG1GC All tests passed Build Log: [...truncated 10462 lines...] [junit4] JVM J0: stderr was not empty, see: /Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/build/solr-core/test/temp/junit4-J0-20131108_235715_812.syserr [junit4] >>> JVM J0: stderr (verbatim) [junit4] java(186,0x14ebd5000) malloc: *** error for object 0x14ebc3f90: pointer being freed was not allocated [junit4] *** set a breakpoint in malloc_error_break to debug [junit4] <<< JVM J0: EOF [...truncated 1 lines...] [junit4] ERROR: JVM J0 ended with an exception, command line: /Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Home/jre/bin/java -XX:+UseCompressedOops -XX:+UseG1GC -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/heapdumps -Dtests.prefix=tests -Dtests.seed=2CD7DE196210E842 -Xmx512M -Dtests.iters= -Dtests.verbose=false -Dtests.infostream=false -Dtests.codec=random -Dtests.postingsformat=random -Dtests.docvaluesformat=random -Dtests.locale=random -Dtests.timezone=random -Dtests.directory=random -Dtests.linedocsfile=europarl.lines.txt.gz -Dtests.luceneMatchVersion=5.0 -Dtests.cleanthreads=perClass -Djava.util.logging.config.file=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/tools/junit4/logging.properties -Dtests.nightly=false -Dtests.weekly=false -Dtests.slow=true -Dtests.asserts.gracious=false -Dtests.multiplier=1 -DtempDir=. -Djava.io.tmpdir=. -Djunit4.tempDir=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/build/solr-core/test/temp -Dclover.db.dir=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/clover/db -Djava.security.manager=org.apache.lucene.util.TestSecurityManager -Djava.security.policy=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/tools/junit4/tests.policy -Dlucene.version=5.0-SNAPSHOT -Djetty.testMode=1 -Djetty.insecurerandom=1 -Dsolr.directoryFactory=org.apache.solr.core.MockDirectoryFactory -Djava.awt.headless=true -Dtests.disableHdfs=true -classpath /Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/build/solr-core/classes/test:/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/build/solr-test-framework/classes/java:/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/test-framework/lib/junit4-ant-2.0.13.jar:/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/build/solr-core/test-files:/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/test-framework/classes/java:/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/codecs/classes/java:/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/build/solr-solrj/classes/java:/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/build/solr-core/classes/java:/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/analysis/common/lucene-analyzers-common-5.0-SNAPSHOT.jar:/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/analysis/kuromoji/lucene-analyzers-kuromoji-5.0-SNAPSHOT.jar:/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/analysis/phonetic/lucene-analyzers-phonetic-5.0-SNAPSHOT.jar:/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/codecs/lucene-codecs-5.0-SNAPSHOT.jar:/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/highlighter/lucene-highlighter-5.0-SNAPSHOT.jar:/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/memory/lucene-memory-5.0-SNAPSHOT.jar:/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/misc/lucene-misc-5.0-SNAPSHOT.jar:/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/spatial/lucene-spatial-5.0-SNAPSHOT.jar:/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/suggest/lucene-suggest-5.0-SNAPSHOT.jar:/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/grouping/lucene-grouping-5.0-SNAPSHOT.jar:/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/queries/lucene-queries-5.0-SNAPSHOT.jar:/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/queryparser/lucene-queryparser-5.0-SNAPSHOT.jar:/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/join/lucene-join-5.0-SNAPSHOT.jar:/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/core/lib/commons-cli-1.2.jar:/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/core/lib/commons-codec-1.7.jar:/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/core/lib/commons-configuration-1.6.jar:/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/core/lib/commons-fileupload-1.2.1.jar:/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/core/lib/commons-lang-2.6.jar:/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/core/lib/concurrentlinkedhashmap-lru-1.2.jar:/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/core/lib/dom4j-1.6.1.jar:/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/core/lib/guava-14.0.1.jar:/Users/jenkins/workspace/Lucene-Solr-trunk-Mac
[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #1021: POMs out of sync
Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/1021/ 2 tests failed. REGRESSION: org.apache.solr.cloud.ChaosMonkeySafeLeaderTest.testDistribSearch Error Message: shard3 is not consistent. Got 34 from http://127.0.0.1:29017/uil/co/collection1lastClient and got 23 from http://127.0.0.1:27870/uil/co/collection1 Stack Trace: java.lang.AssertionError: shard3 is not consistent. Got 34 from http://127.0.0.1:29017/uil/co/collection1lastClient and got 23 from http://127.0.0.1:27870/uil/co/collection1 at __randomizedtesting.SeedInfo.seed([2B5402E5F393C677:AAB28CFD84CCA64B]:0) at org.junit.Assert.fail(Assert.java:93) at org.apache.solr.cloud.AbstractFullDistribZkTestBase.checkShardConsistency(AbstractFullDistribZkTestBase.java:1151) at org.apache.solr.cloud.ChaosMonkeySafeLeaderTest.doTest(ChaosMonkeySafeLeaderTest.java:135) REGRESSION: org.apache.solr.cloud.SyncSliceTest.testDistribSearch Error Message: Test Setup Failure: shard1 should have just been set up to be inconsistent - but it's still consistent. Leader:http://127.0.0.1:16328/collection1 Dead Guy:http://127.0.0.1:34134/collection1skip list:[CloudJettyRunner [url=http://127.0.0.1:18849/collection1], CloudJettyRunner [url=http://127.0.0.1:18849/collection1]] Stack Trace: java.lang.AssertionError: Test Setup Failure: shard1 should have just been set up to be inconsistent - but it's still consistent. Leader:http://127.0.0.1:16328/collection1 Dead Guy:http://127.0.0.1:34134/collection1skip list:[CloudJettyRunner [url=http://127.0.0.1:18849/collection1], CloudJettyRunner [url=http://127.0.0.1:18849/collection1]] at __randomizedtesting.SeedInfo.seed([81EEC327BDDABBEA:84D3FCA85DBD6]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.assertTrue(Assert.java:43) at org.junit.Assert.assertNotNull(Assert.java:526) at org.apache.solr.cloud.SyncSliceTest.doTest(SyncSliceTest.java:216) Build Log: [...truncated 41966 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-Windows (64bit/jdk1.6.0_45) - Build # 3367 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Windows/3367/ Java: 64bit/jdk1.6.0_45 -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC 1 tests failed. REGRESSION: org.apache.solr.core.TestNonNRTOpen.testReaderIsNotNRT Error Message: expected:<3> but was:<2> Stack Trace: java.lang.AssertionError: expected:<3> but was:<2> at __randomizedtesting.SeedInfo.seed([A219E0541BF21F8C:179F81D3A433AD78]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.junit.Assert.assertEquals(Assert.java:456) at org.apache.solr.core.TestNonNRTOpen.assertNotNRT(TestNonNRTOpen.java:133) at org.apache.solr.core.TestNonNRTOpen.testReaderIsNotNRT(TestNonNRTOpen.java:94) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLea
[jira] [Updated] (LUCENE-5336) Add a simple QueryParser to parse human-entered queries.
[ https://issues.apache.org/jira/browse/LUCENE-5336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jack Conradson updated LUCENE-5336: --- Attachment: LUCENE-5336.patch I have attached a patch for this JIRA. > Add a simple QueryParser to parse human-entered queries. > > > Key: LUCENE-5336 > URL: https://issues.apache.org/jira/browse/LUCENE-5336 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Jack Conradson > Attachments: LUCENE-5336.patch > > > I would like to add a new simple QueryParser to Lucene that is designed to > parse human-entered queries. This parser will operate on an entire entered > query using a specified single field or a set of weighted fields (using term > boost). > All features/operations in this parser can be enabled or disabled depending > on what is necessary for the user. A default operator may be specified as > either 'MUST' representing 'and' or 'SHOULD' representing 'or.' The > features/operations that this parser will include are the following: > * AND specified as '+' > * OR specified as '|' > * NOT specified as '-' > * PHRASE surrounded by double quotes > * PREFIX specified as '*' > * PRECEDENCE surrounded by '(' and ')' > * WHITESPACE specified as ' ' '\n' '\r' and '\t' will cause the default > operator to be used > * ESCAPE specified as '\' will allow operators to be used in terms > The key differences between this parser and other existing parsers will be > the following: > * No exceptions will be thrown, and errors in syntax will be ignored. The > parser will do a best-effort interpretation of any query entered. > * It uses minimal syntax to express queries. All available operators are > single characters or pairs of single characters. > * The parser is hand-written and in a single Java file making it easy to > modify. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5336) Add a simple QueryParser to parse human-entered queries.
Jack Conradson created LUCENE-5336: -- Summary: Add a simple QueryParser to parse human-entered queries. Key: LUCENE-5336 URL: https://issues.apache.org/jira/browse/LUCENE-5336 Project: Lucene - Core Issue Type: Improvement Reporter: Jack Conradson I would like to add a new simple QueryParser to Lucene that is designed to parse human-entered queries. This parser will operate on an entire entered query using a specified single field or a set of weighted fields (using term boost). All features/operations in this parser can be enabled or disabled depending on what is necessary for the user. A default operator may be specified as either 'MUST' representing 'and' or 'SHOULD' representing 'or.' The features/operations that this parser will include are the following: * AND specified as '+' * OR specified as '|' * NOT specified as '-' * PHRASE surrounded by double quotes * PREFIX specified as '*' * PRECEDENCE surrounded by '(' and ')' * WHITESPACE specified as ' ' '\n' '\r' and '\t' will cause the default operator to be used * ESCAPE specified as '\' will allow operators to be used in terms The key differences between this parser and other existing parsers will be the following: * No exceptions will be thrown, and errors in syntax will be ignored. The parser will do a best-effort interpretation of any query entered. * It uses minimal syntax to express queries. All available operators are single characters or pairs of single characters. * The parser is hand-written and in a single Java file making it easy to modify. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5334) Add Namespaces to Expressions Javascript Compiler
[ https://issues.apache.org/jira/browse/LUCENE-5334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13817809#comment-13817809 ] Jack Conradson commented on LUCENE-5334: Thanks for committing, Ryan. > Add Namespaces to Expressions Javascript Compiler > - > > Key: LUCENE-5334 > URL: https://issues.apache.org/jira/browse/LUCENE-5334 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Jack Conradson >Assignee: Ryan Ernst >Priority: Minor > Attachments: LUCENE-5334.patch > > > I would like to add the concept of namespaces to the expressions javascript > compiler using '.' as the operator. > Example of namespace usage in functions: > AccurateMath.sqrt(field) > FastMath.sqrt(field) > Example of namespace usage in variables: > location.latitude > location.longitude -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
ConjunctionScorer floating point precision for score()
Hello, I have been investigating an issue with document scoring and found that the ConjunctionScorer implements the score method in a way that can cause floating point precision rounding issues. I noticed in some of my test cases that documents that have not been merged/optimized (I'm not sure of the correct terminology, they have a docNum of 0) have scorers added in a different order than optimized documents. Using a float to maintain the sum of scores introduces the potential for floating point precision errors. In turn this causes the score that is returned from the ConjunctionScorer to be different for some merged/unmerged documents that should have identical scores. Example: float sum1 = 0.0061859353f + 0.0061859353f + 0.0030929677f + 0.0030929677f + 0.0030929677f + 0.5010608f + 0.0061859353f; float sum2 = 0.0061859353f + 0.0061859353f + 0.0061859353f + 0.0030929677f + 0.0030929677f + 0.0030929677f + 0.5010608f; sum1 == 0.5288975; // Incorrect sum2 == 0.52889746; // Correct I am currently running Solr/Lucene 3.6.2 from source and have two potential solutions, but I not an expert on floating point precision, rounding, or lucene performance implications. I also noticed that there is a comment in the 4.5.1 version of Lucene to the effect of: // TODO: sum into a double and cast to float if we ever send required clauses to BS1 My Questions are as follows: Is this currently expected behavior that should not be patched? If not, would either of these potential solutions be maintained by the Lucene development community? Current: public float score() throws IOException { float sum = 0.0f; for (int i = 0; i < scorers.length; i++) { sum += scorers[i].score(); } return sum; } Option 1: public float score() throws IOException { double sum = 0.0d; for (int i = 0; i < scorers.length; i++) { sum += scorers[i].score(); } return (float)sum; } Option 2: public float score() throws IOException { BigDecimal sum = new BigDecimal(0.0f); for (int i = 0; i < scorers.length; i++) { sum = sum.add(new BigDecimal(scorers[i].score())); } return sum.floatValue(); } -- View this message in context: http://lucene.472066.n3.nabble.com/ConjunctionScorer-floating-point-precision-for-score-tp4100051.html Sent from the Lucene - Java Developer mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-5334) Add Namespaces to Expressions Javascript Compiler
[ https://issues.apache.org/jira/browse/LUCENE-5334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Ernst resolved LUCENE-5334. Resolution: Fixed Thanks Jack! > Add Namespaces to Expressions Javascript Compiler > - > > Key: LUCENE-5334 > URL: https://issues.apache.org/jira/browse/LUCENE-5334 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Jack Conradson >Assignee: Ryan Ernst >Priority: Minor > Attachments: LUCENE-5334.patch > > > I would like to add the concept of namespaces to the expressions javascript > compiler using '.' as the operator. > Example of namespace usage in functions: > AccurateMath.sqrt(field) > FastMath.sqrt(field) > Example of namespace usage in variables: > location.latitude > location.longitude -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5334) Add Namespaces to Expressions Javascript Compiler
[ https://issues.apache.org/jira/browse/LUCENE-5334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13817713#comment-13817713 ] ASF subversion and git services commented on LUCENE-5334: - Commit 1540195 from [~rjernst] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1540195 ] LUCENE-5334: Add Namespaces to Expressions Javascript Compiler > Add Namespaces to Expressions Javascript Compiler > - > > Key: LUCENE-5334 > URL: https://issues.apache.org/jira/browse/LUCENE-5334 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Jack Conradson >Assignee: Ryan Ernst >Priority: Minor > Attachments: LUCENE-5334.patch > > > I would like to add the concept of namespaces to the expressions javascript > compiler using '.' as the operator. > Example of namespace usage in functions: > AccurateMath.sqrt(field) > FastMath.sqrt(field) > Example of namespace usage in variables: > location.latitude > location.longitude -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4335) Builds should regenerate all generated sources
[ https://issues.apache.org/jira/browse/LUCENE-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13817683#comment-13817683 ] ASF subversion and git services commented on LUCENE-4335: - Commit 1540187 from [~rjernst] in branch 'dev/trunk' [ https://svn.apache.org/r1540187 ] LUCENE-4335: Add Namespaces to Expressions Javascript Compiler > Builds should regenerate all generated sources > -- > > Key: LUCENE-4335 > URL: https://issues.apache.org/jira/browse/LUCENE-4335 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Michael McCandless >Assignee: Michael McCandless > Attachments: LUCENE-4335.patch, LUCENE-4335.patch, LUCENE-4335.patch > > > We have more and more sources that are generated programmatically (query > parsers, fuzzy levN tables from Moman, packed ints specialized decoders, > etc.), and it's dangerous because developers may directly edit the generated > sources and forget to edit the meta-source. It's happened to me several > times ... most recently just after landing the BlockPostingsFormat branch. > I think we should re-gen all of these in our builds and fail the build if > this creates a difference. I know some generators (eg JavaCC) embed > timestamps and so always create mods ... we can leave them out of this for > starters (or maybe post-process the sources to remove the timestamps) ... -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5332) SpanNearQuery with multiple terms does not find match
[ https://issues.apache.org/jira/browse/LUCENE-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13817640#comment-13817640 ] Jerry Zhou commented on LUCENE-5332: The attached test case in this ticket does not have overlapping in the query. We just use a simple SpanNearQuery: b d g, and it fails. The other issues LUCENE-5331 is about the repeats in nested SpeaNearQuery. > SpanNearQuery with multiple terms does not find match > - > > Key: LUCENE-5332 > URL: https://issues.apache.org/jira/browse/LUCENE-5332 > Project: Lucene - Core > Issue Type: Bug >Reporter: Jerry Zhou > Attachments: MultiTermFlatSpanNearTest.java > > > A flat structure (non-nested) for a SpanNearQuery containing multiple terms > does not always find the correct match. > Test case is attached ... -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Estimating peak memory use for UnInvertedField faceting
Hi Yonik, I don't know enough about JVM tuning and monitoring to do this in a clean way, so I just tried setting the max heap at 8GB and then 6GB to force garbage collection. With it set to 6GB it goes into a long GC loop and then runs out of heap (See below) . The stack trace says the issue is with DocTErmOrds.uninvert: Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded at org.apache.lucene.index.DocTermOrds.uninvert(DocTermOrds.java:405) I'm guessing the actual peak is somewhere between 6 and 8 GB. BTW: is there some documentation somewhere that explains what the stats output to INFO mean? Tom java.lang.OutOfMemoryError: GC overhead limit exceededjava.lang.RuntimeException: java.lang.OutOfMemoryError: GC overhead limit exceeded at org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:653) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:366) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:548) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:174) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:875) at org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665) at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528) at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81) at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689) at java.lang.Thread.run(Thread.java:724) Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded at org.apache.lucene.index.DocTermOrds.uninvert(DocTermOrds.java:405) at org.apache.solr.request.UnInvertedField.(UnInvertedField.java:179) at org.apache.solr.request.UnInvertedField.getUnInvertedField(UnInvertedField.java:664) at org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:426) at org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:517) at org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:252) at org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:78) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1817) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345) ... 16 more --- Nov 08, 2013 1:39:26 PM org.apache.solr.request.UnInvertedField INFO: UnInverted multi-valued field {field=topicStr, memSize=1,768,101,824, tindexSize=86,028, time=45,854, phase1=41,039, nTerms=271,987, bigTerms=0, termInstances=569,429,716, uses=0} Nov 08, 2013 1:39:28 PM org.apache.solr.core.SolrCore execute INFO: [core] webapp=/dev-3 path=/select params={facet=true&facet.mincount=100&indent=true&q=ocr:the&facet.limit=30&facet.field=topicStr&wt=xml} hits=138,605,690 status=0 QTime=49,797 On Fri, Nov 8, 2013 at 2:01 PM, Yonik Seeley wrote: > On Fri, Nov 8, 2013 at 1:56 PM, Tom Burton-West > wrote: > > When testing an index of about 200 million documents, when we do a first > > faceting on one field (query appended below), the memory use rises from > > about 2.5 GB to 13GB. If I run GC after the query the memory use goes > down > > to about 3GB and subsequent queries don't significantly increase the > memory > > use. > > Is there a way to tell what the real max memory usage is? I assume > 13GB is just the peak heap usage, but that could include a lot of > garbage. > > -Yonik > http://heliosearch.com -- making solr shine > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > >
Re: Estimating peak memory use for UnInvertedField faceting
On Fri, Nov 8, 2013 at 1:56 PM, Tom Burton-West wrote: > When testing an index of about 200 million documents, when we do a first > faceting on one field (query appended below), the memory use rises from > about 2.5 GB to 13GB. If I run GC after the query the memory use goes down > to about 3GB and subsequent queries don't significantly increase the memory > use. Is there a way to tell what the real max memory usage is? I assume 13GB is just the peak heap usage, but that could include a lot of garbage. -Yonik http://heliosearch.com -- making solr shine - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5027) Field Collapsing PostFilter
[ https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13817573#comment-13817573 ] David commented on SOLR-5027: - Joel, I submitted a fix in https://issues.apache.org/jira/browse/SOLR-5416 Let me know if you think this is problematic. > Field Collapsing PostFilter > --- > > Key: SOLR-5027 > URL: https://issues.apache.org/jira/browse/SOLR-5027 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 5.0 >Reporter: Joel Bernstein >Assignee: Joel Bernstein >Priority: Minor > Fix For: 4.6, 5.0 > > Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, > SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, > SOLR-5027.patch, SOLR-5027.patch > > > This ticket introduces the *CollapsingQParserPlugin* > The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. > This is a high performance alternative to standard Solr field collapsing > (with *ngroups*) when the number of distinct groups in the result set is high. > For example in one performance test, a search with 10 million full results > and 1 million collapsed groups: > Standard grouping with ngroups : 17 seconds. > CollapsingQParserPlugin: 300 milli-seconds. > Sample syntax: > Collapse based on the highest scoring document: > {code} > fq=(!collapse field=} > {code} > Collapse based on the min value of a numeric field: > {code} > fq={!collapse field= min=} > {code} > Collapse based on the max value of a numeric field: > {code} > fq={!collapse field= max=} > {code} > Collapse with a null policy: > {code} > fq={!collapse field= nullPolicy=} > {code} > There are three null policies: > ignore : removes docs with a null value in the collapse field (default). > expand : treats each doc with a null value in the collapse field as a > separate group. > collapse : collapses all docs with a null value into a single group using > either highest score, or min/max. > The CollapsingQParserPlugin also fully supports the QueryElevationComponent > *Note:* The July 16 patch also includes and ExpandComponent that expands the > collapsed groups for the current search result page. This functionality will > be moved to it's own ticket. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Estimating peak memory use for UnInvertedField faceting
We are considering indexing our 11 million books at a page level, which comes to about 3 billion Solr documents. Our subject field by necessity is multi-valued so the UnInvertedField is used for faceting. When testing an index of about 200 million documents, when we do a first faceting on one field (query appended below), the memory use rises from about 2.5 GB to 13GB. If I run GC after the query the memory use goes down to about 3GB and subsequent queries don't significantly increase the memory use. After the query is run various statistics from UnInvertedField are sent to the log (see below), but they seem to represent the final data structure rather than the peak. For example memSize is listed as 1.8GB, while the temporary data structure was probably closer to 10GB (total 13GB). Is there a formula for estimating the peak memory size? Can the statistics spit out to INFO be used to somehow estimate the peak memory size? Tom - Nov 08, 2013 1:39:26 PM org.apache.solr.request.UnInvertedField INFO: UnInverted multi-valued field {field=topicStr, memSize=1,768,101,824, tindexSize=86,028, time=45,854, phase1=41,039, nTerms=271,987, bigTerms=0, termInstances=569,429,716, uses=0} Nov 08, 2013 1:39:28 PM org.apache.solr.core.SolrCore execute INFO: [core] webapp=/dev-3 path=/select params={facet=true&facet.mincount=100&indent=true&q=ocr:the&facet.limit=30&facet.field=topicStr&wt=xml} hits=138,605,690 status=0 QTime=49,797
[jira] [Assigned] (LUCENE-5334) Add Namespaces to Expressions Javascript Compiler
[ https://issues.apache.org/jira/browse/LUCENE-5334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Ernst reassigned LUCENE-5334: -- Assignee: Ryan Ernst > Add Namespaces to Expressions Javascript Compiler > - > > Key: LUCENE-5334 > URL: https://issues.apache.org/jira/browse/LUCENE-5334 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Jack Conradson >Assignee: Ryan Ernst >Priority: Minor > Attachments: LUCENE-5334.patch > > > I would like to add the concept of namespaces to the expressions javascript > compiler using '.' as the operator. > Example of namespace usage in functions: > AccurateMath.sqrt(field) > FastMath.sqrt(field) > Example of namespace usage in variables: > location.latitude > location.longitude -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5287) Allow at least solrconfig.xml and schema.xml to be edited via the admin screen
[ https://issues.apache.org/jira/browse/SOLR-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13817459#comment-13817459 ] Erick Erickson commented on SOLR-5287: -- I'm coming back around to this. It _looks_ (and we'll have more info on this next week when [~sarowe] has had a chance to straighten me out), like it'll be relatively easy to piggy-back on the REST-API work and allow it to handle arbitrary files in the conf directory. I'm envisioning a new option in the managed schema config for solrconfig. Currently, the tag takes a tag like: managed-schema So what if we allowed something like managed-all-conf or managed-schema managed-all-conf or just assume that managed-all-conf enabled both the "push the whole file" option and using the managed schema. I think the managed schema will allow for a really nice UI interface that'll be valuable, and the people "who don't need no stinking wizard" can just freely edit the raw files. the "managed-all-conf" list + CRUD operations on any file in the conf directory (maybe more later). The infrastructure is in place to push this to SolrCloud, so it seems like about the same amount of work to do it all. That takes care of the ability to restrict this capability by configuration, getting things to the cloud etc. From a UI perspective, it's just a POST to the right URL with the body of the file. Anyway, that's the current thinking... > Allow at least solrconfig.xml and schema.xml to be edited via the admin screen > -- > > Key: SOLR-5287 > URL: https://issues.apache.org/jira/browse/SOLR-5287 > Project: Solr > Issue Type: Improvement > Components: Schema and Analysis, web gui >Affects Versions: 4.5, 5.0 >Reporter: Erick Erickson >Assignee: Erick Erickson > > A user asking a question on the Solr list got me to thinking about editing > the main config files from the Solr admin screen. I chatted briefly with > [~steffkes] about the mechanics of this on the browser side, he doesn't see a > problem on that end. His comment is there's no end point that'll write the > file back. > Am I missing something here or is this actually not a hard problem? I see a > couple of issues off the bat, neither of which seem troublesome. > 1> file permissions. I'd imagine lots of installations will get file > permission exceptions if Solr tries to write the file out. Well, do a > chmod/chown. > 2> screwing up the system maliciously or not. I don't think this is an issue, > this would be part of the admin handler after all. > Does anyone have objections to the idea? And how does this fit into the work > that [~sar...@syr.edu] has been doing? > I can imagine this extending to SolrCloud with a "push this to ZK" option or > something like that, perhaps not in V1 unless it's easy. > Of course any pointers gratefully received. Especially ones that start with > "Don't waste your effort, it'll never work (or be accepted)"... > Because what scares me is this seems like such an easy thing to do that would > be a significant ease-of-use improvement, so there _has_ to be something I'm > missing. > So if we go forward with this we'll make this the umbrella JIRA, the two > immediate sub-JIRAs that spring to mind will be the UI work and the endpoints > for the UI work to use. > I think there are only two end-points here > 1> list all the files in the conf (or arbitrary from /collection) > directory. > 2> write this text to this file > Possibly later we could add "clone the configs from coreX to coreY". > BTW, I've assigned this to myself so I don't lose it, but if anyone wants to > take it over it won't hurt my feelings a bit -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5432) Allow simple editing for solrconfig.xml and schema.xml from the admin interface
[ https://issues.apache.org/jira/browse/SOLR-5432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13817444#comment-13817444 ] Erick Erickson commented on SOLR-5432: -- Well, this might be a short-lived JIRA. I looked at the managed schema code and chatted with Steve Rowe. It _looks_ like it would be relatively straight-forward to leverage the managed schema infrastructure to allow for CRUD on arbitrary files at least in the conf directory. On a quick glance it doesn't appear to be much, if any, more work than the "simple" way of doing things. And it would get us SolrCloud support "for free" or at least using prior art. We'll be able to look at this a bit more next week. Any red flags here? It seems like schema.xml lends itself to a whole series of specific calls, the addfield, updatefield, copyfield sort of thing but the other files don't really, they're more blobs. Which does leave us with the question of how the managed index schema for schema.xml should play if we also have generic file CRUD operations. Does it make sense to prevent the managed-file manipulation of schema.xml if they have configured the managed schema option? My personal take is that, barring having a hard time making this happen in the code, that the two options are not mutually exclusive and we shouldn't worry about it. > Allow simple editing for solrconfig.xml and schema.xml from the admin > interface > --- > > Key: SOLR-5432 > URL: https://issues.apache.org/jira/browse/SOLR-5432 > Project: Solr > Issue Type: Improvement >Affects Versions: 4.6, 5.0 >Reporter: Erick Erickson >Assignee: Erick Erickson > > [~steffkes] OK, let's see if we can make the simple case work as per the > discussion in SOLR-5287 and reserve the rest of the enhancements for later? > I'll try to create an end-point Real Soon Now and we can try it out. This > will be the simple case of just writing basically anything in the conf > directory. Specifically _excluding_ anything in the sub-directories for the > time being. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2899) Add OpenNLP Analysis capabilities as a module
[ https://issues.apache.org/jira/browse/LUCENE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13817332#comment-13817332 ] Markus Jelsma commented on LUCENE-2899: --- Hi - any change this is going to get committed some day? > Add OpenNLP Analysis capabilities as a module > - > > Key: LUCENE-2899 > URL: https://issues.apache.org/jira/browse/LUCENE-2899 > Project: Lucene - Core > Issue Type: New Feature > Components: modules/analysis >Reporter: Grant Ingersoll >Assignee: Grant Ingersoll >Priority: Minor > Fix For: 4.6 > > Attachments: LUCENE-2899-RJN.patch, LUCENE-2899.patch, > OpenNLPFilter.java, OpenNLPTokenizer.java > > > Now that OpenNLP is an ASF project and has a nice license, it would be nice > to have a submodule (under analysis) that exposed capabilities for it. Drew > Farris, Tom Morton and I have code that does: > * Sentence Detection as a Tokenizer (could also be a TokenFilter, although it > would have to change slightly to buffer tokens) > * NamedEntity recognition as a TokenFilter > We are also planning a Tokenizer/TokenFilter that can put parts of speech as > either payloads (PartOfSpeechAttribute?) on a token or at the same position. > I'd propose it go under: > modules/analysis/opennlp -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS-MAVEN] Lucene-Solr-Maven-4.x #501: POMs out of sync
Build: https://builds.apache.org/job/Lucene-Solr-Maven-4.x/501/ No tests ran. Build Log: [...truncated 20123 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5333) Support sparse faceting for heterogeneous indices
[ https://issues.apache.org/jira/browse/LUCENE-5333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13817327#comment-13817327 ] Shai Erera commented on LUCENE-5333: Duh, good point! :). I think then an AllFacetsAccumulator.create() with two variants - one that takes the members required to create TaxonomyFA and another to create SortedSetFA, and then return a FacetsAccumulator which *extends* either of the two, would work better ... but limited to CountingFacetRequest. I think, since it's so simple, we may not even need to bother with other aggregation functions, as this will be an example to how to achieve this functionality at all, and then an app could copy the code to create other FacetRequests (e.g. SumScoreFacetRequest)? > Support sparse faceting for heterogeneous indices > - > > Key: LUCENE-5333 > URL: https://issues.apache.org/jira/browse/LUCENE-5333 > Project: Lucene - Core > Issue Type: New Feature > Components: modules/facet >Reporter: Michael McCandless > > In some search apps, e.g. a large e-commerce site, the index can have > a mix of wildly different product categories and facet dimensions, and > the number of dimensions could be huge. > E.g. maybe the index has shirts, computer memory, hard drives, etc., > and each of these many categories has different attributes. > In such an index, when someone searches for "so dimm", which should > match a bunch of laptop memory modules, you can't (easily) know up > front which facet dimensions will be important. > But, I think this is very easy for the facet module, since ords are > stored "row stride" (each doc lists all facet labels it has), we could > simply count all facets that the hits actually saw, and then in the > end see which ones "got traction" and return facet results for these > top dims. > I'm not sure what the API would look like, but conceptually this > should work very well, because of how the facet module works. > You shouldn't have to state up front exactly which facet dimensions > to count... -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5433) Use the Schema REST api for editing the schema file
Erick Erickson created SOLR-5433: Summary: Use the Schema REST api for editing the schema file Key: SOLR-5433 URL: https://issues.apache.org/jira/browse/SOLR-5433 Project: Solr Issue Type: Improvement Affects Versions: 4.6, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Priority: Minor [~sarowe] [~steffkes] Marker for going forward with some kind of wizard for editing the schema.xml file. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5432) Allow simple editing for solrconfig.xml and schema.xml from the admin interface
Erick Erickson created SOLR-5432: Summary: Allow simple editing for solrconfig.xml and schema.xml from the admin interface Key: SOLR-5432 URL: https://issues.apache.org/jira/browse/SOLR-5432 Project: Solr Issue Type: Improvement Affects Versions: 4.6, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson [~steffkes] OK, let's see if we can make the simple case work as per the discussion in SOLR-5287 and reserve the rest of the enhancements for later? I'll try to create an end-point Real Soon Now and we can try it out. This will be the simple case of just writing basically anything in the conf directory. Specifically _excluding_ anything in the sub-directories for the time being. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5042) MoreLikeThis doesn't return a score when mlt.count is set to 10
[ https://issues.apache.org/jira/browse/SOLR-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13817294#comment-13817294 ] Markus Jelsma commented on SOLR-5042: - Great work! Thanks! > MoreLikeThis doesn't return a score when mlt.count is set to 10 > --- > > Key: SOLR-5042 > URL: https://issues.apache.org/jira/browse/SOLR-5042 > Project: Solr > Issue Type: Bug > Components: MoreLikeThis >Affects Versions: 4.3 >Reporter: Josh Curran >Assignee: Shawn Heisey >Priority: Minor > Attachments: SOLR-5042.patch > > > The problem appears to be around the mlt.count with in the solrconfig.xml. > When this value is set to 10, the 10 values that have been identified as > 'most like this' are returned with the original query, however the 'score' > field is missing. > Changing the mlt.count to say 11 and issuing the same query then the 'score' > field is returned with the same query. This appears to be the workaround. 11 > was just an arbitrary value, 12 or 15 also work > The same problem was raised on stackoverflow > http://stackoverflow.com/questions/16513719/solr-more-like-this-dont-return-score-while-specify-mlt-count -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5042) MoreLikeThis doesn't return a score when mlt.count is set to 10
[ https://issues.apache.org/jira/browse/SOLR-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13817291#comment-13817291 ] Anshum Gupta commented on SOLR-5042: [~markus17] yes this issue was resolved but I just didn't get time to add unit tests for it yet. I'd however manually tested this. > MoreLikeThis doesn't return a score when mlt.count is set to 10 > --- > > Key: SOLR-5042 > URL: https://issues.apache.org/jira/browse/SOLR-5042 > Project: Solr > Issue Type: Bug > Components: MoreLikeThis >Affects Versions: 4.3 >Reporter: Josh Curran >Assignee: Shawn Heisey >Priority: Minor > Attachments: SOLR-5042.patch > > > The problem appears to be around the mlt.count with in the solrconfig.xml. > When this value is set to 10, the 10 values that have been identified as > 'most like this' are returned with the original query, however the 'score' > field is missing. > Changing the mlt.count to say 11 and issuing the same query then the 'score' > field is returned with the same query. This appears to be the workaround. 11 > was just an arbitrary value, 12 or 15 also work > The same problem was raised on stackoverflow > http://stackoverflow.com/questions/16513719/solr-more-like-this-dont-return-score-while-specify-mlt-count -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5335) Change raw Map type to Map for ValueSource context
[ https://issues.apache.org/jira/browse/LUCENE-5335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13817290#comment-13817290 ] Yonik Seeley commented on LUCENE-5335: -- bq. I dont think things like SumTotalTermFreqValueSource should be a blocker to fixing this map API The whole point of the current context is so you can do stuff like the current code does, so it's really unclear how you would "fix" it (or why it needs fixing), without first figuring out a different way to implement stuff like SumTotalTermFreqValueSource. In that sense, it certainly does seem like a blocker. > Change raw Map type to Map for ValueSource context > - > > Key: LUCENE-5335 > URL: https://issues.apache.org/jira/browse/LUCENE-5335 > Project: Lucene - Core > Issue Type: Sub-task >Reporter: Ryan Ernst > Attachments: LUCENE-5335.patch > > > Just as the title says. Simple refactoring. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5042) MoreLikeThis doesn't return a score when mlt.count is set to 10
[ https://issues.apache.org/jira/browse/SOLR-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13817271#comment-13817271 ] Markus Jelsma commented on SOLR-5042: - Guys, i see some commits in trunk and 4x, is this issue resolved? Thanks > MoreLikeThis doesn't return a score when mlt.count is set to 10 > --- > > Key: SOLR-5042 > URL: https://issues.apache.org/jira/browse/SOLR-5042 > Project: Solr > Issue Type: Bug > Components: MoreLikeThis >Affects Versions: 4.3 >Reporter: Josh Curran >Assignee: Shawn Heisey >Priority: Minor > Attachments: SOLR-5042.patch > > > The problem appears to be around the mlt.count with in the solrconfig.xml. > When this value is set to 10, the 10 values that have been identified as > 'most like this' are returned with the original query, however the 'score' > field is missing. > Changing the mlt.count to say 11 and issuing the same query then the 'score' > field is returned with the same query. This appears to be the workaround. 11 > was just an arbitrary value, 12 or 15 also work > The same problem was raised on stackoverflow > http://stackoverflow.com/questions/16513719/solr-more-like-this-dont-return-score-while-specify-mlt-count -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5149) Query facet to respect mincount
[ https://issues.apache.org/jira/browse/SOLR-5149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13817259#comment-13817259 ] Markus Jelsma commented on SOLR-5149: - Any more comments to this? Change stuff? We're using it in production for two months now and are happy with the results. > Query facet to respect mincount > --- > > Key: SOLR-5149 > URL: https://issues.apache.org/jira/browse/SOLR-5149 > Project: Solr > Issue Type: Bug > Components: SearchComponents - other >Affects Versions: 4.4 >Reporter: Markus Jelsma >Priority: Minor > Fix For: 4.6 > > Attachments: SOLR-5149-trunk.patch, SOLR-5149-trunk.patch, > SOLR-5149-trunk.patch, SOLR-5149-trunk.patch > > -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5027) Field Collapsing PostFilter
[ https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13817260#comment-13817260 ] Joel Bernstein commented on SOLR-5027: -- Greg, Are you asking for the ability to use full sort spec as the collapse criteria? I believe you are, but I just want to clarify. You can currently use the full sort spec now to sort the collasped result set. But only min/max of a numeric field as collapse criteria. Joel > Field Collapsing PostFilter > --- > > Key: SOLR-5027 > URL: https://issues.apache.org/jira/browse/SOLR-5027 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 5.0 >Reporter: Joel Bernstein >Assignee: Joel Bernstein >Priority: Minor > Fix For: 4.6, 5.0 > > Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, > SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, > SOLR-5027.patch, SOLR-5027.patch > > > This ticket introduces the *CollapsingQParserPlugin* > The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. > This is a high performance alternative to standard Solr field collapsing > (with *ngroups*) when the number of distinct groups in the result set is high. > For example in one performance test, a search with 10 million full results > and 1 million collapsed groups: > Standard grouping with ngroups : 17 seconds. > CollapsingQParserPlugin: 300 milli-seconds. > Sample syntax: > Collapse based on the highest scoring document: > {code} > fq=(!collapse field=} > {code} > Collapse based on the min value of a numeric field: > {code} > fq={!collapse field= min=} > {code} > Collapse based on the max value of a numeric field: > {code} > fq={!collapse field= max=} > {code} > Collapse with a null policy: > {code} > fq={!collapse field= nullPolicy=} > {code} > There are three null policies: > ignore : removes docs with a null value in the collapse field (default). > expand : treats each doc with a null value in the collapse field as a > separate group. > collapse : collapses all docs with a null value into a single group using > either highest score, or min/max. > The CollapsingQParserPlugin also fully supports the QueryElevationComponent > *Note:* The July 16 patch also includes and ExpandComponent that expands the > collapsed groups for the current search result page. This functionality will > be moved to it's own ticket. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5027) Field Collapsing PostFilter
[ https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13817257#comment-13817257 ] Joel Bernstein edited comment on SOLR-5027 at 11/8/13 1:09 PM: --- David, I was reading your comments while I was away on vacation but my mobile device wasn't playing nicely with the jira site, so I held off on replying until I got back. I see the issue that you've reported and I'll be working on it through the jira that you created. I'll be posting to that jira with my thoughts soon. Joel was (Author: joel.bernstein): David, I've was reading your comments while I was away on vacation but my mobile device wasn't playing nicely with the jira site, so I held off on replying until I got back. I see the issue that you've reported and I'll be working on it through the jira that you created. I'll be posting to that jira with my thoughts soon. Joel > Field Collapsing PostFilter > --- > > Key: SOLR-5027 > URL: https://issues.apache.org/jira/browse/SOLR-5027 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 5.0 >Reporter: Joel Bernstein >Assignee: Joel Bernstein >Priority: Minor > Fix For: 4.6, 5.0 > > Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, > SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, > SOLR-5027.patch, SOLR-5027.patch > > > This ticket introduces the *CollapsingQParserPlugin* > The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. > This is a high performance alternative to standard Solr field collapsing > (with *ngroups*) when the number of distinct groups in the result set is high. > For example in one performance test, a search with 10 million full results > and 1 million collapsed groups: > Standard grouping with ngroups : 17 seconds. > CollapsingQParserPlugin: 300 milli-seconds. > Sample syntax: > Collapse based on the highest scoring document: > {code} > fq=(!collapse field=} > {code} > Collapse based on the min value of a numeric field: > {code} > fq={!collapse field= min=} > {code} > Collapse based on the max value of a numeric field: > {code} > fq={!collapse field= max=} > {code} > Collapse with a null policy: > {code} > fq={!collapse field= nullPolicy=} > {code} > There are three null policies: > ignore : removes docs with a null value in the collapse field (default). > expand : treats each doc with a null value in the collapse field as a > separate group. > collapse : collapses all docs with a null value into a single group using > either highest score, or min/max. > The CollapsingQParserPlugin also fully supports the QueryElevationComponent > *Note:* The July 16 patch also includes and ExpandComponent that expands the > collapsed groups for the current search result page. This functionality will > be moved to it's own ticket. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5027) Field Collapsing PostFilter
[ https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13817257#comment-13817257 ] Joel Bernstein commented on SOLR-5027: -- David, I've was reading your comments while I was away on vacation but my mobile device wasn't playing nicely with the jira site, so I held off on replying until I got back. I see the issue that you've reported and I'll be working on it through the jira that you created. I'll be posting to that jira with my thoughts soon. Joel > Field Collapsing PostFilter > --- > > Key: SOLR-5027 > URL: https://issues.apache.org/jira/browse/SOLR-5027 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 5.0 >Reporter: Joel Bernstein >Assignee: Joel Bernstein >Priority: Minor > Fix For: 4.6, 5.0 > > Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, > SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, > SOLR-5027.patch, SOLR-5027.patch > > > This ticket introduces the *CollapsingQParserPlugin* > The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. > This is a high performance alternative to standard Solr field collapsing > (with *ngroups*) when the number of distinct groups in the result set is high. > For example in one performance test, a search with 10 million full results > and 1 million collapsed groups: > Standard grouping with ngroups : 17 seconds. > CollapsingQParserPlugin: 300 milli-seconds. > Sample syntax: > Collapse based on the highest scoring document: > {code} > fq=(!collapse field=} > {code} > Collapse based on the min value of a numeric field: > {code} > fq={!collapse field= min=} > {code} > Collapse based on the max value of a numeric field: > {code} > fq={!collapse field= max=} > {code} > Collapse with a null policy: > {code} > fq={!collapse field= nullPolicy=} > {code} > There are three null policies: > ignore : removes docs with a null value in the collapse field (default). > expand : treats each doc with a null value in the collapse field as a > separate group. > collapse : collapses all docs with a null value into a single group using > either highest score, or min/max. > The CollapsingQParserPlugin also fully supports the QueryElevationComponent > *Note:* The July 16 patch also includes and ExpandComponent that expands the > collapsed groups for the current search result page. This functionality will > be moved to it's own ticket. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5333) Support sparse faceting for heterogeneous indices
[ https://issues.apache.org/jira/browse/LUCENE-5333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13817196#comment-13817196 ] Michael McCandless commented on LUCENE-5333: Hmm if we wrap another FacetsAccumulator, is it the user's job to first create that accumulator (with no facet requests)? But then how do we then create another one, with all the facet requests we derived from ROOT? I guess we could just switch on the N types we have? But then maybe we should just add static methods to each to make this "All" accumulator for each? > Support sparse faceting for heterogeneous indices > - > > Key: LUCENE-5333 > URL: https://issues.apache.org/jira/browse/LUCENE-5333 > Project: Lucene - Core > Issue Type: New Feature > Components: modules/facet >Reporter: Michael McCandless > > In some search apps, e.g. a large e-commerce site, the index can have > a mix of wildly different product categories and facet dimensions, and > the number of dimensions could be huge. > E.g. maybe the index has shirts, computer memory, hard drives, etc., > and each of these many categories has different attributes. > In such an index, when someone searches for "so dimm", which should > match a bunch of laptop memory modules, you can't (easily) know up > front which facet dimensions will be important. > But, I think this is very easy for the facet module, since ords are > stored "row stride" (each doc lists all facet labels it has), we could > simply count all facets that the hits actually saw, and then in the > end see which ones "got traction" and return facet results for these > top dims. > I'm not sure what the API would look like, but conceptually this > should work very well, because of how the facet module works. > You shouldn't have to state up front exactly which facet dimensions > to count... -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-5283) Fail the build if ant test didn't execute any tests (everything filtered out).
[ https://issues.apache.org/jira/browse/LUCENE-5283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss resolved LUCENE-5283. - Resolution: Fixed > Fail the build if ant test didn't execute any tests (everything filtered out). > -- > > Key: LUCENE-5283 > URL: https://issues.apache.org/jira/browse/LUCENE-5283 > Project: Lucene - Core > Issue Type: Wish >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Trivial > Fix For: 4.6, 5.0 > > Attachments: LUCENE-5283-permgen.patch, LUCENE-5283.patch, > LUCENE-5283.patch, LUCENE-5283.patch > > > This should be an optional setting that defaults to 'false' (the build > proceeds). -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5379) Query-time multi-word synonym expansion
[ https://issues.apache.org/jira/browse/SOLR-5379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13817174#comment-13817174 ] Otis Gospodnetic commented on SOLR-5379: [~bsteele] - maybe you had your colleagues test other patches, like SOLR-4381? > Query-time multi-word synonym expansion > --- > > Key: SOLR-5379 > URL: https://issues.apache.org/jira/browse/SOLR-5379 > Project: Solr > Issue Type: Improvement > Components: query parsers >Reporter: Nguyen Manh Tien > Labels: multi-word, queryparser, synonym > Fix For: 4.5.1, 4.6 > > Attachments: quoted.patch, synonym-expander.patch > > > While dealing with synonym at query time, solr failed to work with multi-word > synonyms due to some reasons: > - First the lucene queryparser tokenizes user query by space so it split > multi-word term into two terms before feeding to synonym filter, so synonym > filter can't recognized multi-word term to do expansion > - Second, if synonym filter expand into multiple terms which contains > multi-word synonym, The SolrQueryParseBase currently use MultiPhraseQuery to > handle synonyms. But MultiPhraseQuery don't work with term have different > number of words. > For the first one, we can extend quoted all multi-word synonym in user query > so that lucene queryparser don't split it. There are a jira task related to > this one https://issues.apache.org/jira/browse/LUCENE-2605. > For the second, we can replace MultiPhraseQuery by an appropriate BoleanQuery > SHOULD which contains multiple PhraseQuery in case tokens stream have > multi-word synonym. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5283) Fail the build if ant test didn't execute any tests (everything filtered out).
[ https://issues.apache.org/jira/browse/LUCENE-5283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13817175#comment-13817175 ] ASF subversion and git services commented on LUCENE-5283: - Commit 1539975 from [~dawidweiss] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1539975 ] LUCENE-5283: Fail the build if ant test didn't execute any tests (everything filtered out). > Fail the build if ant test didn't execute any tests (everything filtered out). > -- > > Key: LUCENE-5283 > URL: https://issues.apache.org/jira/browse/LUCENE-5283 > Project: Lucene - Core > Issue Type: Wish >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Trivial > Fix For: 4.6, 5.0 > > Attachments: LUCENE-5283-permgen.patch, LUCENE-5283.patch, > LUCENE-5283.patch, LUCENE-5283.patch > > > This should be an optional setting that defaults to 'false' (the build > proceeds). -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5325) Move ValueSource and FunctionValues under core/
[ https://issues.apache.org/jira/browse/LUCENE-5325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13817138#comment-13817138 ] Shai Erera commented on LUCENE-5325: bq. Shai Erera, have you already started on this? If not I'd be happy to take it on. No, I haven't started any work yet and won't be able to work on it in the next few weeks, so feel free to take it! > Move ValueSource and FunctionValues under core/ > --- > > Key: LUCENE-5325 > URL: https://issues.apache.org/jira/browse/LUCENE-5325 > Project: Lucene - Core > Issue Type: Improvement > Components: core/search >Reporter: Shai Erera > > Spinoff from LUCENE-5298: ValueSource and FunctionValues are abstract APIs > which exist under the queries/ module. That causes any module which wants to > depend on these APIs (but not necessarily on any of their actual > implementations!), to depend on the queries/ module. If we move these APIs > under core/, we can eliminate these dependencies and add some mock impls for > testing purposes. > Quoting Robert from LUCENE-5298: > {quote} > we should eliminate the suggest/ dependencies on expressions and queries, the > expressions/ on queries, the grouping/ dependency on queries, the spatial/ > dependency on queries, its a mess. > {quote} > To add to that list, facet/ should not depend on queries too. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5379) Query-time multi-word synonym expansion
[ https://issues.apache.org/jira/browse/SOLR-5379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13817133#comment-13817133 ] Markus Jelsma commented on SOLR-5379: - Oh i interpreted your comment as if you have tested it against the other patches linked to this one. > Query-time multi-word synonym expansion > --- > > Key: SOLR-5379 > URL: https://issues.apache.org/jira/browse/SOLR-5379 > Project: Solr > Issue Type: Improvement > Components: query parsers >Reporter: Nguyen Manh Tien > Labels: multi-word, queryparser, synonym > Fix For: 4.5.1, 4.6 > > Attachments: quoted.patch, synonym-expander.patch > > > While dealing with synonym at query time, solr failed to work with multi-word > synonyms due to some reasons: > - First the lucene queryparser tokenizes user query by space so it split > multi-word term into two terms before feeding to synonym filter, so synonym > filter can't recognized multi-word term to do expansion > - Second, if synonym filter expand into multiple terms which contains > multi-word synonym, The SolrQueryParseBase currently use MultiPhraseQuery to > handle synonyms. But MultiPhraseQuery don't work with term have different > number of words. > For the first one, we can extend quoted all multi-word synonym in user query > so that lucene queryparser don't split it. There are a jira task related to > this one https://issues.apache.org/jira/browse/LUCENE-2605. > For the second, we can replace MultiPhraseQuery by an appropriate BoleanQuery > SHOULD which contains multiple PhraseQuery in case tokens stream have > multi-word synonym. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org