[jira] [Updated] (SOLR-4903) Solr sends all doc ids to all shards in the query counting facets
[ https://issues.apache.org/jira/browse/SOLR-4903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitry Kan updated SOLR-4903: - Affects Version/s: 4.3 > Solr sends all doc ids to all shards in the query counting facets > - > > Key: SOLR-4903 > URL: https://issues.apache.org/jira/browse/SOLR-4903 > Project: Solr > Issue Type: Improvement > Components: search >Affects Versions: 3.4, 4.3 >Reporter: Dmitry Kan > > Setup: front end solr and shards. > Summary: solr frontend sends all doc ids received from QueryComponent to all > shards which causes POST request buffer size overflow. > Symptoms: > The query is: http://pastebin.com/0DndK1Cs > I have omitted the shards parameter. > The router log: http://pastebin.com/FTVH1WF3 > Notice the port of a shard, that is affected. That port changes all the time, > even for the same request > The log entry is prepended with lines: > SEVERE: org.apache.solr.common.SolrException: Internal Server Error > Internal Server Error > (they are not in the pastebin link) > The shard log: http://pastebin.com/exwCx3LX > Suggestion: change the data structure in FacetComponent to send only doc ids > that belong to a shard and not a concatenation of all doc ids. > Why is this important: for scaling. Adding more shards will result in > overflowing the POST request buffer at some point anyway. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-4904) Send internal doc ids and index version in distributed faceting to make queries more compact
Dmitry Kan created SOLR-4904: Summary: Send internal doc ids and index version in distributed faceting to make queries more compact Key: SOLR-4904 URL: https://issues.apache.org/jira/browse/SOLR-4904 Project: Solr Issue Type: Improvement Components: search Affects Versions: 4.3, 3.4 Reporter: Dmitry Kan This is suggested by [~ab] at bbuzz conf 2013. This makes a lot of sense and works nice with fixing the root cause of issue SOLR-4903. Basically QueryComponent could send internal lucene ids along with the index version number so that in subsequent queries to other solr components, like FacetComponent, the internal ids would be sent. The index version is required to ensure we deal with the same index. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-4903) Solr sends all doc ids to all shards in the query counting facets
Dmitry Kan created SOLR-4903: Summary: Solr sends all doc ids to all shards in the query counting facets Key: SOLR-4903 URL: https://issues.apache.org/jira/browse/SOLR-4903 Project: Solr Issue Type: Improvement Components: search Affects Versions: 3.4 Reporter: Dmitry Kan Setup: front end solr and shards. Summary: solr frontend sends all doc ids received from QueryComponent to all shards which causes POST request buffer size overflow. Symptoms: The query is: http://pastebin.com/0DndK1Cs I have omitted the shards parameter. The router log: http://pastebin.com/FTVH1WF3 Notice the port of a shard, that is affected. That port changes all the time, even for the same request The log entry is prepended with lines: SEVERE: org.apache.solr.common.SolrException: Internal Server Error Internal Server Error (they are not in the pastebin link) The shard log: http://pastebin.com/exwCx3LX Suggestion: change the data structure in FacetComponent to send only doc ids that belong to a shard and not a concatenation of all doc ids. Why is this important: for scaling. Adding more shards will result in overflowing the POST request buffer at some point anyway. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-2899) Add OpenNLP Analysis capabilities as a module
[ https://issues.apache.org/jira/browse/LUCENE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lance Norskog updated LUCENE-2899: -- Attachment: LUCENE-2899-x.patch > Add OpenNLP Analysis capabilities as a module > - > > Key: LUCENE-2899 > URL: https://issues.apache.org/jira/browse/LUCENE-2899 > Project: Lucene - Core > Issue Type: New Feature > Components: modules/analysis >Reporter: Grant Ingersoll >Assignee: Grant Ingersoll >Priority: Minor > Fix For: 4.4 > > Attachments: LUCENE-2899-current.patch, LUCENE-2899.patch, > LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, > LUCENE-2899.patch, LUCENE-2899-RJN.patch, LUCENE-2899-x.patch, > LUCENE-2899-x.patch, OpenNLPFilter.java, OpenNLPTokenizer.java, > opennlp_trunk.patch > > > Now that OpenNLP is an ASF project and has a nice license, it would be nice > to have a submodule (under analysis) that exposed capabilities for it. Drew > Farris, Tom Morton and I have code that does: > * Sentence Detection as a Tokenizer (could also be a TokenFilter, although it > would have to change slightly to buffer tokens) > * NamedEntity recognition as a TokenFilter > We are also planning a Tokenizer/TokenFilter that can put parts of speech as > either payloads (PartOfSpeechAttribute?) on a token or at the same position. > I'd propose it go under: > modules/analysis/opennlp -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2899) Add OpenNLP Analysis capabilities as a module
[ https://issues.apache.org/jira/browse/LUCENE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13676698#comment-13676698 ] Lance Norskog commented on LUCENE-2899: --- I found the problem with multiple documents. The API for reusing Tokenizers changed something more sensible, but I only noticed and implemented part of the change. The result was than when you upload multiple documents, it just re-processed the first document. File LUCENE-2899-x.patch has this fix. It applies against the 4.x branch and the trunk. It does not apply against Lucene 4.0, 4.1, 4.2 or 4.3. For all released Solr versions you want LUCENE-2899.patch from August 27, 2012. There are no new features since that release. > Add OpenNLP Analysis capabilities as a module > - > > Key: LUCENE-2899 > URL: https://issues.apache.org/jira/browse/LUCENE-2899 > Project: Lucene - Core > Issue Type: New Feature > Components: modules/analysis >Reporter: Grant Ingersoll >Assignee: Grant Ingersoll >Priority: Minor > Fix For: 4.4 > > Attachments: LUCENE-2899-current.patch, LUCENE-2899.patch, > LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, > LUCENE-2899.patch, LUCENE-2899-RJN.patch, LUCENE-2899-x.patch, > OpenNLPFilter.java, OpenNLPTokenizer.java, opennlp_trunk.patch > > > Now that OpenNLP is an ASF project and has a nice license, it would be nice > to have a submodule (under analysis) that exposed capabilities for it. Drew > Farris, Tom Morton and I have code that does: > * Sentence Detection as a Tokenizer (could also be a TokenFilter, although it > would have to change slightly to buffer tokens) > * NamedEntity recognition as a TokenFilter > We are also planning a Tokenizer/TokenFilter that can put parts of speech as > either payloads (PartOfSpeechAttribute?) on a token or at the same position. > I'd propose it go under: > modules/analysis/opennlp -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Documentation for Solr/Lucene 4.x, termIndexInterval and limitations of Lucene File format
On Wed, Jun 5, 2013 at 4:21 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > > Nice :) That's good news (that nothing blew up!). Thanks for sharing. > With such a old jvm and such a large index, I'd say its a stroke of pure luck nothing didn't blow up.
[jira] [Commented] (LUCENE-5033) SlowFuzzyQuery appears to fail with edit distance >=3 in some cases
[ https://issues.apache.org/jira/browse/LUCENE-5033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13676659#comment-13676659 ] Robert Muir commented on LUCENE-5033: - Doing an explicit levenshtein calculation here sort of defeats the entire purpose of having levenshtein automata at all! > SlowFuzzyQuery appears to fail with edit distance >=3 in some cases > --- > > Key: LUCENE-5033 > URL: https://issues.apache.org/jira/browse/LUCENE-5033 > Project: Lucene - Core > Issue Type: Bug > Components: modules/other >Affects Versions: 4.3 >Reporter: Tim Allison >Priority: Minor > Attachments: LUCENE-5033.patch > > > Levenshtein edit btwn "monday" and "montugu" should be 4. The following > shows a query with "sim" set to 3, and there is a hit. > public void testFuzzinessLong2() throws Exception { > Directory directory = newDirectory(); > RandomIndexWriter writer = new RandomIndexWriter(random(), directory); > addDoc("monday", writer); > > IndexReader reader = writer.getReader(); > IndexSearcher searcher = newSearcher(reader); > writer.close(); > SlowFuzzyQuery query; > query = new SlowFuzzyQuery(new Term("field", "montugu"), 3, 0); > ScoreDoc[] hits = searcher.search(query, null, 1000).scoreDocs; > assertEquals(0, hits.length); > } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5033) SlowFuzzyQuery appears to fail with edit distance >=3 in some cases
[ https://issues.apache.org/jira/browse/LUCENE-5033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13676587#comment-13676587 ] Tim Allison commented on LUCENE-5033: - Thank you for your quick response! I, too, was hoping to avoid calcSimilarity if raw is true, but I think we need it to calculate the boost. Let me know if I'm missing something. The bug in the original code was that FilteredTermsEnum sets minSimilarity to 0 when the user-specified minSimilarity is >= 1.0f. So, in SlowFuzzyTermsEnum, similarity (unless it was Float.NEGATIVE_INFINITY) was typically > minSimilarity no matter its value. In other words, when the client code made the call with minSimilarity >=1.0f, that value was correctly recorded in maxEdits, but maxEdits wasn't the determining factor in whether SlowFuzzyTerms accepted a term. > SlowFuzzyQuery appears to fail with edit distance >=3 in some cases > --- > > Key: LUCENE-5033 > URL: https://issues.apache.org/jira/browse/LUCENE-5033 > Project: Lucene - Core > Issue Type: Bug > Components: modules/other >Affects Versions: 4.3 >Reporter: Tim Allison >Priority: Minor > Attachments: LUCENE-5033.patch > > > Levenshtein edit btwn "monday" and "montugu" should be 4. The following > shows a query with "sim" set to 3, and there is a hit. > public void testFuzzinessLong2() throws Exception { > Directory directory = newDirectory(); > RandomIndexWriter writer = new RandomIndexWriter(random(), directory); > addDoc("monday", writer); > > IndexReader reader = writer.getReader(); > IndexSearcher searcher = newSearcher(reader); > writer.close(); > SlowFuzzyQuery query; > query = new SlowFuzzyQuery(new Term("field", "montugu"), 3, 0); > ScoreDoc[] hits = searcher.search(query, null, 1000).scoreDocs; > assertEquals(0, hits.length); > } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-MacOSX (64bit/jdk1.6.0) - Build # 521 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-MacOSX/521/ Java: 64bit/jdk1.6.0 -XX:-UseCompressedOops -XX:+UseParallelGC 1 tests failed. REGRESSION: org.apache.lucene.replicator.http.HttpReplicatorTest.testBasic Error Message: Connection to http://localhost:51367 refused Stack Trace: org.apache.http.conn.HttpHostConnectException: Connection to http://localhost:51367 refused at __randomizedtesting.SeedInfo.seed([BFF64C7ADF67C62C:140C516F00BB4002]:0) at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:190) at org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294) at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:645) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:480) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784) at org.apache.lucene.replicator.http.HttpClientBase.executeGET(HttpClientBase.java:178) at org.apache.lucene.replicator.http.HttpReplicator.checkForUpdate(HttpReplicator.java:51) at org.apache.lucene.replicator.ReplicationClient.doUpdate(ReplicationClient.java:196) at org.apache.lucene.replicator.ReplicationClient.updateNow(ReplicationClient.java:402) at org.apache.lucene.replicator.http.HttpReplicatorTest.testBasic(HttpReplicatorTest.java:112) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at
[jira] [Commented] (LUCENE-5033) SlowFuzzyQuery appears to fail with edit distance >=3 in some cases
[ https://issues.apache.org/jira/browse/LUCENE-5033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13676532#comment-13676532 ] Michael McCandless commented on LUCENE-5033: Thanks Tim! This looks like a great improvement: I like factoring out calcDistance from calcSimilarity. And I like that we now take raw into account when figuring out which comparison to make to accept the term or not. Maybe we could improve it a bit: if raw is true we don't need to calcSimilarity right? For my sanity ... where exactly was the bug in the original code? > SlowFuzzyQuery appears to fail with edit distance >=3 in some cases > --- > > Key: LUCENE-5033 > URL: https://issues.apache.org/jira/browse/LUCENE-5033 > Project: Lucene - Core > Issue Type: Bug > Components: modules/other >Affects Versions: 4.3 >Reporter: Tim Allison >Priority: Minor > Attachments: LUCENE-5033.patch > > > Levenshtein edit btwn "monday" and "montugu" should be 4. The following > shows a query with "sim" set to 3, and there is a hit. > public void testFuzzinessLong2() throws Exception { > Directory directory = newDirectory(); > RandomIndexWriter writer = new RandomIndexWriter(random(), directory); > addDoc("monday", writer); > > IndexReader reader = writer.getReader(); > IndexSearcher searcher = newSearcher(reader); > writer.close(); > SlowFuzzyQuery query; > query = new SlowFuzzyQuery(new Term("field", "montugu"), 3, 0); > ScoreDoc[] hits = searcher.search(query, null, 1000).scoreDocs; > assertEquals(0, hits.length); > } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5033) SlowFuzzyQuery appears to fail with edit distance >=3 in some cases
[ https://issues.apache.org/jira/browse/LUCENE-5033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated LUCENE-5033: Attachment: LUCENE-5033.patch First draft of patch attached. Let me know how this looks. Thank you. > SlowFuzzyQuery appears to fail with edit distance >=3 in some cases > --- > > Key: LUCENE-5033 > URL: https://issues.apache.org/jira/browse/LUCENE-5033 > Project: Lucene - Core > Issue Type: Bug > Components: modules/other >Affects Versions: 4.3 >Reporter: Tim Allison >Priority: Minor > Attachments: LUCENE-5033.patch > > > Levenshtein edit btwn "monday" and "montugu" should be 4. The following > shows a query with "sim" set to 3, and there is a hit. > public void testFuzzinessLong2() throws Exception { > Directory directory = newDirectory(); > RandomIndexWriter writer = new RandomIndexWriter(random(), directory); > addDoc("monday", writer); > > IndexReader reader = writer.getReader(); > IndexSearcher searcher = newSearcher(reader); > writer.close(); > SlowFuzzyQuery query; > query = new SlowFuzzyQuery(new Term("field", "montugu"), 3, 0); > ScoreDoc[] hits = searcher.search(query, null, 1000).scoreDocs; > assertEquals(0, hits.length); > } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Documentation for Solr/Lucene 4.x, termIndexInterval and limitations of Lucene File format
On Wed, Jun 5, 2013 at 2:47 PM, Tom Burton-West wrote: > 13 Billion unique terms. (CheckIndex output appended below) Nice :) That's good news (that nothing blew up!). Thanks for sharing. Mike McCandless http://blog.mikemccandless.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-5035) FieldCacheImpl.SortedDocValuesImpl should compress addresses to term bytes more efficiently
[ https://issues.apache.org/jira/browse/LUCENE-5035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-5035. - Resolution: Fixed Fix Version/s: 4.4 5.0 > FieldCacheImpl.SortedDocValuesImpl should compress addresses to term bytes > more efficiently > --- > > Key: LUCENE-5035 > URL: https://issues.apache.org/jira/browse/LUCENE-5035 > Project: Lucene - Core > Issue Type: Improvement > Components: core/search >Reporter: Robert Muir > Fix For: 5.0, 4.4 > > Attachments: LUCENE-5035.patch > > > Each ordinal in SortedDocValuesImpl has a corresponding address to find its > location in the big byte[] to support lookupOrd() > Today this uses GrowableWriter with absolute addresses. > But it would be much better to use MonotonicAppendingLongBuffer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4902) Confusing field name in example schema - text
[ https://issues.apache.org/jira/browse/SOLR-4902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shawn Heisey updated SOLR-4902: --- Description: The following came up in the IRC channel today: {noformat} 16:34 < sayuke> I can't work this out for the life of me. Is text in text:blah some sort of special syntax for searching all text fields? Google keywords other than text: appreciated {noformat} A better name for this would be something that includes catchall. There is a lot of documentation that mentions this field name that would all have to be updated. was: The following came up in the IRC channel today: 16:34 < sayuke> I can't work this out for the life of me. Is text in text:blah some sort of special syntax for searching all text fields? Google keywords other than text: appreciated A better name for this would be something that includes catchall. There is a lot of documentation that mentions this field name that would all have to be updated. > Confusing field name in example schema - text > - > > Key: SOLR-4902 > URL: https://issues.apache.org/jira/browse/SOLR-4902 > Project: Solr > Issue Type: Improvement >Reporter: Shawn Heisey >Priority: Minor > > The following came up in the IRC channel today: > {noformat} > 16:34 < sayuke> I can't work this out for the life of me. Is text in text:blah > some sort of special syntax for searching all text fields? > Google keywords other than text: appreciated > {noformat} > A better name for this would be something that includes catchall. There is a > lot of documentation that mentions this field name that would all have to be > updated. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-4902) Confusing field name in example schema - text
Shawn Heisey created SOLR-4902: -- Summary: Confusing field name in example schema - text Key: SOLR-4902 URL: https://issues.apache.org/jira/browse/SOLR-4902 Project: Solr Issue Type: Improvement Reporter: Shawn Heisey Priority: Minor The following came up in the IRC channel today: 16:34 < sayuke> I can't work this out for the life of me. Is text in text:blah some sort of special syntax for searching all text fields? Google keywords other than text: appreciated A better name for this would be something that includes catchall. There is a lot of documentation that mentions this field name that would all have to be updated. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-Tests-4.3-Java6 - Build # 56 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-Tests-4.3-Java6/56/ 1 tests failed. FAILED: org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.testContains {#3 seed=[86987851AA607662:EFD27221455C0D78]} Error Message: Shouldn't match I #3:ShapePair(Rect(minX=59.0,maxX=81.0,minY=0.0,maxY=11.0) , Rect(minX=189.0,maxX=190.0,minY=-60.0,maxY=64.0)) Q:Rect(minX=0.0,maxX=256.0,minY=-128.0,maxY=128.0) Stack Trace: java.lang.AssertionError: Shouldn't match I #3:ShapePair(Rect(minX=59.0,maxX=81.0,minY=0.0,maxY=11.0) , Rect(minX=189.0,maxX=190.0,minY=-60.0,maxY=64.0)) Q:Rect(minX=0.0,maxX=256.0,minY=-128.0,maxY=128.0) at __randomizedtesting.SeedInfo.seed([86987851AA607662:EFD27221455C0D78]:0) at org.junit.Assert.fail(Assert.java:93) at org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.fail(SpatialOpRecursivePrefixTreeTest.java:287) at org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.doTest(SpatialOpRecursivePrefixTreeTest.java:273) at org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.testContains(SpatialOpRecursivePrefixTreeTest.java:101) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at java.l
[jira] [Commented] (SOLR-4744) Version conflict error during shard split test
[ https://issues.apache.org/jira/browse/SOLR-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13676370#comment-13676370 ] Yonik Seeley commented on SOLR-4744: Your changes look fine Hoss. It's not clear to me why the forward to subshard needs to be synchronous in the original committed patch, but I guess that can always be revisited later as an optimization. > Version conflict error during shard split test > -- > > Key: SOLR-4744 > URL: https://issues.apache.org/jira/browse/SOLR-4744 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 4.3 >Reporter: Shalin Shekhar Mangar >Assignee: Shalin Shekhar Mangar >Priority: Minor > Fix For: 4.4, 4.3.1 > > Attachments: SOLR-4744__no_more_NPE.patch, SOLR-4744.patch, > SOLR-4744.patch > > > ShardSplitTest fails sometimes with the following error: > {code} > [junit4:junit4] 1> INFO - 2013-04-14 19:05:26.861; > org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update shard state > invoked for collection: collection1 > [junit4:junit4] 1> INFO - 2013-04-14 19:05:26.861; > org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update shard state shard1 > to inactive > [junit4:junit4] 1> INFO - 2013-04-14 19:05:26.861; > org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update shard state > shard1_0 to active > [junit4:junit4] 1> INFO - 2013-04-14 19:05:26.861; > org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update shard state > shard1_1 to active > [junit4:junit4] 1> INFO - 2013-04-14 19:05:26.873; > org.apache.solr.update.processor.LogUpdateProcessor; [collection1] webapp= > path=/update params={wt=javabin&version=2} {add=[169 (1432319507166134272)]} > 0 2 > [junit4:junit4] 1> INFO - 2013-04-14 19:05:26.877; > org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: > WatchedEvent state:SyncConnected type:NodeDataChanged > path:/clusterstate.json, has occurred - updating... (live nodes size: 5) > [junit4:junit4] 1> INFO - 2013-04-14 19:05:26.877; > org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: > WatchedEvent state:SyncConnected type:NodeDataChanged > path:/clusterstate.json, has occurred - updating... (live nodes size: 5) > [junit4:junit4] 1> INFO - 2013-04-14 19:05:26.877; > org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: > WatchedEvent state:SyncConnected type:NodeDataChanged > path:/clusterstate.json, has occurred - updating... (live nodes size: 5) > [junit4:junit4] 1> INFO - 2013-04-14 19:05:26.877; > org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: > WatchedEvent state:SyncConnected type:NodeDataChanged > path:/clusterstate.json, has occurred - updating... (live nodes size: 5) > [junit4:junit4] 1> INFO - 2013-04-14 19:05:26.877; > org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: > WatchedEvent state:SyncConnected type:NodeDataChanged > path:/clusterstate.json, has occurred - updating... (live nodes size: 5) > [junit4:junit4] 1> INFO - 2013-04-14 19:05:26.877; > org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: > WatchedEvent state:SyncConnected type:NodeDataChanged > path:/clusterstate.json, has occurred - updating... (live nodes size: 5) > [junit4:junit4] 1> INFO - 2013-04-14 19:05:26.884; > org.apache.solr.update.processor.LogUpdateProcessor; > [collection1_shard1_1_replica1] webapp= path=/update > params={distrib.from=http://127.0.0.1:41028/collection1/&update.distrib=FROMLEADER&wt=javabin&distrib.from.parent=shard1&version=2} > {} 0 1 > [junit4:junit4] 1> INFO - 2013-04-14 19:05:26.885; > org.apache.solr.update.processor.LogUpdateProcessor; [collection1] webapp= > path=/update > params={distrib.from=http://127.0.0.1:41028/collection1/&update.distrib=FROMLEADER&wt=javabin&distrib.from.parent=shard1&version=2} > {add=[169 (1432319507173474304)]} 0 2 > [junit4:junit4] 1> ERROR - 2013-04-14 19:05:26.885; > org.apache.solr.common.SolrException; shard update error StdNode: > http://127.0.0.1:41028/collection1_shard1_1_replica1/:org.apache.solr.common.SolrException: > version conflict for 169 expected=1432319507173474304 actual=-1 > [junit4:junit4] 1> at > org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:404) > [junit4:junit4] 1> at > org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181) > [junit4:junit4] 1> at > org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:332) > [junit4:junit4] 1> at > org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:306) > [junit4:junit4] 1> at > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) > [junit4:ju
[jira] [Comment Edited] (SOLR-4862) Core admin action "CREATE" fails to persist some settings in solr.xml
[ https://issues.apache.org/jira/browse/SOLR-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13676325#comment-13676325 ] Trey Massingill edited comment on SOLR-4862 at 6/5/13 8:49 PM: --- Seemingly, I'm running into this issue as well. I'm in the process of upgrading from 3.6.1 to 4.3. The solr log shows that I passed the dataDir option, but it does not show up in solr.xml. I'm not sure why "collection" is showing up in solr.xml either. Log message: {noformat} 235705|2013-06-05T20:25:16.774+|qtp875010279-17|INFO|o.a.solr.servlet.SolrDispatchFilter|[admin] webapp=null path=/admin/cores params={schema=schema.xml&loadOnStartup=false&instanceDir=.&transient=true&name=queue-2013060518&action=CREATE&config=solrconfig.xml&dataDir=.. /../index_data/queue-2013060518&wt=json} status=0 QTime=1635 {noformat} solr.xml {noformat} {noformat} This doesn't seem to cause issues at first. However, after restarting the service, I end up with this warning: {noformat} 16764|2013-06-05T20:36:15.289+|qtp1711465251-20|WARN|o.a.solr.handler.ReplicationHandler|Unable to get IndexCommit on startup org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: NativeFSLock@/home/tmassi/Development/svn/mta-blockmon-2012/blockmon-solr/blockmon-solr/master/versions/blockmon-solr-2.0.4-SNAPSHOT/config/solr/data/index/write.lock at org.apache.lucene.store.Lock.obtain(Lock.java:84) at org.apache.lucene.index.IndexWriter.(IndexWriter.java:644) at org.apache.solr.update.SolrIndexWriter.(SolrIndexWriter.java:77) at org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:64) at org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:197) at org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:110) at org.apache.solr.handler.ReplicationHandler.inform(ReplicationHandler.java:939) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:616) at org.apache.solr.core.SolrCore.(SolrCore.java:816) at org.apache.solr.core.SolrCore.(SolrCore.java:618) at org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:949) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:984) at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:1227) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:240) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:155) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1307) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:525) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1072) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:382) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1006) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:365) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:485) at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:926) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:988) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:635) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72) at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at jav
[jira] [Commented] (SOLR-4862) Core admin action "CREATE" fails to persist some settings in solr.xml
[ https://issues.apache.org/jira/browse/SOLR-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13676325#comment-13676325 ] Trey Massingill commented on SOLR-4862: --- Seemingly, I'm running into this issue as well. The solr log shows that I passed the dataDir option, but it does not show up in solr.xml. I'm not sure why "collection" is showing up in solr.xml either. Log message: {noformat} 235705|2013-06-05T20:25:16.774+|qtp875010279-17|INFO|o.a.solr.servlet.SolrDispatchFilter|[admin] webapp=null path=/admin/cores params={schema=schema.xml&loadOnStartup=false&instanceDir=.&transient=true&name=queue-2013060518&action=CREATE&config=solrconfig.xml&dataDir=.. /../index_data/queue-2013060518&wt=json} status=0 QTime=1635 {noformat} solr.xml {noformat} {noformat} This doesn't seem to cause issues at first. However, after restarting the service, I end up with this warning: {noformat} 16764|2013-06-05T20:36:15.289+|qtp1711465251-20|WARN|o.a.solr.handler.ReplicationHandler|Unable to get IndexCommit on startup org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: NativeFSLock@/home/tmassi/Development/svn/mta-blockmon-2012/blockmon-solr/blockmon-solr/master/versions/blockmon-solr-2.0.4-SNAPSHOT/config/solr/data/index/write.lock at org.apache.lucene.store.Lock.obtain(Lock.java:84) at org.apache.lucene.index.IndexWriter.(IndexWriter.java:644) at org.apache.solr.update.SolrIndexWriter.(SolrIndexWriter.java:77) at org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:64) at org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:197) at org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:110) at org.apache.solr.handler.ReplicationHandler.inform(ReplicationHandler.java:939) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:616) at org.apache.solr.core.SolrCore.(SolrCore.java:816) at org.apache.solr.core.SolrCore.(SolrCore.java:618) at org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:949) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:984) at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:1227) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:240) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:155) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1307) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:525) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1072) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:382) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1006) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:365) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:485) at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:926) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:988) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:635) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72) at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:679) {noformat} ... quickly followed by this error: {noformat} 17212
Re: Documentation for Solr/Lucene 4.x, termIndexInterval and limitations of Lucene File format
Hi Mike, 13 Billion unique terms. (CheckIndex output appended below) Tom -- test: terms, freq, prox...OK [13,068,302,002 terms; 187,284,275,343 terms/docs pairs; 786,014,075,745 tokens] Segments file=segments_6 numSegments=2 version=4.0.0.2 format= userData={commitTimeMSec=1357596564850} 1 of 2: name=_uhj docCount=866984 codec=Lucene40 compound=false numFiles=10 size (MB)=2,048,537.68 diagnostics = {os=Linux, os.version=2.6.18-308.24.1.el5, mergeFactor=8, source=merge, lucene.version=4.0.0 1394950 - rmuir - 2012-10-06 03:00:40, os.arch=amd64, mergeMaxNumSegments=1, java.version=1.6.0_16, java.vendor=Sun Microsystems Inc.} no deletions test: open reader.OK test: fields..OK [92 fields] test: field norms.OK [46 fields] test: terms, freq, prox...OK [13068302002 terms; 187284275343 terms/docs pairs; 786014075745 tokens] test: stored fields...OK [34172522 total field count; avg 39.415 fields per doc] test: term vectorsOK [0 total vector count; avg 0 term/freq vector fields per doc] test: DocValuesOK [0 total doc Count; Num DocValues Fields 0 On Tue, Jun 4, 2013 at 1:00 PM, Tom Burton-West wrote: > Thanks Mike. > > I'm running CheckIndex on the 2TB index right now.Hopefully it will > finish running by tomorrow. I'll send you a copy of the output. > > Tom > > > On Mon, Jun 3, 2013 at 9:04 PM, Michael McCandless < > luc...@mikemccandless.com> wrote: > >> Hi Tom, >> >> On Mon, Jun 3, 2013 at 12:11 PM, Tom Burton-West >> wrote: >> >> > What is the current limit? >> >> I *think* (but would be nice to hear back how many terms you were able >> to index into one segment ;) ) there is no hard limit to the max >> number of terms, now that FSTs can handle more than 2.1 B >> bytes/nodes/arcs. >> >> I'll update those javadocs, thanks! >> >> Mike McCandless >> >> http://blog.mikemccandless.com >> >> - >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> >> >
[jira] [Commented] (LUCENE-4055) Refactor SegmentInfo / FieldInfo to make them extensible
[ https://issues.apache.org/jira/browse/LUCENE-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13676083#comment-13676083 ] Michael McCandless commented on LUCENE-4055: Hmm looks like it's package private in 4.3 but is (will be) public in 4.x/trunk. Just replicate for now :) > Refactor SegmentInfo / FieldInfo to make them extensible > > > Key: LUCENE-4055 > URL: https://issues.apache.org/jira/browse/LUCENE-4055 > Project: Lucene - Core > Issue Type: Improvement > Components: core/codecs >Reporter: Andrzej Bialecki >Assignee: Robert Muir > Fix For: 4.0-ALPHA > > Attachments: LUCENE-4055.patch > > > After LUCENE-4050 is done the resulting SegmentInfo / FieldInfo classes > should be made abstract so that they can be extended by Codec-s. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5035) FieldCacheImpl.SortedDocValuesImpl should compress addresses to term bytes more efficiently
[ https://issues.apache.org/jira/browse/LUCENE-5035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13676080#comment-13676080 ] Michael McCandless commented on LUCENE-5035: +1, awesome! > FieldCacheImpl.SortedDocValuesImpl should compress addresses to term bytes > more efficiently > --- > > Key: LUCENE-5035 > URL: https://issues.apache.org/jira/browse/LUCENE-5035 > Project: Lucene - Core > Issue Type: Improvement > Components: core/search >Reporter: Robert Muir > Attachments: LUCENE-5035.patch > > > Each ordinal in SortedDocValuesImpl has a corresponding address to find its > location in the big byte[] to support lookupOrd() > Today this uses GrowableWriter with absolute addresses. > But it would be much better to use MonotonicAppendingLongBuffer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-4891) JsonLoader should preserve field value types from the JSON content stream
[ https://issues.apache.org/jira/browse/SOLR-4891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13676029#comment-13676029 ] Steve Rowe edited comment on SOLR-4891 at 6/5/13 3:30 PM: -- Committed: - trunk: [r1489914|http://svn.apache.org/viewvc?view=rev&rev=1489914] - branch_4x: [r1489915|http://svn.apache.org/viewvc?view=rev&rev=1489915] was (Author: steve_rowe): Committed: - trunk: [r1489914|http://svn.apache.org/viewvc?view=rev?rev=1489914] - branch_4x: [r1489915|http://svn.apache.org/viewvc?view=rev?rev=1489915] > JsonLoader should preserve field value types from the JSON content stream > - > > Key: SOLR-4891 > URL: https://issues.apache.org/jira/browse/SOLR-4891 > Project: Solr > Issue Type: Bug > Components: update >Reporter: Steve Rowe >Assignee: Steve Rowe >Priority: Minor > Fix For: 4.4 > > Attachments: SOLR-4891-BigInteger-bugfix.patch, SOLR-4891.patch > > > JSON content streams carry some basic type information for their field > values, as parsed by Noggit: LONG, NUMBER, BIGNUMBER, and BOOLEAN. > {{JsonLoader}} should set field value object types in the > {{SolrInputDocument}} according to the content stream's data types. > Currently {{JsonLoader}} converts all non-{{String}}-typed field values to > {{String}}-s. > There is a comment in {{JsonLoader.parseSingleFieldValue()}}, where the > convert-everything-to-string logic happens, that says "for legacy reasons, > single values s are expected to be strings", but other content streams' type > information is not flattened like this, e.g. {{JavabinLoader}}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-4891) JsonLoader should preserve field value types from the JSON content stream
[ https://issues.apache.org/jira/browse/SOLR-4891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Rowe resolved SOLR-4891. -- Resolution: Fixed Committed: - trunk: [r1489914|http://svn.apache.org/viewvc?view=rev?rev=1489914] - branch_4x: [r1489915|http://svn.apache.org/viewvc?view=rev?rev=1489915] > JsonLoader should preserve field value types from the JSON content stream > - > > Key: SOLR-4891 > URL: https://issues.apache.org/jira/browse/SOLR-4891 > Project: Solr > Issue Type: Bug > Components: update >Reporter: Steve Rowe >Assignee: Steve Rowe >Priority: Minor > Fix For: 4.4 > > Attachments: SOLR-4891-BigInteger-bugfix.patch, SOLR-4891.patch > > > JSON content streams carry some basic type information for their field > values, as parsed by Noggit: LONG, NUMBER, BIGNUMBER, and BOOLEAN. > {{JsonLoader}} should set field value object types in the > {{SolrInputDocument}} according to the content stream's data types. > Currently {{JsonLoader}} converts all non-{{String}}-typed field values to > {{String}}-s. > There is a comment in {{JsonLoader.parseSingleFieldValue()}}, where the > convert-everything-to-string logic happens, that says "for legacy reasons, > single values s are expected to be strings", but other content streams' type > information is not flattened like this, e.g. {{JavabinLoader}}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4891) JsonLoader should preserve field value types from the JSON content stream
[ https://issues.apache.org/jira/browse/SOLR-4891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Rowe updated SOLR-4891: - Attachment: SOLR-4891-BigInteger-bugfix.patch patch - committing shortly > JsonLoader should preserve field value types from the JSON content stream > - > > Key: SOLR-4891 > URL: https://issues.apache.org/jira/browse/SOLR-4891 > Project: Solr > Issue Type: Bug > Components: update >Reporter: Steve Rowe >Assignee: Steve Rowe >Priority: Minor > Fix For: 4.4 > > Attachments: SOLR-4891-BigInteger-bugfix.patch, SOLR-4891.patch > > > JSON content streams carry some basic type information for their field > values, as parsed by Noggit: LONG, NUMBER, BIGNUMBER, and BOOLEAN. > {{JsonLoader}} should set field value object types in the > {{SolrInputDocument}} according to the content stream's data types. > Currently {{JsonLoader}} converts all non-{{String}}-typed field values to > {{String}}-s. > There is a comment in {{JsonLoader.parseSingleFieldValue()}}, where the > convert-everything-to-string logic happens, that says "for legacy reasons, > single values s are expected to be strings", but other content streams' type > information is not flattened like this, e.g. {{JavabinLoader}}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: SOLR entry point when deployed in web app container
On Wed, Jun 5, 2013 at 11:01 AM, Prathik Puthran wrote: > I was trying to find the entry point in SOLR web app when it is deployed in > an application container. Can someone please help me with this? Check the SolrDispatchFilter class -Yonik http://lucidworks.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Reopened] (SOLR-4891) JsonLoader should preserve field value types from the JSON content stream
[ https://issues.apache.org/jira/browse/SOLR-4891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Rowe reopened SOLR-4891: -- At Hoss's suggestion on #solr IRC last night, I tested whether {{JsonLoader}} behavior has changed around {{BigInteger}} and {{BigDecimal}} values as a result of the changes committed under this issue. I'm reopening to address an issue with adding JSON {{BIGNUMBER}}-s (returned by the Noggit parser when a number won't fit in either a long or a double) to trie integer or long fields: a {{NumberFormatException}} is no longer triggered, and the values are silently corrupted. Before committing the patch on this issue, {{BigInteger}}-typed values were not created for {{BIGNUMBER}}-s in {{SolrInputDocument}}; instead, they (along with every other JSON value) were converted to {{String}}-s, and then adding such a value to an integer or long field would cause a {{NumberFormatException}} to be thrown from {{Integer.parseInt()}} or {{Long.parseLong()}}. This was proper and good. But now, {{BigInteger}}-typed values are converted (in {{TrieField.createField()}} to int/long using {{BigInteger}}'s {{intValue()}} and {{longValue()}} methods, which return only the low-order 32 and 64 bits, respectively. These values are always corrupted: the truncated high-order bits are guaranteed to be non-zero, since {{BigInteger}} typing only happens when values won't fit into 64 bits. Reverting back to {{String}}-typed {{BIGNUMBER}} values fixes the problem. By contrast, {{BigDecimal}}'s {{doubleValue()}} and {{floatValue()}} methods truncate the low-order bits, resulting in loss of precision rather than corruption. This is the same behavior used by {{Double.parseDouble()}} and {{Float.parseFloat()}}. Reverting back to {{String}}-typing for decimal {{BIGNUMBER}}-s in addition to integral {{BIGNUMBER}}-s won't be a problem. Patch forthcoming. > JsonLoader should preserve field value types from the JSON content stream > - > > Key: SOLR-4891 > URL: https://issues.apache.org/jira/browse/SOLR-4891 > Project: Solr > Issue Type: Bug > Components: update >Reporter: Steve Rowe >Assignee: Steve Rowe >Priority: Minor > Fix For: 4.4 > > Attachments: SOLR-4891.patch > > > JSON content streams carry some basic type information for their field > values, as parsed by Noggit: LONG, NUMBER, BIGNUMBER, and BOOLEAN. > {{JsonLoader}} should set field value object types in the > {{SolrInputDocument}} according to the content stream's data types. > Currently {{JsonLoader}} converts all non-{{String}}-typed field values to > {{String}}-s. > There is a comment in {{JsonLoader.parseSingleFieldValue()}}, where the > convert-everything-to-string logic happens, that says "for legacy reasons, > single values s are expected to be strings", but other content streams' type > information is not flattened like this, e.g. {{JavabinLoader}}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
SOLR entry point when deployed in web app container
Hi, I was trying to find the entry point in SOLR web app when it is deployed in an application container. Can someone please help me with this? Thanks, Prathik
[jira] [Created] (SOLR-4901) Newcomer Curb Appeal - improve the Out Of Box experience for new users
Shawn Heisey created SOLR-4901: -- Summary: Newcomer Curb Appeal - improve the Out Of Box experience for new users Key: SOLR-4901 URL: https://issues.apache.org/jira/browse/SOLR-4901 Project: Solr Issue Type: Improvement Components: documentation, web gui Reporter: Shawn Heisey This is a master issue to track improvements affecting a new user's experience with Solr. Please link other issues, blocking this one. Solr is immensely complex. When I first started using it, the initial learning curve was incredibly steep. It's still uphill even now, but I mostly know where the handrails are. The general focus for linked issues: 1) Improving what the user sees when they first download Solr. I think issues for this item will mostly be about the included txt files and the wiki pages referenced there. We want to be sure that the user who downloads Solr and looks at README.txt is able to find information that will give them insight into how the Solr startup works and what information is configured where. Any wiki pages referenced need to be top quality, with introductions that really help a novice user and advanced reference material for users with more experience. The README should tell them how to get into the example's admin UI. Until we improve the UI, IMHO it should set expectations about what kind of features they'll get out of the UI and let them know that they'll probably be accessing API URLs directly and editing config files. Moving from the example in the download to a robust production installation, especially for SolrCloud, should be in our documentation. 2) Improving the UI so the novice doesn't have to edit so many config files or immediately learn how to use arcane HTTP API calls. Experienced users look at these things and have no problem with them, but they are voodoo to the new user. When using the UI to make changes (for example, CoreAdmin), the actual API URL that was called should be available, and if it fails, helpful text and a wiki link should be displayed so that the user can figure out what went wrong. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5035) FieldCacheImpl.SortedDocValuesImpl should compress addresses to term bytes more efficiently
[ https://issues.apache.org/jira/browse/LUCENE-5035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13675979#comment-13675979 ] Adrien Grand commented on LUCENE-5035: -- +1, patch looks good! > FieldCacheImpl.SortedDocValuesImpl should compress addresses to term bytes > more efficiently > --- > > Key: LUCENE-5035 > URL: https://issues.apache.org/jira/browse/LUCENE-5035 > Project: Lucene - Core > Issue Type: Improvement > Components: core/search >Reporter: Robert Muir > Attachments: LUCENE-5035.patch > > > Each ordinal in SortedDocValuesImpl has a corresponding address to find its > location in the big byte[] to support lookupOrd() > Today this uses GrowableWriter with absolute addresses. > But it would be much better to use MonotonicAppendingLongBuffer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4879) Indexing a field of type solr.SpatialRecursivePrefixTreeFieldType fails when at least two vertexes are more than 180 degrees apart
[ https://issues.apache.org/jira/browse/SOLR-4879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13675961#comment-13675961 ] David Smiley commented on SOLR-4879: No problem. If not 4.4 then 4.5, I think. Who knows when 4.4 will be ready so it's hard to say. There is some WKT work going on in Spatial4j that I want to get done before cutting a new release there. > Indexing a field of type solr.SpatialRecursivePrefixTreeFieldType fails when > at least two vertexes are more than 180 degrees apart > -- > > Key: SOLR-4879 > URL: https://issues.apache.org/jira/browse/SOLR-4879 > Project: Solr > Issue Type: Bug > Environment: Linux, Solr 4.0.0, Solr 4.3.0 >Reporter: Øystein Torget >Assignee: David Smiley > > When trying to index a field of the type > solr.SpatialRecursivePrefixTreeFieldType the indexing will fail if two > vertexes are more than 180 longitudal degress apart. > For instance this polygon will fail: > POLYGON((-161 49, 0 49, 20 49, 20 89.1, 0 89.1, -161 89.2,-161 > 49)) > but this will not. > POLYGON((-160 49, 0 49, 20 49, 20 89.1, 0 89.1, -160 89.2,-160 > 49)) > This contradicts the documentation found here: > http://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4 > The documentation states that each vertex must be less than 180 longitudal > degrees apart from the previous vertex. > Relevant parts from the schema.xml file: > > class="solr.SpatialRecursivePrefixTreeFieldType" > > spatialContextFactory="com.spatial4j.core.context.jts.JtsSpatialContextFactory" >distErrPct="0.025" >maxDistErr="0.09" >units="degrees" > /> > stored="true" /> -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5035) FieldCacheImpl.SortedDocValuesImpl should compress addresses to term bytes more efficiently
[ https://issues.apache.org/jira/browse/LUCENE-5035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-5035: Attachment: LUCENE-5035.patch patch > FieldCacheImpl.SortedDocValuesImpl should compress addresses to term bytes > more efficiently > --- > > Key: LUCENE-5035 > URL: https://issues.apache.org/jira/browse/LUCENE-5035 > Project: Lucene - Core > Issue Type: Improvement > Components: core/search >Reporter: Robert Muir > Attachments: LUCENE-5035.patch > > > Each ordinal in SortedDocValuesImpl has a corresponding address to find its > location in the big byte[] to support lookupOrd() > Today this uses GrowableWriter with absolute addresses. > But it would be much better to use MonotonicAppendingLongBuffer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5035) FieldCacheImpl.SortedDocValuesImpl should compress addresses to term bytes more efficiently
Robert Muir created LUCENE-5035: --- Summary: FieldCacheImpl.SortedDocValuesImpl should compress addresses to term bytes more efficiently Key: LUCENE-5035 URL: https://issues.apache.org/jira/browse/LUCENE-5035 Project: Lucene - Core Issue Type: Improvement Components: core/search Reporter: Robert Muir Attachments: LUCENE-5035.patch Each ordinal in SortedDocValuesImpl has a corresponding address to find its location in the big byte[] to support lookupOrd() Today this uses GrowableWriter with absolute addresses. But it would be much better to use MonotonicAppendingLongBuffer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4055) Refactor SegmentInfo / FieldInfo to make them extensible
[ https://issues.apache.org/jira/browse/LUCENE-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13675907#comment-13675907 ] Grant Ingersoll commented on LUCENE-4055: - Hmm, Mike, CODEC_FILE_PATTERN is package access only. Easy enough to replicate/fix, any reason not too? > Refactor SegmentInfo / FieldInfo to make them extensible > > > Key: LUCENE-4055 > URL: https://issues.apache.org/jira/browse/LUCENE-4055 > Project: Lucene - Core > Issue Type: Improvement > Components: core/codecs >Reporter: Andrzej Bialecki >Assignee: Robert Muir > Fix For: 4.0-ALPHA > > Attachments: LUCENE-4055.patch > > > After LUCENE-4050 is done the resulting SegmentInfo / FieldInfo classes > should be made abstract so that they can be extended by Codec-s. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-MacOSX (64bit/jdk1.6.0) - Build # 519 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-MacOSX/519/ Java: 64bit/jdk1.6.0 -XX:-UseCompressedOops -XX:+UseConcMarkSweepGC 1 tests failed. FAILED: org.apache.lucene.replicator.http.HttpReplicatorTest.testBasic Error Message: Connection to http://localhost:51371 refused Stack Trace: org.apache.http.conn.HttpHostConnectException: Connection to http://localhost:51371 refused at __randomizedtesting.SeedInfo.seed([B8423D144314D0:AB425F28CB9F92FE]:0) at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:190) at org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294) at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:645) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:480) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784) at org.apache.lucene.replicator.http.HttpClientBase.executeGET(HttpClientBase.java:178) at org.apache.lucene.replicator.http.HttpReplicator.checkForUpdate(HttpReplicator.java:51) at org.apache.lucene.replicator.ReplicationClient.doUpdate(ReplicationClient.java:196) at org.apache.lucene.replicator.ReplicationClient.updateNow(ReplicationClient.java:402) at org.apache.lucene.replicator.http.HttpReplicatorTest.testBasic(HttpReplicatorTest.java:112) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at
[jira] [Updated] (LUCENE-5034) Make AppendingLongBuffer's page size configurable
[ https://issues.apache.org/jira/browse/LUCENE-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand updated LUCENE-5034: - Attachment: LUCENE-5034.patch > Make AppendingLongBuffer's page size configurable > - > > Key: LUCENE-5034 > URL: https://issues.apache.org/jira/browse/LUCENE-5034 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Assignee: Adrien Grand >Priority: Minor > Attachments: LUCENE-5034.patch > > > Depending on the data, it might be interesting to use smaller or larger page > sizes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5034) Make AppendingLongBuffer's page size configurable
Adrien Grand created LUCENE-5034: Summary: Make AppendingLongBuffer's page size configurable Key: LUCENE-5034 URL: https://issues.apache.org/jira/browse/LUCENE-5034 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Depending on the data, it might be interesting to use smaller or larger page sizes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-5026) PagedGrowableWriter
[ https://issues.apache.org/jira/browse/LUCENE-5026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand resolved LUCENE-5026. -- Resolution: Fixed > PagedGrowableWriter > --- > > Key: LUCENE-5026 > URL: https://issues.apache.org/jira/browse/LUCENE-5026 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Assignee: Adrien Grand > Fix For: 5.0, 4.4 > > Attachments: LUCENE-5026.patch, LUCENE-5026.patch > > > We already have packed data structures that support more than 2B values such > as AppendingLongBuffer and MonotonicAppendingLongBuffer but none of them > supports random write-access. > We could write a PagedGrowableWriter for this, which would essentially wrap > an array of GrowableWriters. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-Linux (64bit/ibm-j9-jdk7) - Build # 5928 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/5928/ Java: 64bit/ibm-j9-jdk7 -Xjit:exclude={org/apache/lucene/util/fst/FST.pack(IIF)Lorg/apache/lucene/util/fst/FST;} 1 tests failed. REGRESSION: org.apache.solr.core.TestJmxIntegration.testJmxRegistration Error Message: No SolrDynamicMBeans found Stack Trace: java.lang.AssertionError: No SolrDynamicMBeans found at __randomizedtesting.SeedInfo.seed([B49FBFA20F813AEB:3A4EDB9862C0628E]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.assertTrue(Assert.java:43) at org.apache.solr.core.TestJmxIntegration.testJmxRegistration(TestJmxIntegration.java:94) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:88) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55) at java.lang.reflect.Method.invoke(Method.java:613) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at java.lang.Thread.run(Thread.java:780) Build Log: [...truncated 9478 lines...] [junit4:junit4] Suite: org.apache.solr.core.TestJmxInte
[jira] [Commented] (SOLR-4805) Calling Collection RELOAD where collection has a single core, leaves collection offline and unusable till reboot
[ https://issues.apache.org/jira/browse/SOLR-4805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13675724#comment-13675724 ] Alexey Kudinov commented on SOLR-4805: -- It seems that the issue happens because ZkController.preRegister sets state 'Down' while in ZkController.register a piece fo code setting state to Active is skipped for reloaded core. Only recovery should be skipped but not setting state to Active. > Calling Collection RELOAD where collection has a single core, leaves > collection offline and unusable till reboot > > > Key: SOLR-4805 > URL: https://issues.apache.org/jira/browse/SOLR-4805 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 4.3 >Reporter: Jared Rodriguez >Assignee: Mark Miller > Fix For: 5.0, 4.4 > > > If you have a collection that is composed of a single core, then calling > reload on that collection leaves the core offline. This happens even if > nothing at all has changed about the collection or its config. This happens > whether you call reload via an http GET or if you directly call reload via > the collections api. > Tried a collection with a single core that contains data, change nothing > about the config in ZK and call reload and the collection. The call > completes, but ZK flags that replica with "state":"down" > Try it where a the single core contains no data and the same thing happens, > ZK config updates and broadcasts "state":"down" for the replica. > I did not try this in a multicore or replicated core environment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-4381) Query-time multi-word synonym expansion
[ https://issues.apache.org/jira/browse/SOLR-4381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13675702#comment-13675702 ] Hemant Verma edited comment on SOLR-4381 at 6/5/13 8:43 AM: While using this patch I found one scenario in which it is not working properly. I have in my synonyms list the below keywords: pepsi,pepsico,pbg outsourcing,rpo,offshoring Difference in expanding synonyms comes up when I use any of the word with stopword as a prefix. Search Keyword Expanded Result pepsi ---> pepsi, pepsico, pbg pbg -> pepsi, pepsico, pbg the pepsi -> pepsi, pepsico the pbg ---> pepsi, pbg outsourcing -> outsourc, offshor, rpo the outsourcing > outsourc, offshor The above expanded synonyms result shows that when we use any keyword (available in synonym list) prefixed with stopword then expanded synonyms do miss few synonym. was (Author: hemantverma09): While using this patch I found one scenario in which it is not working properly. I have in my synonyms list the below keywords: pepsi,pepsico,pbg outsourcing,rpo,offshoring Difference in expanding synonyms comes up when I use any of the word with stopword as a prefix. Search Keyword Expanded Result pepsi ---> pepsi, pepsico, pbg pbg -> pepsi, pepsico, pbg the pepsi -> pepsi, pepsico the pbg > pepsi, pbg outsourcing -> outsourc, offshor, rpo the outsourcing -> outsourc, offshor The above expanded synonyms result shows that when we use any keyword (available in synonym list) prefixed with stopword then expanded synonyms do miss few synonym. > Query-time multi-word synonym expansion > --- > > Key: SOLR-4381 > URL: https://issues.apache.org/jira/browse/SOLR-4381 > Project: Solr > Issue Type: Improvement > Components: query parsers >Reporter: Nolan Lawson >Priority: Minor > Labels: multi-word, queryparser, synonyms > Fix For: 4.4 > > Attachments: SOLR-4381-2.patch, SOLR-4381.patch > > > This is an issue that seems to come up perennially. > The [Solr > docs|http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory] > caution that index-time synonym expansion should be preferred to query-time > synonym expansion, due to the way multi-word synonyms are treated and how IDF > values can be boosted artificially. But query-time expansion should have huge > benefits, given that changes to the synonyms don't require re-indexing, the > index size stays the same, and the IDF values for the documents don't get > permanently altered. > The proposed solution is to move the synonym expansion logic from the > analysis chain (either query- or index-type) and into a new QueryParser. See > the attached patch for an implementation. > The core Lucene functionality is untouched. Instead, the EDismaxQParser is > extended, and synonym expansion is done on-the-fly. Queries are parsed into > a lattice (i.e. all possible synonym combinations), while individual > components of the query are still handled by the EDismaxQParser itself. > It's not an ideal solution by any stretch. But it's nice and self-contained, > so it invites experimentation and improvement. And I think it fits in well > with the merry band of misfit query parsers, like {{func}} and {{frange}}. > More details about this solution can be found in [this blog > post|http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/] and > [the Github page for the > code|https://github.com/healthonnet/hon-lucene-synonyms]. > At the risk of tooting my own horn, I also think this patch sufficiently > fixes SOLR-3390 (highlighting problems with multi-word synonyms) and > LUCENE-4499 (better support for multi-word synonyms). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-4381) Query-time multi-word synonym expansion
[ https://issues.apache.org/jira/browse/SOLR-4381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13675702#comment-13675702 ] Hemant Verma edited comment on SOLR-4381 at 6/5/13 8:42 AM: While using this patch I found one scenario in which it is not working properly. I have in my synonyms list the below keywords: pepsi,pepsico,pbg outsourcing,rpo,offshoring Difference in expanding synonyms comes up when I use any of the word with stopword as a prefix. Search Keyword Expanded Result pepsi ---> pepsi, pepsico, pbg pbg -> pepsi, pepsico, pbg the pepsi -> pepsi, pepsico the pbg > pepsi, pbg outsourcing -> outsourc, offshor, rpo the outsourcing -> outsourc, offshor The above expanded synonyms result shows that when we use any keyword (available in synonym list) prefixed with stopword then expanded synonyms do miss few synonym. was (Author: hemantverma09): While using this patch I found one scenario in which it is not working properly. I have in my synonyms list the below keywords: pepsi,pepsico,pbg outsourcing,rpo,offshoring Difference in expanding synonyms comes up when I use any of the word with stopword as a prefix. Search Keyword Expanded Result -- --- pepsipepsi, pepsico, pbg pbg pepsi, pepsico, pbg the pepsipepsi, pepsico the pbg pepsi, pbg outsourcing outsourc, offshor, rpo the outsourcing outsourc, offshor The above expanded synonyms result shows that when we use any keyword (available in synonym list) prefixed with stopword then expanded synonyms do miss few synonym. > Query-time multi-word synonym expansion > --- > > Key: SOLR-4381 > URL: https://issues.apache.org/jira/browse/SOLR-4381 > Project: Solr > Issue Type: Improvement > Components: query parsers >Reporter: Nolan Lawson >Priority: Minor > Labels: multi-word, queryparser, synonyms > Fix For: 4.4 > > Attachments: SOLR-4381-2.patch, SOLR-4381.patch > > > This is an issue that seems to come up perennially. > The [Solr > docs|http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory] > caution that index-time synonym expansion should be preferred to query-time > synonym expansion, due to the way multi-word synonyms are treated and how IDF > values can be boosted artificially. But query-time expansion should have huge > benefits, given that changes to the synonyms don't require re-indexing, the > index size stays the same, and the IDF values for the documents don't get > permanently altered. > The proposed solution is to move the synonym expansion logic from the > analysis chain (either query- or index-type) and into a new QueryParser. See > the attached patch for an implementation. > The core Lucene functionality is untouched. Instead, the EDismaxQParser is > extended, and synonym expansion is done on-the-fly. Queries are parsed into > a lattice (i.e. all possible synonym combinations), while individual > components of the query are still handled by the EDismaxQParser itself. > It's not an ideal solution by any stretch. But it's nice and self-contained, > so it invites experimentation and improvement. And I think it fits in well > with the merry band of misfit query parsers, like {{func}} and {{frange}}. > More details about this solution can be found in [this blog > post|http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/] and > [the Github page for the > code|https://github.com/healthonnet/hon-lucene-synonyms]. > At the risk of tooting my own horn, I also think this patch sufficiently > fixes SOLR-3390 (highlighting problems with multi-word synonyms) and > LUCENE-4499 (better support for multi-word synonyms). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4879) Indexing a field of type solr.SpatialRecursivePrefixTreeFieldType fails when at least two vertexes are more than 180 degrees apart
[ https://issues.apache.org/jira/browse/SOLR-4879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13675708#comment-13675708 ] Øystein Torget commented on SOLR-4879: -- I see that you fixed the bug in Spatial4j already so I tried adding the latest snapshot of Spatial4j to Solr and that fixed the problem. Thanks for your help! Do you know when we can expect a new release of Solr with the next version of Spatial4j? > Indexing a field of type solr.SpatialRecursivePrefixTreeFieldType fails when > at least two vertexes are more than 180 degrees apart > -- > > Key: SOLR-4879 > URL: https://issues.apache.org/jira/browse/SOLR-4879 > Project: Solr > Issue Type: Bug > Environment: Linux, Solr 4.0.0, Solr 4.3.0 >Reporter: Øystein Torget >Assignee: David Smiley > > When trying to index a field of the type > solr.SpatialRecursivePrefixTreeFieldType the indexing will fail if two > vertexes are more than 180 longitudal degress apart. > For instance this polygon will fail: > POLYGON((-161 49, 0 49, 20 49, 20 89.1, 0 89.1, -161 89.2,-161 > 49)) > but this will not. > POLYGON((-160 49, 0 49, 20 49, 20 89.1, 0 89.1, -160 89.2,-160 > 49)) > This contradicts the documentation found here: > http://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4 > The documentation states that each vertex must be less than 180 longitudal > degrees apart from the previous vertex. > Relevant parts from the schema.xml file: > > class="solr.SpatialRecursivePrefixTreeFieldType" > > spatialContextFactory="com.spatial4j.core.context.jts.JtsSpatialContextFactory" >distErrPct="0.025" >maxDistErr="0.09" >units="degrees" > /> > stored="true" /> -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4381) Query-time multi-word synonym expansion
[ https://issues.apache.org/jira/browse/SOLR-4381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13675702#comment-13675702 ] Hemant Verma commented on SOLR-4381: While using this patch I found one scenario in which it is not working properly. I have in my synonyms list the below keywords: pepsi,pepsico,pbg outsourcing,rpo,offshoring Difference in expanding synonyms comes up when I use any of the word with stopword as a prefix. Search Keyword Expanded Result -- --- pepsipepsi, pepsico, pbg pbg pepsi, pepsico, pbg the pepsipepsi, pepsico the pbg pepsi, pbg outsourcing outsourc, offshor, rpo the outsourcing outsourc, offshor The above expanded synonyms result shows that when we use any keyword (available in synonym list) prefixed with stopword then expanded synonyms do miss few synonym. > Query-time multi-word synonym expansion > --- > > Key: SOLR-4381 > URL: https://issues.apache.org/jira/browse/SOLR-4381 > Project: Solr > Issue Type: Improvement > Components: query parsers >Reporter: Nolan Lawson >Priority: Minor > Labels: multi-word, queryparser, synonyms > Fix For: 4.4 > > Attachments: SOLR-4381-2.patch, SOLR-4381.patch > > > This is an issue that seems to come up perennially. > The [Solr > docs|http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory] > caution that index-time synonym expansion should be preferred to query-time > synonym expansion, due to the way multi-word synonyms are treated and how IDF > values can be boosted artificially. But query-time expansion should have huge > benefits, given that changes to the synonyms don't require re-indexing, the > index size stays the same, and the IDF values for the documents don't get > permanently altered. > The proposed solution is to move the synonym expansion logic from the > analysis chain (either query- or index-type) and into a new QueryParser. See > the attached patch for an implementation. > The core Lucene functionality is untouched. Instead, the EDismaxQParser is > extended, and synonym expansion is done on-the-fly. Queries are parsed into > a lattice (i.e. all possible synonym combinations), while individual > components of the query are still handled by the EDismaxQParser itself. > It's not an ideal solution by any stretch. But it's nice and self-contained, > so it invites experimentation and improvement. And I think it fits in well > with the merry band of misfit query parsers, like {{func}} and {{frange}}. > More details about this solution can be found in [this blog > post|http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/] and > [the Github page for the > code|https://github.com/healthonnet/hon-lucene-synonyms]. > At the risk of tooting my own horn, I also think this patch sufficiently > fixes SOLR-3390 (highlighting problems with multi-word synonyms) and > LUCENE-4499 (better support for multi-word synonyms). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-4989) Hanging on DocumentsWriterStallControl.waitIfStalled forever
[ https://issues.apache.org/jira/browse/LUCENE-4989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer resolved LUCENE-4989. - Resolution: Fixed fixed via LUCENE-5002 > Hanging on DocumentsWriterStallControl.waitIfStalled forever > > > Key: LUCENE-4989 > URL: https://issues.apache.org/jira/browse/LUCENE-4989 > Project: Lucene - Core > Issue Type: Bug > Components: core/index >Affects Versions: 4.1 > Environment: Linux 2.6.32 >Reporter: Jessica Cheng >Assignee: Simon Willnauer > Labels: hang > Fix For: 5.0, 4.3.1 > > > In an environment where our underlying storage was timing out on various > operations, we find all of our indexing threads eventually stuck in the > following state (so far for 4 days): > "Thread-0" daemon prio=5 Thread id=556 WAITING > at java.lang.Object.wait(Native Method) > at java.lang.Object.wait(Object.java:503) > at > org.apache.lucene.index.DocumentsWriterStallControl.waitIfStalled(DocumentsWriterStallControl.java:74) > at > org.apache.lucene.index.DocumentsWriterFlushControl.waitIfStalled(DocumentsWriterFlushControl.java:676) > at > org.apache.lucene.index.DocumentsWriter.preUpdate(DocumentsWriter.java:301) > at > org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:361) > at > org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1484) > at ... > I have not yet enabled detail logging and tried to reproduce yet, but looking > at the code, I see that DWFC.abortPendingFlushes does > try { > dwpt.abort(); > doAfterFlush(dwpt); > } catch (Throwable ex) { > // ignore - keep on aborting the flush queue > } > (and the same for the blocked ones). Since the throwable is ignored, I can't > say for sure, but I've seen DWPT.abort thrown in other cases, so if it does > throw, we'd fail to call doAfterFlush and properly decrement flushBytes. This > can be a problem, right? Is it possible to do this instead: > try { > dwpt.abort(); > } catch (Throwable ex) { > // ignore - keep on aborting the flush queue > } finally { > try { > doAfterFlush(dwpt); > } catch (Throwable ex2) { > // ignore - keep on aborting the flush queue > } > } > It's ugly but safer. Otherwise, maybe at least add logging for the throwable > just to make sure this is/isn't happening. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4989) Hanging on DocumentsWriterStallControl.waitIfStalled forever
[ https://issues.apache.org/jira/browse/LUCENE-4989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13675669#comment-13675669 ] Simon Willnauer commented on LUCENE-4989: - jessica, I agree this is not related to LUCENE-5002. I will go ahead and close it! thanks for reporting this! > Hanging on DocumentsWriterStallControl.waitIfStalled forever > > > Key: LUCENE-4989 > URL: https://issues.apache.org/jira/browse/LUCENE-4989 > Project: Lucene - Core > Issue Type: Bug > Components: core/index >Affects Versions: 4.1 > Environment: Linux 2.6.32 >Reporter: Jessica Cheng > Labels: hang > Fix For: 5.0, 4.3.1 > > > In an environment where our underlying storage was timing out on various > operations, we find all of our indexing threads eventually stuck in the > following state (so far for 4 days): > "Thread-0" daemon prio=5 Thread id=556 WAITING > at java.lang.Object.wait(Native Method) > at java.lang.Object.wait(Object.java:503) > at > org.apache.lucene.index.DocumentsWriterStallControl.waitIfStalled(DocumentsWriterStallControl.java:74) > at > org.apache.lucene.index.DocumentsWriterFlushControl.waitIfStalled(DocumentsWriterFlushControl.java:676) > at > org.apache.lucene.index.DocumentsWriter.preUpdate(DocumentsWriter.java:301) > at > org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:361) > at > org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1484) > at ... > I have not yet enabled detail logging and tried to reproduce yet, but looking > at the code, I see that DWFC.abortPendingFlushes does > try { > dwpt.abort(); > doAfterFlush(dwpt); > } catch (Throwable ex) { > // ignore - keep on aborting the flush queue > } > (and the same for the blocked ones). Since the throwable is ignored, I can't > say for sure, but I've seen DWPT.abort thrown in other cases, so if it does > throw, we'd fail to call doAfterFlush and properly decrement flushBytes. This > can be a problem, right? Is it possible to do this instead: > try { > dwpt.abort(); > } catch (Throwable ex) { > // ignore - keep on aborting the flush queue > } finally { > try { > doAfterFlush(dwpt); > } catch (Throwable ex2) { > // ignore - keep on aborting the flush queue > } > } > It's ugly but safer. Otherwise, maybe at least add logging for the throwable > just to make sure this is/isn't happening. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-4989) Hanging on DocumentsWriterStallControl.waitIfStalled forever
[ https://issues.apache.org/jira/browse/LUCENE-4989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer reassigned LUCENE-4989: --- Assignee: Simon Willnauer > Hanging on DocumentsWriterStallControl.waitIfStalled forever > > > Key: LUCENE-4989 > URL: https://issues.apache.org/jira/browse/LUCENE-4989 > Project: Lucene - Core > Issue Type: Bug > Components: core/index >Affects Versions: 4.1 > Environment: Linux 2.6.32 >Reporter: Jessica Cheng >Assignee: Simon Willnauer > Labels: hang > Fix For: 5.0, 4.3.1 > > > In an environment where our underlying storage was timing out on various > operations, we find all of our indexing threads eventually stuck in the > following state (so far for 4 days): > "Thread-0" daemon prio=5 Thread id=556 WAITING > at java.lang.Object.wait(Native Method) > at java.lang.Object.wait(Object.java:503) > at > org.apache.lucene.index.DocumentsWriterStallControl.waitIfStalled(DocumentsWriterStallControl.java:74) > at > org.apache.lucene.index.DocumentsWriterFlushControl.waitIfStalled(DocumentsWriterFlushControl.java:676) > at > org.apache.lucene.index.DocumentsWriter.preUpdate(DocumentsWriter.java:301) > at > org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:361) > at > org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1484) > at ... > I have not yet enabled detail logging and tried to reproduce yet, but looking > at the code, I see that DWFC.abortPendingFlushes does > try { > dwpt.abort(); > doAfterFlush(dwpt); > } catch (Throwable ex) { > // ignore - keep on aborting the flush queue > } > (and the same for the blocked ones). Since the throwable is ignored, I can't > say for sure, but I've seen DWPT.abort thrown in other cases, so if it does > throw, we'd fail to call doAfterFlush and properly decrement flushBytes. This > can be a problem, right? Is it possible to do this instead: > try { > dwpt.abort(); > } catch (Throwable ex) { > // ignore - keep on aborting the flush queue > } finally { > try { > doAfterFlush(dwpt); > } catch (Throwable ex2) { > // ignore - keep on aborting the flush queue > } > } > It's ugly but safer. Otherwise, maybe at least add logging for the throwable > just to make sure this is/isn't happening. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org