[jira] [Created] (SOLR-5306) can not create collection when have over one config
Liang Tianyu created SOLR-5306: -- Summary: can not create collection when have over one config Key: SOLR-5306 URL: https://issues.apache.org/jira/browse/SOLR-5306 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 4.5 Environment: win7 jdk 7 Reporter: Liang Tianyu Priority: Critical I have uploaded zookeeper two config: patent and applicant. I can not create collection:http://localhost:8080/solr/admin/collections?action=CREATEname=patent_main_1numShards=1collection.configName=patent.show errors:patent_main_1_shard1_replica1: org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException: Could not find configName for collection patent_main_1 found:[applicant, patent]. In solr 4.4 I can create sucessfully. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5023) deleteInstanceDir is added to CoreAdminHandler but can't be passed with solrj
[ https://issues.apache.org/jira/browse/SOLR-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787541#comment-13787541 ] Lyubov Romanchuk commented on SOLR-5023: Hi Shalin, I saw that you had added a test for the patch. Thank you very much. As I understand the last released version 4.5 doesn't include the fix. What should be done in order to commit the patch? Thank you. Best regards. deleteInstanceDir is added to CoreAdminHandler but can't be passed with solrj - Key: SOLR-5023 URL: https://issues.apache.org/jira/browse/SOLR-5023 Project: Solr Issue Type: Improvement Components: multicore Affects Versions: 4.2.1 Reporter: Lyubov Romanchuk Assignee: Shalin Shekhar Mangar Fix For: 4.6 Attachments: SOLR-5023.patch, SOLR-5023.patch deleteInstanceDir is added to CoreAdminHandler but is not supported in Unload CoreAdminRequest -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5250) Provide API to open IR on a specific IndexCommit with ReaderManager
[ https://issues.apache.org/jira/browse/LUCENE-5250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787543#comment-13787543 ] Akos Kitta commented on LUCENE-5250: Agree. This is what I did, but your solution is more elegant and flexible. Thank you for you suggestion. Provide API to open IR on a specific IndexCommit with ReaderManager --- Key: LUCENE-5250 URL: https://issues.apache.org/jira/browse/LUCENE-5250 Project: Lucene - Core Issue Type: Wish Components: core/index Affects Versions: 4.4 Reporter: Akos Kitta Priority: Trivial Currently it is not possible to create a ReaderManager instance on a given IndexCommit. Since the ReaderManager is final class, one has to extend ReferenceManager instead when IR has to be opened on a specified IndexCommit. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.8.0-ea-b106) - Build # 7764 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/7764/ Java: 64bit/jdk1.8.0-ea-b106 -XX:+UseCompressedOops -XX:+UseParallelGC 1 tests failed. REGRESSION: org.apache.solr.core.TestNonNRTOpen.testReaderIsNotNRT Error Message: expected:3 but was:2 Stack Trace: java.lang.AssertionError: expected:3 but was:2 at __randomizedtesting.SeedInfo.seed([6C685AEEF429FE2C:D9EE3B694BE84CD8]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.junit.Assert.assertEquals(Assert.java:456) at org.apache.solr.core.TestNonNRTOpen.assertNotNRT(TestNonNRTOpen.java:133) at org.apache.solr.core.TestNonNRTOpen.testReaderIsNotNRT(TestNonNRTOpen.java:94) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:491) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at
[jira] [Commented] (LUCENE-5189) Numeric DocValues Updates
[ https://issues.apache.org/jira/browse/LUCENE-5189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787576#comment-13787576 ] Shai Erera commented on LUCENE-5189: Committed a fix to SegmentReader (r1529611) to clean unused memory (un-referenced DVPs) as well as don't use an anonymous RefCount class since it resulted in a memory leak, where those anonymous classes held a reference to their SR, therefore we always referenced unused DVPs. Numeric DocValues Updates - Key: LUCENE-5189 URL: https://issues.apache.org/jira/browse/LUCENE-5189 Project: Lucene - Core Issue Type: New Feature Components: core/index Reporter: Shai Erera Assignee: Shai Erera Attachments: LUCENE-5189-4x.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189_process_events.patch, LUCENE-5189_process_events.patch, LUCENE-5189-updates-order.patch, LUCENE-5189-updates-order.patch In LUCENE-4258 we started to work on incremental field updates, however the amount of changes are immense and hard to follow/consume. The reason is that we targeted postings, stored fields, DV etc., all from the get go. I'd like to start afresh here, with numeric-dv-field updates only. There are a couple of reasons to that: * NumericDV fields should be easier to update, if e.g. we write all the values of all the documents in a segment for the updated field (similar to how livedocs work, and previously norms). * It's a fairly contained issue, attempting to handle just one data type to update, yet requires many changes to core code which will also be useful for updating other data types. * It has value in and on itself, and we don't need to allow updating all the data types in Lucene at once ... we can do that gradually. I have some working patch already which I'll upload next, explaining the changes. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5248) Improve the data structure used in ReaderAndLiveDocs to hold the updates
[ https://issues.apache.org/jira/browse/LUCENE-5248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-5248: --- Attachment: LUCENE-5248.patch Patch adds following changes: * NumericFieldUpdates abstraction for holding the field updates. For now, I've kept the Map representation because I wanted to get the rest of the changes in place. * ReaderAndLiveDocs no longer buffers any updates, unless it is merging * Separated writeFieldUpdates from writeLiveDocs: ** BufferedDeleteStream builds a MapString,NumericFieldUpdates from all NumericUpdates and calls RLD.writeFieldUpdates if any ** RLD.writeFieldUpdates writes the new gen'd DV files and buffers the updates if it is merging. * IW.commitMergedDeletes applies the merging updates to the merged segment by traversing over updates to all fields in parallel and builds a new MapString,NumericFieldUpdates (because it re-maps documents), which it then hands over to the new segment's RLD.writeFieldUpdates(). This change basically reverted many of the previous changes to RLD and IW, since not buffering updates in RLD simplifies it a lot. Next I will implement a more RAM-efficient NumericFieldUpdates. Improve the data structure used in ReaderAndLiveDocs to hold the updates Key: LUCENE-5248 URL: https://issues.apache.org/jira/browse/LUCENE-5248 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Shai Erera Assignee: Shai Erera Attachments: LUCENE-5248.patch, LUCENE-5248.patch, LUCENE-5248.patch Currently ReaderAndLiveDocs holds the updates in two structures: +MapString,MapInteger,Long+ Holds a mapping from each field, to all docs that were updated and their values. This structure is updated when applyDeletes is called, and needs to satisfy several requirements: # Un-ordered writes: if a field f is updated by two terms, termA and termB, in that order, and termA affects doc=100 and termB doc=2, then the updates are applied in that order, meaning we cannot rely on updates coming in order. # Same document may be updated multiple times, either by same term (e.g. several calls to IW.updateNDV) or by different terms. Last update wins. # Sequential read: when writing the updates to the Directory (fieldsConsumer), we iterate on the docs in-order and for each one check if it's updated and if not, pull its value from the current DV. # A single update may affect several million documents, therefore need to be efficient w.r.t. memory consumption. +MapInteger,MapString,Long+ Holds a mapping from a document, to all the fields that it was updated in and the updated value for each field. This is used by IW.commitMergedDeletes to apply the updates that came in while the segment was merging. The requirements this structure needs to satisfy are: # Access in doc order: this is how commitMergedDeletes works. # One-pass: we visit a document once (currently) and so if we can, it's better if we know all the fields in which it was updated. The updates are applied to the merged ReaderAndLiveDocs (where they are stored in the first structure mentioned above). Comments with proposals will follow next. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5306) can not create collection when have over one config
[ https://issues.apache.org/jira/browse/SOLR-5306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787584#comment-13787584 ] Erick Erickson commented on SOLR-5306: -- Please raise issues on the Solr user's list first to be certain you've really found a problem and are not simply making an error. In this case, the error message is telling you that the only configurations you have are applicant and patent. You have specified a name of patent.show via configName. can not create collection when have over one config --- Key: SOLR-5306 URL: https://issues.apache.org/jira/browse/SOLR-5306 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 4.5 Environment: win7 jdk 7 Reporter: Liang Tianyu Priority: Critical I have uploaded zookeeper two config: patent and applicant. I can not create collection:http://localhost:8080/solr/admin/collections?action=CREATEname=patent_main_1numShards=1collection.configName=patent.show errors:patent_main_1_shard1_replica1: org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException: Could not find configName for collection patent_main_1 found:[applicant, patent]. In solr 4.4 I can create sucessfully. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-5306) can not create collection when have over one config
[ https://issues.apache.org/jira/browse/SOLR-5306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson resolved SOLR-5306. -- Resolution: Not A Problem can not create collection when have over one config --- Key: SOLR-5306 URL: https://issues.apache.org/jira/browse/SOLR-5306 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 4.5 Environment: win7 jdk 7 Reporter: Liang Tianyu Priority: Critical I have uploaded zookeeper two config: patent and applicant. I can not create collection:http://localhost:8080/solr/admin/collections?action=CREATEname=patent_main_1numShards=1collection.configName=patent.show errors:patent_main_1_shard1_replica1: org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException: Could not find configName for collection patent_main_1 found:[applicant, patent]. In solr 4.4 I can create sucessfully. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5253) Move _version_ (and other _*_ fields) to right abpve the id field in the example schema.xml
[ https://issues.apache.org/jira/browse/SOLR-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson updated SOLR-5253: - Attachment: SOLR-5253.patch Just moves _version_ to the top of the example schema Move _version_ (and other _*_ fields) to right abpve the id field in the example schema.xml --- Key: SOLR-5253 URL: https://issues.apache.org/jira/browse/SOLR-5253 Project: Solr Issue Type: Improvement Affects Versions: 4.5, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Priority: Minor Attachments: SOLR-5253.patch Minor, but it bugs me that it's so easy to try to remove all extraneous fields from schema.xml and shoot yourself in the foot. Now and forever more we should place all the special fields right at the top of the example schema. Trivial to do. True, we say a nice note _*_ fields are internal and required, but still. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5306) can not create collection when have over one config
[ https://issues.apache.org/jira/browse/SOLR-5306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787605#comment-13787605 ] Liang Tianyu commented on SOLR-5306: I am sorry.I am a developer in china,bad english. The correct url is::http://localhost:8080/solr/admin/collections?action=CREATEname=patent_main_1numShards=1collection.configName=patent. I have readed and debuged solr 4.5 source code,I find 'setParams(SolrParams params)' never used in 'CloudDescriptor' class.Although I specify collection.confgName parameters, in fact, did not work.Can you help me,thanks! can not create collection when have over one config --- Key: SOLR-5306 URL: https://issues.apache.org/jira/browse/SOLR-5306 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 4.5 Environment: win7 jdk 7 Reporter: Liang Tianyu Priority: Critical I have uploaded zookeeper two config: patent and applicant. I can not create collection:http://localhost:8080/solr/admin/collections?action=CREATEname=patent_main_1numShards=1collection.configName=patent.show errors:patent_main_1_shard1_replica1: org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException: Could not find configName for collection patent_main_1 found:[applicant, patent]. In solr 4.4 I can create sucessfully. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5253) Move _version_ (and other _*_ fields) to right abpve the id field in the example schema.xml
[ https://issues.apache.org/jira/browse/SOLR-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787616#comment-13787616 ] ASF subversion and git services commented on SOLR-5253: --- Commit 1529621 from [~erickoerickson] in branch 'dev/trunk' [ https://svn.apache.org/r1529621 ] SOLR-5253, rearrange example schema to make it more difficult to remove _version_ and other reserved fields by mistake Move _version_ (and other _*_ fields) to right abpve the id field in the example schema.xml --- Key: SOLR-5253 URL: https://issues.apache.org/jira/browse/SOLR-5253 Project: Solr Issue Type: Improvement Affects Versions: 4.5, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Priority: Minor Attachments: SOLR-5253.patch Minor, but it bugs me that it's so easy to try to remove all extraneous fields from schema.xml and shoot yourself in the foot. Now and forever more we should place all the special fields right at the top of the example schema. Trivial to do. True, we say a nice note _*_ fields are internal and required, but still. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5253) Move _version_ (and other _*_ fields) to right abpve the id field in the example schema.xml
[ https://issues.apache.org/jira/browse/SOLR-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson updated SOLR-5253: - Attachment: SOLR-5253.patch Re-arranged a couple of things and added comments. Move _version_ (and other _*_ fields) to right abpve the id field in the example schema.xml --- Key: SOLR-5253 URL: https://issues.apache.org/jira/browse/SOLR-5253 Project: Solr Issue Type: Improvement Affects Versions: 4.5, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Priority: Minor Attachments: SOLR-5253.patch, SOLR-5253.patch Minor, but it bugs me that it's so easy to try to remove all extraneous fields from schema.xml and shoot yourself in the foot. Now and forever more we should place all the special fields right at the top of the example schema. Trivial to do. True, we say a nice note _*_ fields are internal and required, but still. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5253) Move _version_ (and other _*_ fields) to right abpve the id field in the example schema.xml
[ https://issues.apache.org/jira/browse/SOLR-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787621#comment-13787621 ] ASF subversion and git services commented on SOLR-5253: --- Commit 1529625 from [~erickoerickson] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1529625 ] SOLR-5253, rearrange example schema to make it more difficult to remove _version_ and other reserved fields by mistake Move _version_ (and other _*_ fields) to right abpve the id field in the example schema.xml --- Key: SOLR-5253 URL: https://issues.apache.org/jira/browse/SOLR-5253 Project: Solr Issue Type: Improvement Affects Versions: 4.5, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Priority: Minor Attachments: SOLR-5253.patch, SOLR-5253.patch Minor, but it bugs me that it's so easy to try to remove all extraneous fields from schema.xml and shoot yourself in the foot. Now and forever more we should place all the special fields right at the top of the example schema. Trivial to do. True, we say a nice note _*_ fields are internal and required, but still. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5253) Move _version_ (and other _*_ fields) to right abpve the id field in the example schema.xml
[ https://issues.apache.org/jira/browse/SOLR-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787619#comment-13787619 ] Erick Erickson edited comment on SOLR-5253 at 10/6/13 2:31 PM: --- Re-arranged a couple of things and added comments. This patch accurately reflects the checkin. was (Author: erickerickson): Re-arranged a couple of things and added comments. Move _version_ (and other _*_ fields) to right abpve the id field in the example schema.xml --- Key: SOLR-5253 URL: https://issues.apache.org/jira/browse/SOLR-5253 Project: Solr Issue Type: Improvement Affects Versions: 4.5, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Priority: Minor Attachments: SOLR-5253.patch, SOLR-5253.patch Minor, but it bugs me that it's so easy to try to remove all extraneous fields from schema.xml and shoot yourself in the foot. Now and forever more we should place all the special fields right at the top of the example schema. Trivial to do. True, we say a nice note _*_ fields are internal and required, but still. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5253) Move _version_ (and other _*_ fields) to right abpve the id field in the example schema.xml
[ https://issues.apache.org/jira/browse/SOLR-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787622#comment-13787622 ] Robert Muir commented on SOLR-5253: --- This doesnt make sense. _version_ isn't actually required at all, for example if you just have a single node. on the other hand much much more shit breaks if you have no unique id... So I think this should be updated to reflect reality. Move _version_ (and other _*_ fields) to right abpve the id field in the example schema.xml --- Key: SOLR-5253 URL: https://issues.apache.org/jira/browse/SOLR-5253 Project: Solr Issue Type: Improvement Affects Versions: 4.5, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Priority: Minor Attachments: SOLR-5253.patch, SOLR-5253.patch Minor, but it bugs me that it's so easy to try to remove all extraneous fields from schema.xml and shoot yourself in the foot. Now and forever more we should place all the special fields right at the top of the example schema. Trivial to do. True, we say a nice note _*_ fields are internal and required, but still. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5253) Move _version_ (and other _*_ fields) to right above the id field in the example schema.xml
[ https://issues.apache.org/jira/browse/SOLR-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson updated SOLR-5253: - Summary: Move _version_ (and other _*_ fields) to right above the id field in the example schema.xml (was: Move _version_ (and other _*_ fields) to right abpve the id field in the example schema.xml) Move _version_ (and other _*_ fields) to right above the id field in the example schema.xml --- Key: SOLR-5253 URL: https://issues.apache.org/jira/browse/SOLR-5253 Project: Solr Issue Type: Improvement Affects Versions: 4.5, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Priority: Minor Attachments: SOLR-5253.patch, SOLR-5253.patch Minor, but it bugs me that it's so easy to try to remove all extraneous fields from schema.xml and shoot yourself in the foot. Now and forever more we should place all the special fields right at the top of the example schema. Trivial to do. True, we say a nice note _*_ fields are internal and required, but still. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5253) Move _version_ (and other _*_ fields) to right abpve the id field in the example schema.xml
[ https://issues.apache.org/jira/browse/SOLR-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787625#comment-13787625 ] Erick Erickson commented on SOLR-5253: -- Reality is that if you try to start without _version_ in your schema Solr doesn't start. Stack traces all over the place and a non-functioning installation, can't query, can't get to localhost:8983:/solr, etc. At least that's what happened when I just tried it on trunk. Top-level error below. Or I screwed up again when I tested before committing this, that's been known to happen. So the current comments _do_ reflect reality. Whether it should behave this way is a different question, I was a bit surprised by this behavior as well. Perhaps another JIRA about making _version_ not required in non-cloud mode is in order? As for id, as far as I know it's not required. Personally I'd like to _make_ it required since, as you say, so much breaks if you don't have it. But that's another JIRA too. {msg=SolrCore 'collection1' is not available due to init failure: Unable to use updateLog: _version_ field must exist in schema, using indexed=true stored=true and multiValued=false (_version_ does not exist),trace=org.apache.solr.common.SolrException: SolrCore 'collection1' is not available due to init failure: Unable to use updateLog: _version_ field must exist in schema, using indexed=true stored=true and multiValued=false (_version_ does not exist) at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:785) Move _version_ (and other _*_ fields) to right abpve the id field in the example schema.xml --- Key: SOLR-5253 URL: https://issues.apache.org/jira/browse/SOLR-5253 Project: Solr Issue Type: Improvement Affects Versions: 4.5, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Priority: Minor Attachments: SOLR-5253.patch, SOLR-5253.patch Minor, but it bugs me that it's so easy to try to remove all extraneous fields from schema.xml and shoot yourself in the foot. Now and forever more we should place all the special fields right at the top of the example schema. Trivial to do. True, we say a nice note _*_ fields are internal and required, but still. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5253) Move _version_ (and other _*_ fields) to right above the id field in the example schema.xml
[ https://issues.apache.org/jira/browse/SOLR-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787625#comment-13787625 ] Erick Erickson edited comment on SOLR-5253 at 10/6/13 2:47 PM: --- Reality is that if you try to start without _version_ in your schema Solr doesn't start. Stack traces all over the place and a non-functioning installation, can't query, can't get to localhost:8983:/solr, etc. At least that's what happened when I just tried it on trunk. Top-level error below. Or I screwed up again when I tested before committing this, that's been known to happen. So the current comments _do_ reflect reality. Whether it should behave this way is a different question, I was a bit surprised by this behavior as well. Perhaps another JIRA about making _version_ not required in non-cloud mode is in order? Hmmm, or I need to disable update log. Let me try that. As for id, as far as I know it's not required. Personally I'd like to _make_ it required since, as you say, so much breaks if you don't have it. But that's another JIRA too. {msg=SolrCore 'collection1' is not available due to init failure: Unable to use updateLog: _version_ field must exist in schema, using indexed=true stored=true and multiValued=false (_version_ does not exist),trace=org.apache.solr.common.SolrException: SolrCore 'collection1' is not available due to init failure: Unable to use updateLog: _version_ field must exist in schema, using indexed=true stored=true and multiValued=false (_version_ does not exist) at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:785) was (Author: erickerickson): Reality is that if you try to start without _version_ in your schema Solr doesn't start. Stack traces all over the place and a non-functioning installation, can't query, can't get to localhost:8983:/solr, etc. At least that's what happened when I just tried it on trunk. Top-level error below. Or I screwed up again when I tested before committing this, that's been known to happen. So the current comments _do_ reflect reality. Whether it should behave this way is a different question, I was a bit surprised by this behavior as well. Perhaps another JIRA about making _version_ not required in non-cloud mode is in order? As for id, as far as I know it's not required. Personally I'd like to _make_ it required since, as you say, so much breaks if you don't have it. But that's another JIRA too. {msg=SolrCore 'collection1' is not available due to init failure: Unable to use updateLog: _version_ field must exist in schema, using indexed=true stored=true and multiValued=false (_version_ does not exist),trace=org.apache.solr.common.SolrException: SolrCore 'collection1' is not available due to init failure: Unable to use updateLog: _version_ field must exist in schema, using indexed=true stored=true and multiValued=false (_version_ does not exist) at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:785) Move _version_ (and other _*_ fields) to right above the id field in the example schema.xml --- Key: SOLR-5253 URL: https://issues.apache.org/jira/browse/SOLR-5253 Project: Solr Issue Type: Improvement Affects Versions: 4.5, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Priority: Minor Attachments: SOLR-5253.patch, SOLR-5253.patch Minor, but it bugs me that it's so easy to try to remove all extraneous fields from schema.xml and shoot yourself in the foot. Now and forever more we should place all the special fields right at the top of the example schema. Trivial to do. True, we say a nice note _*_ fields are internal and required, but still. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5253) Move _version_ (and other _*_ fields) to right above the id field in the example schema.xml
[ https://issues.apache.org/jira/browse/SOLR-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787626#comment-13787626 ] Erick Erickson commented on SOLR-5253: -- Ahhh, crap. It's the update log. I'll change the comments. But see how easy it is to shoot oneself in the foot? Move _version_ (and other _*_ fields) to right above the id field in the example schema.xml --- Key: SOLR-5253 URL: https://issues.apache.org/jira/browse/SOLR-5253 Project: Solr Issue Type: Improvement Affects Versions: 4.5, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Priority: Minor Attachments: SOLR-5253.patch, SOLR-5253.patch Minor, but it bugs me that it's so easy to try to remove all extraneous fields from schema.xml and shoot yourself in the foot. Now and forever more we should place all the special fields right at the top of the example schema. Trivial to do. True, we say a nice note _*_ fields are internal and required, but still. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5253) Move _version_ (and other _*_ fields) to right above the id field in the example schema.xml
[ https://issues.apache.org/jira/browse/SOLR-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787627#comment-13787627 ] Erick Erickson commented on SOLR-5253: -- How about this? !-- If you remove this field, you must _also_ disable the update log in solrconfig.xml or Solr won't start. _version_ and update log are required for SolrCloud -- field name=_version_ type=long indexed=true stored=true/ !-- points to the root document of a block of nested documents. Required for nested document support, may be removed otherwise -- field name=_root_ type=string indexed=true stored=false/ !-- While not absoutely required, a uniqueKey is present in almost all Solr installations, only remove the id field if you have very good reason to. See the uniqueKey declaration below. It is _highly_ recommended that you leave this field in. -- field name=id type=string indexed=true stored=true required=true multiValued=false / Move _version_ (and other _*_ fields) to right above the id field in the example schema.xml --- Key: SOLR-5253 URL: https://issues.apache.org/jira/browse/SOLR-5253 Project: Solr Issue Type: Improvement Affects Versions: 4.5, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Priority: Minor Attachments: SOLR-5253.patch, SOLR-5253.patch Minor, but it bugs me that it's so easy to try to remove all extraneous fields from schema.xml and shoot yourself in the foot. Now and forever more we should place all the special fields right at the top of the example schema. Trivial to do. True, we say a nice note _*_ fields are internal and required, but still. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5189) Numeric DocValues Updates
[ https://issues.apache.org/jira/browse/LUCENE-5189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787630#comment-13787630 ] Robert Muir commented on LUCENE-5189: - If someone isn't doing numeric updates, and just using (e.g. memory docvalues), are we really re-loading docvalues for each segmentreader now versus holding it in the segment core? Numeric DocValues Updates - Key: LUCENE-5189 URL: https://issues.apache.org/jira/browse/LUCENE-5189 Project: Lucene - Core Issue Type: New Feature Components: core/index Reporter: Shai Erera Assignee: Shai Erera Attachments: LUCENE-5189-4x.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189_process_events.patch, LUCENE-5189_process_events.patch, LUCENE-5189-updates-order.patch, LUCENE-5189-updates-order.patch In LUCENE-4258 we started to work on incremental field updates, however the amount of changes are immense and hard to follow/consume. The reason is that we targeted postings, stored fields, DV etc., all from the get go. I'd like to start afresh here, with numeric-dv-field updates only. There are a couple of reasons to that: * NumericDV fields should be easier to update, if e.g. we write all the values of all the documents in a segment for the updated field (similar to how livedocs work, and previously norms). * It's a fairly contained issue, attempting to handle just one data type to update, yet requires many changes to core code which will also be useful for updating other data types. * It has value in and on itself, and we don't need to allow updating all the data types in Lucene at once ... we can do that gradually. I have some working patch already which I'll upload next, explaining the changes. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5023) deleteInstanceDir is added to CoreAdminHandler but can't be passed with solrj
[ https://issues.apache.org/jira/browse/SOLR-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787638#comment-13787638 ] Shalin Shekhar Mangar commented on SOLR-5023: - I forgot to link this issue to SOLR-4817. The attached test fails because the copySolrHomeToTemp methods don't work well with our test scripts. See https://issues.apache.org/jira/browse/SOLR-4817?focusedCommentId=13760008page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13760008 Until SOLR-4817 is fixed, we can separate the test into its own issue and commit the solrj changes. deleteInstanceDir is added to CoreAdminHandler but can't be passed with solrj - Key: SOLR-5023 URL: https://issues.apache.org/jira/browse/SOLR-5023 Project: Solr Issue Type: Improvement Components: multicore Affects Versions: 4.2.1 Reporter: Lyubov Romanchuk Assignee: Shalin Shekhar Mangar Fix For: 4.6 Attachments: SOLR-5023.patch, SOLR-5023.patch deleteInstanceDir is added to CoreAdminHandler but is not supported in Unload CoreAdminRequest -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5189) Numeric DocValues Updates
[ https://issues.apache.org/jira/browse/LUCENE-5189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787642#comment-13787642 ] Robert Muir commented on LUCENE-5189: - To get NRT working for docvalues again, I feel like the logic should be something like this in SegmentReader: fooDV(field X) { final Producer producer; if (gen(x) != core.gen(x)) { producer = this.producer; } else { producer = core.producer; } ... } this way, SR only opens up docvalues for ones that have been updated since the SCR was open, but otherwise uses the shared producer as before, and NRT works with docvalues again. Numeric DocValues Updates - Key: LUCENE-5189 URL: https://issues.apache.org/jira/browse/LUCENE-5189 Project: Lucene - Core Issue Type: New Feature Components: core/index Reporter: Shai Erera Assignee: Shai Erera Attachments: LUCENE-5189-4x.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189_process_events.patch, LUCENE-5189_process_events.patch, LUCENE-5189-updates-order.patch, LUCENE-5189-updates-order.patch In LUCENE-4258 we started to work on incremental field updates, however the amount of changes are immense and hard to follow/consume. The reason is that we targeted postings, stored fields, DV etc., all from the get go. I'd like to start afresh here, with numeric-dv-field updates only. There are a couple of reasons to that: * NumericDV fields should be easier to update, if e.g. we write all the values of all the documents in a segment for the updated field (similar to how livedocs work, and previously norms). * It's a fairly contained issue, attempting to handle just one data type to update, yet requires many changes to core code which will also be useful for updating other data types. * It has value in and on itself, and we don't need to allow updating all the data types in Lucene at once ... we can do that gradually. I have some working patch already which I'll upload next, explaining the changes. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5189) Numeric DocValues Updates
[ https://issues.apache.org/jira/browse/LUCENE-5189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787648#comment-13787648 ] Robert Muir commented on LUCENE-5189: - Or maybe its a simple oversight that that RefCount thing is in SR versus SCR? I dont understand why a SR would need refcounting, since its unmodifiable? Numeric DocValues Updates - Key: LUCENE-5189 URL: https://issues.apache.org/jira/browse/LUCENE-5189 Project: Lucene - Core Issue Type: New Feature Components: core/index Reporter: Shai Erera Assignee: Shai Erera Attachments: LUCENE-5189-4x.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189_process_events.patch, LUCENE-5189_process_events.patch, LUCENE-5189-updates-order.patch, LUCENE-5189-updates-order.patch In LUCENE-4258 we started to work on incremental field updates, however the amount of changes are immense and hard to follow/consume. The reason is that we targeted postings, stored fields, DV etc., all from the get go. I'd like to start afresh here, with numeric-dv-field updates only. There are a couple of reasons to that: * NumericDV fields should be easier to update, if e.g. we write all the values of all the documents in a segment for the updated field (similar to how livedocs work, and previously norms). * It's a fairly contained issue, attempting to handle just one data type to update, yet requires many changes to core code which will also be useful for updating other data types. * It has value in and on itself, and we don't need to allow updating all the data types in Lucene at once ... we can do that gradually. I have some working patch already which I'll upload next, explaining the changes. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5189) Numeric DocValues Updates
[ https://issues.apache.org/jira/browse/LUCENE-5189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787650#comment-13787650 ] Uwe Schindler commented on LUCENE-5189: --- Hi, to me the whole code in SegmentReader looks like too complicated and contains things that should *not* be in SegmentReader: SegmenReaders are unmodifiabe and final. Why does SegmentReader need to use refcounting? the Refcounting should be in SegmentCoreReaders, SegmentReaders should only have final fields and unmodifiable data structures, please no refcounts!. If a DirectoryReader is reopened you get a new instance of SegmentReader with same corecache key, why is it then having modifiable stuff? Maybe it is just moving stuff to SegmentCoreReaders, but I did so much hard work to keep this class simple when we removed modiications from IndexReaders in 4.x, thanks to Robert for the help! But now its as complicated or more complicated than before (see number of code lines in Lucene 3.6!). Numeric DocValues Updates - Key: LUCENE-5189 URL: https://issues.apache.org/jira/browse/LUCENE-5189 Project: Lucene - Core Issue Type: New Feature Components: core/index Reporter: Shai Erera Assignee: Shai Erera Attachments: LUCENE-5189-4x.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189_process_events.patch, LUCENE-5189_process_events.patch, LUCENE-5189-updates-order.patch, LUCENE-5189-updates-order.patch In LUCENE-4258 we started to work on incremental field updates, however the amount of changes are immense and hard to follow/consume. The reason is that we targeted postings, stored fields, DV etc., all from the get go. I'd like to start afresh here, with numeric-dv-field updates only. There are a couple of reasons to that: * NumericDV fields should be easier to update, if e.g. we write all the values of all the documents in a segment for the updated field (similar to how livedocs work, and previously norms). * It's a fairly contained issue, attempting to handle just one data type to update, yet requires many changes to core code which will also be useful for updating other data types. * It has value in and on itself, and we don't need to allow updating all the data types in Lucene at once ... we can do that gradually. I have some working patch already which I'll upload next, explaining the changes. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5253) Move _version_ (and other _*_ fields) to right above the id field in the example schema.xml
[ https://issues.apache.org/jira/browse/SOLR-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787661#comment-13787661 ] Robert Muir commented on SOLR-5253: --- This is better, thanks! Maybe we can reorder the phrases of uniqueKey piece, to emphasize more that you should keep it versus emphasizing that its not absolutely necessary, something like: !-- Only remove the id field if you have a very good reason to. While not strictly required, it is highly recommended / Move _version_ (and other _*_ fields) to right above the id field in the example schema.xml --- Key: SOLR-5253 URL: https://issues.apache.org/jira/browse/SOLR-5253 Project: Solr Issue Type: Improvement Affects Versions: 4.5, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Priority: Minor Attachments: SOLR-5253.patch, SOLR-5253.patch Minor, but it bugs me that it's so easy to try to remove all extraneous fields from schema.xml and shoot yourself in the foot. Now and forever more we should place all the special fields right at the top of the example schema. Trivial to do. True, we say a nice note _*_ fields are internal and required, but still. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-5253) Move _version_ (and other _*_ fields) to right above the id field in the example schema.xml
[ https://issues.apache.org/jira/browse/SOLR-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson resolved SOLR-5253. -- Resolution: Fixed Committing in a second. Move _version_ (and other _*_ fields) to right above the id field in the example schema.xml --- Key: SOLR-5253 URL: https://issues.apache.org/jira/browse/SOLR-5253 Project: Solr Issue Type: Improvement Affects Versions: 4.5, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Priority: Minor Attachments: SOLR-5253.patch, SOLR-5253.patch, SOLR-5253.patch Minor, but it bugs me that it's so easy to try to remove all extraneous fields from schema.xml and shoot yourself in the foot. Now and forever more we should place all the special fields right at the top of the example schema. Trivial to do. True, we say a nice note _*_ fields are internal and required, but still. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5253) Move _version_ (and other _*_ fields) to right above the id field in the example schema.xml
[ https://issues.apache.org/jira/browse/SOLR-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson updated SOLR-5253: - Attachment: SOLR-5253.patch Changing the comments as per the conversation with Robert. Move _version_ (and other _*_ fields) to right above the id field in the example schema.xml --- Key: SOLR-5253 URL: https://issues.apache.org/jira/browse/SOLR-5253 Project: Solr Issue Type: Improvement Affects Versions: 4.5, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Priority: Minor Attachments: SOLR-5253.patch, SOLR-5253.patch, SOLR-5253.patch Minor, but it bugs me that it's so easy to try to remove all extraneous fields from schema.xml and shoot yourself in the foot. Now and forever more we should place all the special fields right at the top of the example schema. Trivial to do. True, we say a nice note _*_ fields are internal and required, but still. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5253) Move _version_ (and other _*_ fields) to right above the id field in the example schema.xml
[ https://issues.apache.org/jira/browse/SOLR-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787669#comment-13787669 ] ASF subversion and git services commented on SOLR-5253: --- Commit 1529638 from [~erickoerickson] in branch 'dev/trunk' [ https://svn.apache.org/r1529638 ] SOLR-5253, rearrange example schema to make it more difficult to remove _version_ and other reserved fields by mistake Move _version_ (and other _*_ fields) to right above the id field in the example schema.xml --- Key: SOLR-5253 URL: https://issues.apache.org/jira/browse/SOLR-5253 Project: Solr Issue Type: Improvement Affects Versions: 4.5, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Priority: Minor Attachments: SOLR-5253.patch, SOLR-5253.patch, SOLR-5253.patch Minor, but it bugs me that it's so easy to try to remove all extraneous fields from schema.xml and shoot yourself in the foot. Now and forever more we should place all the special fields right at the top of the example schema. Trivial to do. True, we say a nice note _*_ fields are internal and required, but still. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5253) Move _version_ (and other _*_ fields) to right above the id field in the example schema.xml
[ https://issues.apache.org/jira/browse/SOLR-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787674#comment-13787674 ] ASF subversion and git services commented on SOLR-5253: --- Commit 1529641 from [~erickoerickson] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1529641 ] SOLR-5253, rearrange example schema to make it more difficult to remove _version_ and other reserved fields by mistake Move _version_ (and other _*_ fields) to right above the id field in the example schema.xml --- Key: SOLR-5253 URL: https://issues.apache.org/jira/browse/SOLR-5253 Project: Solr Issue Type: Improvement Affects Versions: 4.5, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Priority: Minor Attachments: SOLR-5253.patch, SOLR-5253.patch, SOLR-5253.patch Minor, but it bugs me that it's so easy to try to remove all extraneous fields from schema.xml and shoot yourself in the foot. Now and forever more we should place all the special fields right at the top of the example schema. Trivial to do. True, we say a nice note _*_ fields are internal and required, but still. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5189) Numeric DocValues Updates
[ https://issues.apache.org/jira/browse/LUCENE-5189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787683#comment-13787683 ] Shai Erera commented on LUCENE-5189: Let me explain: SCR is still used for all the shared objects between _all_ SRs, objects that are closed when the last of the SRs using this 'core' is closed. With gen'd DVPs, they no longer belong in SCR, because there isn't a single DVP that _all_ SRs share. For example, suppose that you start with an index without updates, then the DV fields gen is -1. So a DVP for field 'f' will be created. Next you update a field and reopen, the gen for 'f' is incremented to 1 and a new DVP is needed. The DVP for gen=-1 is no longer needed. Rob, if we did what you propose - store the initial DVPs in SCR and the updated ones in SRs (as they are updated and reopened), Soon the DVPs in SCR will not be used by any SR, and just hang there (possibly consuming expensive RAM, e.g. MemoryDVF) until the last SR is closed. Rather, DVPs *are shared* between SRs. RefCount is a simple object which keeps ref-counting for an object. When an SR is opened anew (e.g. initializes a new SCR), the DVPs it initializes all have RefCount(1). When an SR is opened by sharing another SR (e.g. NRT, DirReader.openIfChanged), it lists all the fields with DV. If the other SR has a DVP for their dvGen, it reuses it and inc-ref it, otherwise it opens a new DVP with RC(1). SR itself doesn't need any ref-counting, it's the DVPs that need them. Putting them in SCR I think only complicates things (or at least doesn't simplify). For example, currently when SR.doClose is called, it dec-ref all DVPs that it uses. And DocValuesRefCount.release() closes the DVP when its ref-count reaches 0. If we moved all the DVPs to SCR, then SR.doClose would need to go an dec-ref all DVPs that it uses in SCR .. but how does it know which DVP it uses if all DVPs just sit there in SCR - even ones that it doesn't use? I think that that they are in SR actually simplifies and keeps the code clear. Rob, if you don't use DV updates, all DV fields have dvGen=-1 and they are shared between all SRs. Numeric DocValues Updates - Key: LUCENE-5189 URL: https://issues.apache.org/jira/browse/LUCENE-5189 Project: Lucene - Core Issue Type: New Feature Components: core/index Reporter: Shai Erera Assignee: Shai Erera Attachments: LUCENE-5189-4x.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189_process_events.patch, LUCENE-5189_process_events.patch, LUCENE-5189-updates-order.patch, LUCENE-5189-updates-order.patch In LUCENE-4258 we started to work on incremental field updates, however the amount of changes are immense and hard to follow/consume. The reason is that we targeted postings, stored fields, DV etc., all from the get go. I'd like to start afresh here, with numeric-dv-field updates only. There are a couple of reasons to that: * NumericDV fields should be easier to update, if e.g. we write all the values of all the documents in a segment for the updated field (similar to how livedocs work, and previously norms). * It's a fairly contained issue, attempting to handle just one data type to update, yet requires many changes to core code which will also be useful for updating other data types. * It has value in and on itself, and we don't need to allow updating all the data types in Lucene at once ... we can do that gradually. I have some working patch already which I'll upload next, explaining the changes. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5189) Numeric DocValues Updates
[ https://issues.apache.org/jira/browse/LUCENE-5189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787688#comment-13787688 ] Shai Erera commented on LUCENE-5189: bq. because there isn't a single DVP that all SRs share let me clarify that -- if you don't use DV updates, you can say that the DVPs for gen=-1 _are_ shared between all the readers. But with updates, gen=-1 may soon be unused by any reader. I prefer for all DVPs management to be in one place than to split it between SCR and SR. Nevertheless, if you think this can be simplified somehow, I'm open to suggestions. I don't want to move them to SCR just for the sake of saying they are in SCR. The code which decides if to reuse a certain DVP or create a new one will still be in SR, and so it makes sense to me that SR manages them. Numeric DocValues Updates - Key: LUCENE-5189 URL: https://issues.apache.org/jira/browse/LUCENE-5189 Project: Lucene - Core Issue Type: New Feature Components: core/index Reporter: Shai Erera Assignee: Shai Erera Attachments: LUCENE-5189-4x.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189_process_events.patch, LUCENE-5189_process_events.patch, LUCENE-5189-updates-order.patch, LUCENE-5189-updates-order.patch In LUCENE-4258 we started to work on incremental field updates, however the amount of changes are immense and hard to follow/consume. The reason is that we targeted postings, stored fields, DV etc., all from the get go. I'd like to start afresh here, with numeric-dv-field updates only. There are a couple of reasons to that: * NumericDV fields should be easier to update, if e.g. we write all the values of all the documents in a segment for the updated field (similar to how livedocs work, and previously norms). * It's a fairly contained issue, attempting to handle just one data type to update, yet requires many changes to core code which will also be useful for updating other data types. * It has value in and on itself, and we don't need to allow updating all the data types in Lucene at once ... we can do that gradually. I have some working patch already which I'll upload next, explaining the changes. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5189) Numeric DocValues Updates
[ https://issues.apache.org/jira/browse/LUCENE-5189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787691#comment-13787691 ] Robert Muir commented on LUCENE-5189: - Thanks shai... I think for me, this is the main one: {quote} SR itself doesn't need any ref-counting, it's the DVPs that need them. Putting them in SCR I think only complicates things (or at least doesn't simplify). {quote} If SR itself doesnt need ref-counting, perhaps we can pull this out of SR then? (rote-refactor into DV-thingy or something). Numeric DocValues Updates - Key: LUCENE-5189 URL: https://issues.apache.org/jira/browse/LUCENE-5189 Project: Lucene - Core Issue Type: New Feature Components: core/index Reporter: Shai Erera Assignee: Shai Erera Attachments: LUCENE-5189-4x.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189_process_events.patch, LUCENE-5189_process_events.patch, LUCENE-5189-updates-order.patch, LUCENE-5189-updates-order.patch In LUCENE-4258 we started to work on incremental field updates, however the amount of changes are immense and hard to follow/consume. The reason is that we targeted postings, stored fields, DV etc., all from the get go. I'd like to start afresh here, with numeric-dv-field updates only. There are a couple of reasons to that: * NumericDV fields should be easier to update, if e.g. we write all the values of all the documents in a segment for the updated field (similar to how livedocs work, and previously norms). * It's a fairly contained issue, attempting to handle just one data type to update, yet requires many changes to core code which will also be useful for updating other data types. * It has value in and on itself, and we don't need to allow updating all the data types in Lucene at once ... we can do that gradually. I have some working patch already which I'll upload next, explaining the changes. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5189) Numeric DocValues Updates
[ https://issues.apache.org/jira/browse/LUCENE-5189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787730#comment-13787730 ] Shai Erera commented on LUCENE-5189: bq. If SR itself doesnt need ref-counting, perhaps we can pull this out of SR then? (rote-refactor into DV-thingy or something). You mean something like SegmentDocValues? It's doable I guess. SR would need to keep track of the DV.gens it uses though, so that in SR.doClose it can call segDV.decRef(gens) so that the latter can decRef all the DVPs that are used for these gens. If it also removes a gen from the map when it's no longer referenced by any SR, we don't need to take care of clearing the genDVP map when all SRs were closed (otherwise I think we'll need to refCount SegDV too, like SCR). I'll give it a shot. Numeric DocValues Updates - Key: LUCENE-5189 URL: https://issues.apache.org/jira/browse/LUCENE-5189 Project: Lucene - Core Issue Type: New Feature Components: core/index Reporter: Shai Erera Assignee: Shai Erera Attachments: LUCENE-5189-4x.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189_process_events.patch, LUCENE-5189_process_events.patch, LUCENE-5189-updates-order.patch, LUCENE-5189-updates-order.patch In LUCENE-4258 we started to work on incremental field updates, however the amount of changes are immense and hard to follow/consume. The reason is that we targeted postings, stored fields, DV etc., all from the get go. I'd like to start afresh here, with numeric-dv-field updates only. There are a couple of reasons to that: * NumericDV fields should be easier to update, if e.g. we write all the values of all the documents in a segment for the updated field (similar to how livedocs work, and previously norms). * It's a fairly contained issue, attempting to handle just one data type to update, yet requires many changes to core code which will also be useful for updating other data types. * It has value in and on itself, and we don't need to allow updating all the data types in Lucene at once ... we can do that gradually. I have some working patch already which I'll upload next, explaining the changes. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4685) JSON response write modification to support RAW JSON
[ https://issues.apache.org/jira/browse/SOLR-4685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787766#comment-13787766 ] Bill Bell commented on SOLR-4685: - Can we please commit this ? I am using it in PROD for last few months and it works great. Jack: XML, PHP, Ruby don't have this issue since if the field is XML, and you use wt=xml you get XML normally out of it. But when you set wt=json and have a field that is JSON string, it escapes everything. There is no hard in this. It just stops the escaping of any fields that end with json.fsuffix=_json - basically ending with _json. I need all the other features of wt=json, but I also need the ability to NOT escape a JSON string field. If someone could figure out a simple way that does not waste resources figuring out which fields are already JSON when you use wt=json, that would be preferrable - to turn off escaping of that field. But until we have that I am proposing this feature. Which has NO hard and it a simple feature to maintain - turn off escaping of a field when using wt=json. Can we vote on it? JSON response write modification to support RAW JSON Key: SOLR-4685 URL: https://issues.apache.org/jira/browse/SOLR-4685 Project: Solr Issue Type: Improvement Reporter: Bill Bell Assignee: Erik Hatcher Priority: Minor Attachments: SOLR-4685.1.patch If the field ends with _json allow the field to return raw JSON. For example the field, office_json -- string I already put into the field raw JSON already escaped. I want it to come with no double quotes and not escaped. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4692) JSON Field transformer for DIH
[ https://issues.apache.org/jira/browse/SOLR-4692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787767#comment-13787767 ] Bill Bell commented on SOLR-4692: - This is more of a contrib to DIH. Would love to see it added as a feature since it is simple. Take XML and convert to JSON and store it. IF not - I'll keep using it on my projects... No harm for me. JSON Field transformer for DIH -- Key: SOLR-4692 URL: https://issues.apache.org/jira/browse/SOLR-4692 Project: Solr Issue Type: Bug Reporter: Bill Bell Attachments: JSONTransformer.java, JSONTransform.jar, xml.jar This works in conjunction with SOLR-4685. Takes an XML field from SQL / manually and adds it as a JSON field. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5258) add distance function to expressions/
[ https://issues.apache.org/jira/browse/LUCENE-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787768#comment-13787768 ] ASF subversion and git services commented on LUCENE-5258: - Commit 1529679 from [~rcmuir] in branch 'dev/trunk' [ https://svn.apache.org/r1529679 ] LUCENE-5258: add distance function to expressions/ add distance function to expressions/ - Key: LUCENE-5258 URL: https://issues.apache.org/jira/browse/LUCENE-5258 Project: Lucene - Core Issue Type: New Feature Components: modules/other Reporter: Robert Muir Attachments: LUCENE-5258.patch Adding this static function makes it really easy to incorporate distance with the score or other signals in arbitrary ways, e.g. score / (1 + log(distance)) or whatever. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5258) add distance function to expressions/
[ https://issues.apache.org/jira/browse/LUCENE-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787771#comment-13787771 ] ASF subversion and git services commented on LUCENE-5258: - Commit 1529685 from [~rcmuir] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1529685 ] LUCENE-5258: add distance function to expressions/ add distance function to expressions/ - Key: LUCENE-5258 URL: https://issues.apache.org/jira/browse/LUCENE-5258 Project: Lucene - Core Issue Type: New Feature Components: modules/other Reporter: Robert Muir Fix For: 5.0, 4.6 Attachments: LUCENE-5258.patch Adding this static function makes it really easy to incorporate distance with the score or other signals in arbitrary ways, e.g. score / (1 + log(distance)) or whatever. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-5258) add distance function to expressions/
[ https://issues.apache.org/jira/browse/LUCENE-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-5258. - Resolution: Fixed Fix Version/s: 4.6 5.0 add distance function to expressions/ - Key: LUCENE-5258 URL: https://issues.apache.org/jira/browse/LUCENE-5258 Project: Lucene - Core Issue Type: New Feature Components: modules/other Reporter: Robert Muir Fix For: 5.0, 4.6 Attachments: LUCENE-5258.patch Adding this static function makes it really easy to incorporate distance with the score or other signals in arbitrary ways, e.g. score / (1 + log(distance)) or whatever. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4685) JSON response write modification to support RAW JSON
[ https://issues.apache.org/jira/browse/SOLR-4685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Bell updated SOLR-4685: Attachment: SOLR-4685.SOLR_4_5.patch For recent release of SOLR 4.5 JSON response write modification to support RAW JSON Key: SOLR-4685 URL: https://issues.apache.org/jira/browse/SOLR-4685 Project: Solr Issue Type: Improvement Reporter: Bill Bell Assignee: Erik Hatcher Priority: Minor Attachments: SOLR-4685.1.patch, SOLR-4685.SOLR_4_5.patch If the field ends with _json allow the field to return raw JSON. For example the field, office_json -- string I already put into the field raw JSON already escaped. I want it to come with no double quotes and not escaped. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5307) Solr 4.5 collection api ignores collection.configName when used in cloud
Nathan Neulinger created SOLR-5307: -- Summary: Solr 4.5 collection api ignores collection.configName when used in cloud Key: SOLR-5307 URL: https://issues.apache.org/jira/browse/SOLR-5307 Project: Solr Issue Type: Bug Affects Versions: 4.5 Reporter: Nathan Neulinger This worked properly in 4.4, but on 4.5, specifying collection.configName when creating a collection doesn't work - it gets the default regardless of what has been uploaded into zk. Explicitly linking config name to collection ahead of time with zkcli.sh is a workaround I'm using for the moment, but that did not appear to be necessary with 4.4 unless I was doing something wrong and not realizing it. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4465) Configurable Collectors
[ https://issues.apache.org/jira/browse/SOLR-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787784#comment-13787784 ] Joel Bernstein commented on SOLR-4465: -- Prabha, This patch is no longer under development. Part of the functionality has been split into smaller tickets, SOLR-5027 and SOLR-5047, which are still in development. Take a look at Solr grouping (http://wiki.apache.org/solr/FieldCollapsing) and the stats component (http://wiki.apache.org/solr/StatsComponent) which are available now and sound like they may meet your needs. Configurable Collectors --- Key: SOLR-4465 URL: https://issues.apache.org/jira/browse/SOLR-4465 Project: Solr Issue Type: New Feature Components: search Affects Versions: 4.1 Reporter: Joel Bernstein Fix For: 4.6 Attachments: SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch This ticket provides a patch to add pluggable collectors to Solr. This patch was generated and tested with Solr 4.1. This is how the patch functions: Collectors are plugged into Solr in the solconfig.xml using the new collectorFactory element. For example: collectorFactory name=default class=solr.CollectorFactory/ collectorFactory name=sum class=solr.SumCollectorFactory/ The elements above define two collector factories. The first one is the default collectorFactory. The class attribute points to org.apache.solr.handler.component.CollectorFactory, which implements logic that returns the default TopScoreDocCollector and TopFieldCollector. To create your own collectorFactory you must subclass the default CollectorFactory and at a minimum override the getCollector method to return your new collector. The parameter cl turns on pluggable collectors: cl=true If cl is not in the parameters, Solr will automatically use the default collectorFactory. *Pluggable Doclist Sorting With the Docs Collector* You can specify two types of pluggable collectors. The first type is the docs collector. For example: cl.docs=name The above param points to a named collectorFactory in the solrconfig.xml to construct the collector. The docs collectorFactorys must return a collector that extends the TopDocsCollector base class. Docs collectors are responsible for collecting the doclist. You can specify only one docs collector per query. You can pass parameters to the docs collector using local params syntax. For example: cl.docs=\{! sort=mycustomesort\}mycollector If cl=true and a docs collector is not specified, Solr will use the default collectorFactory to create the docs collector. *Pluggable Custom Analytics With Delegating Collectors* You can also specify any number of custom analytic collectors with the cl.analytic parameter. Analytic collectors are designed to collect something else besides the doclist. Typically this would be some type of custom analytic. For example: cl.analytic=sum The parameter above specifies a analytic collector named sum. Like the docs collectors, sum points to a named collectorFactory in the solrconfig.xml. You can specificy any number of analytic collectors by adding additional cl.analytic parameters. Analytic collector factories must return Collector instances that extend DelegatingCollector. A sample analytic collector is provided in the patch through the org.apache.solr.handler.component.SumCollectorFactory. This collectorFactory provides a very simple DelegatingCollector that groups by a field and sums a column of floats. The sum collector is not designed to be a fully functional sum function but to be a proof of concept for pluggable analytics through delegating collectors. You can send parameters to analytic collectors with solr local param syntax. For example: cl.analytic=\{! id=1 groupby=field1 column=field2\}sum The id parameter is mandatory for analytic collectors and is used to identify the output from the collector. In this example the groupby and column params tell the sum collector which field to group by and sum. Analytic collectors are passed a reference to the ResponseBuilder and can place maps with analytic output directory into the SolrQueryResponse with the add() method. Maps that are placed in the SolrQueryResponse are automatically added to the outgoing response. The response will include a list named cl.analytic.id, where id is specified in the local param. *Distributed Search* The CollectorFactory also has a method called merge(). This method aggregates the results from each of the shards during distributed
[jira] [Created] (LUCENE-5259) convert analysis consumers to try-with-resources
Robert Muir created LUCENE-5259: --- Summary: convert analysis consumers to try-with-resources Key: LUCENE-5259 URL: https://issues.apache.org/jira/browse/LUCENE-5259 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: LUCENE-5259.patch Most of these consumers' exception handling is questionable at best. if you do it with try-with-resources, then its obvious its correct: {code} try (TokenStream stream = analyzer.tokenStream(body, testing)) { stream.reset(); while (stream.incrementToken()) { ... } stream.end(); } {code} For trunk we should change the consumers to work this way. For 4.x we can simulate it with IOUtils -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5259) convert analysis consumers to try-with-resources
[ https://issues.apache.org/jira/browse/LUCENE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-5259: Attachment: LUCENE-5259.patch Here's a patch (still running tests). I think most of these consumers (except indexwriter) are all technically buggy today. But this also makes the code simpler, a lot of them had complicated AND buggy exception handling. convert analysis consumers to try-with-resources Key: LUCENE-5259 URL: https://issues.apache.org/jira/browse/LUCENE-5259 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: LUCENE-5259.patch Most of these consumers' exception handling is questionable at best. if you do it with try-with-resources, then its obvious its correct: {code} try (TokenStream stream = analyzer.tokenStream(body, testing)) { stream.reset(); while (stream.incrementToken()) { ... } stream.end(); } {code} For trunk we should change the consumers to work this way. For 4.x we can simulate it with IOUtils -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Reopened] (SOLR-5306) can not create collection when have over one config
[ https://issues.apache.org/jira/browse/SOLR-5306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Tianyu reopened SOLR-5306: Please see https://issues.apache.org/jira/browse/SOLR-5307,the same question. The correct url is::http://localhost:8080/solr/admin/collections?action=CREATEname=patent_main_1numShards=1collection.configName=patent. I have readed and debuged solr 4.5 source code,I find 'setParams(SolrParams params)' never used in 'CloudDescriptor' class.Although I specify collection.confgName parameters, in fact, did not work. can not create collection when have over one config --- Key: SOLR-5306 URL: https://issues.apache.org/jira/browse/SOLR-5306 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 4.5 Environment: win7 jdk 7 Reporter: Liang Tianyu Priority: Critical I have uploaded zookeeper two config: patent and applicant. I can not create collection:http://localhost:8080/solr/admin/collections?action=CREATEname=patent_main_1numShards=1collection.configName=patent.show errors:patent_main_1_shard1_replica1: org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException: Could not find configName for collection patent_main_1 found:[applicant, patent]. In solr 4.4 I can create sucessfully. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5258) add distance function to expressions/
[ https://issues.apache.org/jira/browse/LUCENE-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787832#comment-13787832 ] David Smiley commented on LUCENE-5258: -- Really cool Rob! It wasn't obvious in the issue description but after looking at your patch, I see that you bring in here some other open-source implementations of cosine and sine that are not quite as accurate but run much faster. I might find a way to use these routines in Lucene-Spatial / Spatial4j. add distance function to expressions/ - Key: LUCENE-5258 URL: https://issues.apache.org/jira/browse/LUCENE-5258 Project: Lucene - Core Issue Type: New Feature Components: modules/other Reporter: Robert Muir Fix For: 5.0, 4.6 Attachments: LUCENE-5258.patch Adding this static function makes it really easy to incorporate distance with the score or other signals in arbitrary ways, e.g. score / (1 + log(distance)) or whatever. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5306) can not create collection when have over one config
[ https://issues.apache.org/jira/browse/SOLR-5306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787852#comment-13787852 ] Liang Tianyu commented on SOLR-5306: I add some code at line 453 in class CoreAdminHandler: String opts = params.get(CoreAdminParams.CONFIG); CloudDescriptor cd = dcore.getCloudDescriptor(); if (cd != null) { cd.setParams(req.getParams()); opts = params.get(CoreAdminParams.COLLECTION); if (opts != null) cd.setCollectionName(opts); opts = params.get(CoreAdminParams.SHARD); if (opts != null) cd.setShardId(opts); opts = params.get(CoreAdminParams.SHARD_RANGE); if (opts != null) cd.setShardRange(opts); opts = params.get(CoreAdminParams.SHARD_STATE); if (opts != null) cd.setShardState(opts); opts = params.get(CoreAdminParams.ROLES); if (opts != null) cd.setRoles(opts); opts = params.get(CoreAdminParams.CORE_NODE_NAME); if (opts != null) cd.setCoreNodeName(opts); Integer numShards = params.getInt(ZkStateReader.NUM_SHARDS_PROP); if (numShards != null) cd.setNumShards(numShards); } test passed. can not create collection when have over one config --- Key: SOLR-5306 URL: https://issues.apache.org/jira/browse/SOLR-5306 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 4.5 Environment: win7 jdk 7 Reporter: Liang Tianyu Priority: Critical I have uploaded zookeeper two config: patent and applicant. I can not create collection:http://localhost:8080/solr/admin/collections?action=CREATEname=patent_main_1numShards=1collection.configName=patent.show errors:patent_main_1_shard1_replica1: org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException: Could not find configName for collection patent_main_1 found:[applicant, patent]. In solr 4.4 I can create sucessfully. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5260) Make older Suggesters more accepting of TermFreqPayloadIterator
Areek Zillur created LUCENE-5260: Summary: Make older Suggesters more accepting of TermFreqPayloadIterator Key: LUCENE-5260 URL: https://issues.apache.org/jira/browse/LUCENE-5260 Project: Lucene - Core Issue Type: Improvement Components: core/search Reporter: Areek Zillur As discussed in https://issues.apache.org/jira/browse/LUCENE-5251, it would be nice to make the older suggesters accepting of TermFreqPayloadIterator and throw an exception if payload is found (if it cannot be used). This will also allow us to nuke most of the other interfaces for BytesRefIterator. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5251) New Dictionary Implementation for Suggester consumption
[ https://issues.apache.org/jira/browse/LUCENE-5251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Areek Zillur updated LUCENE-5251: - Attachment: LUCENE-5251.patch Uploaded new patch (diff rather than git patch): - field is expected to have stringvalue(); payload is expected to binaryValue() - throw IAE if any of the fields do not have the desired value (including weight) - Use RandomIndexWriter for tests! - Added documentation that DocumentDictionary will not work with older suggesters I also opened another issue (https://issues.apache.org/jira/browse/LUCENE-5260) to make the new Dictionary work for the older suggesters! New Dictionary Implementation for Suggester consumption --- Key: LUCENE-5251 URL: https://issues.apache.org/jira/browse/LUCENE-5251 Project: Lucene - Core Issue Type: New Feature Components: core/search Reporter: Areek Zillur Attachments: LUCENE-5251.patch, LUCENE-5251.patch, LUCENE-5251.patch With the vast array of new suggester, It would be nice to have a dictionary implementation that could feed the suggesters terms, weights and (optionally) payloads from the lucene index. The idea of this dictionary implementation is to grab stored documents from the index and use user-configured fields for terms, weights and payloads. use-case: If you have a document with three fields - product_id - product_name - product_popularity_score then using this implementation would enable you to have a suggester for product_name using the weight of product_popularity_score and return you the payload of product_id, with which you can do further processing on (example: construct a url etc). -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5027) Result Set Collapse and Expand Plugins
[ https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787909#comment-13787909 ] Prabha Satya commented on SOLR-5027: Hi Joel, By the comments above I could make out that collapse plugin would allow us to do aggregations. But I am not sure whether this collapse plugin help me achieve something like this, I would express it in sql language for better understanding. Schema: === Student id subject marks Query: = Select subject,max(marks) from Student group by subject. Result Set Collapse and Expand Plugins -- Key: SOLR-5027 URL: https://issues.apache.org/jira/browse/SOLR-5027 Project: Solr Issue Type: New Feature Components: search Affects Versions: 5.0 Reporter: Joel Bernstein Priority: Minor Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch This ticket introduces two new Solr plugins, the *CollapsingQParserPlugin* and the *ExpandComponent*. The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. Collapse based on the highest scoring document: {code} fq=(!collapse field=field_name} {code} Collapse based on the min value of a numeric field: {code} fq={!collapse field=field_name min=field_name} {code} Collapse based on the max value of a numeric field: {code} fq={!collapse field=field_name max=field_name} {code} Collapse with a null policy: {code} fq={!collapse field=field_name nullPolicy=null_policy} {code} There are three null policies: ignore : removes docs with a null value in the collapse field (default). expand : treats each doc with a null value in the collapse field as a separate group. collapse : collapses all docs with a null value into a single group using either highest score, or min/max. The *ExpandComponent* is a search component that takes the collapsed docList and expands the groups for a single page based on parameters provided. Initial syntax: expand=true - Turns on the expand component. expand.field=field - Expands results for this field expand.limit=5 - Limits the documents for each expanded group. expand.sort=sort spec - The sort spec for the expanded documents. Default is score. expand.rows=500 - The max number of expanded results to bring back. Default is 500. *Note:* Recent patches don't contain the expand component. The July 16 patch does. This will be brought back in when the collapse is finished, or possible moved to it's own ticket. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org