[jira] [Commented] (SOLR-5374) Support user configured doc-centric versioning rules
[ https://issues.apache.org/jira/browse/SOLR-5374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394194#comment-14394194 ] Linbin Chen commented on SOLR-5374: --- good feature. extremely useful > Support user configured doc-centric versioning rules > > > Key: SOLR-5374 > URL: https://issues.apache.org/jira/browse/SOLR-5374 > Project: Solr > Issue Type: Improvement >Reporter: Hoss Man >Assignee: Hoss Man > Fix For: 4.6, Trunk > > Attachments: SOLR-5374.patch, SOLR-5374.patch, SOLR-5374.patch, > SOLR-5374.patch, SOLR-5374.patch, SOLR-5374.patch > > > The existing optimistic concurrency features of Solr can be very handy for > ensuring that you are only updating/replacing the version of the doc you > think you are updating/replacing, w/o the risk of someone else > adding/removing the doc in the mean time -- but I've recently encountered > some situations where I really wanted to be able to let the client specify an > arbitrary version, on a per document basis, (ie: generated by an external > system, or perhaps a timestamp of when a file was last modified) and ensure > that the corresponding document update was processed only if the "new" > version is greater then the "old" version -- w/o needing to check exactly > which version is currently in Solr. (ie: If a client wants to index version > 101 of a doc, that update should fail if version 102 is already in the index, > but succeed if the currently indexed version is 99 -- w/o the client needing > to ask Solr what the current version) > The idea Yonik brought up in SOLR-5298 (letting the client specify a > {{\_new\_version\_}} that would be used by the existing optimistic > concurrency code to control the assignment of the {{\_version\_}} field for > documents) looked like a good direction to go -- but after digging into the > way {{\_version\_}} is used internally I realized it requires a uniqueness > constraint across all update commands, that would make it impossible to allow > multiple independent documents to have the same {{\_version\_}}. > So instead I've tackled the problem in a different way, using an > UpdateProcessor that is configured with user defined field to track a > "DocBasedVersion" and uses the RTG logic to figure out if the update is > allowed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5374) Support user configured doc-centric versioning rules
[ https://issues.apache.org/jira/browse/SOLR-5374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13818637#comment-13818637 ] Yonik Seeley commented on SOLR-5374: bq. should we change the commented logging to log.debug? I only left them there (commented out) in case I needed to try and debug again in the short term. They are not of the quality one would want for long term. I'd rather they be deleted than changed to logs. > Support user configured doc-centric versioning rules > > > Key: SOLR-5374 > URL: https://issues.apache.org/jira/browse/SOLR-5374 > Project: Solr > Issue Type: Improvement >Reporter: Hoss Man >Assignee: Hoss Man > Fix For: 4.6, 5.0 > > Attachments: SOLR-5374.patch, SOLR-5374.patch, SOLR-5374.patch, > SOLR-5374.patch, SOLR-5374.patch, SOLR-5374.patch > > > The existing optimistic concurrency features of Solr can be very handy for > ensuring that you are only updating/replacing the version of the doc you > think you are updating/replacing, w/o the risk of someone else > adding/removing the doc in the mean time -- but I've recently encountered > some situations where I really wanted to be able to let the client specify an > arbitrary version, on a per document basis, (ie: generated by an external > system, or perhaps a timestamp of when a file was last modified) and ensure > that the corresponding document update was processed only if the "new" > version is greater then the "old" version -- w/o needing to check exactly > which version is currently in Solr. (ie: If a client wants to index version > 101 of a doc, that update should fail if version 102 is already in the index, > but succeed if the currently indexed version is 99 -- w/o the client needing > to ask Solr what the current version) > The idea Yonik brought up in SOLR-5298 (letting the client specify a > {{\_new\_version\_}} that would be used by the existing optimistic > concurrency code to control the assignment of the {{\_version\_}} field for > documents) looked like a good direction to go -- but after digging into the > way {{\_version\_}} is used internally I realized it requires a uniqueness > constraint across all update commands, that would make it impossible to allow > multiple independent documents to have the same {{\_version\_}}. > So instead I've tackled the problem in a different way, using an > UpdateProcessor that is configured with user defined field to track a > "DocBasedVersion" and uses the RTG logic to figure out if the update is > allowed. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5374) Support user configured doc-centric versioning rules
[ https://issues.apache.org/jira/browse/SOLR-5374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13818410#comment-13818410 ] Anshum Gupta commented on SOLR-5374: Just a thought, should we change the commented logging to log.debug? I'm assuming that's the intention behind leaving it in there. > Support user configured doc-centric versioning rules > > > Key: SOLR-5374 > URL: https://issues.apache.org/jira/browse/SOLR-5374 > Project: Solr > Issue Type: Improvement >Reporter: Hoss Man >Assignee: Hoss Man > Fix For: 4.6, 5.0 > > Attachments: SOLR-5374.patch, SOLR-5374.patch, SOLR-5374.patch, > SOLR-5374.patch, SOLR-5374.patch, SOLR-5374.patch > > > The existing optimistic concurrency features of Solr can be very handy for > ensuring that you are only updating/replacing the version of the doc you > think you are updating/replacing, w/o the risk of someone else > adding/removing the doc in the mean time -- but I've recently encountered > some situations where I really wanted to be able to let the client specify an > arbitrary version, on a per document basis, (ie: generated by an external > system, or perhaps a timestamp of when a file was last modified) and ensure > that the corresponding document update was processed only if the "new" > version is greater then the "old" version -- w/o needing to check exactly > which version is currently in Solr. (ie: If a client wants to index version > 101 of a doc, that update should fail if version 102 is already in the index, > but succeed if the currently indexed version is 99 -- w/o the client needing > to ask Solr what the current version) > The idea Yonik brought up in SOLR-5298 (letting the client specify a > {{\_new\_version\_}} that would be used by the existing optimistic > concurrency code to control the assignment of the {{\_version\_}} field for > documents) looked like a good direction to go -- but after digging into the > way {{\_version\_}} is used internally I realized it requires a uniqueness > constraint across all update commands, that would make it impossible to allow > multiple independent documents to have the same {{\_version\_}}. > So instead I've tackled the problem in a different way, using an > UpdateProcessor that is configured with user defined field to track a > "DocBasedVersion" and uses the RTG logic to figure out if the update is > allowed. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5374) Support user configured doc-centric versioning rules
[ https://issues.apache.org/jira/browse/SOLR-5374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13818193#comment-13818193 ] ASF subversion and git services commented on SOLR-5374: --- Commit 1540341 from [~yo...@apache.org] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1540341 ] SOLR-5374: missing returns in user versioning processor > Support user configured doc-centric versioning rules > > > Key: SOLR-5374 > URL: https://issues.apache.org/jira/browse/SOLR-5374 > Project: Solr > Issue Type: Improvement >Reporter: Hoss Man >Assignee: Hoss Man > Fix For: 4.6, 5.0 > > Attachments: SOLR-5374.patch, SOLR-5374.patch, SOLR-5374.patch, > SOLR-5374.patch, SOLR-5374.patch, SOLR-5374.patch > > > The existing optimistic concurrency features of Solr can be very handy for > ensuring that you are only updating/replacing the version of the doc you > think you are updating/replacing, w/o the risk of someone else > adding/removing the doc in the mean time -- but I've recently encountered > some situations where I really wanted to be able to let the client specify an > arbitrary version, on a per document basis, (ie: generated by an external > system, or perhaps a timestamp of when a file was last modified) and ensure > that the corresponding document update was processed only if the "new" > version is greater then the "old" version -- w/o needing to check exactly > which version is currently in Solr. (ie: If a client wants to index version > 101 of a doc, that update should fail if version 102 is already in the index, > but succeed if the currently indexed version is 99 -- w/o the client needing > to ask Solr what the current version) > The idea Yonik brought up in SOLR-5298 (letting the client specify a > {{\_new\_version\_}} that would be used by the existing optimistic > concurrency code to control the assignment of the {{\_version\_}} field for > documents) looked like a good direction to go -- but after digging into the > way {{\_version\_}} is used internally I realized it requires a uniqueness > constraint across all update commands, that would make it impossible to allow > multiple independent documents to have the same {{\_version\_}}. > So instead I've tackled the problem in a different way, using an > UpdateProcessor that is configured with user defined field to track a > "DocBasedVersion" and uses the RTG logic to figure out if the update is > allowed. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5374) Support user configured doc-centric versioning rules
[ https://issues.apache.org/jira/browse/SOLR-5374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13818178#comment-13818178 ] ASF subversion and git services commented on SOLR-5374: --- Commit 1540336 from [~yo...@apache.org] in branch 'dev/trunk' [ https://svn.apache.org/r1540336 ] SOLR-5374: missing returns in user versioning processor > Support user configured doc-centric versioning rules > > > Key: SOLR-5374 > URL: https://issues.apache.org/jira/browse/SOLR-5374 > Project: Solr > Issue Type: Improvement >Reporter: Hoss Man >Assignee: Hoss Man > Fix For: 4.6, 5.0 > > Attachments: SOLR-5374.patch, SOLR-5374.patch, SOLR-5374.patch, > SOLR-5374.patch, SOLR-5374.patch, SOLR-5374.patch > > > The existing optimistic concurrency features of Solr can be very handy for > ensuring that you are only updating/replacing the version of the doc you > think you are updating/replacing, w/o the risk of someone else > adding/removing the doc in the mean time -- but I've recently encountered > some situations where I really wanted to be able to let the client specify an > arbitrary version, on a per document basis, (ie: generated by an external > system, or perhaps a timestamp of when a file was last modified) and ensure > that the corresponding document update was processed only if the "new" > version is greater then the "old" version -- w/o needing to check exactly > which version is currently in Solr. (ie: If a client wants to index version > 101 of a doc, that update should fail if version 102 is already in the index, > but succeed if the currently indexed version is 99 -- w/o the client needing > to ask Solr what the current version) > The idea Yonik brought up in SOLR-5298 (letting the client specify a > {{\_new\_version\_}} that would be used by the existing optimistic > concurrency code to control the assignment of the {{\_version\_}} field for > documents) looked like a good direction to go -- but after digging into the > way {{\_version\_}} is used internally I realized it requires a uniqueness > constraint across all update commands, that would make it impossible to allow > multiple independent documents to have the same {{\_version\_}}. > So instead I've tackled the problem in a different way, using an > UpdateProcessor that is configured with user defined field to track a > "DocBasedVersion" and uses the RTG logic to figure out if the update is > allowed. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5374) Support user configured doc-centric versioning rules
[ https://issues.apache.org/jira/browse/SOLR-5374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13810747#comment-13810747 ] ASF subversion and git services commented on SOLR-5374: --- Commit 1537706 from [~yo...@apache.org] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1537706 ] SOLR-5374: fix unnamed thread pool > Support user configured doc-centric versioning rules > > > Key: SOLR-5374 > URL: https://issues.apache.org/jira/browse/SOLR-5374 > Project: Solr > Issue Type: Improvement >Reporter: Hoss Man >Assignee: Hoss Man > Fix For: 4.6, 5.0 > > Attachments: SOLR-5374.patch, SOLR-5374.patch, SOLR-5374.patch, > SOLR-5374.patch, SOLR-5374.patch > > > The existing optimistic concurrency features of Solr can be very handy for > ensuring that you are only updating/replacing the version of the doc you > think you are updating/replacing, w/o the risk of someone else > adding/removing the doc in the mean time -- but I've recently encountered > some situations where I really wanted to be able to let the client specify an > arbitrary version, on a per document basis, (ie: generated by an external > system, or perhaps a timestamp of when a file was last modified) and ensure > that the corresponding document update was processed only if the "new" > version is greater then the "old" version -- w/o needing to check exactly > which version is currently in Solr. (ie: If a client wants to index version > 101 of a doc, that update should fail if version 102 is already in the index, > but succeed if the currently indexed version is 99 -- w/o the client needing > to ask Solr what the current version) > The idea Yonik brought up in SOLR-5298 (letting the client specify a > {{\_new\_version\_}} that would be used by the existing optimistic > concurrency code to control the assignment of the {{\_version\_}} field for > documents) looked like a good direction to go -- but after digging into the > way {{\_version\_}} is used internally I realized it requires a uniqueness > constraint across all update commands, that would make it impossible to allow > multiple independent documents to have the same {{\_version\_}}. > So instead I've tackled the problem in a different way, using an > UpdateProcessor that is configured with user defined field to track a > "DocBasedVersion" and uses the RTG logic to figure out if the update is > allowed. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5374) Support user configured doc-centric versioning rules
[ https://issues.apache.org/jira/browse/SOLR-5374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13810745#comment-13810745 ] ASF subversion and git services commented on SOLR-5374: --- Commit 1537704 from [~yo...@apache.org] in branch 'dev/trunk' [ https://svn.apache.org/r1537704 ] SOLR-5374: fix unnamed thread pool > Support user configured doc-centric versioning rules > > > Key: SOLR-5374 > URL: https://issues.apache.org/jira/browse/SOLR-5374 > Project: Solr > Issue Type: Improvement >Reporter: Hoss Man >Assignee: Hoss Man > Fix For: 4.6, 5.0 > > Attachments: SOLR-5374.patch, SOLR-5374.patch, SOLR-5374.patch, > SOLR-5374.patch, SOLR-5374.patch > > > The existing optimistic concurrency features of Solr can be very handy for > ensuring that you are only updating/replacing the version of the doc you > think you are updating/replacing, w/o the risk of someone else > adding/removing the doc in the mean time -- but I've recently encountered > some situations where I really wanted to be able to let the client specify an > arbitrary version, on a per document basis, (ie: generated by an external > system, or perhaps a timestamp of when a file was last modified) and ensure > that the corresponding document update was processed only if the "new" > version is greater then the "old" version -- w/o needing to check exactly > which version is currently in Solr. (ie: If a client wants to index version > 101 of a doc, that update should fail if version 102 is already in the index, > but succeed if the currently indexed version is 99 -- w/o the client needing > to ask Solr what the current version) > The idea Yonik brought up in SOLR-5298 (letting the client specify a > {{\_new\_version\_}} that would be used by the existing optimistic > concurrency code to control the assignment of the {{\_version\_}} field for > documents) looked like a good direction to go -- but after digging into the > way {{\_version\_}} is used internally I realized it requires a uniqueness > constraint across all update commands, that would make it impossible to allow > multiple independent documents to have the same {{\_version\_}}. > So instead I've tackled the problem in a different way, using an > UpdateProcessor that is configured with user defined field to track a > "DocBasedVersion" and uses the RTG logic to figure out if the update is > allowed. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5374) Support user configured doc-centric versioning rules
[ https://issues.apache.org/jira/browse/SOLR-5374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13810602#comment-13810602 ] ASF subversion and git services commented on SOLR-5374: --- Commit 1537597 from [~yo...@apache.org] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1537597 ] SOLR-5374: user version update processor > Support user configured doc-centric versioning rules > > > Key: SOLR-5374 > URL: https://issues.apache.org/jira/browse/SOLR-5374 > Project: Solr > Issue Type: Improvement >Reporter: Hoss Man >Assignee: Hoss Man > Attachments: SOLR-5374.patch, SOLR-5374.patch, SOLR-5374.patch, > SOLR-5374.patch, SOLR-5374.patch > > > The existing optimistic concurrency features of Solr can be very handy for > ensuring that you are only updating/replacing the version of the doc you > think you are updating/replacing, w/o the risk of someone else > adding/removing the doc in the mean time -- but I've recently encountered > some situations where I really wanted to be able to let the client specify an > arbitrary version, on a per document basis, (ie: generated by an external > system, or perhaps a timestamp of when a file was last modified) and ensure > that the corresponding document update was processed only if the "new" > version is greater then the "old" version -- w/o needing to check exactly > which version is currently in Solr. (ie: If a client wants to index version > 101 of a doc, that update should fail if version 102 is already in the index, > but succeed if the currently indexed version is 99 -- w/o the client needing > to ask Solr what the current version) > The idea Yonik brought up in SOLR-5298 (letting the client specify a > {{\_new\_version\_}} that would be used by the existing optimistic > concurrency code to control the assignment of the {{\_version\_}} field for > documents) looked like a good direction to go -- but after digging into the > way {{\_version\_}} is used internally I realized it requires a uniqueness > constraint across all update commands, that would make it impossible to allow > multiple independent documents to have the same {{\_version\_}}. > So instead I've tackled the problem in a different way, using an > UpdateProcessor that is configured with user defined field to track a > "DocBasedVersion" and uses the RTG logic to figure out if the update is > allowed. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5374) Support user configured doc-centric versioning rules
[ https://issues.apache.org/jira/browse/SOLR-5374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13810572#comment-13810572 ] ASF subversion and git services commented on SOLR-5374: --- Commit 1537587 from [~yo...@apache.org] in branch 'dev/trunk' [ https://svn.apache.org/r1537587 ] SOLR-5374: user version update processor > Support user configured doc-centric versioning rules > > > Key: SOLR-5374 > URL: https://issues.apache.org/jira/browse/SOLR-5374 > Project: Solr > Issue Type: Improvement >Reporter: Hoss Man >Assignee: Hoss Man > Attachments: SOLR-5374.patch, SOLR-5374.patch, SOLR-5374.patch, > SOLR-5374.patch, SOLR-5374.patch > > > The existing optimistic concurrency features of Solr can be very handy for > ensuring that you are only updating/replacing the version of the doc you > think you are updating/replacing, w/o the risk of someone else > adding/removing the doc in the mean time -- but I've recently encountered > some situations where I really wanted to be able to let the client specify an > arbitrary version, on a per document basis, (ie: generated by an external > system, or perhaps a timestamp of when a file was last modified) and ensure > that the corresponding document update was processed only if the "new" > version is greater then the "old" version -- w/o needing to check exactly > which version is currently in Solr. (ie: If a client wants to index version > 101 of a doc, that update should fail if version 102 is already in the index, > but succeed if the currently indexed version is 99 -- w/o the client needing > to ask Solr what the current version) > The idea Yonik brought up in SOLR-5298 (letting the client specify a > {{\_new\_version\_}} that would be used by the existing optimistic > concurrency code to control the assignment of the {{\_version\_}} field for > documents) looked like a good direction to go -- but after digging into the > way {{\_version\_}} is used internally I realized it requires a uniqueness > constraint across all update commands, that would make it impossible to allow > multiple independent documents to have the same {{\_version\_}}. > So instead I've tackled the problem in a different way, using an > UpdateProcessor that is configured with user defined field to track a > "DocBasedVersion" and uses the RTG logic to figure out if the update is > allowed. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5374) Support user configured doc-centric versioning rules
[ https://issues.apache.org/jira/browse/SOLR-5374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809714#comment-13809714 ] Yonik Seeley commented on SOLR-5374: Linking to SOLR-5406, which hopefully is the only issue stopping this from fully working. > Support user configured doc-centric versioning rules > > > Key: SOLR-5374 > URL: https://issues.apache.org/jira/browse/SOLR-5374 > Project: Solr > Issue Type: Improvement >Reporter: Hoss Man >Assignee: Hoss Man > Attachments: SOLR-5374.patch, SOLR-5374.patch, SOLR-5374.patch > > > The existing optimistic concurrency features of Solr can be very handy for > ensuring that you are only updating/replacing the version of the doc you > think you are updating/replacing, w/o the risk of someone else > adding/removing the doc in the mean time -- but I've recently encountered > some situations where I really wanted to be able to let the client specify an > arbitrary version, on a per document basis, (ie: generated by an external > system, or perhaps a timestamp of when a file was last modified) and ensure > that the corresponding document update was processed only if the "new" > version is greater then the "old" version -- w/o needing to check exactly > which version is currently in Solr. (ie: If a client wants to index version > 101 of a doc, that update should fail if version 102 is already in the index, > but succeed if the currently indexed version is 99 -- w/o the client needing > to ask Solr what the current version) > The idea Yonik brought up in SOLR-5298 (letting the client specify a > {{\_new\_version\_}} that would be used by the existing optimistic > concurrency code to control the assignment of the {{\_version\_}} field for > documents) looked like a good direction to go -- but after digging into the > way {{\_version\_}} is used internally I realized it requires a uniqueness > constraint across all update commands, that would make it impossible to allow > multiple independent documents to have the same {{\_version\_}}. > So instead I've tackled the problem in a different way, using an > UpdateProcessor that is configured with user defined field to track a > "DocBasedVersion" and uses the RTG logic to figure out if the update is > allowed. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5374) Support user configured doc-centric versioning rules
[ https://issues.apache.org/jira/browse/SOLR-5374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809559#comment-13809559 ] Yonik Seeley commented on SOLR-5374: hmmm, in SolrCloud mode, somewhere in the mix del_version is being dropped. Not sure where yet... > Support user configured doc-centric versioning rules > > > Key: SOLR-5374 > URL: https://issues.apache.org/jira/browse/SOLR-5374 > Project: Solr > Issue Type: Improvement >Reporter: Hoss Man >Assignee: Hoss Man > Attachments: SOLR-5374.patch, SOLR-5374.patch, SOLR-5374.patch > > > The existing optimistic concurrency features of Solr can be very handy for > ensuring that you are only updating/replacing the version of the doc you > think you are updating/replacing, w/o the risk of someone else > adding/removing the doc in the mean time -- but I've recently encountered > some situations where I really wanted to be able to let the client specify an > arbitrary version, on a per document basis, (ie: generated by an external > system, or perhaps a timestamp of when a file was last modified) and ensure > that the corresponding document update was processed only if the "new" > version is greater then the "old" version -- w/o needing to check exactly > which version is currently in Solr. (ie: If a client wants to index version > 101 of a doc, that update should fail if version 102 is already in the index, > but succeed if the currently indexed version is 99 -- w/o the client needing > to ask Solr what the current version) > The idea Yonik brought up in SOLR-5298 (letting the client specify a > {{\_new\_version\_}} that would be used by the existing optimistic > concurrency code to control the assignment of the {{\_version\_}} field for > documents) looked like a good direction to go -- but after digging into the > way {{\_version\_}} is used internally I realized it requires a uniqueness > constraint across all update commands, that would make it impossible to allow > multiple independent documents to have the same {{\_version\_}}. > So instead I've tackled the problem in a different way, using an > UpdateProcessor that is configured with user defined field to track a > "DocBasedVersion" and uses the RTG logic to figure out if the update is > allowed. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5374) Support user configured doc-centric versioning rules
[ https://issues.apache.org/jira/browse/SOLR-5374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13806484#comment-13806484 ] Yonik Seeley commented on SOLR-5374: Ideally, this code would run on the leader for the shard. I've opened SOLR-5395 as one step to allow that. > Support user configured doc-centric versioning rules > > > Key: SOLR-5374 > URL: https://issues.apache.org/jira/browse/SOLR-5374 > Project: Solr > Issue Type: Improvement >Reporter: Hoss Man >Assignee: Hoss Man > Attachments: SOLR-5374.patch, SOLR-5374.patch > > > The existing optimistic concurrency features of Solr can be very handy for > ensuring that you are only updating/replacing the version of the doc you > think you are updating/replacing, w/o the risk of someone else > adding/removing the doc in the mean time -- but I've recently encountered > some situations where I really wanted to be able to let the client specify an > arbitrary version, on a per document basis, (ie: generated by an external > system, or perhaps a timestamp of when a file was last modified) and ensure > that the corresponding document update was processed only if the "new" > version is greater then the "old" version -- w/o needing to check exactly > which version is currently in Solr. (ie: If a client wants to index version > 101 of a doc, that update should fail if version 102 is already in the index, > but succeed if the currently indexed version is 99 -- w/o the client needing > to ask Solr what the current version) > The idea Yonik brought up in SOLR-5298 (letting the client specify a > {{\_new\_version\_}} that would be used by the existing optimistic > concurrency code to control the assignment of the {{\_version\_}} field for > documents) looked like a good direction to go -- but after digging into the > way {{\_version\_}} is used internally I realized it requires a uniqueness > constraint across all update commands, that would make it impossible to allow > multiple independent documents to have the same {{\_version\_}}. > So instead I've tackled the problem in a different way, using an > UpdateProcessor that is configured with user defined field to track a > "DocBasedVersion" and uses the RTG logic to figure out if the update is > allowed. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5374) Support user configured doc-centric versioning rules
[ https://issues.apache.org/jira/browse/SOLR-5374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13806370#comment-13806370 ] Yonik Seeley commented on SOLR-5374: Hmmm, testConcurrentAdds fails even if I change the executor size to 1 thread... not sure why at this point. > Support user configured doc-centric versioning rules > > > Key: SOLR-5374 > URL: https://issues.apache.org/jira/browse/SOLR-5374 > Project: Solr > Issue Type: Improvement >Reporter: Hoss Man >Assignee: Hoss Man > Attachments: SOLR-5374.patch, SOLR-5374.patch > > > The existing optimistic concurrency features of Solr can be very handy for > ensuring that you are only updating/replacing the version of the doc you > think you are updating/replacing, w/o the risk of someone else > adding/removing the doc in the mean time -- but I've recently encountered > some situations where I really wanted to be able to let the client specify an > arbitrary version, on a per document basis, (ie: generated by an external > system, or perhaps a timestamp of when a file was last modified) and ensure > that the corresponding document update was processed only if the "new" > version is greater then the "old" version -- w/o needing to check exactly > which version is currently in Solr. (ie: If a client wants to index version > 101 of a doc, that update should fail if version 102 is already in the index, > but succeed if the currently indexed version is 99 -- w/o the client needing > to ask Solr what the current version) > The idea Yonik brought up in SOLR-5298 (letting the client specify a > {{\_new\_version\_}} that would be used by the existing optimistic > concurrency code to control the assignment of the {{\_version\_}} field for > documents) looked like a good direction to go -- but after digging into the > way {{\_version\_}} is used internally I realized it requires a uniqueness > constraint across all update commands, that would make it impossible to allow > multiple independent documents to have the same {{\_version\_}}. > So instead I've tackled the problem in a different way, using an > UpdateProcessor that is configured with user defined field to track a > "DocBasedVersion" and uses the RTG logic to figure out if the update is > allowed. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5374) Support user configured doc-centric versioning rules
[ https://issues.apache.org/jira/browse/SOLR-5374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13804391#comment-13804391 ] Hoss Man commented on SOLR-5374: bq. It seems like concurrency is not yet handled? Under concurrent updates, the patch won't guarantee the correct ordering. You're right ... i hadn't considered that. bq. Also, it looks like the current code assumes it's running on the leader? The realtime-get done is local only, ... It is?!?! ... I didn't realize that. (but i also hadn't had a chance to add a test for it) That must just be because of the convenience method i used correct? Obviously the RTG Component has a way to fetch the document even if you don't hit the correct shard (I hope! for bigger reasons then this patch). The only way i can think of to address your concurrency concern is by forcing this logic to run on hte leader (not sure if you have an alternative idea: I'm not following your "or optimistic concurrency." suggestion) in which case if we solve that problem, we should automatically solve the "current code assumes it's running on the leader?" correct? Unless i'm missing something, we still don't have an easy generic way to say "run this code _only_ on the leader" -- not w/o modifying DistributedUpdateProcessor i don't think -- but IIUC the distributed update code first ensures that the update succeeds on the leader before forwarding to the replicas, correct? Perhaps we couldtweak the logic of DocBasedVersionConstraintsProcessor so it's configured to run _after_ DistributedUpdateProcessor. On the leader it would use uniqueKey based locking around the existing logic, and throw an error if the constrain wasn't satisfied - preventing the leader from ever forwarding to the replicas. On the replicas it would just be a no-op. The "ignoreOldUpdates" would have to be rippped out, but it could easily be refactored into a little convenience processor that could run _before_ DistributedUpdateProcessor so that if enabled it would catch all 409 errors and swallow them. (which could be geenrally re-usable with the existing optimisitc concurrency feature as well if people want to ignore those conflicts as well) The only thing i'm not sure about how to deal with if we go this direction is supporting the DeleteUpdateCommand -> AddUpdateCommand logic. Because if that happens on the leader _after_ DistributedUpdateProcessor I don't think it will affect the commands that get forwarded to the replicas. (will it?) > Support user configured doc-centric versioning rules > > > Key: SOLR-5374 > URL: https://issues.apache.org/jira/browse/SOLR-5374 > Project: Solr > Issue Type: Improvement >Reporter: Hoss Man >Assignee: Hoss Man > Attachments: SOLR-5374.patch > > > The existing optimistic concurrency features of Solr can be very handy for > ensuring that you are only updating/replacing the version of the doc you > think you are updating/replacing, w/o the risk of someone else > adding/removing the doc in the mean time -- but I've recently encountered > some situations where I really wanted to be able to let the client specify an > arbitrary version, on a per document basis, (ie: generated by an external > system, or perhaps a timestamp of when a file was last modified) and ensure > that the corresponding document update was processed only if the "new" > version is greater then the "old" version -- w/o needing to check exactly > which version is currently in Solr. (ie: If a client wants to index version > 101 of a doc, that update should fail if version 102 is already in the index, > but succeed if the currently indexed version is 99 -- w/o the client needing > to ask Solr what the current version) > The idea Yonik brought up in SOLR-5298 (letting the client specify a > {{\_new\_version\_}} that would be used by the existing optimistic > concurrency code to control the assignment of the {{\_version\_}} field for > documents) looked like a good direction to go -- but after digging into the > way {{\_version\_}} is used internally I realized it requires a uniqueness > constraint across all update commands, that would make it impossible to allow > multiple independent documents to have the same {{\_version\_}}. > So instead I've tackled the problem in a different way, using an > UpdateProcessor that is configured with user defined field to track a > "DocBasedVersion" and uses the RTG logic to figure out if the update is > allowed. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5374) Support user configured doc-centric versioning rules
[ https://issues.apache.org/jira/browse/SOLR-5374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13803636#comment-13803636 ] Yonik Seeley commented on SOLR-5374: It seems like concurrency is not yet handled? Under concurrent updates, the patch won't guarantee the correct ordering. {code} Thread 1: update with version=10, check version on doc A, returns 5 Thread 2: update with version=11, check version on doc A, returns 5 Thread 2: update completes with version 11 Thread 1: update completes with version 10 {code} There's going to need to be either some sort of synchronization or optimistic concurrency. Also, it looks like the current code assumes it's running on the leader? The realtime-get done is local only, and if you hit the wrong shard with the request, it will look like the doc doesn't exist yet. > Support user configured doc-centric versioning rules > > > Key: SOLR-5374 > URL: https://issues.apache.org/jira/browse/SOLR-5374 > Project: Solr > Issue Type: Improvement >Reporter: Hoss Man >Assignee: Hoss Man > Attachments: SOLR-5374.patch > > > The existing optimistic concurrency features of Solr can be very handy for > ensuring that you are only updating/replacing the version of the doc you > think you are updating/replacing, w/o the risk of someone else > adding/removing the doc in the mean time -- but I've recently encountered > some situations where I really wanted to be able to let the client specify an > arbitrary version, on a per document basis, (ie: generated by an external > system, or perhaps a timestamp of when a file was last modified) and ensure > that the corresponding document update was processed only if the "new" > version is greater then the "old" version -- w/o needing to check exactly > which version is currently in Solr. (ie: If a client wants to index version > 101 of a doc, that update should fail if version 102 is already in the index, > but succeed if the currently indexed version is 99 -- w/o the client needing > to ask Solr what the current version) > The idea Yonik brought up in SOLR-5298 (letting the client specify a > {{\_new\_version\_}} that would be used by the existing optimistic > concurrency code to control the assignment of the {{\_version\_}} field for > documents) looked like a good direction to go -- but after digging into the > way {{\_version\_}} is used internally I realized it requires a uniqueness > constraint across all update commands, that would make it impossible to allow > multiple independent documents to have the same {{\_version\_}}. > So instead I've tackled the problem in a different way, using an > UpdateProcessor that is configured with user defined field to track a > "DocBasedVersion" and uses the RTG logic to figure out if the update is > allowed. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5374) Support user configured doc-centric versioning rules
[ https://issues.apache.org/jira/browse/SOLR-5374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13802613#comment-13802613 ] Ramkumar Aiyengar commented on SOLR-5374: - My bad, I didn't quite get the Optimistic Concurrency feature, it would indeed do what I was describing. Thanks for the link. > Support user configured doc-centric versioning rules > > > Key: SOLR-5374 > URL: https://issues.apache.org/jira/browse/SOLR-5374 > Project: Solr > Issue Type: Improvement >Reporter: Hoss Man >Assignee: Hoss Man > Attachments: SOLR-5374.patch > > > The existing optimistic concurrency features of Solr can be very handy for > ensuring that you are only updating/replacing the version of the doc you > think you are updating/replacing, w/o the risk of someone else > adding/removing the doc in the mean time -- but I've recently encountered > some situations where I really wanted to be able to let the client specify an > arbitrary version, on a per document basis, (ie: generated by an external > system, or perhaps a timestamp of when a file was last modified) and ensure > that the corresponding document update was processed only if the "new" > version is greater then the "old" version -- w/o needing to check exactly > which version is currently in Solr. (ie: If a client wants to index version > 101 of a doc, that update should fail if version 102 is already in the index, > but succeed if the currently indexed version is 99 -- w/o the client needing > to ask Solr what the current version) > The idea Yonik brought up in SOLR-5298 (letting the client specify a > {{\_new\_version\_}} that would be used by the existing optimistic > concurrency code to control the assignment of the {{\_version\_}} field for > documents) looked like a good direction to go -- but after digging into the > way {{\_version\_}} is used internally I realized it requires a uniqueness > constraint across all update commands, that would make it impossible to allow > multiple independent documents to have the same {{\_version\_}}. > So instead I've tackled the problem in a different way, using an > UpdateProcessor that is configured with user defined field to track a > "DocBasedVersion" and uses the RTG logic to figure out if the update is > allowed. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5374) Support user configured doc-centric versioning rules
[ https://issues.apache.org/jira/browse/SOLR-5374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13802028#comment-13802028 ] Hoss Man commented on SOLR-5374: bq. it might not always be practical in a distributed system to have a version ordering across updates as it's likely to involve a single point of coordination Ramkumar: I'm not sure i understand your comment. The specific use case i'm targeting here is precisely the situation where there is already a an externally generated, per-document, version that we want to use to enforce that only "new" updates are processed. see the issue description: {panel}I've recently encountered some situations where I really wanted to be able to let the client specify an arbitrary version, on a per document basis, (ie: generated by an external system, or perhaps a timestamp of when a file was last modified) ...{panel} bq. In such cases, it might just suffice for the system if the versions were just equality comparable rather than having a strict ordering – i.e. update if the previous version equals what I expect, else reject the update What you are describing sounds like what is already possible using Solr's existing optimistic concurrency features... https://cwiki.apache.org/confluence/display/solr/Updating+Parts+of+Documents#UpdatingPartsofDocuments-OptimisticConcurrency I'm trying to address use cases i've seen come up recently where the client app doesn't want to have to check, or keep track of, what's version is in the _index_ (in several cases because they are already keeping track in an independent authoritative data store) they just wants to add/replace a document only if it's "newer" then whatever version is currently in the index. > Support user configured doc-centric versioning rules > > > Key: SOLR-5374 > URL: https://issues.apache.org/jira/browse/SOLR-5374 > Project: Solr > Issue Type: Improvement >Reporter: Hoss Man >Assignee: Hoss Man > Attachments: SOLR-5374.patch > > > The existing optimistic concurrency features of Solr can be very handy for > ensuring that you are only updating/replacing the version of the doc you > think you are updating/replacing, w/o the risk of someone else > adding/removing the doc in the mean time -- but I've recently encountered > some situations where I really wanted to be able to let the client specify an > arbitrary version, on a per document basis, (ie: generated by an external > system, or perhaps a timestamp of when a file was last modified) and ensure > that the corresponding document update was processed only if the "new" > version is greater then the "old" version -- w/o needing to check exactly > which version is currently in Solr. (ie: If a client wants to index version > 101 of a doc, that update should fail if version 102 is already in the index, > but succeed if the currently indexed version is 99 -- w/o the client needing > to ask Solr what the current version) > The idea Yonik brought up in SOLR-5298 (letting the client specify a > {{\_new\_version\_}} that would be used by the existing optimistic > concurrency code to control the assignment of the {{\_version\_}} field for > documents) looked like a good direction to go -- but after digging into the > way {{\_version\_}} is used internally I realized it requires a uniqueness > constraint across all update commands, that would make it impossible to allow > multiple independent documents to have the same {{\_version\_}}. > So instead I've tackled the problem in a different way, using an > UpdateProcessor that is configured with user defined field to track a > "DocBasedVersion" and uses the RTG logic to figure out if the update is > allowed. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5374) Support user configured doc-centric versioning rules
[ https://issues.apache.org/jira/browse/SOLR-5374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13801632#comment-13801632 ] Ramkumar Aiyengar commented on SOLR-5374: - This feature certainly is helpful as is, but it might not always be practical in a distributed system to have a version ordering across updates as it's likely to involve a single point of coordination if you need one. In such cases, it might just suffice for the system if the versions were just equality comparable rather than having a strict ordering -- i.e. update if the previous version equals what I expect, else reject the update. In some sense, if some external coordinator is able to guarantee a version ordering amongst updates, couldn't the same system be able to order the queue of updates to Solr? > Support user configured doc-centric versioning rules > > > Key: SOLR-5374 > URL: https://issues.apache.org/jira/browse/SOLR-5374 > Project: Solr > Issue Type: Improvement >Reporter: Hoss Man >Assignee: Hoss Man > Attachments: SOLR-5374.patch > > > The existing optimistic concurrency features of Solr can be very handy for > ensuring that you are only updating/replacing the version of the doc you > think you are updating/replacing, w/o the risk of someone else > adding/removing the doc in the mean time -- but I've recently encountered > some situations where I really wanted to be able to let the client specify an > arbitrary version, on a per document basis, (ie: generated by an external > system, or perhaps a timestamp of when a file was last modified) and ensure > that the corresponding document update was processed only if the "new" > version is greater then the "old" version -- w/o needing to check exactly > which version is currently in Solr. (ie: If a client wants to index version > 101 of a doc, that update should fail if version 102 is already in the index, > but succeed if the currently indexed version is 99 -- w/o the client needing > to ask Solr what the current version) > The idea Yonik brought up in SOLR-5298 (letting the client specify a > {{\_new\_version\_}} that would be used by the existing optimistic > concurrency code to control the assignment of the {{\_version\_}} field for > documents) looked like a good direction to go -- but after digging into the > way {{\_version\_}} is used internally I realized it requires a uniqueness > constraint across all update commands, that would make it impossible to allow > multiple independent documents to have the same {{\_version\_}}. > So instead I've tackled the problem in a different way, using an > UpdateProcessor that is configured with user defined field to track a > "DocBasedVersion" and uses the RTG logic to figure out if the update is > allowed. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org