[ 
https://issues.apache.org/jira/browse/SOLR-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13257682#comment-13257682
 ] 

Yonik Seeley edited comment on SOLR-3178 at 4/19/12 7:14 PM:
-------------------------------------------------------------

bq. it is hard to push too many small partly-done features.

As committers it's the opposite - it's sometimes hard to consume patches that 
do a lot of different things.

bq. It works with multi-document update request, providing fully detailed 
feedback to the client about which document-updates failed (and why) and which 
succeeded.

That's great!  But it's also a separate feature that we've needed for a while 
(and I think there has been a JIRA issue open for it for a while).

bq. Looking shortly at you patch, I belive, that if two threads in the server 
JVM overlaps in a way that is unfortunate enough, then it is possible that data 
will not be stored or will be overwritten without an exception being thrown to 
indicate that to the client.

I'm confident in the synchronization/concurrency part - it's just reusing what 
was put in place for SolrCloud to handle reordering of updates to replicas and 
is very well tested (see TestRealTimeGet).  But please let us know if you see a 
problem with it, as that would mean a problem with SolrCloud today (even 
without this patch).

bq. Feedback to client is even possible if using ConcurrentSolrServer, because 
it has been changed to provide a Future from "request" - a Future that will 
eventually provide the calling client with the result from the request.

This sounds like another great feature!  It really deserves it's own issue so 
people can more easily provide review/feedback and not have it coupled to this 
issue.

Some other additions that can be handled later that I see:
- SolrJ support for easier passing of _version_ on a delete, constants for 
_version_, etc
- Use of "1" as a generic "exists" version (i.e. update document only if it 
already exists)
- If one document in a batch fails, don't automatically abort, and provide info 
back about which docs succeeded and which failed (your first point).

That last one in particular needs some good discussion and design work and 
really deserves it's own issue.

Down to the specifics of this patch... it's very non-invasive to core, 
consisting of the following code block (once for add, once for delete):
{code}
            if (versionOnUpdate != 0) {
              Long lastVersion = vinfo.lookupVersion(cmd.getIndexedId());
              long foundVersion = lastVersion == null ? -1 : lastVersion;
              if ( versionOnUpdate == foundVersion || (versionOnUpdate < 0 && 
foundVersion < 0) ) {
                // we're ok if versions match, or if both are negative (all 
missing docs are equal)
              } else {
                throw new SolrException(ErrorCode.CONFLICT, "version conflict 
for " + cmd.getPrintableId() + " expected=" + versionOnUpdate + " actual=" + 
foundVersion);
              }
            }
{code}

Having this current improvement committed in no way blocks any future 
improvements you may come up with (including deleting the code quoted above and 
using whatever method you have come up with for checking the versions), and it 
even uses the same API (or a subset of it) via _version_ and 409.  Progress, 
not perfection!

                
      was (Author: ysee...@gmail.com):
    bq. it is hard to push too many small partly-done features.

As committers it's the opposite - it's sometimes hard to consume patches that 
do a lot of different things.

bq. It works with multi-document update request, providing fully detailed 
feedback to the client about which document-updates failed (and why) and which 
succeeded.

That's great!  But it's also a separate feature that we've needed for a while 
(and I think there has been a JIRA issue open for it for a while).

bq. Looking shortly at you patch, I belive, that if two threads in the server 
JVM overlaps in a way that is unfortunate enough, then it is possible that data 
will not be stored or will be overwritten without an exception being thrown to 
indicate that to the client.

I'm confident in the synchronization/concurrency part - it's just reusing what 
was put in place for SolrCloud to handle reordering of updates to replicas and 
is very well tested (see TestRealTimeGet).  But please let us know if you see a 
problem with it, as that would mean a problem with SolrCloud today (even 
without this patch).

Some other additions that can be handled later that I see:
- SolrJ support for easier passing of _version_ on a delete, constants for 
_version_, etc
- Use of "1" as a generic "exists" version (i.e. update document only if it 
already exists)
- If one document in a batch fails, don't automatically abort, and provide info 
back about which docs succeeded and which failed (your first point).

That last one in particular needs some good discussion and design work and 
really deserves it's own issue.

Down to the specifics of this patch... it's very non-invasive to core, 
consisting of the following code block (once for add, once for delete):
{code}
            if (versionOnUpdate != 0) {
              Long lastVersion = vinfo.lookupVersion(cmd.getIndexedId());
              long foundVersion = lastVersion == null ? -1 : lastVersion;
              if ( versionOnUpdate == foundVersion || (versionOnUpdate < 0 && 
foundVersion < 0) ) {
                // we're ok if versions match, or if both are negative (all 
missing docs are equal)
              } else {
                throw new SolrException(ErrorCode.CONFLICT, "version conflict 
for " + cmd.getPrintableId() + " expected=" + versionOnUpdate + " actual=" + 
foundVersion);
              }
            }
{code}

Having this current improvement committed in no way blocks any future 
improvements you may come up with (including deleting the code quoted above and 
using whatever method you have come up with for checking the versions), and it 
even uses the same API (or a subset of it) via _version_ and 409.  Progress, 
not perfection!


                  
> Versioning - optimistic locking
> -------------------------------
>
>                 Key: SOLR-3178
>                 URL: https://issues.apache.org/jira/browse/SOLR-3178
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>    Affects Versions: 3.5
>         Environment: All
>            Reporter: Per Steffensen
>            Assignee: Per Steffensen
>              Labels: RDBMS, insert, locking, nosql, optimistic, uniqueKey, 
> update, versioning
>             Fix For: 4.0
>
>         Attachments: SOLR-3178.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> In order increase the ability of Solr to be used as a NoSql database (lots of 
> concurrent inserts, updates, deletes and queries in the entire lifetime of 
> the index) instead of just a search index (first: everything indexed (in one 
> thread), after: only queries), I would like Solr to support versioning to be 
> used for optimistic locking.
> When my intent (see SOLR-3173) is to update an existing document, I will need 
> to provide a version-number equal to "the version number I got when I fetched 
> the existing document for update" plus one. If this provided version-number 
> does not correspond to "the newest version-number of that document at the 
> time of update" plus one, I will get a VersionConflict error. If it does 
> correspond the document will be updated with the new one, so that "the newest 
> version-number of that document" is NOW one higher than before the update. 
> Correct but efficient concurrency handling.
> When my intent (see SOLR-3173) is to insert a new document, the version 
> number provided will not be used - instead a version-number 0 will be used. 
> According to SOLR-3173 insert will only succeed if a document with the same 
> value on uniqueKey-field does not already exist.
> In general when talking about different versions of "the same document", of 
> course we need to be able to identify when a document "is the same" - that, 
> per definition, is when the values of the uniqueKey-fields are equal. 
> The functionality provided by this issue is only really meaningfull when you 
> run with "updateLog" activated.
> This issue might be solved more or less at the same time as SOLR-3173, and 
> only one single SVN patch might be given to cover both issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to