Hello, Alexey. Great mail, by the way. I think, it would be great to have this feature in Ignite.
> I haven't removed thread id completely from code. Can we remove thread id completely from code? Can you estimate how much effort do we need? As far as I can see from parent task [1] we need some complex tests to be implemented. Are they presented in prototype? [1] https://issues.apache.org/jira/browse/IGNITE-4887?focusedCommentId=16069655&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16069655 В Пн, 26/02/2018 в 10:59 +0000, Aleksey Kuznetsov пишет: > Hello Igniters! > > Currently we have suspension/resuming implemented for optimistic > transactions [1]. > Unless suspend/resume isn't supported for pessimistic tx JTA isn't fully > supported [4]. > > I’m working on a ticket "Suspension/resuming for pessimistic transactions" > [2]. > Goal of the ticket is to support transaction suspend/resume for pessimistic > transactions. > > # Benefits of suspending/resuming transaction. > > 1. Full JTA standart support. > 2. Increase of throughput in high-load scenarios. > Suspend operation would allow to release Ignite threads and > optionally perform some other logic. > > Note, current API has got suspend/resume only for optimistic transactions, > which confuses users. > > # Real life example. > > Consider the following scenario: > > 1. Application starts Ignite transaction. > 2. Business logic is executed inside transaction. > 3. For commit/rollback application need approval message from external > agent. > 4. Currently, thread inside Ignite is idle until approval is received. > 4a. When suspend/resume support is implemented, application can perform > suspend and release thread inside Ignite. > > # How pessimistic transaction works. > > When we perform put/get operations in pessimistic transactions, lock > request is sent to remote nodes by `GridNearLockRequest`. > Request contains thread id `IgniteTxAdapter#threadId`, in which operation > was performed. > In pessimistic mode, multiple transaction objects are created - on > primary, on backup nodes, and on originating node: > `GridNearTxLocal`, `GridDhtTxLocal`, `GridNearTxRemote`, `GridDhtTxRemote`. > > Thread id is used in logic on these nodes. > For instance, to check whether thread has successfully locked the key, > after lock acquisition attempt. > Or to check whether active transaction exists. > > # Main challenge for implementation. > > I've analysed implementation approaches and see the core issue: > > The essential problem with suspending/resuming lies in thread id field > transferred to remote nodes during put/get operations. > > Imagine, we want to suspend transaction and resume it in another thread. > See code snippet below: > > ``` > tx = ignite.transactions().txStart(PESSIMISTIC, SERIALIZABLE); > > cache.put(1, 1); > > tx.suspend(); > .... > > // In another thread. > tx.resume(); // Thread id will be changed in transaction instance. > ``` > > Original thread id is transferred and saved on remote node. > After resuming thread id on local node differs from remote node. > I want to avoid one more network round trip to change thread Id on remote > node after transaction resuming. > > # Design proposal. > > Transaction id (`xid`) can be used instead of thread id on remote nodes. > The following solution is possible for the ticket : > > Replace thread id by transaction id for sending to remote nodes. > Thread id will be removed from the following classes: > `IgniteTxAdapter`, `GridDistributedTxPrepareRequest`, > `GridDistributedTxFinishRequest`, `GridDistributedTxFinishResponse`, > `GridDistributedTxPrepareRequest`. > > I haven't removed thread id completely from code. Thread id is moved to > `GridNearTxLocal`. > We still need it in near local transaction for many reasons, for example to > assure only thread started transaction can suspend it in > `GridNearTxLocal#suspend()`. > In future we can remove thread id completely. I propose to study this > question in another ticket. > > Also, thread id is remained in `GridDistributedLockRequest`. > Lock request used by cache locks and it need to transfer thread id to > remote nodes. > For example to use cache locks along with cache operations put/get, see > `GridNearTxLocal#updateExplicitVersion`. > As for pessimistic transaction, thread id in `GridDistributedLockRequest` > is set to `UNDEFINED_THREAD_ID`, which means we must not use it remotely. > > Note, that if user suspends transaction and forgets to resume it, > transaction would be rolled back once timeout has occurred. > > In my design when transaction is suspended, all locked keys remain locked. > > Please see my prototype of proposal implementation [3]. > > Proposed changes are relatively small. > They ensure consistency of information about locks, if thread Id will be > changed within one transaction (by suspend/resume). > There will be used correct id for locks on remote nodes. It also requires > painstaking work, but changes will not affect the logic of oher components. > > Tell me please what do you think? Any suggestions and comments will be > helpful. > > If you agree with my design I also will do benchmarking. > > [1] https://issues.apache.org/jira/browse/IGNITE-5712 > [2] https://issues.apache.org/jira/browse/IGNITE-5714 > [3] https://github.com/apache/ignite/pull/2789 > [4] Section 3.2.3 > http://download.oracle.com/otn-pub/jcp/jta-1.1-spec-oth-JSpec/jta-1_1-spec.pdf
signature.asc
Description: This is a digitally signed message part