Re: [os-libsynthesis] temporary local ID + FinalizeLocalID
On Wed, 2013-06-26 at 16:10 +0200, Patrick Ohly wrote: On Mon, 2013-06-10 at 12:35 +0200, Lukas Zeller wrote: On 10.06.2013, at 11:57, Patrick Ohly patrick.o...@intel.com wrote: I'll probably try something else: if commands were delayed in a message which is marked Final, then force execution of the commands at the end of processing that message and issue their Status commands, just as syncing normally would. Sounds like a good compromise to me. I've implemented that. I've run this through a full test against different peers, without problems. Or rather, there were problems earlier, which shows that the tests covered relevant corner cases ;-) These problems have been addressed. After adding more tests, I found one more problem: the combination of Moredata and queuing preceding items failed. Patch attached. I intend to squash it into the preserve status order patch before including FDO master branch. -- Best Regards, Patrick Ohly The content of this message is my personal opinion only and although I am an employee of Intel, the statements I make here in no way represent Intel's position on the issue, nor am I authorized to speak on behalf of Intel on this matter. From 1f01b69ee1aa597924f90c48a299db254a0cd5d9 Mon Sep 17 00:00:00 2001 From: Patrick Ohly patrick.o...@intel.com Date: Mon, 1 Jul 2013 08:52:47 +0200 Subject: [PATCH 1/4] merge: ordered status Add/Replace commands which contain incomplete items are special: they must trigger a Status 213 response in the same message, which cannot be done without processing the previously queued commands first. We could try to queue the incomplete command without processing it at all, but that would change the response in a way that might be unexpected by the peer (although legal). Better play it safe and finish all queued commands, then process the command. --- src/sysync/synccommand.cpp | 18 ++ src/sysync/synccommand.h |4 src/sysync/syncsession.cpp |7 +++ 3 files changed, 29 insertions(+) diff --git a/src/sysync/synccommand.cpp b/src/sysync/synccommand.cpp index ef363d1..3b27d63 100755 --- a/src/sysync/synccommand.cpp +++ b/src/sysync/synccommand.cpp @@ -2289,6 +2289,24 @@ missingeoc: } // TSyncOpCommand::AddNextChunk +// SyncOp commands can execute out of order except when they +// contain chunked items, because then we would have to issue +// a 213 Status immediately, which would violate the ordering +// of Status replies. +bool TSyncOpCommand::canExecuteOutOfOrder() +{ + SmlItemListPtr_t *itemnodePP=(fSyncOpElementP-itemList); + while (*itemnodePP) { +SmlItemListPtr_t thisitemnode = *itemnodePP; +if (thisitemnode-item +thisitemnode-item-flags SmlMoreData_f) { + return false; +} +itemnodePP = (thisitemnode-next); + } + return true; +} + // execute command (perform real actions, generate status) // returns true if command has executed and can be deleted bool TSyncOpCommand::execute(void) diff --git a/src/sysync/synccommand.h b/src/sysync/synccommand.h index bc5869c..c070905 100755 --- a/src/sysync/synccommand.h +++ b/src/sysync/synccommand.h @@ -90,6 +90,9 @@ public: #endif // - analyze command (but do not yet execute) virtual bool analyze(TPackageStates aPackageState) { fPackageState=aPackageState; return true; }; // returns false if command is bad and cannot be executed + // - execute() can be called even if there are other already queued commands. + // True by default, exceptions must be defined explicitly. + virtual bool canExecuteOutOfOrder() { return true; } // - execute command (perform real actions, generate status) virtual bool execute(void); // returns true if command could execute, false if it must be queued for later finishing (next message) // - get number of bytes that will be still available in the workspace after @@ -413,6 +416,7 @@ public: virtual ~TSyncOpCommand(); virtual bool isSyncOp() { return true; }; virtual bool analyze(TPackageStates aPackageState); + virtual bool canExecuteOutOfOrder(); virtual bool execute(void); #ifndef USE_SML_EVALUATION // - get (approximated) message size required for sending it diff --git a/src/sysync/syncsession.cpp b/src/sysync/syncsession.cpp index 870342a..a394bed 100644 --- a/src/sysync/syncsession.cpp +++ b/src/sysync/syncsession.cpp @@ -2549,6 +2549,13 @@ Ret_t TSyncSession::process(TSmlCommand *aSyncCommandP) delayExecUntilNextRequest(aSyncCommandP); } else { +if (fDelayedExecutionCommands.size()0 +!aSyncCommandP-canExecuteOutOfOrder()) { + PDEBUGPRINTFX(DBG_SESSION,(%s: command cannot be executed with other commands already delayed - flush queue,aSyncCommandP-getName())); + fCmdIncoming = NULL; + tryDelayedExecutionCommands(); +} + // command is ok, execute it
Re: [os-libsynthesis] temporary local ID + FinalizeLocalID
Hello Patrick, On 10.06.2013, at 11:57, Patrick Ohly patrick.o...@intel.com wrote: Looking at it like that, the question becomes: how does the server decided that it still needs an answer from the client? Perhaps the problem is on the client side after all: in message msgID=2, the client client sent the Add command with a Final flag. In msgID=3, it acknowledges the server's SyncCmd with a Status, *without* a Final flag. The server therefore (in it's MessageEnded checks) continues with the session, whereas the client quits it. Should the client continue to set the Final flag as long as it is waiting for the server to finish? This is one of the things the specs did not nail and actual implementations varied. Some severely misbehaved when receiving extra final/ (more than one per phase), and others were tolerant. Things are complicated further because final/ is session scope, whereas the actual phase is datastore scope. This can lead to really strange overlaps when multiple datastores are involved. It'd didn't do that because the server didn't set it either. My feeling is that untangling all these state changes is going to be tricky. Undoubtedly. Untangling the code that hasn't become prettier over time would be one thing. However, the real problem is that there's no clear specs to even draw a state diagram in the first place. I'll probably try something else: if commands were delayed in a message which is marked Final, then force execution of the commands at the end of processing that message and issue their Status commands, just as syncing normally would. Sounds like a good compromise to me. The effect will be that commands only get delayed across messages if there will be more of them coming in anyway. The hope is that this will avoid or at least minimize state change issues. Note that the problem that I am running into also exists in the unmodified code: I expect similar issues when the server delays processing because of timing. Probably not, because the server can only delay processing at the very beginning of the sync, while it loads the sync set in the background. So far, this was the only (but well tested with very long lasting sync set loads) case of delaying command execution. Delaying execution after sync has actually started is a new case. Best Regards, Lukas Zeller ___ os-libsynthesis mailing list os-libsynthesis@synthesis.ch http://lists.synthesis.ch/mailman/listinfo/os-libsynthesis
Re: [os-libsynthesis] temporary local ID + FinalizeLocalID
On Mon, 2013-06-10 at 12:35 +0200, Lukas Zeller wrote: On 10.06.2013, at 11:57, Patrick Ohly patrick.o...@intel.com wrote: I'll probably try something else: if commands were delayed in a message which is marked Final, then force execution of the commands at the end of processing that message and issue their Status commands, just as syncing normally would. Sounds like a good compromise to me. It seems to be working, too :-) The effect will be that commands only get delayed across messages if there will be more of them coming in anyway. The hope is that this will avoid or at least minimize state change issues. Note that the problem that I am running into also exists in the unmodified code: I expect similar issues when the server delays processing because of timing. Probably not, because the server can only delay processing at the very beginning of the sync, while it loads the sync set in the background. So far, this was the only (but well tested with very long lasting sync set loads) case of delaying command execution. Delaying execution after sync has actually started is a new case. Ah, I see. I had skipped over the fLocalSyncDatastoreP-engIsStarted(false) check in the relevant code: TSyncSession::processSyncOpItem() ... // check if we can process it now // Note: request time limit is active in server only. if (!fLocalSyncDatastoreP-engIsStarted(false) || RemainingRequestTime()0) { aQueueForLater=true; // re-execute later... return true; // ...but otherwise ok } -- Best Regards, Patrick Ohly The content of this message is my personal opinion only and although I am an employee of Intel, the statements I make here in no way represent Intel's position on the issue, nor am I authorized to speak on behalf of Intel on this matter. ___ os-libsynthesis mailing list os-libsynthesis@synthesis.ch http://lists.synthesis.ch/mailman/listinfo/os-libsynthesis
Re: [os-libsynthesis] temporary local ID + FinalizeLocalID
On Mon, 2013-06-03 at 22:17 +0200, Lukas Zeller wrote: How about yet another approach: the store is allowed to return a new error code, LOCERR_AGAIN, for add/update/delete operations. The engine then remembers all operations with that return code and calls them again (with the exact same parameters and without freeing resources!) at a suitable time, for example the end of the current message. In the second call, the store must execute or finish the operation. The engine must not call an operation on an item for which it has already something pending. If that happens, it has to complete the first operation first. That way the store can keep a list of pending operations with the ItemID as key. Sounds like a good solution indeed. I like it better than late-handling return codes especially because it avoids any risk of breaking existing datastores (and possible unsafe assumptions in their implementations), because for those nothing would change outside *and* inside the engine. It's also somewhat similar to the finalisationscript mechanism, which also keeps items (or parts of them) in memory for using them again at the end of the session, usually to resolve cross references (parent/child tasks for example). Maybe some of the mechanisms can be re-used for LOCERR_AGAIN. I found a different mechanism: TSyncSession::processSyncOpItem() can decide to delay execution of the command by setting a flag. Currently this is done when processing the message already took too long. I have changed the command and item processing call chain and my backend so that it can kick of the operation and continue when the chain is invoked again. For that, I am setting the aQueueForLater=true to use the existing queuing mechanism for SyncML commands. Now I found one problem with that: after the first of several Add or Update commands got queued, all following commands are also queued. What I'd like to see instead is that they all get processed. Then in the level above the engine, right before sending the response, I would gather all pending operations and combine them into a batched, asynchronous add or update operation. The batching is expected to be much more efficient with EDS. It also allows overlapping local processing in the PIM storage with network IO. But right now, I always only get one item to be batched because everything else is still in the queue for later processing. Is the command received after other commands needed to be delayed - must be delayed, too something which is imposed by SyncML? I tried to think of situations where the engine needs to enforce completion of the pending operation before triggering another one, but couldn't think of such a situation. My expectation is that either the backend will properly serialize the item changes or the item changes are independent (update foo, delete bar). But what could happen is that update foo gets started, does not complete, and then delete bar gets processed right away. That would re-order the status messages such that the status for the later command gets sent first. That's because my backend currently can only do insert/update asynchronously, but not deletes. -- Best Regards, Patrick Ohly The content of this message is my personal opinion only and although I am an employee of Intel, the statements I make here in no way represent Intel's position on the issue, nor am I authorized to speak on behalf of Intel on this matter. ___ os-libsynthesis mailing list os-libsynthesis@synthesis.ch http://lists.synthesis.ch/mailman/listinfo/os-libsynthesis
[os-libsynthesis] temporary local ID + FinalizeLocalID
Hello! I'm trying to batch local database adds together to increase performance in SyncEvolution (https://bugs.freedesktop.org/show_bug.cgi?id=52669 and https://bugs.freedesktop.org/show_bug.cgi?id=64838#c5). My hope was that I can postpone the actual modification and return a temporary local ID. Then later or in parallel, combine multiple items together in a batch add. Finally, the temporary ID was meant to be replaced by the final one in FinalizeLocalID. But I hit the first snag pretty early: FinalizeLocalID() doesn't seem to get called on the server side where I need this feature primarily (for PBAP, CalDAV, ActiveSync the server side is the one with the local database). dsFinalizeLocalID, the wrapper around the store's FinalizeLocalID, gets called in three functions: (gdb) b sysync::TBinfileImplDS::changeLogPostflight Breakpoint 2 at 0x106f956: file /home/pohly/syncevolution/libsynthesis/src/sysync/binfileimplds.cpp, line 462. (gdb) b sysync::TBinfileImplDS::SaveAdminData Breakpoint 3 at 0x10739fd: file /home/pohly/syncevolution/libsynthesis/src/sysync/binfileimplds.cpp, line 2389. (gdb) b sysync::TLocalEngineDS::engGenerateMapItems Breakpoint 4 at 0x109c79c: file /home/pohly/syncevolution/libsynthesis/src/sysync/localengineds.cpp, line 6635. None of these get called on the server side during a refresh-from-client, slow or normal two-way sync. That makes sense, map items are only relevant on the client, and the server is not using the binfile class. But shouldn't the server also call FinalizeLocalID somewhere? Where would be the right place for it? Regarding FinalizeLocalID() in the client, I see some conceptual issues with it, at least for what I am trying to achieve. My main concern is about error handling. FinalizeLocalID() can return an error code, but dsFinalizeLocalID does not check for errors. It merely checks for ID modified. So in case of an error, the local store itself has to remember that writing failed and do something. What that something could be is not clear. It could return an error in EndDataWrite(), in which case I would expect the engine to enforce a slow sync in the next sync session. I can't think of other, realistic options right now. I think a better solution would be to delay handling the result of a database write until the entire SyncML message has been processed. Then give the store a chance to finalize the ID of each item that was touched as part of that SyncML message. The store can use that opportunity to flush the pending operations and return error codes for each item, just as it would normally do in the synchronous write operations. Then the engine continues with the same error handling that it already does. The advantage, besides better error handling, is that the store has a natural place to flush pending operations. Without the intermediate FinalizeLocalID() calls a new max pending operations setting independent of the SyncML message size would be needed. Comments? -- Best Regards, Patrick Ohly The content of this message is my personal opinion only and although I am an employee of Intel, the statements I make here in no way represent Intel's position on the issue, nor am I authorized to speak on behalf of Intel on this matter. ___ os-libsynthesis mailing list os-libsynthesis@synthesis.ch http://lists.synthesis.ch/mailman/listinfo/os-libsynthesis
Re: [os-libsynthesis] temporary local ID + FinalizeLocalID
Hello Patrick, FinalizeLocalID came into existence as a workaround for the iOS client datastores, in particular the address book which did not generate proper IDs until a very expensive save all method was called. I tried to implement it as generically as possible, but avoiding any extra risk (and development time I didn't have then) beyond what was needed to fix that problem. Which was for binfile based clients only. For the same reason (less risk to break something) the FinalizeLocalID implementation must be safe against being called with a already finalized ID. The idea is that the calls to FinalizeLocalID can be placed wherever a final ID is required, without need to track if the same item will get finalized again later. So making the FinalizeLocalID mechanism work for non-binfile and server should be not much more than placing the call at a few more places. Probably, for the server it's only when processing Map items, and before persisting them at the end of the request. How about yet another approach: the store is allowed to return a new error code, LOCERR_AGAIN, for add/update/delete operations. The engine then remembers all operations with that return code and calls them again (with the exact same parameters and without freeing resources!) at a suitable time, for example the end of the current message. In the second call, the store must execute or finish the operation. The engine must not call an operation on an item for which it has already something pending. If that happens, it has to complete the first operation first. That way the store can keep a list of pending operations with the ItemID as key. Sounds like a good solution indeed. I like it better than late-handling return codes especially because it avoids any risk of breaking existing datastores (and possible unsafe assumptions in their implementations), because for those nothing would change outside *and* inside the engine. It's also somewhat similar to the finalisationscript mechanism, which also keeps items (or parts of them) in memory for using them again at the end of the session, usually to resolve cross references (parent/child tasks for example). Maybe some of the mechanisms can be re-used for LOCERR_AGAIN. So, from my side: just go ahead :-) Best Regards, Lukas On 03.06.2013, at 15:12, Patrick Ohly patrick.o...@intel.com wrote: Hello! I'm trying to batch local database adds together to increase performance in SyncEvolution (https://bugs.freedesktop.org/show_bug.cgi?id=52669 and https://bugs.freedesktop.org/show_bug.cgi?id=64838#c5). My hope was that I can postpone the actual modification and return a temporary local ID. Then later or in parallel, combine multiple items together in a batch add. Finally, the temporary ID was meant to be replaced by the final one in FinalizeLocalID. But I hit the first snag pretty early: FinalizeLocalID() doesn't seem to get called on the server side where I need this feature primarily (for PBAP, CalDAV, ActiveSync the server side is the one with the local database). dsFinalizeLocalID, the wrapper around the store's FinalizeLocalID, gets called in three functions: (gdb) b sysync::TBinfileImplDS::changeLogPostflight Breakpoint 2 at 0x106f956: file /home/pohly/syncevolution/libsynthesis/src/sysync/binfileimplds.cpp, line 462. (gdb) b sysync::TBinfileImplDS::SaveAdminData Breakpoint 3 at 0x10739fd: file /home/pohly/syncevolution/libsynthesis/src/sysync/binfileimplds.cpp, line 2389. (gdb) b sysync::TLocalEngineDS::engGenerateMapItems Breakpoint 4 at 0x109c79c: file /home/pohly/syncevolution/libsynthesis/src/sysync/localengineds.cpp, line 6635. None of these get called on the server side during a refresh-from-client, slow or normal two-way sync. That makes sense, map items are only relevant on the client, and the server is not using the binfile class. But shouldn't the server also call FinalizeLocalID somewhere? Where would be the right place for it? Regarding FinalizeLocalID() in the client, I see some conceptual issues with it, at least for what I am trying to achieve. My main concern is about error handling. FinalizeLocalID() can return an error code, but dsFinalizeLocalID does not check for errors. It merely checks for ID modified. So in case of an error, the local store itself has to remember that writing failed and do something. What that something could be is not clear. It could return an error in EndDataWrite(), in which case I would expect the engine to enforce a slow sync in the next sync session. I can't think of other, realistic options right now. I think a better solution would be to delay handling the result of a database write until the entire SyncML message has been processed. Then give the store a chance to finalize the ID of each item that was touched as part of that SyncML message. The store can use that opportunity to flush the pending operations and return error codes