Re: [os-libsynthesis] detecting slow syncs

Lukas Zeller Fri, 05 Feb 2010 02:40:56 -0800

Hi Patrick,

On Feb 4, 2010, at 22:16 , Patrick Ohly wrote:

> On Di, 2010-02-02 at 11:23 +0000, Lukas Zeller wrote:
>> We could check at syncagent.cpp:1171 if the abort reason status is
>> zero (i.e. ABORTDATASTORE(0) called), and if so behave differently,
>> i.e. just muting that datastore, but leave the sessing running. But as
>> said, I'm not sure that will help with many servers.
> 
> ABORTDATASTORE(0) would lead to the engine picking some other, generic
> status (514).
> I solved this by introducing a specific status code that
> tells the engine "abort this store, but continue with the rest". The
> SyncEvolution takes care of actually aborting the session, as before.
> 
> Not pushed to git yet. Patch attched. If that approach looks right, I'll
> merge the Synthesis master branch with the fix for ABORTDATASTORE() and
> add my own patch on top (or you include it).

Looks fine. I'll push the extended remote rule stuff we discussed today, so 
once the XML fragment merge is done (see other mail) that can be added as well.

> Now, for the server side things are bit more tricky. First of all, is
> there any way how the caller of the engine can determine the status of
> each source, as it can for a client via the progress events? The lack of
> progress events in the server is already a big problem for interactive
> monitoring of a running server session, but not getting information
> about per-source failures affects our error handling and reporting.

Ok, I see that the progress events for the server must get more priority on my 
todo list!

> Second, datastoreinitscript is apparently called when the server
> processes the "Sync" command. There is only one Sync command in that
> message, so the client-side trick of processing the whole message to
> determine the sync mode for all stores doesn't work. Hooking into the
> processing of the Status response of the client for the server's Alert
> won't work either, because someone client and server also only send
> Alert and Status for one store per message.
> 
> When the client alerts the server, there are two Alerts in the message.
> Then the server replies with an Alert for only one of the stores. Why is
> that? Couldn't it include two Alerts in the same reply?

Somehow I lost track now. In what situation do you see that happen?

Basically however, the *message* is not a relevant unit, only the *package* is. 
Implementations are free to split up any package into as many messages as they 
like. For example I've seen implementations which split the "updates from 
server" package into one message per data item, probably to be able to detect 
data items that make the client crash precisely.

> Finally, on the server side we typically don't know whether the user has
> chosen a slow sync mode intentionally. He might have. So we can only
> enable the slow sync detection for server-initiated syncs were we
> control the sync mode. This problem can be solved, for example by only
> enabling the slow sync detection in configs where the server initiates
> the connection or by making it conditional on having sent a SAN.

Makes sense.

However, genereally thinking about the whole topic, I'm not sure if trying to 
continue some syncs and aborting others is not asking for potentially massive 
trouble with confused peers (222 loops etc.), while gaining not much.

What is the cost of aborting the other datastores in case of an unexpected slow 
sync? A bit of additional traffic, but that's it. As the abort happens before 
any actual data is exchanged, there's very little danger of impact on the sync 
in the other datastores.

On the other hand, my experience is that to get the SyncML state machine 
completely right and stable enough to survive things like partial syncs is very 
hard. Even after 10 years with ours there were still recent cases where it did 
behave wrong, and issues with <final/> and 222 loops is certainly the most 
common problem on the SyncML protocol level with any new or immature 
implementation I've seen. The outcome of trying to run a partial sync is IMHO 
very much unpredictable.

I think it would be safer to just abort the entire session on a unexpected slow 
sync, in both client and server cases.

Lukas Zeller (l...@synthesis.ch)
- 
Synthesis AG, SyncML Solutions  & Sustainable Software Concepts
i...@synthesis.ch, http://www.synthesis.ch

_______________________________________________
os-libsynthesis mailing list
os-libsynthesis@synthesis.ch
http://lists.synthesis.ch/mailman/listinfo/os-libsynthesis

Re: [os-libsynthesis] detecting slow syncs

Reply via email to