On Mon, Jan 13, 2014 at 3:28 PM, Guido Trotter <[email protected]> wrote:
> On Mon, Jan 13, 2014 at 10:07 AM, Petr Pudlák <[email protected]> wrote: > > Good idea. This would be another good improvement. We could update > > _last_written_ssconf only if all RPC calls that distribute it succeed. > This > > way, if some fails, we'll try to redistribute it again on a next config > > update. But I'm a bit worried that this would slow ConfigWriter > considerably > > if a node failed, especially if there were a network problem and trying > to > > upload would end up with a network timeout. So it's a trade-off between > > trying to be consistent as much as possible and performance. What do you > > think? > > Well, if a node is down and not *marked* offline things are horribly > slow anyway (try to check). > Also of course we need to consider nodes that are marked offline and > consider that normal. > > > > > I guess this could be solved by uploading ssconf asynchronously, which > would > > also speed up all configuration updates considerably, but I would rather > > focus on this in the WConfD daemon (ATM I'm not completely sure if ssconf > > falling a bit behind the master configuration would be OK, I haven't > worked > > with ssconf very much). The same idea could be used for distributing the > > configuration to master candidates. > > > > Yes, indeed, further improvements could go in the new daemon. I'm just > worried now not to make the situation worse. :) > Ok, what about this: If any of ssconf uploads fails, empty _last_written_ssconf so that the whole upload operation is forced at a next config change?
