Re: [osgi-dev] Waiting in deactivate/activate()?

Peter Kriens Mon, 15 Aug 2016 00:12:02 -0700

> On 9 aug. 2016, at 11:53, list+org.o...@io7m.com wrote:
> 'Lo.
> On 2016-08-09T09:32:57 +0200
> Peter Kriens <peter.kri...@aqute.biz> wrote:
>> 
>>> On 8 aug. 2016, at 17:02, list+org.o...@io7m.com wrote:
>>> It is in fact what I ended up putting into the example:
>>> https://github.com/io7m/osgi-example-reverser/blob/master/tcp-server/src/main/java/com/io7m/reverser/tcp_server/TCPServerService.java#L47
>>>   
>> a) You will get an automatic retry by DS but then you’re component is
>> dead when it runs in a problem.
> Hm! This is news to me. Could you elaborate a bit on this? I've not
> yet experienced/learned about automatric retries for DS components.
> Is there some specific way I should be writing components to take
> advantage of this?
The DS retry is not specced anywhere but I noticed Felix DS doing this on a 
failure. So you can’t and should not rely on it.


>> b) At certain network problems you will overload the log, at least
>> insert ’some’ delay
> I'm not sure which commit you saw, but there's an exponential backoff
> now. It retries up to n times (where n = 5 at the moment), delaying each
> retry for 2^n seconds. After that, it simply rethrows whatever the last
> exception was.
The backoff is on the initialization, not on the normal control flow. If you 
get errors in the normal flow after initialization you can get in an error 
situation that the socket throws an exception you log it, but then it throws an 
exception immediately again. Not all exceptions (at least I think) close the 
socket.

Again, the active state of a bundle should be seen as a command to keep your 
server running. Any failures should NEVER give up be retried without 
overloading the system.

> 
>> c) the server dies at certain network problems and will never recover
>> because you create the server socket in the constructor and not in a
>> loop
>> 
>> OSGi is a server model, you should write your servers to run forever.
>> There is no reason to bail out ever before you’re stopped. This was
>> the primary pain point of Blueprint where the application termed into
>> zombie state if you were not quick enough.
> 
> I see what you're saying, but there's a flip side to this approach too.
> The issue as I see it is that the network functions (bind() in
> particular) can fail for reasons that are effectively permanent. In
> other words, in a lot of server configurations, if bind() fails to due
> to EADDRINUSE or EADDRNOTAVAIL, then that's something that's likely
> due to a configuration problem and is just not going to change however
> many times the operation is retried.
That is not true. For example, in many cases during development you still have 
your old server running from a debug session. Just killing it then allows the 
newly started server to continue. In OSGi, you should think ‘continuous’ not in 
command/event like models. In the case your bind fails, make sure you log it or 
even sys out so that an operator knows the problem and can fix it. In a cluster 
environment, it is extremely hard to get the timing right when the cluster 
starts so your other side might not be available yet. Blueprint is a disaster 
because it did not have this model, it gives up after a certain time.

> It seems like it'd be better to
> crash and let the administrator actually know about it rather than
> looping and writing error messages in the hope that somebody notices.
But in many cases it is not just that the port is occupied (which I remember 
could be transient on Windows once) it can also fail because the remote side is 
not (yet) available.

> I admit this philosophy applies more to systems that can't be
> dynamically reconfigured, unlike OSGi.
I rest my case :-)

> 
>> Ericsson
> I can see a lot of parallels between OSGi and Erlang's actor model
> (expect failure, recover, otherwise crash and escalate failure and
> assume that a supervisor component will restart me).
Yes, it definitely was an inspiration for me. At the time I started to work at 
OSGi in 1998 I was working for Ericsson Research and Joe Armstrong and Bjarne 
Deckers and me shared a manager. I never worked with them but Erlang was in 
heavy use in our department. I was hired for my Java knowledge and I have to 
admit I tried to get people on Java :-) The dynamic configuration and dynamic 
software updates were also very much inspired by Ericsson since these were 
common in telephone switches.

>> So yes, you’re pedantic :-), but that is the only way to make
>> progress because we should always challenge the way we work to
>> improve it. I therefore do appreciate these questions. I hope you
>> found my answers just as useful.
> 
> I have. Thanks for taking the time!
> 
> M
Kind regards,

        Peter Kriens

> _______________________________________________
> OSGi Developer Mail List
> osgi-dev@mail.osgi.org
> https://mail.osgi.org/mailman/listinfo/osgi-dev

smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
OSGi Developer Mail List
osgi-dev@mail.osgi.org
https://mail.osgi.org/mailman/listinfo/osgi-dev

Re: [osgi-dev] Waiting in deactivate/activate()?

Reply via email to