I put this review on my list and will really try to go through the code. This
is very important work you are doing Christian.
Kind regards,
Peter Kriens
> On 15 feb. 2015, at 19:07, Christian Schneider <[email protected]>
> wrote:
>
> If you do not implement something special for clean shutdown of inflight
> exchanges then the normal error handling should take effect like you
> mentioned.
> So for example a db transaction should roll back. Some issue may be that
> e.g. a service call can not be rolled back.
>
> On the other hand I think implementing clean shutdown will add a lot of
> complexity. The special code will only be executed for quite rare cases.
> These two effects increase the change of programming errors in the code.
> So I am with you that in most cases you can just implement normal error
> handling and jsut live with the fact that inflight calls might run into
> errors.
>
> What I have seen on production systems is that they mark a machine to be
> updated as inactive on a front end load balancer. So no new requests come in
> and after some time you can quite safely update the bundles.
> This is a quite low tech solution but I think exactly for this reason it
> works so well.
>
> So while I wanted to understand clean shutdown better for the discussion on
> aries dev I do not think it should always be done.
>
> Btw. For my current redesign of jpa I have one problem that I would like to
> get some feedback / ideas.
> I am providing a so called EmSupplier:
> https://github.com/cschneider/jpa-experiments/blob/master/jpa-support/src/main/java/net/lr/jpa/impl/EMSupplierImpl.java
>
> <https://github.com/cschneider/jpa-experiments/blob/master/jpa-support/src/main/java/net/lr/jpa/impl/EMSupplierImpl.java>
>
> This class will be offered as a service per persistence unit and should help
> to work with jpa. There is a precall method that will create an EM on the
> thread. Then there is a get() to retrieve the local thread em and a postcall
> that will close the EM again. As discussed a bundle should have stopped all
> work when the stop method is done. In this case this applies to the case
> where the PU bundle will be stopped. So the EntityManagerFactory will also be
> deregistered and closed. As the EMSupplier depends on the EMF it will also
> have to be closed.
>
> Now the problem is that there might still be threads working on their per
> thread EMs. The really safe way is to wait until all these threads have
> closed their EMs. This is what I am doing now. To make it a little more
> predictable I added a timeout and close the remaining EMs after the timeout.
>
> So the question is: Is this a best practice ? The clear disadvantage is that
> stopping a PU bundle could take quite long (depending on timeout). Would it
> be better to just let the threads close the EMs asynchronously and ignore the
> fact that this might go wrong if the bundle is uninstalled in the mean time.
>
> Christian
>
> Am 15.02.2015 um 18:38 schrieb Peter Kriens:
>> As always with design, it is about trade offs. As indicated in my mail, the
>> recovery time can be shortened if you can do a controlled shutdown. I know
>> this was a big issue with mainframes, however, I doubt that with today’s
>> highly distributed systems this is still very relevant. In general, when I
>> have the choice in these circumstances I would rather focus on reducing
>> startup time instead of trying to manage shutdown more nicely.
>>
>> I think the complexity of the additional recovery part is also dangerous,
>> especially since you will have a common path and one that only gets executed
>> when the shit really hits the fan. I think that is worth some additional
>> startup time in one of the many machines in the cluster.
>>
>> That said, every case is special. Just sharing my long experience in seeing
>> overly complicated solutions that looked good close up but provided no real
>> gain when you looked at the overall picture.
>>
>> Kind regards,
>>
>>
>> Peter Kriens
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>> On 15 feb. 2015, at 13:18, Graham Charters <[email protected]
>>> <mailto:[email protected]>> wrote:
>>>
>>> Hi Peter,
>>>
>>>
>>>
>>> I think you and I see different customer use
>>> cases. As I mentioned at the last OSGi f2f, we
>>> have customers whose applications take a
>>> significant amount of time to start and they
>>> have many instance. Rolling updates can
>>> therefore take a long time if full application
>>> restart is necessary, so these customers want to
>>> minimise application update time and disruption.
>>> These are transactional deployments with
>>> failover so they can be recovered if someone
>>> trips over the power chord, but that doesn't
>>> mean they want use this during normal
>>> maintenance.
>>>
>>>
>>>
>>>
>>>
>>> Regards, Graham.
>>>
>>>
>>>
>>> Graham Charters PhD CEng MBCS PhD
>>>
>>> STSM, WebSphere OSGi Applications & Liberty
>>> Repository Lead Architect, Master Inventor
>>>
>>> IBM United Kingdom Limited, MP 146, Hursley
>>> Park, Winchester, SO21 2JN, UK
>>>
>>> Tel: +44 1962 816527 Email: [email protected]
>>> <mailto:[email protected]>
>>>
>>> Peter Kriens --- Re: [osgi-dev] How to cleanly update/uninstall bundles ---
>>>
>>> From: "Peter Kriens" <[email protected]
>>> <mailto:[email protected]>>
>>> To: "OSGi Developer Mail List" <[email protected]
>>> <mailto:[email protected]>>
>>> Date: Sun, 15 Feb 2015 11:48
>>> Subject: Re: [osgi-dev] How to cleanly update/uninstall bundles
>>>
>>> I am not sure I agree with your conclusion. :-)
>>>
>>> Since it is theoretically impossible to protect against hard failure
>>> (power, kernel panic, kill -9, distributed call when the cable is plugged,
>>> etc) any valuable application must have protection against an unexpected
>>> exit at any moment in time. Idempotency, consensus, and transactionality
>>> are your friends in these cases. So if you are protected against these bad
>>> failures, how bad can an in-flight shutdown be? Best case you can shorten
>>> the recovery time at restart but this often requires additional complexity
>>> that can then also fail. Since the chance that things go wrong in-flight is
>>> quite small I would take the recovery cost in the unlikely event you got
>>> caught.
>>>
>>> Related is my very old opposition to an update or uninstall callback to the
>>> bundle. Though it is an awfully attractive idea with lots of good stuff the
>>> party is spoiled because you cannot guarantee such a call circumstances.
>>>
>>> Billy Joy (Sun Founder) once told us a story about the development of the
>>> Internet, of which he took part. Initially they tried to make every router
>>> perfect but this turned the routers incredibly expensive and there were
>>> still failure scenarios that even a perfect router could not handle (power,
>>> cable cuts). Then someone proposed to assume the routers were very
>>> imperfect and that the end points should correct the problems in the net.
>>> This changed a very large number of very hard to handle failure scenario
>>> into one problem: how to handle a missing package. If a router panicked,
>>> lost power, a cable was cost, too busy, out of memory, had no clue: discard
>>> the package.
>>>
>>> It is a pervasive problem in Enterprise software world that we want to
>>> ignore failure because it is so hard. For example, Blueprint has this awful
>>> service damping that looks so attractive for the developer (Look Ma, no
>>> dynamics!) but by hiding the reality you get caught in lots of unexpected
>>> places.
>>>
>>> Bad software expects an unchanging perfect world, good software is more
>>> realistic. Embrace failure! :-)
>>>
>>> Kind regards,
>>>
>>> Peter Kriens
>>>
>>>
>>>> On 15 feb. 2015, at 11:09, Christian Schneider <[email protected]
>>>> <mailto:[email protected]>> wrote:
>>>>
>>>> Thanks to all of you for the insights.
>>>>
>>>> From the responses I take that clean shutdown is not in scope of OSGi
>>>> itself.
>>>> I agree that it is best solved on the application level. On the other hand
>>>> I see that the Quiesce API can
>>>> at least cover some
>>>> cases and so it has its values.
>>>>
>>>> Christian
>>>>
>>>> Am 13.02.2015 um 17:55 schrieb Raymond Auge:
>>>>> To my knowledge what you are speaking of is not intentionally supported
>>>>> by the dynamics of osgi. This topic comes up all the time, it's funny.
>>>>>
>>>>> If you must support "in flight" changes, then you have to implement this
>>>>> support in your code using concurrency constructs.
>>>>>
>>>>> Note that unregistering a service is a synchronous operation during
>>>>> "shutdown" of a bundle, and so with
>>>>> proper concurrency measures in place, a bundle could both
>>>>> be shutting down (meaning it's not reachable by other bundles) and also
>>>>> finishing any ongoing work.
>>>>>
>>>>> Anyone feel free to correct me but this is what I've learned in my short
>>>>> experience.
>>>>>
>>>>> - Ray
>>
>>
>>
>> _______________________________________________
>> OSGi Developer Mail List
>> [email protected] <mailto:[email protected]>
>> https://mail.osgi.org/mailman/listinfo/osgi-dev
>> <https://mail.osgi.org/mailman/listinfo/osgi-dev>
>
> --
>
> Christian Schneider
> http://www.liquid-reality.de <http://www.liquid-reality.de/>
>
> Open Source Architect
> Talend Application Integration Division http://www.talend.com
> <http://www.talend.com/>
> _______________________________________________
> OSGi Developer Mail List
> [email protected]
> https://mail.osgi.org/mailman/listinfo/osgi-dev
_______________________________________________
OSGi Developer Mail List
[email protected]
https://mail.osgi.org/mailman/listinfo/osgi-dev