Thanks Rob! That clarifies a lot! On Mon, Aug 22, 2016 at 4:02 PM, Rob Cernich <rcern...@redhat.com> wrote:
> > > ------------------------------ > > Hey Rob! > > Thanks a lot for clarification! > > More comments inlined. > > Thanks > Sebastian > > On Sat, Aug 20, 2016 at 12:04 AM, Rob Cernich <rcern...@redhat.com> wrote: > >> A couple of things... >> >> re. volumes: >> We also need to consider the mounting behavior for scale down scenarios >> and for overage scenarios when doing upgrades. For the latter, OpenShift >> can spin up pods of the new version before the older version pods have >> terminated. This may mean that some volumes from the old pods are >> orphaned. We did see this when testing A-MQ during upgrades. With a >> single pod, the upgrade process caused the new version to have a new mount >> and the original mount was left orphaned (another upgrade would cause the >> newer pod to pick up the orphaned mount, leaving the new mount orphaned). >> I believe we worked around this by specifying an overage of 0% during >> upgrades. This ensured the new pods would pick up the volumes left behind >> by the old pods. (Actually, we were using subdirectories in the mount, >> since all pods shared the same volume.) >> >> > I think PetSets try to address this kind of problems. According to the > manual page [11], the storage is linked to the Pod ordinal and hostaname > and should be stable. > > [11] http://kubernetes.io/docs/user-guide/petset/#when-to-use-pet-set > >> >> re. dns: >> DNS should work fine as-is, but there are a couple things that you need >> to consider. >> 1. Service endpoints are only available in DNS after the pod becomes >> ready (SVC records on the service name). Because infinispan attaches >> itself to the cluster, this meant pods were all started as cluster of one, >> then merged once they noticed the other pods. This had a significant >> impact on startup. Since then, OpenShift has added the ability to query >> the endpoints associated with a service as soon as the pod is created, >> which would allow initialization to work correctly. To make this work, >> we'd have to change the form of the DNS query to pick up the service >> endpoints (I forget the naming scheme). >> > > Yes, I agree. Adding nodes one after another will have significant impact > on cluster startup time. However it should be safe to query the cluster > (and even put data) during rebalance. So I would say, if a node is up, and > cluster is not damaged - we should treat it as ready. > > To be clear, the issue was around the lifecycle and interaction with the > readiness probe. OpenShift/Kubernetes only add the pod to the service once > it's "ready." Our readiness probe defines ready as startup is complete > (i.e. x of y services started). The issue with this is that as pods come > up, they only see other pods that are ready when they initialize, so if you > start with an initial cluster size of five, the new nodes don't see any of > the other nodes until they finish startup and refresh the list of nodes, > after which they all have to merge with each other. This had a significant > impact on performance. Since then, a feature has been added which allows > you to query DNS for all endpoints regardless of ready state. This allows > nodes to see the other nodes before they become ready, which allows for a > more natural cluster formation. To reiterate, the impact on startup was > significant, especially when scaling up under load. > > > NB - I proposed a HealthCheck API to Infinispan 9 (currently under > development) [12][13]. The overall cluster health can be in one of 3 > statuses - GREEN (everything is fine), YELLOW (rebalance in progress), RED > (cluster not healthy). Kubernetes/OpenShift readiness probe should check if > the status is GREEN or YELLOW. The HealthCheck API is attached to the WF > management API so you can query it with CURL or using ispn_cli.sh script. > > [12] https://github.com/infinispan/infinispan/wiki/Health-check-API > [13] https://github.com/infinispan/infinispan/pull/4499 > >> >> Another thing to keep in mind is that looking up pods by labels allows >> any pod with the specified label to be added to the cluster. I'm not sure >> of a use case for this, but it would allow other deployments to be included >> in the cluster. (You could also argue that the service is the authority >> for this and any pod with said label would be added as a service endpoint, >> thus achieving the same behavior...probably more simply too.) >> > > I think this is a scenario when someone might try to attach Infinispan in > library mode (a dependency in WAR file for example) to the Hot Rod cluster. > Gustavo answered question like this a while ago [14]. > > [14] https://developer.jboss.org/message/961568 > >> Lastly, DNS was a little flaky when we first implemented this, which was >> part of the reason we went straight to kubernetes. Users were using >> dnsmasq with wildcards that worked well for routes, but ended up routing >> services to the router ip instead of pod ip. Needless to say, there were a >> lot of complications trying to use DNS and debug user problems with service >> resolution. >> > > I think a governing headless service [15] is required here (PetSets > require a service but considering how Infinispan works, it should be a > headless service in my opinion). > > [15] http://kubernetes.io/docs/user-guide/services/#headless-services > >> >> >> Hope that helps, >> Rob >> >> ------------------------------ >> >> Hey Bela! >> >> No no, the resolution can be done with pure JDK. >> >> Thanks >> Sebastian >> >> On Fri, Aug 19, 2016 at 11:18 AM, Bela Ban <b...@redhat.com> wrote: >> >>> Hi Sebastian >>> >>> the usual restrictions apply: if DNS discovery depends on external libs, >>> then it should be hosted in jgroups-extras, otherwise we can add it to >>> JGroups itself. >>> >>> On 19/08/16 11:00, Sebastian Laskawiec wrote: >>> >>>> Hey! >>>> >>>> I've been playing with Kubernetes PetSets [1] for a while and I'd like >>>> to share some thoughts. Before I dig in, let me give you some PetSets >>>> highlights: >>>> >>>> * PetSets are alpha resources for managing stateful apps in Kubernetes >>>> 1.3 (and OpenShift Origin 1.3). >>>> * Since this is an alpha resource, there are no guarantees about >>>> backwards compatibility. Alpha resources can also be disabled in >>>> some public cloud providers (you can control which API versions are >>>> accessible [2]). >>>> * PetSets allows starting pods in sequence (not relevant for us, but >>>> this is a killer feature for master-slave systems). >>>> * Each Pod has it's own unique entry in DNS, which makes discovery >>>> very simple (I'll dig into that a bit later) >>>> * Volumes are always mounted to the same Pods, which is very important >>>> in Cache Store scenarios when we restart pods (e.g. Rolling Upgrades >>>> [3]). >>>> >>>> Thoughts and ideas after spending some time playing with this feature: >>>> >>>> * PetSets make discovery a lot easier. It's a combination of two >>>> things - Headless Services [4] which create multiple A records in >>>> DNS and predictable host names. Each Pod has it's own unique DNS >>>> entry following pattern: {PetSetName}-{PodIndex}.{ServiceName} [5]. >>>> Here's an example of an Infinispan PetSet deployed on my local >>>> cluster [6]. As you can see we have all domain names and IPs from a >>>> single DNS query. >>>> * Maybe we could perform discovery using this mechanism? I'm aware of >>>> DNS discovery implemented in KUBE_PING [7][8] but the code looks >>>> trivial [9] so maybe it should be implement inside JGroups? @Bela - >>>> WDYT? >>>> * PetSets do not integrate well with OpenShift 'new-app' command. In >>>> other words, our users will need to use provided yaml (or json) >>>> files to create Infinispan cluster. It's not a show-stopper but it's >>>> a bit less convenient than 'oc new-app'. >>>> * Since PetSets are alpha resources they need to be considered as >>>> secondary way to deploy Infinispan on Kubernetes and OpenShift. >>>> * Finally, the persistent volumes - since a Pod always gets the same >>>> volume, it would be safe to use any file-based cache store. >>>> >>>> If you'd like to play with PetSets on your local environment, here are >>>> necessary yaml files [10]. >>>> >>>> Thanks >>>> Sebastian >>>> >>>> >>>> [1] http://kubernetes.io/docs/user-guide/petset/ >>>> [2] For checking which APIs are accessible, use 'kubectl api-versions' >>>> [3] >>>> http://infinispan.org/docs/stable/user_guide/user_guide. >>>> html#_Rolling_chapter >>>> [4] http://kubernetes.io/docs/user-guide/services/#headless-services >>>> [5] http://kubernetes.io/docs/user-guide/petset/#peer-discovery >>>> [6] https://gist.github.com/slaskawi/0866e63a39276f8ab66376229716a676 >>>> [7] https://github.com/jboss-openshift/openshift-ping/tree/master/dns >>>> [8] https://github.com/jgroups-extras/jgroups-kubernetes/ >>>> tree/master/dns >>>> [9] http://stackoverflow.com/a/12405896/562699 >>>> [10] You might need to adjust ImageStream. >>>> https://gist.github.com/slaskawi/7cffb5588dabb770f654557579c5f2d0 >>>> >>> >>> -- >>> Bela Ban, JGroups lead (http://www.jgroups.org) >>> >>> >> >> > >
_______________________________________________ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev