Looks very interesting!

Sorry for breaking into discussion, i'm not a commiter, just yet another user, 
but..

As you wrote, docker doesn't fit this well.
The problem is that you tried to push all components into one container, and 
you lost immutability of image.
I fully agree and understand this way for production, for more local 
connectivity, but docker containers doesn't have big difference for run this 
hive, spark, hdfs on one docker container, or on many different. Anyway, their 
using network for connectivity, and you compare connectivity inside one 
container and connectivity between many containers on one local machine.

They all run on one single machine, and if you create own container for each 
component hdfs, hive, spark, hbase, yarn its good fitting to docker model.

Futher, you can create environment using docker-compose for mixing this base 
layers [hdfs, hive, spark, hhbase, ignite] as you wish.

Just create a set of base images and templating script for creating 
docker-compose.yml for connect their.

Futher, if you want to simulate many-nodes cluster -- you can do it just 
writting new docker-compose.yaml. You can test High Availability, HDFS 
decommission, or anything you want just write your own docker-compose.yaml.

-- 
Mikhail Epikhin

On Thu, Oct 11, 2018, at 20:25, Konstantin Boudnik wrote:
> Well, finally I came around and started working on the long-awaiting feature
> for Bigtop, where one would be able to quickly build a container with an
> arbitrary set of components in it for further orchestration.
> 
> The ideas was to have components in different layers, so they could be put
> combined together for desired effect. Say there are layers with:
>   1 hdfs
>   2 hive
>   3 spark
>   4 hbase
>   5 ignite
>   6 yarn
> and so on....
> 
> If one wants to assemble a spark only cluster there would be a way to layer up
> 3 and 1 (ideally, 3's dependency to 1 would be automatically calculated) and
> boom - there's an image, which would be put to use. The number of combination
> might be greater, of course. E.g. 3-6-1, or 4-2-1-6 and so forth.
> 
> Turned out, that I can't "prebuild" those layers as Docker won't allow you to
> combine separate images to one ;( However, there's still a way to achieve a
> similar effect. All I need to do is to create a set of tar-ball containing all
> bits of particular components, i.e. all bits of spark or hive. When an image
> needs to be build, these tarballs would be used to layer the software on top
> of the base image and each other. In the above example, Dockerfile would look
> something like
> 
>     FROM ubuntu:16.04
>     ADD hdfs-all.tar /tmp
>     RUN tar xf /tmp/hdfs-all.tar 
>     ADD spark-all.tar /tmp
>     RUN tar xf /tmp/spark-all.tar 
> 
> Once the images is generated, the orchestration and configuration phases will
> kick in. At which point a docker-based cluster would be all ready to go.
> 
> Do you guys see any value in this approach comparing to the current
> package-based way of managing things? 
> 
> Appreciate any thoughts!
> --
>   Cos
> 
> P.S. BTW, I guess I have a decent answer to all those asking for tar-ball
> installation artifacts. It is as easy as running 
>     dpkg-deb -xv
> on all packages and then tar'ing up the resulted set of files.
> 
> Email had 1 attachment:
> + signature.asc
>   1k (application/pgp-signature)

Reply via email to