If you guys finally decide to move to k8s then I can help get involved again. I'm a k8s maintainer and my current company needs a Hadoop deployment that supports containers
> On Jun 21, 2017, at 1:02 PM, Olaf Flebbe <[email protected]> wrote: > > Hi Andrew, > > you surely are making jokes when you are saying TAR is an improvement with > respect to RPM/DEB.You surely know that you can unpack every RPM straight to > the filesystem (DEB requires two steps), in case you'll like to. > > You surely know that one can easily host a complete docker based hadoop > cluster on a developer machine in the current git of bigtop. And that docker > toolbox, docker engine, docker for mac integrates really well with Windows, > Linux and MacOSX, working right out of the box (at least on MacOSX and Linux) > as it is right now within bigtop without manually tweaking config files. > > I see no point in reproducing hive, hbase, ... hadoop tests -- most of them > single machine, fake cluster environment -- when we can have the real thing, > a cluster where we use docker for isolating nodes. When the tests do not > really work portable, that's a problem of other projects, not ours. Let's fix > it there. > > IMHO, if we could orchestrate k8s (kubernetes) (or docker-swarm, my favorite) > we could even chose to use a single host with some docker instances or scale > out to a cloud environment and have a reproducable system without tweaking > files. Of course there is much work to do to port tests to the cloud > environment, but these would be a tremendous value added. > > Olaf > > > > >> Am 20.06.2017 um 23:12 schrieb Andrew Purtell <[email protected]>: >> >> Yeah, we can build from git repos. Instead of archive URL you can specify >> for each component a repo and reference by git-URL and branch, tag, or SHA. >> >> Regarding tarball build targets, I was thinking of it as a packaging >> improvement, an additional packaging target. It could make integration >> testing more convenient too if you are not using containers or bare metal >> systems where you own the whole filesystem. >> >>> On Jun 19, 2017, at 6:13 AM, Evans Ye <[email protected]> wrote: >>> >>> Hi Andy, >>> >>> Is it easier to have multiple tarballs to setup a cluster for integration >>> tests? >>> I'm not on the Hadoop/HBase developer side so I have zero context. I was >>> just assuming that deploying a cluster for integration tests would be a >>> beneficial feature for them. >>> >>> Bringing up my discussion with Hadoop and HBase guys at Cloudera, them >>> mentioned two things specifically for Bigtop: >>> >>> a). build from git (which I think you've contributed that in Bigtop already) >>> b). easy to run integration test framework >>> >>> I'm happy to have b). because either way we need to have it in our CI. >>> >>> >>> 2017-06-19 5:04 GMT+08:00 Andrew Purtell <[email protected]>: >>> >>>> IMHO, the easiest and fastest way to get the distribution aspect to be more >>>> useful to more folks is to add a build target that generates plain tarballs >>>> instead of distro-specific Linux packaging. People like us can take the >>>> tarballs and unpack them to environments where for various reasons we don't >>>> want to do RPM management. Vendors like Cloudera can convert tarballs to >>>> parcels, or whatever proprietary format is desired. >>>> >>>> >>>> >>>>> On Sun, Jun 18, 2017 at 12:13 PM, Evans Ye <[email protected]> wrote: >>>>> >>>>> Hi folks, >>>>> >>>>> Many things happened during DataWorks Summit San Jose 2017. Some of the >>>>> folks(Cos, Roman, Amir, Nate, Mike, etc) gathered together to discuss >>>> 1.2.1 >>>>> and the future 1.3 release of Bigtop. I'd like to get back those >>>>> discussions to the mailing list so that who can't make it there can still >>>>> be with us for further discussions: >>>>> >>>>> * 1.2.1 release >>>>> a). Some of the folks expecting Docker on YARN to be back ported to 2.7.4 >>>>> and included in the release >>>>> b). Get rotted code out of our code base: packaging, deployment, testing, >>>>> etc >>>>> c). Get integration test to work in CI >>>>> >>>>> * 1.3.0 release >>>>> a). More machine learning integrations >>>>> b). K8S integration will be an interesting topic >>>>> >>>>> Please help me to complete the list if I miss something. :) >>>>> >>>>> >>>>> OTOH, for me specifically, I visited Cloudera for doing a tech talk. I >>>> meet >>>>> Sean Mackrory and there Hadoop and HBase lead. The pain point they're >>>>> having for a long time is not having an integration test framework for >>>>> there work on the bleeding edge. For example, whether a specific patch >>>> from >>>>> Hadoop breaks HBase or Hive? >>>>> >>>>> My thinking towards this is this is what Bigtop tries to solve at the >>>> very >>>>> beginning. We supposed to have folks from multiple projects to work with >>>> us >>>>> to upgrade packages, and use our frameworks to properly integrate, test >>>>> their code with other components. >>>>> >>>>> So, the future of Bigtop. I think tightly work with the other communities >>>>> is a better way we move forward. But, that means something need to be >>>>> changed. For example, our distribution is somehow, from developers >>>>> perspective, old. Which can not support the integration and testing on >>>> the >>>>> bleeding edge. If we still like to release something suggested for >>>>> Production only, one of the solution is to have both dev and stable >>>>> releases in Bigtop, so developers can work on the dev branch and test >>>>> against newest components. In that case, people from other communities >>>>> might be possible to help us upgrade the package to the newer version, >>>>> which makes things easier. >>>>> >>>>> What do you guys think? Please join me for the discussion. >>>>> >>>> >>>> >>>> >>>> -- >>>> Best regards, >>>> >>>> - Andy >>>> >>>> If you are given a choice, you believe you have acted freely. - Raymond >>>> Teller (via Peter Watts) >>>> >
