Re: Define the future of Apache Bigtop

2019-01-08 Thread Olaf Flebbe
Hi,

Thanks all for sharing your views.

Since I am not able to invest in Bigtop any more, I will not give 
recommendations.

However I like to share to results of my previous work on topics (1) and (3)

(1a)  If you see that a particular software does not fit your needs, it is best
to work with the respective project. At an downstream user of a particular 
project
Bigtop cannot maintain forks. 

(1b)  In my previous live I was with many HPC and Big Data sysadmins. 
The all loved to automate things. There are a of tools fitting the needs, but 
surely 
one does not need an UI to bootstrap/maintain a Linux/Hadoop/HPC/Big Data 
Cluster.
(It might be helpful for beginners and for software concepts running on 
powerpoint)

All the tools (chef/saltstack/puppet/ansible/whatever) are already able to do 
the job, not necessary to 
reinventing the wheel and creating a UI to bootstrap a hadoop cluster.
Since part of the job is to install kerberos, configure name and directory 
services, you name it ... there 
are ready and battletested recipies already available for each of the named 
tools.
At Bigtop there are puppet recipies which can be tuned to really relevant 
environments. 
(Some of my collegues rewrote the stuff with the tool the are used at their 
site for their needs)

I am particulary refering to my talk at 2015 Big Data conference 
http://oflebbe.de/presentations/2015/kerberos_ldap_puppet_bigtop.pdf

There is no easy way to run a production ready Big Data Cluster. It is an 
complex environment 
where you need to know how to plan your HW, networking, enterprise integration 
etc ...
A beginner will fail. Better let them fail fast.

This comment does not necessary extend to the aspects of monitoring, reporting, 
proxying and failure detection.

(3)   No standard configuration of kerberos servers can be handled by k18s (as 
of 2017)

Without kerberos you will not get secure hadoop. 

Without security .. it is not production ready.

Argument is as follows: Standard Kerberos needs to have the reverse ip address 
lookup of a server to match the FQDN of the server. 

In k18s you do not have control over the reverse name lookup of a  service. 
(This argument is valid for docker as well. That is the reason I chose to 
provision to lxc in my 2015 talk)

(3a) There might be other very elaborate ways by not using reverse name lookups 
in kerberos, 
but it is way beyond my insight into kerberos.

(3b) This is based on k18s as of ~2017. Iff k18s is now able to support 
canonical hostnames, this argument is void

(3c) Hadoop3 may support other ways to create a secure environment

Best Regards,
Olaf



> Am 08.01.2019 um 06:43 schrieb Jun HE :
> ance
> Thanks for the summary, Evans.
> 
> From Linaro's perspective, it would be great to see:
> 1. Simple and unified deployment/management for Bigtop SW stack across
> various distros and architectures.
>We've seen people like Bigtop stack (open and easy to customize) but
> failed to deploy and manage in an easy way. While Ambari provides such
> capabilities it combines tight with HDP and x86. And this prevents the
> wider aoption for Bigtop. As some academies and industry players are
> leveraging certain Bigdata workloads to non-x86 platforms it would give
> them great help if we can enable Bigtop SW stack support in Ambari. Also I
> think community can work with ODPi (https://www.odpi.org/) on this to move
> things forward more effectively.
> 2. Add and improve tests (smoke tests/package tests) to improve Bigtop
> stack quality.
>There are quite a few components enabled in the stack but lack of
> tests. And some existing smoke tests need improvement to align with new
> versions/settings. This could be improved to achieve good tests quality and
> results to give users pretty confident to use Bigtop in their daily work.
> 3. Native K8s support as Jay has mentioned
> 4. Hadoop 3.x supports.
>This could take times and effort (I think most come from ecosystem
> dependecies), but it could be our primary but not prioritized work. :)
> 
> That's things came up to me so far. Pls feel free to comment.
> 
> Thanks.
> 
> Jun
> 
> Evans Ye  于2019年1月4日周五 上午12:10写道:
> 
>> To sum up the discussion of this thread, there're two ideas proposed:
>> * Native K8S packages - by Jay
>> * Deploy Bigtop via Ambari - by Jun
>> 
>> I think both are very good ideas. However I'd like to gather more info from
>> the requirement point of view. For example, Jun what do you see from the
>> Linaro perspective for user requirements on Big Data? I think that's a
>> valuable input to the community since you guys are also becoming a
>> significant contributors at the community.
>> 
>> - Evans
>> 
>> 
>> Jun HE  於 2018年12月17日 週一 上午9:17寫道:
>> 
>>> +1 for this!
>>> 
>>> And another thing is I'm not clear about the status of ambari mpack in
>>> Bigtop, so can we deploy Bigtop SW stack using ambari now? If the answer
>> is
>>> no, maybe this is what end user would like to see.
>>> 
>>> Jay Vyas  

Re: Define the future of Apache Bigtop

2019-01-07 Thread Jun HE
Thanks for the summary, Evans.

>From Linaro's perspective, it would be great to see:
1. Simple and unified deployment/management for Bigtop SW stack across
various distros and architectures.
We've seen people like Bigtop stack (open and easy to customize) but
failed to deploy and manage in an easy way. While Ambari provides such
capabilities it combines tight with HDP and x86. And this prevents the
wider aoption for Bigtop. As some academies and industry players are
leveraging certain Bigdata workloads to non-x86 platforms it would give
them great help if we can enable Bigtop SW stack support in Ambari. Also I
think community can work with ODPi (https://www.odpi.org/) on this to move
things forward more effectively.
2. Add and improve tests (smoke tests/package tests) to improve Bigtop
stack quality.
There are quite a few components enabled in the stack but lack of
tests. And some existing smoke tests need improvement to align with new
versions/settings. This could be improved to achieve good tests quality and
results to give users pretty confident to use Bigtop in their daily work.
3. Native K8s support as Jay has mentioned
4. Hadoop 3.x supports.
This could take times and effort (I think most come from ecosystem
dependecies), but it could be our primary but not prioritized work. :)

That's things came up to me so far. Pls feel free to comment.

Thanks.

Jun

Evans Ye  于2019年1月4日周五 上午12:10写道:

> To sum up the discussion of this thread, there're two ideas proposed:
> * Native K8S packages - by Jay
> * Deploy Bigtop via Ambari - by Jun
>
> I think both are very good ideas. However I'd like to gather more info from
> the requirement point of view. For example, Jun what do you see from the
> Linaro perspective for user requirements on Big Data? I think that's a
> valuable input to the community since you guys are also becoming a
> significant contributors at the community.
>
> - Evans
>
>
> Jun HE  於 2018年12月17日 週一 上午9:17寫道:
>
> > +1 for this!
> >
> > And another thing is I'm not clear about the status of ambari mpack in
> > Bigtop, so can we deploy Bigtop SW stack using ambari now? If the answer
> is
> > no, maybe this is what end user would like to see.
> >
> > Jay Vyas  于2018年12月15日周六 上午1:11写道:
> >
> > > How about a Kubernetes native distribution :).
> > >
> > > > On Dec 14, 2018, at 11:53 AM, Evans Ye  wrote:
> > > >
> > > > Hi all,
> > > >
> > > > We've just released 1.3.0 back in Nov. 2018 with RM Jun He and the
> > > > community's great help. Now it's time to look forward and setup a
> goal
> > > for
> > > > the next stage. We've a doc established back in 2017 to record the
> > ideas
> > > of
> > > > Bigtop. See:
> > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1F2Gxu8GARQDZXgqHn12LKkQ5wCV_AF4b_tVmjYB6YfA/edit#
> > > >
> > > > Align with Project Frontier's goal, I'll work on the testing part and
> > > > improve our testing, deployment, CI pipelines as a whole. So welcome
> to
> > > > work on this together!
> > > > But as time flows by, I'd more like to call out for your thoughts.
> What
> > > is
> > > > the most valuable feature to add in Bigtop, which ultimately helps
> you,
> > > > your company, and the big data users in the world. Please join the
> > > > discussion and shape the future of our project.
> > > >
> > > > Best,
> > > > Evans Ye
> > >
> >
>


Re: Define the future of Apache Bigtop

2019-01-03 Thread Evans Ye
To sum up the discussion of this thread, there're two ideas proposed:
* Native K8S packages - by Jay
* Deploy Bigtop via Ambari - by Jun

I think both are very good ideas. However I'd like to gather more info from
the requirement point of view. For example, Jun what do you see from the
Linaro perspective for user requirements on Big Data? I think that's a
valuable input to the community since you guys are also becoming a
significant contributors at the community.

- Evans


Jun HE  於 2018年12月17日 週一 上午9:17寫道:

> +1 for this!
>
> And another thing is I'm not clear about the status of ambari mpack in
> Bigtop, so can we deploy Bigtop SW stack using ambari now? If the answer is
> no, maybe this is what end user would like to see.
>
> Jay Vyas  于2018年12月15日周六 上午1:11写道:
>
> > How about a Kubernetes native distribution :).
> >
> > > On Dec 14, 2018, at 11:53 AM, Evans Ye  wrote:
> > >
> > > Hi all,
> > >
> > > We've just released 1.3.0 back in Nov. 2018 with RM Jun He and the
> > > community's great help. Now it's time to look forward and setup a goal
> > for
> > > the next stage. We've a doc established back in 2017 to record the
> ideas
> > of
> > > Bigtop. See:
> > >
> > >
> >
> https://docs.google.com/document/d/1F2Gxu8GARQDZXgqHn12LKkQ5wCV_AF4b_tVmjYB6YfA/edit#
> > >
> > > Align with Project Frontier's goal, I'll work on the testing part and
> > > improve our testing, deployment, CI pipelines as a whole. So welcome to
> > > work on this together!
> > > But as time flows by, I'd more like to call out for your thoughts. What
> > is
> > > the most valuable feature to add in Bigtop, which ultimately helps you,
> > > your company, and the big data users in the world. Please join the
> > > discussion and shape the future of our project.
> > >
> > > Best,
> > > Evans Ye
> >
>


Re: Define the future of Apache Bigtop

2018-12-16 Thread Jun HE
+1 for this!

And another thing is I'm not clear about the status of ambari mpack in
Bigtop, so can we deploy Bigtop SW stack using ambari now? If the answer is
no, maybe this is what end user would like to see.

Jay Vyas  于2018年12月15日周六 上午1:11写道:

> How about a Kubernetes native distribution :).
>
> > On Dec 14, 2018, at 11:53 AM, Evans Ye  wrote:
> >
> > Hi all,
> >
> > We've just released 1.3.0 back in Nov. 2018 with RM Jun He and the
> > community's great help. Now it's time to look forward and setup a goal
> for
> > the next stage. We've a doc established back in 2017 to record the ideas
> of
> > Bigtop. See:
> >
> >
> https://docs.google.com/document/d/1F2Gxu8GARQDZXgqHn12LKkQ5wCV_AF4b_tVmjYB6YfA/edit#
> >
> > Align with Project Frontier's goal, I'll work on the testing part and
> > improve our testing, deployment, CI pipelines as a whole. So welcome to
> > work on this together!
> > But as time flows by, I'd more like to call out for your thoughts. What
> is
> > the most valuable feature to add in Bigtop, which ultimately helps you,
> > your company, and the big data users in the world. Please join the
> > discussion and shape the future of our project.
> >
> > Best,
> > Evans Ye
>


Re: Define the future of Apache Bigtop

2018-12-15 Thread Jay Vyas
Retooling and reenvisioning the platform., more like. Leaner and focused on an 
opinionated workflow .

Sent from my iPhone

> On Dec 15, 2018, at 11:22 AM, Evans Ye  wrote:
> 
> Yeah. A long standing user story :)
> I heard there's a pkg management system called helm for k8s. Is producing
> pkgs in helm what you proposed?
> 
> Jay Vyas  於 2018年12月15日 週六 上午1:11寫道:
> 
>> How about a Kubernetes native distribution :).
>> 
>>> On Dec 14, 2018, at 11:53 AM, Evans Ye  wrote:
>>> 
>>> Hi all,
>>> 
>>> We've just released 1.3.0 back in Nov. 2018 with RM Jun He and the
>>> community's great help. Now it's time to look forward and setup a goal
>> for
>>> the next stage. We've a doc established back in 2017 to record the ideas
>> of
>>> Bigtop. See:
>>> 
>>> 
>> https://docs.google.com/document/d/1F2Gxu8GARQDZXgqHn12LKkQ5wCV_AF4b_tVmjYB6YfA/edit#
>>> 
>>> Align with Project Frontier's goal, I'll work on the testing part and
>>> improve our testing, deployment, CI pipelines as a whole. So welcome to
>>> work on this together!
>>> But as time flows by, I'd more like to call out for your thoughts. What
>> is
>>> the most valuable feature to add in Bigtop, which ultimately helps you,
>>> your company, and the big data users in the world. Please join the
>>> discussion and shape the future of our project.
>>> 
>>> Best,
>>> Evans Ye
>> 


Re: Define the future of Apache Bigtop

2018-12-15 Thread Evans Ye
Yeah. A long standing user story :)
I heard there's a pkg management system called helm for k8s. Is producing
pkgs in helm what you proposed?

Jay Vyas  於 2018年12月15日 週六 上午1:11寫道:

> How about a Kubernetes native distribution :).
>
> > On Dec 14, 2018, at 11:53 AM, Evans Ye  wrote:
> >
> > Hi all,
> >
> > We've just released 1.3.0 back in Nov. 2018 with RM Jun He and the
> > community's great help. Now it's time to look forward and setup a goal
> for
> > the next stage. We've a doc established back in 2017 to record the ideas
> of
> > Bigtop. See:
> >
> >
> https://docs.google.com/document/d/1F2Gxu8GARQDZXgqHn12LKkQ5wCV_AF4b_tVmjYB6YfA/edit#
> >
> > Align with Project Frontier's goal, I'll work on the testing part and
> > improve our testing, deployment, CI pipelines as a whole. So welcome to
> > work on this together!
> > But as time flows by, I'd more like to call out for your thoughts. What
> is
> > the most valuable feature to add in Bigtop, which ultimately helps you,
> > your company, and the big data users in the world. Please join the
> > discussion and shape the future of our project.
> >
> > Best,
> > Evans Ye
>


Re: Define the future of Apache Bigtop

2018-12-14 Thread Jay Vyas
How about a Kubernetes native distribution :).

> On Dec 14, 2018, at 11:53 AM, Evans Ye  wrote:
> 
> Hi all,
> 
> We've just released 1.3.0 back in Nov. 2018 with RM Jun He and the
> community's great help. Now it's time to look forward and setup a goal for
> the next stage. We've a doc established back in 2017 to record the ideas of
> Bigtop. See:
> 
> https://docs.google.com/document/d/1F2Gxu8GARQDZXgqHn12LKkQ5wCV_AF4b_tVmjYB6YfA/edit#
> 
> Align with Project Frontier's goal, I'll work on the testing part and
> improve our testing, deployment, CI pipelines as a whole. So welcome to
> work on this together!
> But as time flows by, I'd more like to call out for your thoughts. What is
> the most valuable feature to add in Bigtop, which ultimately helps you,
> your company, and the big data users in the world. Please join the
> discussion and shape the future of our project.
> 
> Best,
> Evans Ye


Define the future of Apache Bigtop

2018-12-14 Thread Evans Ye
Hi all,

We've just released 1.3.0 back in Nov. 2018 with RM Jun He and the
community's great help. Now it's time to look forward and setup a goal for
the next stage. We've a doc established back in 2017 to record the ideas of
Bigtop. See:

https://docs.google.com/document/d/1F2Gxu8GARQDZXgqHn12LKkQ5wCV_AF4b_tVmjYB6YfA/edit#

Align with Project Frontier's goal, I'll work on the testing part and
improve our testing, deployment, CI pipelines as a whole. So welcome to
work on this together!
But as time flows by, I'd more like to call out for your thoughts. What is
the most valuable feature to add in Bigtop, which ultimately helps you,
your company, and the big data users in the world. Please join the
discussion and shape the future of our project.

Best,
Evans Ye