Decision making taking more time than I expected and
I think this shouldn't be blocker for 0.7.0.

We can take more time deciding which interpreters can be included or
excluded.
Until then, I am just going to go with our current one: zeppelin-bin-all,
zeppelin-bin-netinst.

Moon's suggestion looks good too.
Here I summarized interpreter lists that can be included for each option:
 a. Min package includes interpreters, binary size less than 10MB
      > angular, bigquery, hdfs, kylin, livy, md, postgresql, python, sh
 b. Min package includes interpreters 5 or more JIRA issue created per
month.
      > Need to track. This can be overload for release process.
 c. Min package includes/exclude interpreter that community decide via
formal vote.
     > md, jdbc, spark (based on this mailing thread)



On Fri, Jan 20, 2017 at 5:57 PM moon soo Lee <m...@apache.org> wrote:

> Hi,
>
> I think we need to have some policy to decide which interpreter goes into
> zeppelin-bin-min package. And make applying that policy as a part of
> release process.
> Because i can not see any consistent rule except for "it seems" or "i
> guess". And i have no idea how i can explain if somebody ask 'why python is
> not in min package?' 'why xxx is not in min package?'.
>
> If we really want to min package, we must have a policy that gives
> everyone same expectation which goes to min package and which goes not.
> Once we agree on policy we can make it part of the release process.
>
> So, why don't we try define policy together? Here's some idea i can throw.
>
>  a. Min package includes interpreters, binary size less than 10MB
>  b. Min package includes interpreters 5 or more JIRA issue created per
> month.
>  c. Min package includes/exclude interpreter that community decide via
> formal vote.
>
> "10MB", "5 or more" they are number i just made up. We can change them to
> more reasonable numbers.
> Also a,b,c are possible examples. We can refine them, we can use only one,
> we can use all three, we can add more.
>
> My point is, we need to give everyone the same expectation which goes min
> package, which goes not.
> What do you think?
>
> Thanks,
> moon
>
> On Thu, Jan 19, 2017 at 12:47 AM Mina Lee <mina...@apache.org> wrote:
>
> Thank you for sharing your opinion guys.
>
> I like Eric's approach.
> We are planning to provide official docker managed by community.
> There is ongoing work [1] around it, I can focus on this after 0.7.0
> release.
>
> It seems that majority prefers binary package with top used interpreters
> such as spark, md, jdbc.
> I think we can gradually move to providing only netinst package once
> docker is ready.
> For upcoming 0.7.0 release, I'd like to distribute two binary packages:
>   - zeppelin-bin-min(spark, jdbc, md)
>   - zeppelin-bin-netinst(spark only)
>
> [1] https://github.com/apache/zeppelin/pull/1761
>
> Thanks,
> Mina
>
> On Thu, Jan 19, 2017 at 1:57 AM Jongyoul Lee <jongy...@gmail.com> wrote:
>
> I like to deploy netinst only. And it's good idea that Apache Zeppelin
> supports official docker image with all possible interpreters.
>
> On Wed, Jan 18, 2017 at 7:42 PM, Eric Pugh <
> ep...@opensourceconnections.com> wrote:
>
> Can I throw out an alternate approach?   I feel like the key value of the
> “-all” option is to simplify the life of someone who is new to Zeppelin.
>  If you’re a sophisticated Zeppelin user, then picking and choosing
> interpreters is easy, and you you grok why you want to do that….
>
> However, for myself, when I want to demo Zeppelin, I go straight to one of
> the Docker images, specifically
> https://github.com/dylanmei/docker-zeppelin because it bundles in
> everything.
>
> Would providing a similar Docker image on the “Get Zeppelin” page that
> bundles in all the dependencies and interpreters solve the “how do I try
> Zeppelin in 5 minutes” challenge?  The “Get Zeppelin” page is rather
> daunting page!
>
> Eric
>
>
> On Jan 18, 2017, at 12:00 AM, Mohit Jaggi <mohitja...@gmail.com> wrote:
>
>  Including ALL interpreters is not feasible, not due to download size as
> that is easily increased but because we wouldn't want to couple the release
> cycles as pointed out by Jeff. IMHO a few of the most popular ones should
> be included. Yes it is just one extra step but if a computer can do it why
> make a human suffer? :-)
> Re: spark-packages, Spark does include important and mature functionality
> in its assembly e.g. Csv parser was merged into core spark when it matured.
> I believe Z should do the same.
>
> Sent from my iPhone
>
> On Jan 17, 2017, at 8:05 PM, Jeff Zhang <zjf...@gmail.com> wrote:
>
>
> Another thing I'd like to talk is that should we move most of interpreters
> out of zeppelin project to somewhere else just like spark do for
> spark-packages, 2 benefits:
>
> 1. Keep the zeppelin project much smaller
> 2. Each interpreter's improvements won't be blocked by the release of
> zeppelin. Interpreters can has its own release cycle as long as
> zeppelin-interpreter doesn't break the compatibility.
>
> If it make sense, I can open another thread to discuss it.
>
>
>
>
> Jun Kim <i2r....@gmail.com>于2017年1月18日周三 上午11:55写道:
>
> +1 for Jeff's idea! I also use the three interpreters mainly :)
>
> 2017년 1월 18일 (수) 오후 12:52, Jeff Zhang <zjf...@gmail.com>님이 작성:
>
>
> How about also include markdown and jdbc interpreter if this won't cause
> binary distribution much bigger ? I guess spark, markdown, and jdbc
> interpreters are the top 3 interpreters in zeppelin.
>
>
>
> Ahyoung Ryu <ahyoung...@apache.org>于2017年1月18日周三 上午11:33写道:
>
> Thanks Mina always!
> +1 for releasing only netinst package.
>
> On Wed, Jan 18, 2017 at 12:29 PM, Prabhjyot Singh <
> prabhjyotsi...@apache.org> wrote:
>
> +1
>
> I don't think it's a problem now, but if it keeps increasing then in the
> subsequent releases we can ship Zeppelin with few interpreters, and mark
> others as plugins that can be downloaded later with instructions with how
> to configure.
>
> On Jan 18, 2017 8:54 AM, "Jun Kim" <i2r....@gmail.com> wrote:
>
> +1
>
> I think it won't be a problem if we notice it clear.
> Maybe we can do that next to the download button here (
> http://zeppelin.apache.org/download.html)
> A message may be "NOTE: only spark interpreter included since 0.7.0. If
> you want other interpreters, please see interpreter installation guide"
>
> 2017년 1월 18일 (수) 오후 12:14, Jeff Zhang <zjf...@gmail.com>님이 작성:
>
>
> +1, we should also mention it in release note and in the 0.7 doc
>
>
>
> Mina Lee <mina...@apache.org>于2017年1月18日周三 上午11:12写道:
>
> Hi all,
>
> Zeppelin is about to start 0.7.0 release process, I would like to discuss
> about binary package distribution.
>
> Every time we distribute new binary package, size of the
> zeppelin-0.x.x-bin-all.tgz package is getting bigger:
>    - zeppelin-0.6.0-bin-all.tgz: 506M
>    - zeppelin-0.6.1-bin-all.tgz: 517M
>    - zeppelin-0.6.2-bin-all.tgz: 547M
>    - zeppelin-0.7.0-bin-all.tgz: 720M (Expected)
>
> Mostly it is because the number of interpreters supported by zeppelin
> keeps growing,
> and there is high chance that we support more interpreters in the near
> future.
> So instead of asking apache infra team to increase limit,
> I would like to suggest to have only zeppelin-0.7.0-bin-netinst.tgz, which
> only includes spark interpreter from 0.7.0 release.
> One concern is that users need one more step to install the interpreters
> they use,
> but I believe it can be done easily with single line of command [1].
>
> FYI, attaching the link of similar discussion [2] we had last June in
> mailing list.
>
> Regards,
> Mina
>
> [1]
> http://zeppelin.apache.org/docs/0.6.2/manual/interpreterinstallation.html#install-specific-interpreters
> <http://zeppelin.apache.org/docs/0.6.2/manual/interpreterinstallation.html>
> [2]
> https://lists.apache.org/thread.html/4b54c034cf8d691655156e0cb647243180c57a6829d97aa3c085b63c@%3Cusers.zeppelin.apache.org%3E
>
> --
> Taejun Kim
>
> Data Mining Lab.
> School of Electrical and Computer Engineering
> University of Seoul
>
>
> --
> Taejun Kim
>
> Data Mining Lab.
> School of Electrical and Computer Engineering
> University of Seoul
>
>
>
> _______________________
> *Eric Pugh **| *Founder & CEO | OpenSource Connections, LLC | 434.466.1467
> | http://www.opensourceconnections.com | My Free/Busy
> <http://tinyurl.com/eric-cal>
> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed
> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
> This e-mail and all contents, including attachments, is considered to be
> Company Confidential unless explicitly stated otherwise, regardless
> of whether attachments are marked as such.
>
>
>
>
> --
> 이종열, Jongyoul Lee, 李宗烈
> http://madeng.net
>
>

Reply via email to