Decision making taking more time than I expected and I think this shouldn't be blocker for 0.7.0.
We can take more time deciding which interpreters can be included or excluded. Until then, I am just going to go with our current one: zeppelin-bin-all, zeppelin-bin-netinst. Moon's suggestion looks good too. Here I summarized interpreter lists that can be included for each option: a. Min package includes interpreters, binary size less than 10MB > angular, bigquery, hdfs, kylin, livy, md, postgresql, python, sh b. Min package includes interpreters 5 or more JIRA issue created per month. > Need to track. This can be overload for release process. c. Min package includes/exclude interpreter that community decide via formal vote. > md, jdbc, spark (based on this mailing thread) On Fri, Jan 20, 2017 at 5:57 PM moon soo Lee <m...@apache.org> wrote: > Hi, > > I think we need to have some policy to decide which interpreter goes into > zeppelin-bin-min package. And make applying that policy as a part of > release process. > Because i can not see any consistent rule except for "it seems" or "i > guess". And i have no idea how i can explain if somebody ask 'why python is > not in min package?' 'why xxx is not in min package?'. > > If we really want to min package, we must have a policy that gives > everyone same expectation which goes to min package and which goes not. > Once we agree on policy we can make it part of the release process. > > So, why don't we try define policy together? Here's some idea i can throw. > > a. Min package includes interpreters, binary size less than 10MB > b. Min package includes interpreters 5 or more JIRA issue created per > month. > c. Min package includes/exclude interpreter that community decide via > formal vote. > > "10MB", "5 or more" they are number i just made up. We can change them to > more reasonable numbers. > Also a,b,c are possible examples. We can refine them, we can use only one, > we can use all three, we can add more. > > My point is, we need to give everyone the same expectation which goes min > package, which goes not. > What do you think? > > Thanks, > moon > > On Thu, Jan 19, 2017 at 12:47 AM Mina Lee <mina...@apache.org> wrote: > > Thank you for sharing your opinion guys. > > I like Eric's approach. > We are planning to provide official docker managed by community. > There is ongoing work [1] around it, I can focus on this after 0.7.0 > release. > > It seems that majority prefers binary package with top used interpreters > such as spark, md, jdbc. > I think we can gradually move to providing only netinst package once > docker is ready. > For upcoming 0.7.0 release, I'd like to distribute two binary packages: > - zeppelin-bin-min(spark, jdbc, md) > - zeppelin-bin-netinst(spark only) > > [1] https://github.com/apache/zeppelin/pull/1761 > > Thanks, > Mina > > On Thu, Jan 19, 2017 at 1:57 AM Jongyoul Lee <jongy...@gmail.com> wrote: > > I like to deploy netinst only. And it's good idea that Apache Zeppelin > supports official docker image with all possible interpreters. > > On Wed, Jan 18, 2017 at 7:42 PM, Eric Pugh < > ep...@opensourceconnections.com> wrote: > > Can I throw out an alternate approach? I feel like the key value of the > “-all” option is to simplify the life of someone who is new to Zeppelin. > If you’re a sophisticated Zeppelin user, then picking and choosing > interpreters is easy, and you you grok why you want to do that…. > > However, for myself, when I want to demo Zeppelin, I go straight to one of > the Docker images, specifically > https://github.com/dylanmei/docker-zeppelin because it bundles in > everything. > > Would providing a similar Docker image on the “Get Zeppelin” page that > bundles in all the dependencies and interpreters solve the “how do I try > Zeppelin in 5 minutes” challenge? The “Get Zeppelin” page is rather > daunting page! > > Eric > > > On Jan 18, 2017, at 12:00 AM, Mohit Jaggi <mohitja...@gmail.com> wrote: > > Including ALL interpreters is not feasible, not due to download size as > that is easily increased but because we wouldn't want to couple the release > cycles as pointed out by Jeff. IMHO a few of the most popular ones should > be included. Yes it is just one extra step but if a computer can do it why > make a human suffer? :-) > Re: spark-packages, Spark does include important and mature functionality > in its assembly e.g. Csv parser was merged into core spark when it matured. > I believe Z should do the same. > > Sent from my iPhone > > On Jan 17, 2017, at 8:05 PM, Jeff Zhang <zjf...@gmail.com> wrote: > > > Another thing I'd like to talk is that should we move most of interpreters > out of zeppelin project to somewhere else just like spark do for > spark-packages, 2 benefits: > > 1. Keep the zeppelin project much smaller > 2. Each interpreter's improvements won't be blocked by the release of > zeppelin. Interpreters can has its own release cycle as long as > zeppelin-interpreter doesn't break the compatibility. > > If it make sense, I can open another thread to discuss it. > > > > > Jun Kim <i2r....@gmail.com>于2017年1月18日周三 上午11:55写道: > > +1 for Jeff's idea! I also use the three interpreters mainly :) > > 2017년 1월 18일 (수) 오후 12:52, Jeff Zhang <zjf...@gmail.com>님이 작성: > > > How about also include markdown and jdbc interpreter if this won't cause > binary distribution much bigger ? I guess spark, markdown, and jdbc > interpreters are the top 3 interpreters in zeppelin. > > > > Ahyoung Ryu <ahyoung...@apache.org>于2017年1月18日周三 上午11:33写道: > > Thanks Mina always! > +1 for releasing only netinst package. > > On Wed, Jan 18, 2017 at 12:29 PM, Prabhjyot Singh < > prabhjyotsi...@apache.org> wrote: > > +1 > > I don't think it's a problem now, but if it keeps increasing then in the > subsequent releases we can ship Zeppelin with few interpreters, and mark > others as plugins that can be downloaded later with instructions with how > to configure. > > On Jan 18, 2017 8:54 AM, "Jun Kim" <i2r....@gmail.com> wrote: > > +1 > > I think it won't be a problem if we notice it clear. > Maybe we can do that next to the download button here ( > http://zeppelin.apache.org/download.html) > A message may be "NOTE: only spark interpreter included since 0.7.0. If > you want other interpreters, please see interpreter installation guide" > > 2017년 1월 18일 (수) 오후 12:14, Jeff Zhang <zjf...@gmail.com>님이 작성: > > > +1, we should also mention it in release note and in the 0.7 doc > > > > Mina Lee <mina...@apache.org>于2017年1月18日周三 上午11:12写道: > > Hi all, > > Zeppelin is about to start 0.7.0 release process, I would like to discuss > about binary package distribution. > > Every time we distribute new binary package, size of the > zeppelin-0.x.x-bin-all.tgz package is getting bigger: > - zeppelin-0.6.0-bin-all.tgz: 506M > - zeppelin-0.6.1-bin-all.tgz: 517M > - zeppelin-0.6.2-bin-all.tgz: 547M > - zeppelin-0.7.0-bin-all.tgz: 720M (Expected) > > Mostly it is because the number of interpreters supported by zeppelin > keeps growing, > and there is high chance that we support more interpreters in the near > future. > So instead of asking apache infra team to increase limit, > I would like to suggest to have only zeppelin-0.7.0-bin-netinst.tgz, which > only includes spark interpreter from 0.7.0 release. > One concern is that users need one more step to install the interpreters > they use, > but I believe it can be done easily with single line of command [1]. > > FYI, attaching the link of similar discussion [2] we had last June in > mailing list. > > Regards, > Mina > > [1] > http://zeppelin.apache.org/docs/0.6.2/manual/interpreterinstallation.html#install-specific-interpreters > <http://zeppelin.apache.org/docs/0.6.2/manual/interpreterinstallation.html> > [2] > https://lists.apache.org/thread.html/4b54c034cf8d691655156e0cb647243180c57a6829d97aa3c085b63c@%3Cusers.zeppelin.apache.org%3E > > -- > Taejun Kim > > Data Mining Lab. > School of Electrical and Computer Engineering > University of Seoul > > > -- > Taejun Kim > > Data Mining Lab. > School of Electrical and Computer Engineering > University of Seoul > > > > _______________________ > *Eric Pugh **| *Founder & CEO | OpenSource Connections, LLC | 434.466.1467 > | http://www.opensourceconnections.com | My Free/Busy > <http://tinyurl.com/eric-cal> > Co-Author: Apache Solr Enterprise Search Server, 3rd Ed > <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw> > This e-mail and all contents, including attachments, is considered to be > Company Confidential unless explicitly stated otherwise, regardless > of whether attachments are marked as such. > > > > > -- > 이종열, Jongyoul Lee, 李宗烈 > http://madeng.net > >