Re: [DISCUSS] Update Roadmap

2016-02-29 Thread moon soo Lee
@Vinayak, @Eran, @Benjamin, @Guilherme, @Sourav, @Rick
Now I can see a lot of demands around enterprise level job scheduling.
Either external or built-in, I completely agree having enterprise level job
scheduling support on the roadmap.
ZEPPELIN-137 ,
ZEPPELIN-531  are
related issues i can find in our JIRA.

@Vinayak
Regarding importing notebook from github, Zeppelin has pluggable notebook
storage layer (see related package
).
So, github notebook sync can be implemented easily.

@Shabeel
Right, we need better manage management to prevent such OOM.
And i think table is one of the most frequently used way of displaying
data. So definitely, we'll need more features like filter, sort, etc.
After this roadmap discussion, discussion for the next release will follow.
Then we'll get idea when those features will be available.

@Prasad
Thanks for mentioning HA and DR. They're really important subject for
enterprise use. Definitely Zeppelin will need to address them.
And displaying meta information of notebook on top level page is good idea.

It's really great to hear many opinions and ideas.
And thanks @Rick for sharing valuable view to Zeppelin project.

Thanks,
moon


On Mon, Feb 29, 2016 at 11:14 PM Rick Moritz  wrote:

> Hi,
>
> For one, I know that there is rudimentary scheduling built into Zeppelin
> already (at least I fixed a bug in the test for a scheduling feature a few
> months ago).
> But another point is, that Zeppelin should also focus on quality,
> reproduceability and portability.
> Although this doesn't offer exciting new features, it would make
> development much easier.
>
> Cross-platform testability, Tests that pass when run sequentially,
> compatibility with Firefox, and many more open issues that make it so much
> harder to enhance Zeppelin and add features should be addressed soon,
> preferably before more features are added. Already Zeppelin is suffering -
> in my opinion - from quite a lot of feature creep, and we should avoid
> putting in the kitchen sink, at the cost of quality and maintainability.
> Instead modularity (ZEPPELIN-533 in particular) should be targeted.
>
> Oozie, in my opinion, is a dead end - it may de-facto still be in use on
> many clusters, but it's not getting the love it needs, and I wouldn't bet
> on it, when it comes to integrating scheduling. Instead, any external tool
> should be able to use the REST-API to trigger executions, if you want
> external scheduling.
>
> So, in conclusion, if we take Moon's list as a list of descending
> priorities, I fully agree, under the condition that code quality is
> included as a subset of enterprise-readyness. Auth* is paramount (Kerberos
> SPNEGO SSO support is what we really want) with user and group rights
> assignment on the notebook level. We probably also need Knox-integration
> (ODP-Members looking at integrating Zeppelin should consider contributing
> this), and integration of something like Spree (
> https://github.com/hammerlab/spree) to be able to profile jobs.
>
> I'm hopeful that soon I can resume contributing some quality-oriented
> code, to drive this "necessary evil" forward ;)
>
> On Mon, Feb 29, 2016 at 8:27 PM, Sourav Mazumder <
> sourav.mazumde...@gmail.com> wrote:
>
>> I do agree with Vinayak. It need not be coupled with Oozie.
>>
>> Rather one should be able to call it from any scheduler typically used in
>> enterprise level. May be support for BPML.
>>
>> I believe the existing ability to call/execute a Zeppelin Notebook or a
>> specific paragraph within a notebook using REST API should take care of
>> this requirement to some extent.
>>
>> Regards,
>> Sourav
>>
>> On Mon, Feb 29, 2016 at 11:23 AM, Vinayak Agrawal <
>> vinayakagrawa...@gmail.com> wrote:
>>
>>> @Eran Witkon,
>>> Thanks for the suggestion Eran. I concur with your thought.
>>> If Zepplin can be integrated with oozie, that would be wonderful. Users
>>> will also be able to leverage their Oozie skills.
>>> This would be promising for now.
>>> However, in the future Hadoop might not necessarily be installed in
>>> Spark Cluster and Oozie (since its installs with Hadoop Distribution) might
>>> not be available.
>>> So perhaps we should give a thought about this feature for the future.
>>> Should it depend on oozie or should Zeppelin have its owns scheduling?
>>>
>>> As Benjamin has iterated, Databrick notebook has this as a core notebook
>>> feature.
>>>
>>>
>>> Also, would anybody give any suggestions regarding "sync with github"
>>> feature?
>>> -Exporting notebook to Github
>>> -Importing notebook from Github
>>>
>>> Thanks
>>> Vinayak
>>>
>>>
>>> On Mon, Feb 29, 2016 at 4:17 AM, Eran Witkon 
>>> wrote:
>>>
 @Vinayak Agrawal I would suggest adding the ability to connect
 

Re: [DISCUSS] Update Roadmap

2016-02-29 Thread Rick Moritz
Hi,

For one, I know that there is rudimentary scheduling built into Zeppelin
already (at least I fixed a bug in the test for a scheduling feature a few
months ago).
But another point is, that Zeppelin should also focus on quality,
reproduceability and portability.
Although this doesn't offer exciting new features, it would make
development much easier.

Cross-platform testability, Tests that pass when run sequentially,
compatibility with Firefox, and many more open issues that make it so much
harder to enhance Zeppelin and add features should be addressed soon,
preferably before more features are added. Already Zeppelin is suffering -
in my opinion - from quite a lot of feature creep, and we should avoid
putting in the kitchen sink, at the cost of quality and maintainability.
Instead modularity (ZEPPELIN-533 in particular) should be targeted.

Oozie, in my opinion, is a dead end - it may de-facto still be in use on
many clusters, but it's not getting the love it needs, and I wouldn't bet
on it, when it comes to integrating scheduling. Instead, any external tool
should be able to use the REST-API to trigger executions, if you want
external scheduling.

So, in conclusion, if we take Moon's list as a list of descending
priorities, I fully agree, under the condition that code quality is
included as a subset of enterprise-readyness. Auth* is paramount (Kerberos
SPNEGO SSO support is what we really want) with user and group rights
assignment on the notebook level. We probably also need Knox-integration
(ODP-Members looking at integrating Zeppelin should consider contributing
this), and integration of something like Spree (
https://github.com/hammerlab/spree) to be able to profile jobs.

I'm hopeful that soon I can resume contributing some quality-oriented code,
to drive this "necessary evil" forward ;)

On Mon, Feb 29, 2016 at 8:27 PM, Sourav Mazumder <
sourav.mazumde...@gmail.com> wrote:

> I do agree with Vinayak. It need not be coupled with Oozie.
>
> Rather one should be able to call it from any scheduler typically used in
> enterprise level. May be support for BPML.
>
> I believe the existing ability to call/execute a Zeppelin Notebook or a
> specific paragraph within a notebook using REST API should take care of
> this requirement to some extent.
>
> Regards,
> Sourav
>
> On Mon, Feb 29, 2016 at 11:23 AM, Vinayak Agrawal <
> vinayakagrawa...@gmail.com> wrote:
>
>> @Eran Witkon,
>> Thanks for the suggestion Eran. I concur with your thought.
>> If Zepplin can be integrated with oozie, that would be wonderful. Users
>> will also be able to leverage their Oozie skills.
>> This would be promising for now.
>> However, in the future Hadoop might not necessarily be installed in Spark
>> Cluster and Oozie (since its installs with Hadoop Distribution) might not
>> be available.
>> So perhaps we should give a thought about this feature for the future.
>> Should it depend on oozie or should Zeppelin have its owns scheduling?
>>
>> As Benjamin has iterated, Databrick notebook has this as a core notebook
>> feature.
>>
>>
>> Also, would anybody give any suggestions regarding "sync with github"
>> feature?
>> -Exporting notebook to Github
>> -Importing notebook from Github
>>
>> Thanks
>> Vinayak
>>
>>
>> On Mon, Feb 29, 2016 at 4:17 AM, Eran Witkon 
>> wrote:
>>
>>> @Vinayak Agrawal I would suggest adding the ability to connect zeppelin
>>> to existing scheduling tools\workflow tools such as
>>> https://oozie.apache.org/. this requires betters hooks and status
>>> reporting but doesn't make zeppeling and ETL\scheduler tool by itself/
>>>
>>>
>>> On Mon, Feb 29, 2016 at 10:21 AM Vinayak Agrawal <
>>> vinayakagrawa...@gmail.com> wrote:
>>>
 Moon,
 The new roadmap looks very promising. I am very happy to see security
 in the list.
 I have some suggestions regarding Enterprise Ready features:

 1. Job Scheduler - Can this be improved?
 Currently the scheduler can be used with Cron expression or a pre-set
 time. But in an enterprise solution, a notebook might be one piece of the
 workflow. Can we look towards the functionality of scheduling notebook's
 based on other notebooks finishing their job successfully?
 This requirement would arise in any ETL workflow, where all the
 downstream users wait for the ETL notebook to finish successfully. Only
 after that, other business oriented notebooks can be executed.

 2. Importing a notebook - Is there a current requirement or future plan
 to implement a feature that allows import-notebook-from-github? This would
 allow users to share notebooks seamlessly.

 Thanks
 Vinayak

 On Sun, Feb 28, 2016 at 11:22 PM, moon soo Lee  wrote:

> Zhong Wang,
> Right, Folder support would be quite useful. Thanks for the opinion.
>
 Hope i can finish the work pr-190
> .
>

> Sourav,

Re: OOM error when run all paragraphs

2016-02-29 Thread Skanda
Hi

Im also hitting this issue of OOM for couple of notebooks with about 6 to 7
paragraphs running Hive queries and caching 1000 records of output.

Regards,
Skanda

On Mon, Feb 29, 2016 at 10:42 PM, moon soo Lee  wrote:

> Thanks for creating an issue.
> Let me look into it more.
>
> Best,
> moon
>
> On Sun, Feb 28, 2016 at 10:14 PM Dafeng Wang 
> wrote:
>
>> Thanks Moon for your effort to repro this bug, I had create a Jira
>> https://issues.apache.org/jira/browse/ZEPPELIN-706 for it, please let me
>> know if you need anything else.
>>
>>
>>
>> Regards,
>>
>> Dafeng
>>
>>
>>
>> *From:* moon soo Lee [mailto:m...@apache.org]
>> *Sent:* Saturday, February 27, 2016 2:25 AM
>>
>>
>> *To:* users@zeppelin.incubator.apache.org
>> *Subject:* Re: OOM error when run all paragraphs
>>
>>
>>
>> Thanks for sharing your use case and the memory usage table.
>> I could able to reproduce the problem. Do you mind create an issue for it
>> on our jira?
>> I might have sometime next week to dig into this problem.
>>
>> Thanks,
>> moon
>>
>>
>>
>> On Thu, Feb 25, 2016 at 5:49 PM Dafeng Wang 
>> wrote:
>>
>> Hi Moon,
>>
>>
>>
>> Thanks for your reply, as for my case: my zeppelin server only have 1
>> notebook with almost all queries are sparkSql query(30 paragraphs), the
>> result limitation I set is 1, all query reach the limitation of 1.
>>
>> The strange thing is: run them one by one won’t cause the OOM error, if I
>> click on run notebooks, it will OOM quickly.
>>
>>
>>
>> I check on the memory usage, the fastest increasing part is [B and [C,
>> however, it just tells me we do have memory increase, doesn’t hint any
>> other things, I put the table below for reference, the last one is the
>> latest one, you can see only *scala.reflect.io.VirtualFile
>>  *got increased when I run
>> it again and again, no obvious clue of memory leak.
>>
>>
>>
>> *scala.reflect.io.FileZipArchive$FileEntry$1
>> *
>>
>> 224886
>>
>> *scala.reflect.io.FileZipArchive$FileEntry$1
>> *
>>
>> 224886
>>
>> *scala.reflect.io.ZipArchive$DirEntry
>> *
>>
>> 10123
>>
>> *scala.reflect.io.ZipArchive$DirEntry
>> *
>>
>> 10123
>>
>> *sun.nio.cs.UTF_8$Encoder *
>>
>> 1090
>>
>> *sun.nio.cs.UTF_8$Encoder *
>>
>> 1091
>>
>> *java.util.zip.Inflater *
>>
>> 960
>>
>> *java.util.zip.Inflater *
>>
>> 960
>>
>> *org.apache.derby.iapi.services.io.FormatIdInputStream
>> *
>>
>> 884
>>
>> *org.apache.derby.iapi.services.io.FormatIdInputStream
>> *
>>
>> 884
>>
>> *sun.security.util.DerInputBuffer
>> *
>>
>> 745
>>
>> *sun.security.util.DerInputBuffer
>> *
>>
>> 745
>>
>> *org.apache.derby.iapi.services.io.FormatIdOutputStream
>> *
>>
>> 479
>>
>> *org.apache.derby.iapi.services.io.FormatIdOutputStream
>> *
>>
>> 479
>>
>> *sun.security.util.ObjectIdentifier
>> *
>>
>> 466
>>
>> *sun.security.util.ObjectIdentifier
>> *
>>
>> 466
>>
>> *org.apache.derby.iapi.services.io.ArrayInputStream
>> *
>>
>> 442
>>
>> *org.apache.derby.iapi.services.io.ArrayInputStream
>> *
>>
>> 442
>>
>> *org.apache.derby.iapi.services.io.ArrayOutputStream
>> *
>>
>> 442
>>
>> *org.apache.derby.iapi.services.io.ArrayOutputStream
>> *
>>
>> 442
>>
>> *org.apache.derby.iapi.services.io.FormatableBitSet
>> *
>>
>> 364
>>
>> *org.apache.derby.iapi.services.io.FormatableBitSet
>> *
>>
>> 364
>>
>> *scala.reflect.internal.pickling.UnPickler$Scan
>> *
>>
>> 282
>>
>> *scala.reflect.internal.pickling.UnPickler$Scan
>> *
>>
>> 282
>>
>> *org.apache.derby.impl.store.raw.data.StoredPage
>> *
>>
>> 279
>>
>> *org.apache.derby.impl.store.raw.data.StoredPage
>> *
>>
>> 279
>>
>> *java.util.jar.JarFile$JarFileEntry
>> *
>>
>> 276
>>
>> *java.util.jar.JarFile$JarFileEntry
>> 

Math formula on %md model

2016-02-29 Thread Jun Chen
Hi,

I found both Jupyter and Gitbook(use katex or mathjax) support math formula
on markdown model, so how about zeppelin? The $$ math formula $$ is not
work in zeppelin.

BR
mufeng


Re: [DISCUSS] Update Roadmap

2016-02-29 Thread Sourav Mazumder
I do agree with Vinayak. It need not be coupled with Oozie.

Rather one should be able to call it from any scheduler typically used in
enterprise level. May be support for BPML.

I believe the existing ability to call/execute a Zeppelin Notebook or a
specific paragraph within a notebook using REST API should take care of
this requirement to some extent.

Regards,
Sourav

On Mon, Feb 29, 2016 at 11:23 AM, Vinayak Agrawal <
vinayakagrawa...@gmail.com> wrote:

> @Eran Witkon,
> Thanks for the suggestion Eran. I concur with your thought.
> If Zepplin can be integrated with oozie, that would be wonderful. Users
> will also be able to leverage their Oozie skills.
> This would be promising for now.
> However, in the future Hadoop might not necessarily be installed in Spark
> Cluster and Oozie (since its installs with Hadoop Distribution) might not
> be available.
> So perhaps we should give a thought about this feature for the future.
> Should it depend on oozie or should Zeppelin have its owns scheduling?
>
> As Benjamin has iterated, Databrick notebook has this as a core notebook
> feature.
>
>
> Also, would anybody give any suggestions regarding "sync with github"
> feature?
> -Exporting notebook to Github
> -Importing notebook from Github
>
> Thanks
> Vinayak
>
>
> On Mon, Feb 29, 2016 at 4:17 AM, Eran Witkon  wrote:
>
>> @Vinayak Agrawal I would suggest adding the ability to connect zeppelin
>> to existing scheduling tools\workflow tools such as
>> https://oozie.apache.org/. this requires betters hooks and status
>> reporting but doesn't make zeppeling and ETL\scheduler tool by itself/
>>
>>
>> On Mon, Feb 29, 2016 at 10:21 AM Vinayak Agrawal <
>> vinayakagrawa...@gmail.com> wrote:
>>
>>> Moon,
>>> The new roadmap looks very promising. I am very happy to see security in
>>> the list.
>>> I have some suggestions regarding Enterprise Ready features:
>>>
>>> 1. Job Scheduler - Can this be improved?
>>> Currently the scheduler can be used with Cron expression or a pre-set
>>> time. But in an enterprise solution, a notebook might be one piece of the
>>> workflow. Can we look towards the functionality of scheduling notebook's
>>> based on other notebooks finishing their job successfully?
>>> This requirement would arise in any ETL workflow, where all the
>>> downstream users wait for the ETL notebook to finish successfully. Only
>>> after that, other business oriented notebooks can be executed.
>>>
>>> 2. Importing a notebook - Is there a current requirement or future plan
>>> to implement a feature that allows import-notebook-from-github? This would
>>> allow users to share notebooks seamlessly.
>>>
>>> Thanks
>>> Vinayak
>>>
>>> On Sun, Feb 28, 2016 at 11:22 PM, moon soo Lee  wrote:
>>>
 Zhong Wang,
 Right, Folder support would be quite useful. Thanks for the opinion.

>>> Hope i can finish the work pr-190
 .

>>>
 Sourav,
 Regarding concurrent running, Zeppelin doesn't have limitation of run
 paragraph/query concurrently. Interpreter can implement it's own scheduling
 policy. For example, SparkSQL interpreter and ShellInterpreter can already
 run paragraph/query concurrently.

 SparkInterpreter is implemented with FIFO scheduler considering nature
 of scala compiler. That's why user can not run multiple paragraph
 concurrently when they work with SparkInterpreter.
 But as Zhong Wang mentioned, pr-703 enables each notebook will have
 separate scala compiler so paragraphs run concurrently, while they're in
 different notebooks.
 Thanks for the feedback!

 Best,
 moon

>>> On Sat, Feb 27, 2016 at 8:59 PM Zhong Wang 
 wrote:

>>> Sourav: I think this newly merged PR can help you
> https://github.com/apache/incubator-zeppelin/pull/703#issuecomment-185582537
>
> On Sat, Feb 27, 2016 at 1:46 PM, Sourav Mazumder <
> sourav.mazumde...@gmail.com> wrote:
>
 Hi Moon,
>>
>> This looks great.
>>
>> My only suggestion would be to include a PR/feature - Support for
>> Running Concurrent paragraphs/queries in Zeppelin.
>>
>> Right now if more than one user tries to run paragraphs in multiple
>> notebooks concurrently through a single Zeppelin instance (and single
>> interpreter instance) the performance is very slow. It is obvious that 
>> the
>> queue gets built up within the zeppelin process and interpreter process 
>> in
>> that scenario as the time taken to move the status from start to pending
>> and pending to running is very high compared to the actual running time 
>> of
>> a paragraph.
>>
>> Without this the multi tenancy support would be meaningless as no one
>> can practically use it in a situation where multiple users are trying to
>> connect to the same instance of Zeppelin (and the related 

Re: [DISCUSS] Update Roadmap

2016-02-29 Thread Vinayak Agrawal
@Eran Witkon,
Thanks for the suggestion Eran. I concur with your thought.
If Zepplin can be integrated with oozie, that would be wonderful. Users
will also be able to leverage their Oozie skills.
This would be promising for now.
However, in the future Hadoop might not necessarily be installed in Spark
Cluster and Oozie (since its installs with Hadoop Distribution) might not
be available.
So perhaps we should give a thought about this feature for the future.
Should it depend on oozie or should Zeppelin have its owns scheduling?

As Benjamin has iterated, Databrick notebook has this as a core notebook
feature.


Also, would anybody give any suggestions regarding "sync with github"
feature?
-Exporting notebook to Github
-Importing notebook from Github

Thanks
Vinayak


On Mon, Feb 29, 2016 at 4:17 AM, Eran Witkon  wrote:

> @Vinayak Agrawal I would suggest adding the ability to connect zeppelin
> to existing scheduling tools\workflow tools such as
> https://oozie.apache.org/. this requires betters hooks and status
> reporting but doesn't make zeppeling and ETL\scheduler tool by itself/
>
>
> On Mon, Feb 29, 2016 at 10:21 AM Vinayak Agrawal <
> vinayakagrawa...@gmail.com> wrote:
>
>> Moon,
>> The new roadmap looks very promising. I am very happy to see security in
>> the list.
>> I have some suggestions regarding Enterprise Ready features:
>>
>> 1. Job Scheduler - Can this be improved?
>> Currently the scheduler can be used with Cron expression or a pre-set
>> time. But in an enterprise solution, a notebook might be one piece of the
>> workflow. Can we look towards the functionality of scheduling notebook's
>> based on other notebooks finishing their job successfully?
>> This requirement would arise in any ETL workflow, where all the
>> downstream users wait for the ETL notebook to finish successfully. Only
>> after that, other business oriented notebooks can be executed.
>>
>> 2. Importing a notebook - Is there a current requirement or future plan
>> to implement a feature that allows import-notebook-from-github? This would
>> allow users to share notebooks seamlessly.
>>
>> Thanks
>> Vinayak
>>
>> On Sun, Feb 28, 2016 at 11:22 PM, moon soo Lee  wrote:
>>
>>> Zhong Wang,
>>> Right, Folder support would be quite useful. Thanks for the opinion.
>>>
>> Hope i can finish the work pr-190
>>> .
>>>
>>
>>> Sourav,
>>> Regarding concurrent running, Zeppelin doesn't have limitation of run
>>> paragraph/query concurrently. Interpreter can implement it's own scheduling
>>> policy. For example, SparkSQL interpreter and ShellInterpreter can already
>>> run paragraph/query concurrently.
>>>
>>> SparkInterpreter is implemented with FIFO scheduler considering nature
>>> of scala compiler. That's why user can not run multiple paragraph
>>> concurrently when they work with SparkInterpreter.
>>> But as Zhong Wang mentioned, pr-703 enables each notebook will have
>>> separate scala compiler so paragraphs run concurrently, while they're in
>>> different notebooks.
>>> Thanks for the feedback!
>>>
>>> Best,
>>> moon
>>>
>> On Sat, Feb 27, 2016 at 8:59 PM Zhong Wang 
>>> wrote:
>>>
>> Sourav: I think this newly merged PR can help you
 https://github.com/apache/incubator-zeppelin/pull/703#issuecomment-185582537

 On Sat, Feb 27, 2016 at 1:46 PM, Sourav Mazumder <
 sourav.mazumde...@gmail.com> wrote:

>>> Hi Moon,
>
> This looks great.
>
> My only suggestion would be to include a PR/feature - Support for
> Running Concurrent paragraphs/queries in Zeppelin.
>
> Right now if more than one user tries to run paragraphs in multiple
> notebooks concurrently through a single Zeppelin instance (and single
> interpreter instance) the performance is very slow. It is obvious that the
> queue gets built up within the zeppelin process and interpreter process in
> that scenario as the time taken to move the status from start to pending
> and pending to running is very high compared to the actual running time of
> a paragraph.
>
> Without this the multi tenancy support would be meaningless as no one
> can practically use it in a situation where multiple users are trying to
> connect to the same instance of Zeppelin (and the related interpreter). A
> possible solution would be to spawn separate instance of the same
> interpreter at every notebook/user level.
>
> Regards,
> Sourav
>
 On Sat, Feb 27, 2016 at 12:48 PM, moon soo Lee  wrote:
>
 Hi Zeppelin users and developers,
>>
>> The roadmap we have published at
>> https://cwiki.apache.org/confluence/display/ZEPPELIN/Zeppelin+Roadmap
>> is almost 9 month old, and it doesn't reflect where the community
>> goes anymore. It's time to update.
>>
>> Based on mailing list, jira issues, pullrequests, feedbacks from
>> 

Re: [DISCUSS] Update Roadmap

2016-02-29 Thread Guilherme Silveira
I agree.  Jobs schedulling should be a core feature.
Em 29 de fev de 2016 12:15, "Benjamin Kim"  escreveu:

> I concur with this suggestion. In the enterprise, management would like to
> see scheduled runs to be tracked, monitored, and given SLA constraints for
> the mission critical. Alerts and notifications are crucial for DevOps to
> respond with error clarification within it. If the Zeppelin notebooks can
> be executed by a third party scheduling application, such as Oozie, then
> this requirement can be satisfied if there are no immediate plans for a
> built-in one.
>
> On Feb 29, 2016, at 1:17 AM, Eran Witkon  wrote:
>
> @Vinayak Agrawal I would suggest adding the ability to connect zeppelin
> to existing scheduling tools\workflow tools such as
> https://oozie.apache.org/. this requires betters hooks and status
> reporting but doesn't make zeppeling and ETL\scheduler tool by itself/
>
>
> On Mon, Feb 29, 2016 at 10:21 AM Vinayak Agrawal <
> vinayakagrawa...@gmail.com> wrote:
>
>> Moon,
>> The new roadmap looks very promising. I am very happy to see security in
>> the list.
>> I have some suggestions regarding Enterprise Ready features:
>>
>> 1. Job Scheduler - Can this be improved?
>> Currently the scheduler can be used with Cron expression or a pre-set
>> time. But in an enterprise solution, a notebook might be one piece of the
>> workflow. Can we look towards the functionality of scheduling notebook's
>> based on other notebooks finishing their job successfully?
>> This requirement would arise in any ETL workflow, where all the
>> downstream users wait for the ETL notebook to finish successfully. Only
>> after that, other business oriented notebooks can be executed.
>>
>> 2. Importing a notebook - Is there a current requirement or future plan
>> to implement a feature that allows import-notebook-from-github? This would
>> allow users to share notebooks seamlessly.
>>
>> Thanks
>> Vinayak
>>
>> On Sun, Feb 28, 2016 at 11:22 PM, moon soo Lee  wrote:
>>
>>> Zhong Wang,
>>> Right, Folder support would be quite useful. Thanks for the opinion.
>>>
>> Hope i can finish the work pr-190
>>> .
>>>
>>
>>> Sourav,
>>> Regarding concurrent running, Zeppelin doesn't have limitation of run
>>> paragraph/query concurrently. Interpreter can implement it's own scheduling
>>> policy. For example, SparkSQL interpreter and ShellInterpreter can already
>>> run paragraph/query concurrently.
>>>
>>> SparkInterpreter is implemented with FIFO scheduler considering nature
>>> of scala compiler. That's why user can not run multiple paragraph
>>> concurrently when they work with SparkInterpreter.
>>> But as Zhong Wang mentioned, pr-703 enables each notebook will have
>>> separate scala compiler so paragraphs run concurrently, while they're in
>>> different notebooks.
>>> Thanks for the feedback!
>>>
>>> Best,
>>> moon
>>>
>> On Sat, Feb 27, 2016 at 8:59 PM Zhong Wang 
>>> wrote:
>>>
>> Sourav: I think this newly merged PR can help you
 https://github.com/apache/incubator-zeppelin/pull/703#issuecomment-185582537

 On Sat, Feb 27, 2016 at 1:46 PM, Sourav Mazumder <
 sourav.mazumde...@gmail.com> wrote:

>>> Hi Moon,
>
> This looks great.
>
> My only suggestion would be to include a PR/feature - Support for
> Running Concurrent paragraphs/queries in Zeppelin.
>
> Right now if more than one user tries to run paragraphs in multiple
> notebooks concurrently through a single Zeppelin instance (and single
> interpreter instance) the performance is very slow. It is obvious that the
> queue gets built up within the zeppelin process and interpreter process in
> that scenario as the time taken to move the status from start to pending
> and pending to running is very high compared to the actual running time of
> a paragraph.
>
> Without this the multi tenancy support would be meaningless as no one
> can practically use it in a situation where multiple users are trying to
> connect to the same instance of Zeppelin (and the related interpreter). A
> possible solution would be to spawn separate instance of the same
> interpreter at every notebook/user level.
>
> Regards,
> Sourav
>
 On Sat, Feb 27, 2016 at 12:48 PM, moon soo Lee  wrote:
>
 Hi Zeppelin users and developers,
>>
>> The roadmap we have published at
>> https://cwiki.apache.org/confluence/display/ZEPPELIN/Zeppelin+Roadmap
>> is almost 9 month old, and it doesn't reflect where the community
>> goes anymore. It's time to update.
>>
>> Based on mailing list, jira issues, pullrequests, feedbacks from
>> users, conferences and meetings, I could summarize the major interest of
>> users and developers in 7 categories. Enterprise ready, Usability
>> improvement, Pluggability, 

Re: OOM error when run all paragraphs

2016-02-29 Thread moon soo Lee
Thanks for creating an issue.
Let me look into it more.

Best,
moon

On Sun, Feb 28, 2016 at 10:14 PM Dafeng Wang  wrote:

> Thanks Moon for your effort to repro this bug, I had create a Jira
> https://issues.apache.org/jira/browse/ZEPPELIN-706 for it, please let me
> know if you need anything else.
>
>
>
> Regards,
>
> Dafeng
>
>
>
> *From:* moon soo Lee [mailto:m...@apache.org]
> *Sent:* Saturday, February 27, 2016 2:25 AM
>
>
> *To:* users@zeppelin.incubator.apache.org
> *Subject:* Re: OOM error when run all paragraphs
>
>
>
> Thanks for sharing your use case and the memory usage table.
> I could able to reproduce the problem. Do you mind create an issue for it
> on our jira?
> I might have sometime next week to dig into this problem.
>
> Thanks,
> moon
>
>
>
> On Thu, Feb 25, 2016 at 5:49 PM Dafeng Wang  wrote:
>
> Hi Moon,
>
>
>
> Thanks for your reply, as for my case: my zeppelin server only have 1
> notebook with almost all queries are sparkSql query(30 paragraphs), the
> result limitation I set is 1, all query reach the limitation of 1.
>
> The strange thing is: run them one by one won’t cause the OOM error, if I
> click on run notebooks, it will OOM quickly.
>
>
>
> I check on the memory usage, the fastest increasing part is [B and [C,
> however, it just tells me we do have memory increase, doesn’t hint any
> other things, I put the table below for reference, the last one is the
> latest one, you can see only *scala.reflect.io.VirtualFile
>  *got increased when I run
> it again and again, no obvious clue of memory leak.
>
>
>
> *scala.reflect.io.FileZipArchive$FileEntry$1
> *
>
> 224886
>
> *scala.reflect.io.FileZipArchive$FileEntry$1
> *
>
> 224886
>
> *scala.reflect.io.ZipArchive$DirEntry
> *
>
> 10123
>
> *scala.reflect.io.ZipArchive$DirEntry
> *
>
> 10123
>
> *sun.nio.cs.UTF_8$Encoder *
>
> 1090
>
> *sun.nio.cs.UTF_8$Encoder *
>
> 1091
>
> *java.util.zip.Inflater *
>
> 960
>
> *java.util.zip.Inflater *
>
> 960
>
> *org.apache.derby.iapi.services.io.FormatIdInputStream
> *
>
> 884
>
> *org.apache.derby.iapi.services.io.FormatIdInputStream
> *
>
> 884
>
> *sun.security.util.DerInputBuffer
> *
>
> 745
>
> *sun.security.util.DerInputBuffer
> *
>
> 745
>
> *org.apache.derby.iapi.services.io.FormatIdOutputStream
> *
>
> 479
>
> *org.apache.derby.iapi.services.io.FormatIdOutputStream
> *
>
> 479
>
> *sun.security.util.ObjectIdentifier
> *
>
> 466
>
> *sun.security.util.ObjectIdentifier
> *
>
> 466
>
> *org.apache.derby.iapi.services.io.ArrayInputStream
> *
>
> 442
>
> *org.apache.derby.iapi.services.io.ArrayInputStream
> *
>
> 442
>
> *org.apache.derby.iapi.services.io.ArrayOutputStream
> *
>
> 442
>
> *org.apache.derby.iapi.services.io.ArrayOutputStream
> *
>
> 442
>
> *org.apache.derby.iapi.services.io.FormatableBitSet
> *
>
> 364
>
> *org.apache.derby.iapi.services.io.FormatableBitSet
> *
>
> 364
>
> *scala.reflect.internal.pickling.UnPickler$Scan
> *
>
> 282
>
> *scala.reflect.internal.pickling.UnPickler$Scan
> *
>
> 282
>
> *org.apache.derby.impl.store.raw.data.StoredPage
> *
>
> 279
>
> *org.apache.derby.impl.store.raw.data.StoredPage
> *
>
> 279
>
> *java.util.jar.JarFile$JarFileEntry
> *
>
> 276
>
> *java.util.jar.JarFile$JarFileEntry
> *
>
> 276
>
> *sun.security.x509.X509CertImpl
> *
>
> 166
>
> *sun.security.x509.X509CertImpl
> *
>
> 166
>
> *org.apache.derby.impl.store.raw.data.AllocPage
> *
>
> 162
>
> *org.apache.derby.impl.store.raw.data.AllocPage
> 

RE: problem with start H2OContent

2016-02-29 Thread Silvio Fiorito

Can you try running it from just a Spark shell to confirm it works that way (no 
other conflict)?

bin/spark-shell --master local[*] --packages 
ai.h2o:sparkling-water-core_2.10:1.5.10

Also, are you able to run the Spark interpreter without the h2o package?

Thanks,
Silvio

From: Aleksandr Modestov
Sent: Monday, February 29, 2016 11:30 AM
To: 
users@zeppelin.incubator.apache.org
Subject: Re: problem with start H2OContent

I use Spark 1.5
The problem with the external Spark with internal Spark I can not launch 
h2oContent :)
The error is:

"ERROR [2016-02-29 19:28:16,609] ({pool-1-thread-3} 
NotebookServer.java[afterStatusChange]:766) - Error
org.apache.zeppelin.interpreter.InterpreterException: 
org.apache.zeppelin.interpreter.InterpreterException: 
org.apache.thrift.transport.TTransportException: java.net.ConnectException: 
Connection refused
at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:268)
at 
org.apache.zeppelin.interpreter.LazyOpenInterpreter.getFormType(LazyOpenInterpreter.java:104)
at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:198)
at org.apache.zeppelin.scheduler.Job.run(Job.java:169)
at 
org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:322)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.zeppelin.interpreter.InterpreterException: 
org.apache.thrift.transport.TTransportException: java.net.ConnectException: 
Connection refused
at 
org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:53)
at 
org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:37)
at 
org.apache.commons.pool2.BasePooledObjectFactory.makeObject(BasePooledObjectFactory.java:60)
at 
org.apache.commons.pool2.impl.GenericObjectPool.create(GenericObjectPool.java:861)
at 
org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:435)
at 
org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:363)
at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreterProcess.getClient(RemoteInterpreterProcess.java:139)
at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:266)
... 11 more
Caused by: org.apache.thrift.transport.TTransportException: 
java.net.ConnectException: Connection refused
at org.apache.thrift.transport.TSocket.open(TSocket.java:187)
at 
org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:51)
... 18 more
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at org.apache.thrift.transport.TSocket.open(TSocket.java:182)
... 19 more"

On Mon, Feb 29, 2016 at 7:07 PM, Silvio Fiorito 
> wrote:
In your zeppelin-env you set SPARK_HOME and SPARK_SUBMIT_OPTIONS ? Anything in 
the logs? Looks like the interpreter failed to start.

Also, Sparkling Water currently supports up to 1.5 only, last I checked.

Thanks,
Silvio



From: Aleksandr Modestov
Sent: Monday, February 29, 2016 10:43 AM
To: 
users@zeppelin.incubator.apache.org
Subject: Re: problem with start H2OContent

When I use external Spark I get exeption:

java.net.ConnectException: Connection refused at 
java.net.PlainSocketImpl.socketConnect(Native Method) at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
 at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) 
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at 
java.net.Socket.connect(Socket.java:579) at 
org.apache.thrift.transport.TSocket.open(TSocket.java:182) at 
org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:51)
 at 
org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:37)

Re: problem with start H2OContent

2016-02-29 Thread Aleksandr Modestov
I use Spark 1.5
The problem with the external Spark with internal Spark I can not launch
h2oContent :)
The error is:

"ERROR [2016-02-29 19:28:16,609] ({pool-1-thread-3}
NotebookServer.java[afterStatusChange]:766) - Error
org.apache.zeppelin.interpreter.InterpreterException:
org.apache.zeppelin.interpreter.InterpreterException:
org.apache.thrift.transport.TTransportException: java.net.ConnectException:
Connection refused
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:268)
at
org.apache.zeppelin.interpreter.LazyOpenInterpreter.getFormType(LazyOpenInterpreter.java:104)
at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:198)
at org.apache.zeppelin.scheduler.Job.run(Job.java:169)
at
org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:322)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.zeppelin.interpreter.InterpreterException:
org.apache.thrift.transport.TTransportException: java.net.ConnectException:
Connection refused
at
org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:53)
at
org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:37)
at
org.apache.commons.pool2.BasePooledObjectFactory.makeObject(BasePooledObjectFactory.java:60)
at
org.apache.commons.pool2.impl.GenericObjectPool.create(GenericObjectPool.java:861)
at
org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:435)
at
org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:363)
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreterProcess.getClient(RemoteInterpreterProcess.java:139)
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:266)
... 11 more
Caused by: org.apache.thrift.transport.TTransportException:
java.net.ConnectException: Connection refused
at org.apache.thrift.transport.TSocket.open(TSocket.java:187)
at
org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:51)
... 18 more
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at org.apache.thrift.transport.TSocket.open(TSocket.java:182)
... 19 more"

On Mon, Feb 29, 2016 at 7:07 PM, Silvio Fiorito <
silvio.fior...@granturing.com> wrote:

> In your zeppelin-env you set SPARK_HOME and SPARK_SUBMIT_OPTIONS ?
> Anything in the logs? Looks like the interpreter failed to start.
>
>
>
> Also, Sparkling Water currently supports up to 1.5 only, last I checked.
>
>
>
> Thanks,
>
> Silvio
>
>
>
>
>
>
>
> *From: *Aleksandr Modestov 
> *Sent: *Monday, February 29, 2016 10:43 AM
> *To: *users@zeppelin.incubator.apache.org
> *Subject: *Re: problem with start H2OContent
>
>
> When I use external Spark I get exeption:
>
> java.net.ConnectException: Connection refused at
> java.net.PlainSocketImpl.socketConnect(Native Method) at
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
> at
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
> at
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at
> java.net.Socket.connect(Socket.java:579) at
> org.apache.thrift.transport.TSocket.open(TSocket.java:182) at
> org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:51)
> at
> org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:37)
> at
> org.apache.commons.pool2.BasePooledObjectFactory.makeObject(BasePooledObjectFactory.java:60)
> at
> org.apache.commons.pool2.impl.GenericObjectPool.create(GenericObjectPool.java:861)
> at
> org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:435)
> at
> org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:363)
> at
> org.apache.zeppelin.interpreter.remote.RemoteInterpreterProcess.getClient(RemoteInterpreterProcess.java:139)
> at
> 

RE: problem with start H2OContent

2016-02-29 Thread Silvio Fiorito
In your zeppelin-env you set SPARK_HOME and SPARK_SUBMIT_OPTIONS ? Anything in 
the logs? Looks like the interpreter failed to start.

Also, Sparkling Water currently supports up to 1.5 only, last I checked.

Thanks,
Silvio



From: Aleksandr Modestov
Sent: Monday, February 29, 2016 10:43 AM
To: 
users@zeppelin.incubator.apache.org
Subject: Re: problem with start H2OContent

When I use external Spark I get exeption:

java.net.ConnectException: Connection refused at 
java.net.PlainSocketImpl.socketConnect(Native Method) at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
 at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) 
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at 
java.net.Socket.connect(Socket.java:579) at 
org.apache.thrift.transport.TSocket.open(TSocket.java:182) at 
org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:51)
 at 
org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:37)
 at 
org.apache.commons.pool2.BasePooledObjectFactory.makeObject(BasePooledObjectFactory.java:60)
 at 
org.apache.commons.pool2.impl.GenericObjectPool.create(GenericObjectPool.java:861)
 at 
org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:435)
 at 
org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:363)
 at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreterProcess.getClient(RemoteInterpreterProcess.java:139)
 at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreter.init(RemoteInterpreter.java:129)
 at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:257)
 at 
org.apache.zeppelin.interpreter.LazyOpenInterpreter.getFormType(LazyOpenInterpreter.java:104)
 at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:198) at 
org.apache.zeppelin.scheduler.Job.run(Job.java:169) at 
org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:322)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at 
java.util.concurrent.FutureTask.run(FutureTask.java:262) at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
at java.lang.Thread.run(Thread.java:745)


On Mon, Feb 29, 2016 at 5:43 PM, Silvio Fiorito 
> wrote:
It doesn’t seem to be loading transitive dependencies properly. When I was 
helping someone else set this up recently, I had to use 
SPARK_SUBMIT_OPTIONS=“--packages ai.h2o:sparkling-water-core_2.10:1.5.10” with 
an external Spark installation (vs using bundled Spark in Zeppelin).

From: Aleksandr Modestov 
>
Reply-To: 
"users@zeppelin.incubator.apache.org"
 
>
Date: Monday, February 29, 2016 at 9:30 AM
To: 
"users@zeppelin.incubator.apache.org"
 
>
Subject: Re: problem with start H2OContent

In a conf-file I wrote package but it doesn't work and I use  'z.load("...")'.

On Mon, Feb 29, 2016 at 5:25 PM, vincent gromakowski 
> wrote:
your H2O jar is not loaded in spark classpath. Maybe retry to load with 
z.load("...") or add spark.jars parameter in spark interpreter configuration

2016-02-29 15:23 GMT+01:00 Aleksandr Modestov 
>:
I did import "import org.apache.spark.h2o._"
What do you mean " it's probably a problem with your classpath."?


On Mon, Feb 29, 2016 at 5:19 PM, vincent gromakowski 
> wrote:
Don't forget to do the import. If done it's probably a problem with your 
classpath...

2016-02-29 15:03 GMT+01:00 Aleksandr Modestov 
>:
Hello all,
There is a problem when I start to initialize H2OContent.
Does anybody know the answer?

java.lang.NoClassDefFoundError: water/api/HandlerFactory at 
org.apache.spark.h2o.H2OContext.start(H2OContext.scala:107) at 

Re: [DISCUSS] Update Roadmap

2016-02-29 Thread Prasad Wagle
This is a great list.

In the enterprise ready section, what do you think about adding "High
Availability and Disaster Recovery"? We can start with updating the
documentation with best practices and scripts for a cold standby solution
and work towards active-active

 solution.

Another suggestion is to store meta-data for notes like creator, last
updated (time and user) and number of views. We can show this information
in the top level page in a table format with ability to sort by any column.

On Mon, Feb 29, 2016 at 7:15 AM, Benjamin Kim  wrote:

> I concur with this suggestion. In the enterprise, management would like to
> see scheduled runs to be tracked, monitored, and given SLA constraints for
> the mission critical. Alerts and notifications are crucial for DevOps to
> respond with error clarification within it. If the Zeppelin notebooks can
> be executed by a third party scheduling application, such as Oozie, then
> this requirement can be satisfied if there are no immediate plans for a
> built-in one.
>
> On Feb 29, 2016, at 1:17 AM, Eran Witkon  wrote:
>
> @Vinayak Agrawal I would suggest adding the ability to connect zeppelin
> to existing scheduling tools\workflow tools such as
> https://oozie.apache.org/. this requires betters hooks and status
> reporting but doesn't make zeppeling and ETL\scheduler tool by itself/
>
>
> On Mon, Feb 29, 2016 at 10:21 AM Vinayak Agrawal <
> vinayakagrawa...@gmail.com> wrote:
>
>> Moon,
>> The new roadmap looks very promising. I am very happy to see security in
>> the list.
>> I have some suggestions regarding Enterprise Ready features:
>>
>> 1. Job Scheduler - Can this be improved?
>> Currently the scheduler can be used with Cron expression or a pre-set
>> time. But in an enterprise solution, a notebook might be one piece of the
>> workflow. Can we look towards the functionality of scheduling notebook's
>> based on other notebooks finishing their job successfully?
>> This requirement would arise in any ETL workflow, where all the
>> downstream users wait for the ETL notebook to finish successfully. Only
>> after that, other business oriented notebooks can be executed.
>>
>> 2. Importing a notebook - Is there a current requirement or future plan
>> to implement a feature that allows import-notebook-from-github? This would
>> allow users to share notebooks seamlessly.
>>
>> Thanks
>> Vinayak
>>
>> On Sun, Feb 28, 2016 at 11:22 PM, moon soo Lee  wrote:
>>
>>> Zhong Wang,
>>> Right, Folder support would be quite useful. Thanks for the opinion.
>>>
>> Hope i can finish the work pr-190
>>> .
>>>
>>
>>> Sourav,
>>> Regarding concurrent running, Zeppelin doesn't have limitation of run
>>> paragraph/query concurrently. Interpreter can implement it's own scheduling
>>> policy. For example, SparkSQL interpreter and ShellInterpreter can already
>>> run paragraph/query concurrently.
>>>
>>> SparkInterpreter is implemented with FIFO scheduler considering nature
>>> of scala compiler. That's why user can not run multiple paragraph
>>> concurrently when they work with SparkInterpreter.
>>> But as Zhong Wang mentioned, pr-703 enables each notebook will have
>>> separate scala compiler so paragraphs run concurrently, while they're in
>>> different notebooks.
>>> Thanks for the feedback!
>>>
>>> Best,
>>> moon
>>>
>> On Sat, Feb 27, 2016 at 8:59 PM Zhong Wang 
>>> wrote:
>>>
>> Sourav: I think this newly merged PR can help you
 https://github.com/apache/incubator-zeppelin/pull/703#issuecomment-185582537

 On Sat, Feb 27, 2016 at 1:46 PM, Sourav Mazumder <
 sourav.mazumde...@gmail.com> wrote:

>>> Hi Moon,
>
> This looks great.
>
> My only suggestion would be to include a PR/feature - Support for
> Running Concurrent paragraphs/queries in Zeppelin.
>
> Right now if more than one user tries to run paragraphs in multiple
> notebooks concurrently through a single Zeppelin instance (and single
> interpreter instance) the performance is very slow. It is obvious that the
> queue gets built up within the zeppelin process and interpreter process in
> that scenario as the time taken to move the status from start to pending
> and pending to running is very high compared to the actual running time of
> a paragraph.
>
> Without this the multi tenancy support would be meaningless as no one
> can practically use it in a situation where multiple users are trying to
> connect to the same instance of Zeppelin (and the related interpreter). A
> possible solution would be to spawn separate instance of the same
> interpreter at every notebook/user level.
>
> Regards,
> Sourav
>
 On Sat, Feb 27, 2016 at 12:48 PM, moon soo Lee  

Re: problem with start H2OContent

2016-02-29 Thread Aleksandr Modestov
When I use external Spark I get exeption:

java.net.ConnectException: Connection refused at
java.net.PlainSocketImpl.socketConnect(Native Method) at
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at
java.net.Socket.connect(Socket.java:579) at
org.apache.thrift.transport.TSocket.open(TSocket.java:182) at
org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:51)
at
org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:37)
at
org.apache.commons.pool2.BasePooledObjectFactory.makeObject(BasePooledObjectFactory.java:60)
at
org.apache.commons.pool2.impl.GenericObjectPool.create(GenericObjectPool.java:861)
at
org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:435)
at
org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:363)
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreterProcess.getClient(RemoteInterpreterProcess.java:139)
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreter.init(RemoteInterpreter.java:129)
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:257)
at
org.apache.zeppelin.interpreter.LazyOpenInterpreter.getFormType(LazyOpenInterpreter.java:104)
at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:198) at
org.apache.zeppelin.scheduler.Job.run(Job.java:169) at
org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:322)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262) at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)


On Mon, Feb 29, 2016 at 5:43 PM, Silvio Fiorito <
silvio.fior...@granturing.com> wrote:

> It doesn’t seem to be loading transitive dependencies properly. When I was
> helping someone else set this up recently, I had to use
> SPARK_SUBMIT_OPTIONS=“--packages ai.h2o:sparkling-water-core_2.10:1.5.10”
> with an external Spark installation (vs using bundled Spark in Zeppelin).
>
> From: Aleksandr Modestov 
> Reply-To: "users@zeppelin.incubator.apache.org" <
> users@zeppelin.incubator.apache.org>
> Date: Monday, February 29, 2016 at 9:30 AM
> To: "users@zeppelin.incubator.apache.org" <
> users@zeppelin.incubator.apache.org>
> Subject: Re: problem with start H2OContent
>
> In a conf-file I wrote package but it doesn't work and I use  '
> z.load("...")'.
>
> On Mon, Feb 29, 2016 at 5:25 PM, vincent gromakowski <
> vincent.gromakow...@gmail.com> wrote:
>
>> your H2O jar is not loaded in spark classpath. Maybe retry to load with
>> z.load("...") or add spark.jars parameter in spark interpreter configuration
>>
>> 2016-02-29 15:23 GMT+01:00 Aleksandr Modestov <
>> aleksandrmodes...@gmail.com>:
>>
>>> I did import "import org.apache.spark.h2o._"
>>> What do you mean " it's probably a problem with your classpath."?
>>>
>>>
>>> On Mon, Feb 29, 2016 at 5:19 PM, vincent gromakowski <
>>> vincent.gromakow...@gmail.com> wrote:
>>>
 Don't forget to do the import. If done it's probably a problem with
 your classpath...

 2016-02-29 15:03 GMT+01:00 Aleksandr Modestov <
 aleksandrmodes...@gmail.com>:

> Hello all,
> There is a problem when I start to initialize H2OContent.
> Does anybody know the answer?
>
> java.lang.NoClassDefFoundError: water/api/HandlerFactory at
> org.apache.spark.h2o.H2OContext.start(H2OContext.scala:107) at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:65)
> at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:70)
> at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:72)
> at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:74)
> at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:76)
> at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:78)
> at

Re: [DISCUSS] Update Roadmap

2016-02-29 Thread Benjamin Kim
I concur with this suggestion. In the enterprise, management would like to see 
scheduled runs to be tracked, monitored, and given SLA constraints for the 
mission critical. Alerts and notifications are crucial for DevOps to respond 
with error clarification within it. If the Zeppelin notebooks can be executed 
by a third party scheduling application, such as Oozie, then this requirement 
can be satisfied if there are no immediate plans for a built-in one.

> On Feb 29, 2016, at 1:17 AM, Eran Witkon  wrote:
> 
> @Vinayak Agrawal I would suggest adding the ability to connect zeppelin to 
> existing scheduling tools\workflow tools such as  https://oozie.apache.org/ 
> . this requires betters hooks and status reporting 
> but doesn't make zeppeling and ETL\scheduler tool by itself/
> 
> 
> On Mon, Feb 29, 2016 at 10:21 AM Vinayak Agrawal  > wrote:
> Moon,
> The new roadmap looks very promising. I am very happy to see security in the 
> list.
> I have some suggestions regarding Enterprise Ready features:
> 
> 1. Job Scheduler - Can this be improved? 
> Currently the scheduler can be used with Cron expression or a pre-set time. 
> But in an enterprise solution, a notebook might be one piece of the workflow. 
> Can we look towards the functionality of scheduling notebook's based on other 
> notebooks finishing their job successfully?
> This requirement would arise in any ETL workflow, where all the downstream 
> users wait for the ETL notebook to finish successfully. Only after that, 
> other business oriented notebooks can be executed.  
> 
> 2. Importing a notebook - Is there a current requirement or future plan to 
> implement a feature that allows import-notebook-from-github? This would allow 
> users to share notebooks seamlessly. 
> 
> Thanks 
> Vinayak
> 
> On Sun, Feb 28, 2016 at 11:22 PM, moon soo Lee  > wrote:
> Zhong Wang, 
> Right, Folder support would be quite useful. Thanks for the opinion. 
> Hope i can finish the work pr-190 
> .
> 
> Sourav,
> Regarding concurrent running, Zeppelin doesn't have limitation of run 
> paragraph/query concurrently. Interpreter can implement it's own scheduling 
> policy. For example, SparkSQL interpreter and ShellInterpreter can already 
> run paragraph/query concurrently.
> 
> SparkInterpreter is implemented with FIFO scheduler considering nature of 
> scala compiler. That's why user can not run multiple paragraph concurrently 
> when they work with SparkInterpreter.
> But as Zhong Wang mentioned, pr-703 enables each notebook will have separate 
> scala compiler so paragraphs run concurrently, while they're in different 
> notebooks.
> Thanks for the feedback!
> 
> Best,
> moon
> On Sat, Feb 27, 2016 at 8:59 PM Zhong Wang  > wrote:
> Sourav: I think this newly merged PR can help you 
> https://github.com/apache/incubator-zeppelin/pull/703#issuecomment-185582537 
> 
> 
> On Sat, Feb 27, 2016 at 1:46 PM, Sourav Mazumder  > wrote:
> Hi Moon,
> 
> This looks great.
> 
> My only suggestion would be to include a PR/feature - Support for Running 
> Concurrent paragraphs/queries in Zeppelin. 
> 
> Right now if more than one user tries to run paragraphs in multiple notebooks 
> concurrently through a single Zeppelin instance (and single interpreter 
> instance) the performance is very slow. It is obvious that the queue gets 
> built up within the zeppelin process and interpreter process in that scenario 
> as the time taken to move the status from start to pending and pending to 
> running is very high compared to the actual running time of a paragraph.
> 
> Without this the multi tenancy support would be meaningless as no one can 
> practically use it in a situation where multiple users are trying to connect 
> to the same instance of Zeppelin (and the related interpreter). A possible 
> solution would be to spawn separate instance of the same interpreter at every 
> notebook/user level.
> 
> Regards,
> Sourav
> On Sat, Feb 27, 2016 at 12:48 PM, moon soo Lee  > wrote:
> Hi Zeppelin users and developers,
> 
> The roadmap we have published at
> https://cwiki.apache.org/confluence/display/ZEPPELIN/Zeppelin+Roadmap 
> 
> is almost 9 month old, and it doesn't reflect where the community goes 
> anymore. It's time to update.
> 
> Based on mailing list, jira issues, pullrequests, feedbacks from users, 
> conferences and meetings, I could summarize the major interest of users and 
> developers in 7 categories. Enterprise ready, Usability improvement, 
> 

Re: problem with start H2OContent

2016-02-29 Thread Silvio Fiorito
It doesn’t seem to be loading transitive dependencies properly. When I was 
helping someone else set this up recently, I had to use 
SPARK_SUBMIT_OPTIONS=“--packages ai.h2o:sparkling-water-core_2.10:1.5.10” with 
an external Spark installation (vs using bundled Spark in Zeppelin).

From: Aleksandr Modestov 
>
Reply-To: 
"users@zeppelin.incubator.apache.org"
 
>
Date: Monday, February 29, 2016 at 9:30 AM
To: 
"users@zeppelin.incubator.apache.org"
 
>
Subject: Re: problem with start H2OContent

In a conf-file I wrote package but it doesn't work and I use  'z.load("...")'.

On Mon, Feb 29, 2016 at 5:25 PM, vincent gromakowski 
> wrote:
your H2O jar is not loaded in spark classpath. Maybe retry to load with 
z.load("...") or add spark.jars parameter in spark interpreter configuration

2016-02-29 15:23 GMT+01:00 Aleksandr Modestov 
>:
I did import "import org.apache.spark.h2o._"
What do you mean " it's probably a problem with your classpath."?


On Mon, Feb 29, 2016 at 5:19 PM, vincent gromakowski 
> wrote:
Don't forget to do the import. If done it's probably a problem with your 
classpath...

2016-02-29 15:03 GMT+01:00 Aleksandr Modestov 
>:
Hello all,
There is a problem when I start to initialize H2OContent.
Does anybody know the answer?

java.lang.NoClassDefFoundError: water/api/HandlerFactory at 
org.apache.spark.h2o.H2OContext.start(H2OContext.scala:107) at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:65)
 at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:70)
 at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:72)
 at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:74)
 at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:76)
 at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:78)
 at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:80)
 at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:82)
 at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:84)
 at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:86)
 at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:88)
 at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:90)
 at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:92)
 at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:94)
 at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:96)
 at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:98)
 at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:100)
 at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:102) at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:104) at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:106) at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:108) at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:110) at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:112) at 
$iwC$$iwC$$iwC$$iwC$$iwC.(:114) at 
$iwC$$iwC$$iwC$$iwC.(:116) at 
$iwC$$iwC$$iwC.(:118) at $iwC$$iwC.(:120) at 
$iwC.(:122) at (:124) at .(:128) 
at .() at .(:7) at .() at 
$print() at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606) at 
org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065) at 
org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346) at 
org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840) at 
org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871) at 

Re: problem with start H2OContent

2016-02-29 Thread Aleksandr Modestov
In a conf-file I wrote package but it doesn't work and I use  '
z.load("...")'.

On Mon, Feb 29, 2016 at 5:25 PM, vincent gromakowski <
vincent.gromakow...@gmail.com> wrote:

> your H2O jar is not loaded in spark classpath. Maybe retry to load with
> z.load("...") or add spark.jars parameter in spark interpreter configuration
>
> 2016-02-29 15:23 GMT+01:00 Aleksandr Modestov  >:
>
>> I did import "import org.apache.spark.h2o._"
>> What do you mean " it's probably a problem with your classpath."?
>>
>>
>> On Mon, Feb 29, 2016 at 5:19 PM, vincent gromakowski <
>> vincent.gromakow...@gmail.com> wrote:
>>
>>> Don't forget to do the import. If done it's probably a problem with your
>>> classpath...
>>>
>>> 2016-02-29 15:03 GMT+01:00 Aleksandr Modestov <
>>> aleksandrmodes...@gmail.com>:
>>>
 Hello all,
 There is a problem when I start to initialize H2OContent.
 Does anybody know the answer?

 java.lang.NoClassDefFoundError: water/api/HandlerFactory at
 org.apache.spark.h2o.H2OContext.start(H2OContext.scala:107) at
 $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:65)
 at
 $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:70)
 at
 $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:72)
 at
 $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:74)
 at
 $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:76)
 at
 $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:78)
 at
 $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:80)
 at
 $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:82)
 at
 $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:84)
 at
 $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:86)
 at
 $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:88)
 at
 $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:90)
 at
 $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:92)
 at
 $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:94)
 at
 $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:96)
 at
 $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:98)
 at
 $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:100)
 at
 $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:102)
 at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:104)
 at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:106) at
 $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:108) at
 $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:110) at
 $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:112) at
 $iwC$$iwC$$iwC$$iwC$$iwC.(:114) at
 $iwC$$iwC$$iwC$$iwC.(:116) at
 $iwC$$iwC$$iwC.(:118) at $iwC$$iwC.(:120) at
 $iwC.(:122) at (:124) at
 .(:128) at .() at .(:7) at
 .() at $print() at
 sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606) at
 org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
 at
 org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
 at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
 at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871) at
 org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819) at
 org.apache.zeppelin.spark.SparkInterpreter.interpretInput(SparkInterpreter.java:709)
 at
 org.apache.zeppelin.spark.SparkInterpreter.interpret(SparkInterpreter.java:674)
 at
 org.apache.zeppelin.spark.SparkInterpreter.interpret(SparkInterpreter.java:667)
 at
 org.apache.zeppelin.interpreter.ClassloaderInterpreter.interpret(ClassloaderInterpreter.java:57)
 at
 org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93)
 at
 org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:300)
 at 

Re: problem with start H2OContent

2016-02-29 Thread vincent gromakowski
your H2O jar is not loaded in spark classpath. Maybe retry to load with
z.load("...") or add spark.jars parameter in spark interpreter configuration

2016-02-29 15:23 GMT+01:00 Aleksandr Modestov :

> I did import "import org.apache.spark.h2o._"
> What do you mean " it's probably a problem with your classpath."?
>
>
> On Mon, Feb 29, 2016 at 5:19 PM, vincent gromakowski <
> vincent.gromakow...@gmail.com> wrote:
>
>> Don't forget to do the import. If done it's probably a problem with your
>> classpath...
>>
>> 2016-02-29 15:03 GMT+01:00 Aleksandr Modestov <
>> aleksandrmodes...@gmail.com>:
>>
>>> Hello all,
>>> There is a problem when I start to initialize H2OContent.
>>> Does anybody know the answer?
>>>
>>> java.lang.NoClassDefFoundError: water/api/HandlerFactory at
>>> org.apache.spark.h2o.H2OContext.start(H2OContext.scala:107) at
>>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:65)
>>> at
>>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:70)
>>> at
>>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:72)
>>> at
>>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:74)
>>> at
>>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:76)
>>> at
>>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:78)
>>> at
>>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:80)
>>> at
>>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:82)
>>> at
>>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:84)
>>> at
>>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:86)
>>> at
>>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:88)
>>> at
>>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:90)
>>> at
>>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:92)
>>> at
>>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:94)
>>> at
>>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:96)
>>> at
>>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:98)
>>> at
>>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:100)
>>> at
>>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:102)
>>> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:104)
>>> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:106) at
>>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:108) at
>>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:110) at
>>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:112) at
>>> $iwC$$iwC$$iwC$$iwC$$iwC.(:114) at
>>> $iwC$$iwC$$iwC$$iwC.(:116) at
>>> $iwC$$iwC$$iwC.(:118) at $iwC$$iwC.(:120) at
>>> $iwC.(:122) at (:124) at
>>> .(:128) at .() at .(:7) at
>>> .() at $print() at
>>> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> at java.lang.reflect.Method.invoke(Method.java:606) at
>>> org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
>>> at
>>> org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
>>> at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
>>> at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871) at
>>> org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819) at
>>> org.apache.zeppelin.spark.SparkInterpreter.interpretInput(SparkInterpreter.java:709)
>>> at
>>> org.apache.zeppelin.spark.SparkInterpreter.interpret(SparkInterpreter.java:674)
>>> at
>>> org.apache.zeppelin.spark.SparkInterpreter.interpret(SparkInterpreter.java:667)
>>> at
>>> org.apache.zeppelin.interpreter.ClassloaderInterpreter.interpret(ClassloaderInterpreter.java:57)
>>> at
>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93)
>>> at
>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:300)
>>> at org.apache.zeppelin.scheduler.Job.run(Job.java:169) at
>>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:134)
>>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>> at java.util.concurrent.FutureTask.run(FutureTask.java:262) at

Re: problem with start H2OContent

2016-02-29 Thread Aleksandr Modestov
I did import "import org.apache.spark.h2o._"
What do you mean " it's probably a problem with your classpath."?


On Mon, Feb 29, 2016 at 5:19 PM, vincent gromakowski <
vincent.gromakow...@gmail.com> wrote:

> Don't forget to do the import. If done it's probably a problem with your
> classpath...
>
> 2016-02-29 15:03 GMT+01:00 Aleksandr Modestov  >:
>
>> Hello all,
>> There is a problem when I start to initialize H2OContent.
>> Does anybody know the answer?
>>
>> java.lang.NoClassDefFoundError: water/api/HandlerFactory at
>> org.apache.spark.h2o.H2OContext.start(H2OContext.scala:107) at
>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:65)
>> at
>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:70)
>> at
>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:72)
>> at
>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:74)
>> at
>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:76)
>> at
>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:78)
>> at
>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:80)
>> at
>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:82)
>> at
>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:84)
>> at
>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:86)
>> at
>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:88)
>> at
>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:90)
>> at
>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:92)
>> at
>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:94)
>> at
>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:96)
>> at
>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:98)
>> at
>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:100)
>> at
>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:102)
>> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:104)
>> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:106) at
>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:108) at
>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:110) at
>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:112) at
>> $iwC$$iwC$$iwC$$iwC$$iwC.(:114) at
>> $iwC$$iwC$$iwC$$iwC.(:116) at
>> $iwC$$iwC$$iwC.(:118) at $iwC$$iwC.(:120) at
>> $iwC.(:122) at (:124) at
>> .(:128) at .() at .(:7) at
>> .() at $print() at
>> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:606) at
>> org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
>> at
>> org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
>> at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
>> at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871) at
>> org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819) at
>> org.apache.zeppelin.spark.SparkInterpreter.interpretInput(SparkInterpreter.java:709)
>> at
>> org.apache.zeppelin.spark.SparkInterpreter.interpret(SparkInterpreter.java:674)
>> at
>> org.apache.zeppelin.spark.SparkInterpreter.interpret(SparkInterpreter.java:667)
>> at
>> org.apache.zeppelin.interpreter.ClassloaderInterpreter.interpret(ClassloaderInterpreter.java:57)
>> at
>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93)
>> at
>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:300)
>> at org.apache.zeppelin.scheduler.Job.run(Job.java:169) at
>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:134)
>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>> at java.util.concurrent.FutureTask.run(FutureTask.java:262) at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
>> at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
>> at
>> 

Re: problem with start H2OContent

2016-02-29 Thread vincent gromakowski
Don't forget to do the import. If done it's probably a problem with your
classpath...

2016-02-29 15:03 GMT+01:00 Aleksandr Modestov :

> Hello all,
> There is a problem when I start to initialize H2OContent.
> Does anybody know the answer?
>
> java.lang.NoClassDefFoundError: water/api/HandlerFactory at
> org.apache.spark.h2o.H2OContext.start(H2OContext.scala:107) at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:65)
> at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:70)
> at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:72)
> at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:74)
> at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:76)
> at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:78)
> at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:80)
> at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:82)
> at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:84)
> at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:86)
> at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:88)
> at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:90)
> at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:92)
> at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:94)
> at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:96)
> at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:98)
> at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:100)
> at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:102)
> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:104)
> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:106) at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:108) at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:110) at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:112) at
> $iwC$$iwC$$iwC$$iwC$$iwC.(:114) at
> $iwC$$iwC$$iwC$$iwC.(:116) at
> $iwC$$iwC$$iwC.(:118) at $iwC$$iwC.(:120) at
> $iwC.(:122) at (:124) at
> .(:128) at .() at .(:7) at
> .() at $print() at
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606) at
> org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
> at
> org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
> at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
> at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871) at
> org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819) at
> org.apache.zeppelin.spark.SparkInterpreter.interpretInput(SparkInterpreter.java:709)
> at
> org.apache.zeppelin.spark.SparkInterpreter.interpret(SparkInterpreter.java:674)
> at
> org.apache.zeppelin.spark.SparkInterpreter.interpret(SparkInterpreter.java:667)
> at
> org.apache.zeppelin.interpreter.ClassloaderInterpreter.interpret(ClassloaderInterpreter.java:57)
> at
> org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93)
> at
> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:300)
> at org.apache.zeppelin.scheduler.Job.run(Job.java:169) at
> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:134)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745) Caused by:
> java.lang.ClassNotFoundException: water.api.HandlerFactory at
> java.net.URLClassLoader$1.run(URLClassLoader.java:366) at
> 

Re: h2o from zeppelin notebook

2016-02-29 Thread Aleksandr Modestov
Thank you!

On Mon, Feb 29, 2016 at 4:17 PM, vincent gromakowski <
vincent.gromakow...@gmail.com> wrote:

> if you want to auto load, go in zeppelin UI to settings, edit spark
> interpreter config and ad a line in dependencies artifact...
>
> 2016-02-29 14:16 GMT+01:00 Aleksandr Modestov  >:
>
>> Thank you!
>> It does work!
>> "%dep
>> z.load("ai.h2o:sparkling-water-core_2.10:1.3.7")"
>>
>> On Mon, Feb 29, 2016 at 3:43 PM, vincent gromakowski <
>> vincent.gromakow...@gmail.com> wrote:
>>
>>> Try to use the dependency loader in Spark interpreter configuration
>>> page. I have encountered strange behaviors with spark.jars options...
>>>
>>> 2016-02-29 13:35 GMT+01:00 Aleksandr Modestov <
>>> aleksandrmodes...@gmail.com>:
>>>
 Hello!
 Excuse me, but it doesn't work...
 I open an interpreter window and create several additional lines.
 spark.jars .../sparkling-water-assembly-1.5.10-all.jar
 spark.jars.packages ai.h2o:sparkling-water-core_2.10
 Inside the notebook I try to add a h2o-lib: import
 org.apache.spark.h2o._
 But I have a problem: :39: error: object h2o is not a member
 of package org.apache.spark
 import org.apache.spark.h2o._
 Thank you



 On Sun, Feb 21, 2016 at 9:21 AM, moon soo Lee  wrote:

> As Felix mentioned,
>
> Loading ai.h2o:sparkling-water-core_2.10 package [1] in
> SparkInterpreter [2] would let H2O work in Zeppelin.
>
> Let me know if it does not work for you.
>
> Thanks,
> moon
>
> [1]
> https://github.com/h2oai/sparkling-water#sparkling-water-as-spark-package
> [2]
> http://zeppelin.incubator.apache.org/docs/latest/interpreter/spark.html 
> Dependency
> Management section
>
> On Sat, Feb 20, 2016 at 9:01 AM Felix Cheung <
> felixcheun...@hotmail.com> wrote:
>
>> According to this
>>
>> https://github.com/h2oai/sparkling-water
>>
>> It can be loaded as a spark package into a spark shell - the same way
>> should work with Zeppelin Spark interpreter (which is running the spark
>> shell).
>>
>>
>>
>>
>> On Sat, Feb 20, 2016 at 12:58 AM -0800, "Aleksandr Modestov" <
>> aleksandrmodes...@gmail.com> wrote:
>>
>> "H2o works in Python, Java, Scala or with Spark (Sparkling Water) as
>> well."
>> Thank you:)
>> I know I work with H2O from Jupyter or from shells...
>> But I hope that I can use Scala (for instance) from zeppelin notebook
>> it's better that use shell...
>> I can not find where I can point out zeppelin how to work with h2o
>> algoriths.
>> It sounds very good that I can work from Zeppelin notebook with Spark
>> and H2O algorithms inside one workplace.
>>
>> On Sat, Feb 20, 2016 at 8:44 AM, Felix Cheung <
>> felixcheun...@hotmail.com> wrote:
>>
>> H2o works in Python, Java, Scala or with Spark (Sparkling Water) as
>> well.
>>
>>
>>
>>
>>
>> On Fri, Feb 19, 2016 at 10:11 AM -0800, "Girish Reddy" <
>> gir...@springml.com> wrote:
>>
>> You'll need an R interpreter -
>> https://github.com/elbamos/Zeppelin-With-R
>>
>> You can then load the H2O libraries just as you would from RStudio.
>>
>>
>> On Fri, Feb 19, 2016 at 8:41 AM, Aleksandr Modestov <
>> aleksandrmodes...@gmail.com> wrote:
>>
>> If I want to use h2o libraries from noteboke what shoul I do?
>>
>>
>>
>>

>>>
>>
>


Re: h2o from zeppelin notebook

2016-02-29 Thread vincent gromakowski
if you want to auto load, go in zeppelin UI to settings, edit spark
interpreter config and ad a line in dependencies artifact...

2016-02-29 14:16 GMT+01:00 Aleksandr Modestov :

> Thank you!
> It does work!
> "%dep
> z.load("ai.h2o:sparkling-water-core_2.10:1.3.7")"
>
> On Mon, Feb 29, 2016 at 3:43 PM, vincent gromakowski <
> vincent.gromakow...@gmail.com> wrote:
>
>> Try to use the dependency loader in Spark interpreter configuration page.
>> I have encountered strange behaviors with spark.jars options...
>>
>> 2016-02-29 13:35 GMT+01:00 Aleksandr Modestov <
>> aleksandrmodes...@gmail.com>:
>>
>>> Hello!
>>> Excuse me, but it doesn't work...
>>> I open an interpreter window and create several additional lines.
>>> spark.jars .../sparkling-water-assembly-1.5.10-all.jar
>>> spark.jars.packages ai.h2o:sparkling-water-core_2.10
>>> Inside the notebook I try to add a h2o-lib: import org.apache.spark.h2o._
>>> But I have a problem: :39: error: object h2o is not a member
>>> of package org.apache.spark
>>> import org.apache.spark.h2o._
>>> Thank you
>>>
>>>
>>>
>>> On Sun, Feb 21, 2016 at 9:21 AM, moon soo Lee  wrote:
>>>
 As Felix mentioned,

 Loading ai.h2o:sparkling-water-core_2.10 package [1] in
 SparkInterpreter [2] would let H2O work in Zeppelin.

 Let me know if it does not work for you.

 Thanks,
 moon

 [1]
 https://github.com/h2oai/sparkling-water#sparkling-water-as-spark-package
 [2]
 http://zeppelin.incubator.apache.org/docs/latest/interpreter/spark.html 
 Dependency
 Management section

 On Sat, Feb 20, 2016 at 9:01 AM Felix Cheung 
 wrote:

> According to this
>
> https://github.com/h2oai/sparkling-water
>
> It can be loaded as a spark package into a spark shell - the same way
> should work with Zeppelin Spark interpreter (which is running the spark
> shell).
>
>
>
>
> On Sat, Feb 20, 2016 at 12:58 AM -0800, "Aleksandr Modestov" <
> aleksandrmodes...@gmail.com> wrote:
>
> "H2o works in Python, Java, Scala or with Spark (Sparkling Water) as
> well."
> Thank you:)
> I know I work with H2O from Jupyter or from shells...
> But I hope that I can use Scala (for instance) from zeppelin notebook
> it's better that use shell...
> I can not find where I can point out zeppelin how to work with h2o
> algoriths.
> It sounds very good that I can work from Zeppelin notebook with Spark
> and H2O algorithms inside one workplace.
>
> On Sat, Feb 20, 2016 at 8:44 AM, Felix Cheung <
> felixcheun...@hotmail.com> wrote:
>
> H2o works in Python, Java, Scala or with Spark (Sparkling Water) as
> well.
>
>
>
>
>
> On Fri, Feb 19, 2016 at 10:11 AM -0800, "Girish Reddy" <
> gir...@springml.com> wrote:
>
> You'll need an R interpreter -
> https://github.com/elbamos/Zeppelin-With-R
>
> You can then load the H2O libraries just as you would from RStudio.
>
>
> On Fri, Feb 19, 2016 at 8:41 AM, Aleksandr Modestov <
> aleksandrmodes...@gmail.com> wrote:
>
> If I want to use h2o libraries from noteboke what shoul I do?
>
>
>
>
>>>
>>
>


Re: h2o from zeppelin notebook

2016-02-29 Thread Aleksandr Modestov
Thank you!
It does work!
"%dep
z.load("ai.h2o:sparkling-water-core_2.10:1.3.7")"

On Mon, Feb 29, 2016 at 3:43 PM, vincent gromakowski <
vincent.gromakow...@gmail.com> wrote:

> Try to use the dependency loader in Spark interpreter configuration page.
> I have encountered strange behaviors with spark.jars options...
>
> 2016-02-29 13:35 GMT+01:00 Aleksandr Modestov  >:
>
>> Hello!
>> Excuse me, but it doesn't work...
>> I open an interpreter window and create several additional lines.
>> spark.jars .../sparkling-water-assembly-1.5.10-all.jar
>> spark.jars.packages ai.h2o:sparkling-water-core_2.10
>> Inside the notebook I try to add a h2o-lib: import org.apache.spark.h2o._
>> But I have a problem: :39: error: object h2o is not a member of
>> package org.apache.spark
>> import org.apache.spark.h2o._
>> Thank you
>>
>>
>>
>> On Sun, Feb 21, 2016 at 9:21 AM, moon soo Lee  wrote:
>>
>>> As Felix mentioned,
>>>
>>> Loading ai.h2o:sparkling-water-core_2.10 package [1] in
>>> SparkInterpreter [2] would let H2O work in Zeppelin.
>>>
>>> Let me know if it does not work for you.
>>>
>>> Thanks,
>>> moon
>>>
>>> [1]
>>> https://github.com/h2oai/sparkling-water#sparkling-water-as-spark-package
>>> [2]
>>> http://zeppelin.incubator.apache.org/docs/latest/interpreter/spark.html 
>>> Dependency
>>> Management section
>>>
>>> On Sat, Feb 20, 2016 at 9:01 AM Felix Cheung 
>>> wrote:
>>>
 According to this

 https://github.com/h2oai/sparkling-water

 It can be loaded as a spark package into a spark shell - the same way
 should work with Zeppelin Spark interpreter (which is running the spark
 shell).




 On Sat, Feb 20, 2016 at 12:58 AM -0800, "Aleksandr Modestov" <
 aleksandrmodes...@gmail.com> wrote:

 "H2o works in Python, Java, Scala or with Spark (Sparkling Water) as
 well."
 Thank you:)
 I know I work with H2O from Jupyter or from shells...
 But I hope that I can use Scala (for instance) from zeppelin notebook
 it's better that use shell...
 I can not find where I can point out zeppelin how to work with h2o
 algoriths.
 It sounds very good that I can work from Zeppelin notebook with Spark
 and H2O algorithms inside one workplace.

 On Sat, Feb 20, 2016 at 8:44 AM, Felix Cheung <
 felixcheun...@hotmail.com> wrote:

 H2o works in Python, Java, Scala or with Spark (Sparkling Water) as
 well.





 On Fri, Feb 19, 2016 at 10:11 AM -0800, "Girish Reddy" <
 gir...@springml.com> wrote:

 You'll need an R interpreter -
 https://github.com/elbamos/Zeppelin-With-R

 You can then load the H2O libraries just as you would from RStudio.


 On Fri, Feb 19, 2016 at 8:41 AM, Aleksandr Modestov <
 aleksandrmodes...@gmail.com> wrote:

 If I want to use h2o libraries from noteboke what shoul I do?




>>
>


Re: h2o from zeppelin notebook

2016-02-29 Thread vincent gromakowski
Try to use the dependency loader in Spark interpreter configuration page. I
have encountered strange behaviors with spark.jars options...

2016-02-29 13:35 GMT+01:00 Aleksandr Modestov :

> Hello!
> Excuse me, but it doesn't work...
> I open an interpreter window and create several additional lines.
> spark.jars .../sparkling-water-assembly-1.5.10-all.jar
> spark.jars.packages ai.h2o:sparkling-water-core_2.10
> Inside the notebook I try to add a h2o-lib: import org.apache.spark.h2o._
> But I have a problem: :39: error: object h2o is not a member of
> package org.apache.spark
> import org.apache.spark.h2o._
> Thank you
>
>
>
> On Sun, Feb 21, 2016 at 9:21 AM, moon soo Lee  wrote:
>
>> As Felix mentioned,
>>
>> Loading ai.h2o:sparkling-water-core_2.10 package [1] in SparkInterpreter
>> [2] would let H2O work in Zeppelin.
>>
>> Let me know if it does not work for you.
>>
>> Thanks,
>> moon
>>
>> [1]
>> https://github.com/h2oai/sparkling-water#sparkling-water-as-spark-package
>> [2]
>> http://zeppelin.incubator.apache.org/docs/latest/interpreter/spark.html 
>> Dependency
>> Management section
>>
>> On Sat, Feb 20, 2016 at 9:01 AM Felix Cheung 
>> wrote:
>>
>>> According to this
>>>
>>> https://github.com/h2oai/sparkling-water
>>>
>>> It can be loaded as a spark package into a spark shell - the same way
>>> should work with Zeppelin Spark interpreter (which is running the spark
>>> shell).
>>>
>>>
>>>
>>>
>>> On Sat, Feb 20, 2016 at 12:58 AM -0800, "Aleksandr Modestov" <
>>> aleksandrmodes...@gmail.com> wrote:
>>>
>>> "H2o works in Python, Java, Scala or with Spark (Sparkling Water) as
>>> well."
>>> Thank you:)
>>> I know I work with H2O from Jupyter or from shells...
>>> But I hope that I can use Scala (for instance) from zeppelin notebook
>>> it's better that use shell...
>>> I can not find where I can point out zeppelin how to work with h2o
>>> algoriths.
>>> It sounds very good that I can work from Zeppelin notebook with Spark
>>> and H2O algorithms inside one workplace.
>>>
>>> On Sat, Feb 20, 2016 at 8:44 AM, Felix Cheung >> > wrote:
>>>
>>> H2o works in Python, Java, Scala or with Spark (Sparkling Water) as well.
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Feb 19, 2016 at 10:11 AM -0800, "Girish Reddy" <
>>> gir...@springml.com> wrote:
>>>
>>> You'll need an R interpreter -
>>> https://github.com/elbamos/Zeppelin-With-R
>>>
>>> You can then load the H2O libraries just as you would from RStudio.
>>>
>>>
>>> On Fri, Feb 19, 2016 at 8:41 AM, Aleksandr Modestov <
>>> aleksandrmodes...@gmail.com> wrote:
>>>
>>> If I want to use h2o libraries from noteboke what shoul I do?
>>>
>>>
>>>
>>>
>


Re: h2o from zeppelin notebook

2016-02-29 Thread Aleksandr Modestov
Hello!
Excuse me, but it doesn't work...
I open an interpreter window and create several additional lines.
spark.jars .../sparkling-water-assembly-1.5.10-all.jar
spark.jars.packages ai.h2o:sparkling-water-core_2.10
Inside the notebook I try to add a h2o-lib: import org.apache.spark.h2o._
But I have a problem: :39: error: object h2o is not a member of
package org.apache.spark
import org.apache.spark.h2o._
Thank you



On Sun, Feb 21, 2016 at 9:21 AM, moon soo Lee  wrote:

> As Felix mentioned,
>
> Loading ai.h2o:sparkling-water-core_2.10 package [1] in SparkInterpreter
> [2] would let H2O work in Zeppelin.
>
> Let me know if it does not work for you.
>
> Thanks,
> moon
>
> [1]
> https://github.com/h2oai/sparkling-water#sparkling-water-as-spark-package
> [2]
> http://zeppelin.incubator.apache.org/docs/latest/interpreter/spark.html 
> Dependency
> Management section
>
> On Sat, Feb 20, 2016 at 9:01 AM Felix Cheung 
> wrote:
>
>> According to this
>>
>> https://github.com/h2oai/sparkling-water
>>
>> It can be loaded as a spark package into a spark shell - the same way
>> should work with Zeppelin Spark interpreter (which is running the spark
>> shell).
>>
>>
>>
>>
>> On Sat, Feb 20, 2016 at 12:58 AM -0800, "Aleksandr Modestov" <
>> aleksandrmodes...@gmail.com> wrote:
>>
>> "H2o works in Python, Java, Scala or with Spark (Sparkling Water) as
>> well."
>> Thank you:)
>> I know I work with H2O from Jupyter or from shells...
>> But I hope that I can use Scala (for instance) from zeppelin notebook
>> it's better that use shell...
>> I can not find where I can point out zeppelin how to work with h2o
>> algoriths.
>> It sounds very good that I can work from Zeppelin notebook with Spark and
>> H2O algorithms inside one workplace.
>>
>> On Sat, Feb 20, 2016 at 8:44 AM, Felix Cheung 
>> wrote:
>>
>> H2o works in Python, Java, Scala or with Spark (Sparkling Water) as well.
>>
>>
>>
>>
>>
>> On Fri, Feb 19, 2016 at 10:11 AM -0800, "Girish Reddy" <
>> gir...@springml.com> wrote:
>>
>> You'll need an R interpreter - https://github.com/elbamos/Zeppelin-With-R
>>
>> You can then load the H2O libraries just as you would from RStudio.
>>
>>
>> On Fri, Feb 19, 2016 at 8:41 AM, Aleksandr Modestov <
>> aleksandrmodes...@gmail.com> wrote:
>>
>> If I want to use h2o libraries from noteboke what shoul I do?
>>
>>
>>
>>


AW: AW: -Dspark.jars is ignored when running in yarn-client mode, also when adding the jar with sc.addJars

2016-02-29 Thread Rabe, Jens
Once the library is ready for release, I am going to put it on our 
company-internal Nexus server, but as of now, it is still work in progress.

Von: Felipe Almeida [mailto:falmeida1...@gmail.com]
Gesendet: Sonntag, 28. Februar 2016 00:23
An: users@zeppelin.incubator.apache.org
Betreff: Re: AW: -Dspark.jars is ignored when running in yarn-client mode, also 
when adding the jar with sc.addJars


You can also add maven packages and spark will  download it (along with any 
dependencies), just use the --packages directive. There's a little example at 
the end of this post but I'm still working on it: 
http://queirozf.com/entries/apache-zeppelin-spark-streaming-and-amazon-kinesis-simple-guide-and-examples

FA
Am 26.02.2016 05:57 schrieb "Rabe, Jens" 
>:
Hello,

I found out ahow to add the library. Since I run Spark with spark-submit, I 
have to add the option to the SPARK_SUBMIT_OPTIONS variable, so I added:
export SPARK_SUBMIT_OPTIONS="--jars /home/zeppelin/jars/mylib.jar"

Now it works.

This should be added to the documentation though.

Von: Rabe, Jens 
[mailto:jens.r...@iwes.fraunhofer.de]
Gesendet: Freitag, 26. Februar 2016 09:26
An: 
users@zeppelin.incubator.apache.org
Betreff: -Dspark.jars is ignored when running in yarn-client mode, also when 
adding the jar with sc.addJars

Hello,

I have a library I want to embed in Zeppelin.

I am using a build from Git yesterday, and Spark 1.6.

Here is my conf/zeppelin-env.sh:

export JAVA_HOME=/usr/lib/jvm/java-7-oracle
export MASTER=yarn-client
export HADOOP_CONF_DIR=/etc/hadoop/conf
export ZEPPELIN_PORT=10080
export SPARK_HOME=/opt/spark
export ZEPPELIN_JAVA_OPTS="-Dhdp.version=current 
–Dspark.jars=/home/zeppelin/jars/mylib.jar"

Here is my /opt/spark/conf/spark-defaults.conf:

spark.master yarn-client
spark.dynamicAllocation.enabled true
spark.shuffle.service.enabled true
spark.driver.extraJavaOptions -Dhdp.version=current
spark.yarn.am.extraJavaOptions -Dhdp.version=current

Now, I try to run Zeppelin normally.

When I then try to import something from my lib:

import com.example._

I get:

:27: error: not found: value com

I also tried with “--conf jars=…” and “--jars", to no avail – Zeppelin then 
won’t start because of an “unrecognized option”.

When I do a “ps ax |grep java”, the command line option seems to be passed 
correctly:
  481 ?Sl 0:07 /usr/lib/jvm/java-7-oracle/bin/java 
-Dhdp.version=current -Dspark.jars=/home/zeppelin/jars/mylib.jar 
-Dfile.encoding=UTF-8 -Xms1024m -Xmx1024m -XX:MaxPermSize=512m 
-Dzeppelin.log.file=/home/zeppelin/incubator-zeppelin/logs/zeppelin--hadoop-frontend.log
 -cp 
::/home/zeppelin/incubator-zeppelin/zeppelin-server/target/lib/*:/home/zeppelin/incubator-zeppelin/zeppelin-zengine/target/lib/*:/home/zeppelin/incubator-zeppelin/zeppelin-interpreter/target/lib/*:/home/zeppelin/incubator-zeppelin/lib/*:/home/zeppelin/incubator-zeppelin/*::/home/zeppelin/incubator-zeppelin/conf:/home/zeppelin/incubator-zeppelin/zeppelin-interpreter/target/classes:/home/zeppelin/incubator-zeppelin/zeppelin-zengine/target/classes:/home/zeppelin/incubator-zeppelin/zeppelin-server/target/classes
 org.apache.zeppelin.server.ZeppelinServer

Even when I upload the mylib.jar to HDFS and use “sc.addJar”, I cannot use it.

What am I missing?


Re: NoClassDefFoundError: Lorg/apache/zeppelin/spark/ZeppelinContext

2016-02-29 Thread vincent gromakowski
Sorry guys, I have made a mistake in zeppelin configuration that removed
zeppelin-spark.jar from the classpath...

2016-02-29 12:07 GMT+01:00 vincent gromakowski <
vincent.gromakow...@gmail.com>:

> Hi all,
> I am getting this strange error
>
> java.lang.NoClassDefFoundError: Lorg/apache/zeppelin/spark/ZeppelinContext;
>
> at java.lang.Class.getDeclaredFields0(Native Method)
>
> at java.lang.Class.privateGetDeclaredFields(Class.java:2583)
>
> at java.lang.Class.getDeclaredField(Class.java:2068)
>
> at
> java.io.ObjectStreamClass.getDeclaredSUID(ObjectStreamClass.java:1703)
>
> at java.io.ObjectStreamClass.access$700(ObjectStreamClass.java:72)
>
> at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:484)
>
> at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:472)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at java.io.ObjectStreamClass.(ObjectStreamClass.java:472)
>
> at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:369)
>
> at
> java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:598)
>
> at
> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1623)
>
> at
> java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
>
> at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
>
> at
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>
> at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
>
> at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
>
> at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
>
> at
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>
> at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
>
> at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
>
> at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
>
> at
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>
> at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
>
> at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
>
> at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
>
> at
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>
> at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
>
> at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
>
> at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
>
> at
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>
> at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
>
> at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
>
> at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
>
> at
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>
> at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
>
> at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
>
> at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
>
> at
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>
> at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
>
> at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
>
> at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
>
> at
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>
> at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
>
> at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
>
> at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
>
> at
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>
> at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
>
> at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
>
> at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
>
> at
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>
> at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
>
> at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
>
> at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
>
> at
> 

NoClassDefFoundError: Lorg/apache/zeppelin/spark/ZeppelinContext

2016-02-29 Thread vincent gromakowski
Hi all,
I am getting this strange error

java.lang.NoClassDefFoundError: Lorg/apache/zeppelin/spark/ZeppelinContext;

at java.lang.Class.getDeclaredFields0(Native Method)

at java.lang.Class.privateGetDeclaredFields(Class.java:2583)

at java.lang.Class.getDeclaredField(Class.java:2068)

at
java.io.ObjectStreamClass.getDeclaredSUID(ObjectStreamClass.java:1703)

at java.io.ObjectStreamClass.access$700(ObjectStreamClass.java:72)

at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:484)

at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:472)

at java.security.AccessController.doPrivileged(Native Method)

at java.io.ObjectStreamClass.(ObjectStreamClass.java:472)

at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:369)

at
java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:598)

at
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1623)

at
java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)

at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)

at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)

at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)

at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)

at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)

at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)

at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)

at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)

at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)

at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)

at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)

at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)

at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)

at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)

at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)

at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)

at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)

at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)

at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)

at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)

at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)

at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)

at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)

at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)

at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)

at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)

at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)

at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)

at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)

at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)

at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)

at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)

at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)

at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)

at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)

at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)

at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)

at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)

at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)

at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)

at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)

at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)

at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)

at
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:76)

at
org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:115)

at
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)

at org.apache.spark.scheduler.Task.run(Task.scala:89)

at

Re: [DISCUSS] Update Roadmap

2016-02-29 Thread Eran Witkon
@Vinayak Agrawal I would suggest adding the ability to connect zeppelin to
existing scheduling tools\workflow tools such as  https://oozie.apache.org/.
this requires betters hooks and status reporting but doesn't make zeppeling
and ETL\scheduler tool by itself/


On Mon, Feb 29, 2016 at 10:21 AM Vinayak Agrawal 
wrote:

> Moon,
> The new roadmap looks very promising. I am very happy to see security in
> the list.
> I have some suggestions regarding Enterprise Ready features:
>
> 1. Job Scheduler - Can this be improved?
> Currently the scheduler can be used with Cron expression or a pre-set
> time. But in an enterprise solution, a notebook might be one piece of the
> workflow. Can we look towards the functionality of scheduling notebook's
> based on other notebooks finishing their job successfully?
> This requirement would arise in any ETL workflow, where all the downstream
> users wait for the ETL notebook to finish successfully. Only after that,
> other business oriented notebooks can be executed.
>
> 2. Importing a notebook - Is there a current requirement or future plan to
> implement a feature that allows import-notebook-from-github? This would
> allow users to share notebooks seamlessly.
>
> Thanks
> Vinayak
>
> On Sun, Feb 28, 2016 at 11:22 PM, moon soo Lee  wrote:
>
>> Zhong Wang,
>> Right, Folder support would be quite useful. Thanks for the opinion.
>>
> Hope i can finish the work pr-190
>> .
>>
>
>> Sourav,
>> Regarding concurrent running, Zeppelin doesn't have limitation of run
>> paragraph/query concurrently. Interpreter can implement it's own scheduling
>> policy. For example, SparkSQL interpreter and ShellInterpreter can already
>> run paragraph/query concurrently.
>>
>> SparkInterpreter is implemented with FIFO scheduler considering nature of
>> scala compiler. That's why user can not run multiple paragraph concurrently
>> when they work with SparkInterpreter.
>> But as Zhong Wang mentioned, pr-703 enables each notebook will have
>> separate scala compiler so paragraphs run concurrently, while they're in
>> different notebooks.
>> Thanks for the feedback!
>>
>> Best,
>> moon
>>
> On Sat, Feb 27, 2016 at 8:59 PM Zhong Wang 
>> wrote:
>>
> Sourav: I think this newly merged PR can help you
>>> https://github.com/apache/incubator-zeppelin/pull/703#issuecomment-185582537
>>>
>>> On Sat, Feb 27, 2016 at 1:46 PM, Sourav Mazumder <
>>> sourav.mazumde...@gmail.com> wrote:
>>>
>> Hi Moon,

 This looks great.

 My only suggestion would be to include a PR/feature - Support for
 Running Concurrent paragraphs/queries in Zeppelin.

 Right now if more than one user tries to run paragraphs in multiple
 notebooks concurrently through a single Zeppelin instance (and single
 interpreter instance) the performance is very slow. It is obvious that the
 queue gets built up within the zeppelin process and interpreter process in
 that scenario as the time taken to move the status from start to pending
 and pending to running is very high compared to the actual running time of
 a paragraph.

 Without this the multi tenancy support would be meaningless as no one
 can practically use it in a situation where multiple users are trying to
 connect to the same instance of Zeppelin (and the related interpreter). A
 possible solution would be to spawn separate instance of the same
 interpreter at every notebook/user level.

 Regards,
 Sourav

>>> On Sat, Feb 27, 2016 at 12:48 PM, moon soo Lee  wrote:

>>> Hi Zeppelin users and developers,
>
> The roadmap we have published at
> https://cwiki.apache.org/confluence/display/ZEPPELIN/Zeppelin+Roadmap
> is almost 9 month old, and it doesn't reflect where the community goes
> anymore. It's time to update.
>
> Based on mailing list, jira issues, pullrequests, feedbacks from
> users, conferences and meetings, I could summarize the major interest of
> users and developers in 7 categories. Enterprise ready, Usability
> improvement, Pluggability, Documentation, Backend integration, Notebook
> storage, and Visualization.
>
> And i could list related subjects under each categories.
>

>- Enterprise ready
>   - Authentication
>  - Shiro authentication ZEPPELIN-548
>  
>   - Authorization
>  - Notebook authorization PR-681
>  
>   - Security
>   - Multi-tenancy
>   - Stability
>- Usability Improvement
>
>
>- UX improvement
>   - Better Table data support
>
>
>- Download data as csv, etc PR-725
>  

Re: sharing angular variable to scala interpreter in zepelin

2016-02-29 Thread Balachandar R.A.
Thanks.. It helped

Regards
Bala
On 24-Feb-2016 10:19 pm, "moon soo Lee"  wrote:

> Hi,
>
> Once an AngularObject is created, updated value from front-end side
> (angular interpreter) automatically propagated to the back-end side. So
> SparkInterpreter can read it using z.angular().
>
> However, creation of new AngularObject is possible only from the backend
> side at the moment.
>
> So, you'll need call z.angularBind() from backend side for all your
> variables.
>
> Thanks,
> moon
>
> On Tue, Feb 23, 2016 at 9:14 PM Balachandar R.A. 
> wrote:
>
>>
>>
>> Hi
>>
>>
>> In one of my usecases, i need to pass a variable defined inside angular
>> interpreter to another paragraph which is a scala interpreter. I could do
>> other way around using z.angularBind() but could not figure out how to pass
>> a variable from angular to scala. Any clue will be helpful here
>> regards
>> Bala
>>
>


Re: [DISCUSS] Update Roadmap

2016-02-29 Thread Vinayak Agrawal
Moon,
The new roadmap looks very promising. I am very happy to see security in
the list.
I have some suggestions regarding Enterprise Ready features:

1. Job Scheduler - Can this be improved?
Currently the scheduler can be used with Cron expression or a pre-set time.
But in an enterprise solution, a notebook might be one piece of the
workflow. Can we look towards the functionality of scheduling notebook's
based on other notebooks finishing their job successfully?
This requirement would arise in any ETL workflow, where all the downstream
users wait for the ETL notebook to finish successfully. Only after that,
other business oriented notebooks can be executed.

2. Importing a notebook - Is there a current requirement or future plan to
implement a feature that allows import-notebook-from-github? This would
allow users to share notebooks seamlessly.

Thanks
Vinayak

On Sun, Feb 28, 2016 at 11:22 PM, moon soo Lee  wrote:

> Zhong Wang,
> Right, Folder support would be quite useful. Thanks for the opinion.
> Hope i can finish the work pr-190
> .
>
> Sourav,
> Regarding concurrent running, Zeppelin doesn't have limitation of run
> paragraph/query concurrently. Interpreter can implement it's own scheduling
> policy. For example, SparkSQL interpreter and ShellInterpreter can already
> run paragraph/query concurrently.
>
> SparkInterpreter is implemented with FIFO scheduler considering nature of
> scala compiler. That's why user can not run multiple paragraph concurrently
> when they work with SparkInterpreter.
> But as Zhong Wang mentioned, pr-703 enables each notebook will have
> separate scala compiler so paragraphs run concurrently, while they're in
> different notebooks.
> Thanks for the feedback!
>
> Best,
> moon
>
> On Sat, Feb 27, 2016 at 8:59 PM Zhong Wang 
> wrote:
>
>> Sourav: I think this newly merged PR can help you
>> https://github.com/apache/incubator-zeppelin/pull/703#issuecomment-185582537
>>
>> On Sat, Feb 27, 2016 at 1:46 PM, Sourav Mazumder <
>> sourav.mazumde...@gmail.com> wrote:
>>
>>> Hi Moon,
>>>
>>> This looks great.
>>>
>>> My only suggestion would be to include a PR/feature - Support for
>>> Running Concurrent paragraphs/queries in Zeppelin.
>>>
>>> Right now if more than one user tries to run paragraphs in multiple
>>> notebooks concurrently through a single Zeppelin instance (and single
>>> interpreter instance) the performance is very slow. It is obvious that the
>>> queue gets built up within the zeppelin process and interpreter process in
>>> that scenario as the time taken to move the status from start to pending
>>> and pending to running is very high compared to the actual running time of
>>> a paragraph.
>>>
>>> Without this the multi tenancy support would be meaningless as no one
>>> can practically use it in a situation where multiple users are trying to
>>> connect to the same instance of Zeppelin (and the related interpreter). A
>>> possible solution would be to spawn separate instance of the same
>>> interpreter at every notebook/user level.
>>>
>>> Regards,
>>> Sourav
>>>
>>> On Sat, Feb 27, 2016 at 12:48 PM, moon soo Lee  wrote:
>>>
 Hi Zeppelin users and developers,

 The roadmap we have published at
 https://cwiki.apache.org/confluence/display/ZEPPELIN/Zeppelin+Roadmap
 is almost 9 month old, and it doesn't reflect where the community goes
 anymore. It's time to update.

 Based on mailing list, jira issues, pullrequests, feedbacks from users,
 conferences and meetings, I could summarize the major interest of users and
 developers in 7 categories. Enterprise ready, Usability improvement,
 Pluggability, Documentation, Backend integration, Notebook storage, and
 Visualization.

 And i could list related subjects under each categories.

- Enterprise ready
   - Authentication
  - Shiro authentication ZEPPELIN-548
  
   - Authorization
  - Notebook authorization PR-681
  
   - Security
   - Multi-tenancy
   - Stability
- Usability Improvement
   - UX improvement
   - Better Table data support
  - Download data as csv, etc PR-725
  ,
  PR-714 
  , PR-6 ,
  PR-89 
  - Featureful table data display (pagenation, etc)
   - Pluggability ZEPPELIN-533

   - Pluggable visualization
   - Dynamic Interpreter, notebook,