Re: Interpreter maintenance

2022-07-31 Thread Andrea Santurbano
Ho Jongyoul
I created the ksql and Neo4j interpreters. In particular the last one is
widely used by Neo users. Do you plan to cut it out?

Il giorno dom 31 lug 2022 alle 12:34 Jongyoul Lee  ha
scritto:

> Hello,
>
> I'm currently working with several contributors about the issue of what
> interpreters we will focus on.
>
> My plan is to minimize the number of interpreters and focus on managing it
> very actively so the first step is to remove outdated and less active
> interpreters. Please check the status and let us know if you really want to
> keep some interpreters. We will keep them updated as much as possible so
> please update the version and the usage. Please take a look into it.
>
> Here is the reference:
>
> https://cwiki.apache.org/confluence/display/ZEPPELIN/Interpreter+Maintenance
>
> Regards,
> Jongyoul Lee
>
> --
> 이종열, Jongyoul Lee, 李宗烈
> http://madeng.net
>


Re: Slack channel for Zeppelin community

2021-02-15 Thread Andrea Santurbano
Hi Jeff, my email is sant...@gmail.com can you please add me in?

Il giorno mer 10 feb 2021 alle ore 16:18 Jeff Zhang  ha
scritto:

> Hi Folks,
>
> We have an apache slack channel for Zeppelin, but we didn't make lots of
> discussion there,  but I think it is suitable for many kinds of discussion,
> specially we recently talked about regular community sync up meeting. So if
> you are interested, please leave your slack account in this thread, I will
> invite you to join it.
>
>
>
> --
> Best Regards
>
> Jeff Zhang
>


Re: org.apache.hadoop.fs.FileSystem$Statistics.getThreadStatistics()

2018-07-06 Thread Andrea Santurbano
Thanks Adamantios,
I created a Dockerfile in order to aoutomate the process, feel free to use
it:

https://gist.github.com/conker84/4ffc9a2f0125c808b4dfcf3b7d70b043



Il giorno gio 5 lug 2018 alle ore 13:00 Adamantios Corais <
adamantios.cor...@gmail.com> ha scritto:

> Hi Andrea,
>
> The following workaround works for me (but maybe there are other
> alternatives too):
>
> - downloaded spark spark-2.3.1-bin-hadoop2.7
> - renamed the zeppelin-env.sh.template to zeppelin-env.sh
> - appended the following line in the above file: export
> SPARK_HOME=../../spark-2.3.1-bin-hadoop2.7/
>
> Hope this helps,
>
>
>
>
> *// **Adamantios Corais*
>
> On Thu, Jul 5, 2018 at 1:51 PM, Andrea Santurbano 
> wrote:
>
>> Thanks Jeff,
>> is there a workaround in order to make it work now?
>>
>> Il giorno gio 5 lug 2018 alle ore 12:42 Jeff Zhang  ha
>> scritto:
>>
>>>
>>> This is due to hadoop version used in embedded spark is 2.3 which is too
>>> lower. I created https://issues.apache.org/jira/browse/ZEPPELIN-3586 for
>>> this issue. Suppose it will be fixed in o.8.1
>>>
>>>
>>>
>>> Andrea Santurbano 于2018年7月5日周四 下午3:35写道:
>>>
>>>> I agree that is not for production, but if want to do a simple blog
>>>> post (and that's what I'm doing) I think it's a well suited solution.
>>>> Is it possible to fix this?
>>>> Thanks
>>>> Andrea
>>>>
>>>> Il giorno gio 5 lug 2018 alle ore 02:29 Jeff Zhang 
>>>> ha scritto:
>>>>
>>>>>
>>>>> This might be due to the embedded spark version.  I would recommend
>>>>> you to specify SPARK_HOME instead of using the embedded spark, the 
>>>>> embedded
>>>>> spark is not for production.
>>>>>
>>>>>
>>>>> Andrea Santurbano 于2018年7月5日周四 上午12:07写道:
>>>>>
>>>>>> I have the same issue...
>>>>>> Il giorno mar 3 lug 2018 alle 23:18 Adamantios Corais <
>>>>>> adamantios.cor...@gmail.com> ha scritto:
>>>>>>
>>>>>>> Hi Jeff, I am using the embedded Spark.
>>>>>>>
>>>>>>> FYI, this is how I start the dockerized (yet old) version of
>>>>>>> Zeppelin that works as expected.
>>>>>>>
>>>>>>> #!/bin/bash
>>>>>>>> docker run --rm \
>>>>>>>> --name zepelin \
>>>>>>>> -p 127.0.0.1:9090:8080 \
>>>>>>>> -p 127.0.0.1:5050:4040 \
>>>>>>>> -v $(pwd):/zeppelin/notebook \
>>>>>>>> apache/zeppelin:0.7.3
>>>>>>>
>>>>>>>
>>>>>>> And this is how I start the binarized (yet stable) version of
>>>>>>> Zeppelin that is supposed to work (but it doesn't).
>>>>>>>
>>>>>>> #!/bin/bash
>>>>>>>> wget
>>>>>>>> http://www-eu.apache.org/dist/zeppelin/zeppelin-0.8.0/zeppelin-0.8.0-bin-all.tgz
>>>>>>>> tar  zxvf zeppelin-0.8.0-bin-all.tgz
>>>>>>>> cd   ./zeppelin-0.8.0-bin-all/
>>>>>>>> bash ./bin/zeppelin.sh
>>>>>>>
>>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *// **Adamantios Corais*
>>>>>>>
>>>>>>> On Tue, Jul 3, 2018 at 2:24 AM, Jeff Zhang  wrote:
>>>>>>>
>>>>>>>>
>>>>>>>> Do you use the embeded spark or specify SPARK_HOME ? If you set
>>>>>>>> SPARK_HOME, which spark version and hadoop version do you use ?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Adamantios Corais 于2018年7月3日周二
>>>>>>>> 上午12:32写道:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I have downloaded the latest binary package of Zeppelin (ver.
>>>>>>>>> 0.8.0), extracted, and started as follows: `./bin/zeppelin.sh`
>>>>>>>>>
>>>>>>>>> Next, I tried a very simple example:
>>>>>>>>>
>>>>>>>>> `spark.read.parquet("./bin

Re: org.apache.hadoop.fs.FileSystem$Statistics.getThreadStatistics()

2018-07-05 Thread Andrea Santurbano
I agree that is not for production, but if want to do a simple blog post
(and that's what I'm doing) I think it's a well suited solution.
Is it possible to fix this?
Thanks
Andrea

Il giorno gio 5 lug 2018 alle ore 02:29 Jeff Zhang  ha
scritto:

>
> This might be due to the embedded spark version.  I would recommend you to
> specify SPARK_HOME instead of using the embedded spark, the embedded spark
> is not for production.
>
>
> Andrea Santurbano 于2018年7月5日周四 上午12:07写道:
>
>> I have the same issue...
>> Il giorno mar 3 lug 2018 alle 23:18 Adamantios Corais <
>> adamantios.cor...@gmail.com> ha scritto:
>>
>>> Hi Jeff, I am using the embedded Spark.
>>>
>>> FYI, this is how I start the dockerized (yet old) version of Zeppelin
>>> that works as expected.
>>>
>>> #!/bin/bash
>>>> docker run --rm \
>>>> --name zepelin \
>>>> -p 127.0.0.1:9090:8080 \
>>>> -p 127.0.0.1:5050:4040 \
>>>> -v $(pwd):/zeppelin/notebook \
>>>> apache/zeppelin:0.7.3
>>>
>>>
>>> And this is how I start the binarized (yet stable) version of Zeppelin that
>>> is supposed to work (but it doesn't).
>>>
>>> #!/bin/bash
>>>> wget
>>>> http://www-eu.apache.org/dist/zeppelin/zeppelin-0.8.0/zeppelin-0.8.0-bin-all.tgz
>>>> tar  zxvf zeppelin-0.8.0-bin-all.tgz
>>>> cd   ./zeppelin-0.8.0-bin-all/
>>>> bash ./bin/zeppelin.sh
>>>
>>>
>>> Thanks.
>>>
>>>
>>>
>>>
>>> *// **Adamantios Corais*
>>>
>>> On Tue, Jul 3, 2018 at 2:24 AM, Jeff Zhang  wrote:
>>>
>>>>
>>>> Do you use the embeded spark or specify SPARK_HOME ? If you set
>>>> SPARK_HOME, which spark version and hadoop version do you use ?
>>>>
>>>>
>>>>
>>>> Adamantios Corais 于2018年7月3日周二 上午12:32写道:
>>>>
>>>>> Hi,
>>>>>
>>>>> I have downloaded the latest binary package of Zeppelin (ver. 0.8.0),
>>>>> extracted, and started as follows: `./bin/zeppelin.sh`
>>>>>
>>>>> Next, I tried a very simple example:
>>>>>
>>>>> `spark.read.parquet("./bin/userdata1.parquet").show()`
>>>>>
>>>>> Which unfortunately returns the following error. Note that the same
>>>>> example works fine with the official docker version of Zeppelin (ver.
>>>>> 0.7.3). Any ideas?
>>>>>
>>>>> org.apache.spark.SparkException: Job aborted due to stage failure:
>>>>>> Task 0 in stage 7.0 failed 1 times, most recent failure: Lost task 0.0 in
>>>>>> stage 7.0 (TID 7, localhost, executor driver): 
>>>>>> java.lang.NoSuchMethodError:
>>>>>> org.apache.hadoop.fs.FileSystem$Statistics.getThreadStatistics()Lorg/apache/hadoop/fs/FileSystem$Statistics$StatisticsData;
>>>>>> at
>>>>>> org.apache.spark.deploy.SparkHadoopUtil$$anonfun$1$$anonfun$apply$mcJ$sp$1.apply(SparkHadoopUtil.scala:149)
>>>>>> at
>>>>>> org.apache.spark.deploy.SparkHadoopUtil$$anonfun$1$$anonfun$apply$mcJ$sp$1.apply(SparkHadoopUtil.scala:149)
>>>>>> at
>>>>>> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>>>>>> at
>>>>>> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>>>>>> at scala.collection.Iterator$class.foreach(Iterator.scala:893)
>>>>>> at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
>>>>>> at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>>>>>> at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>>>>>> at
>>>>>> scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
>>>>>> at scala.collection.AbstractTraversable.map(Traversable.scala:104)
>>>>>> at
>>>>>> org.apache.spark.deploy.SparkHadoopUtil$$anonfun$1.apply$mcJ$sp(SparkHadoopUtil.scala:149)
>>>>>> at
>>>>>> org.apache.spark.deploy.SparkHadoopUtil.getFSBytesReadOnThreadCallback(SparkHadoopUtil.scala:150)
>>>>>> at
>>>>>> org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.(FileScanRDD.scala:78)
>>>>>> at
>>>>>> org.apache.spark.sql.execution.datasources.FileScanRDD.compute(FileScanRDD.scala:71)
>&g

Re: org.apache.hadoop.fs.FileSystem$Statistics.getThreadStatistics()

2018-07-04 Thread Andrea Santurbano
I have the same issue...
Il giorno mar 3 lug 2018 alle 23:18 Adamantios Corais <
adamantios.cor...@gmail.com> ha scritto:

> Hi Jeff, I am using the embedded Spark.
>
> FYI, this is how I start the dockerized (yet old) version of Zeppelin that
> works as expected.
>
> #!/bin/bash
>> docker run --rm \
>> --name zepelin \
>> -p 127.0.0.1:9090:8080 \
>> -p 127.0.0.1:5050:4040 \
>> -v $(pwd):/zeppelin/notebook \
>> apache/zeppelin:0.7.3
>
>
> And this is how I start the binarized (yet stable) version of Zeppelin that
> is supposed to work (but it doesn't).
>
> #!/bin/bash
>> wget
>> http://www-eu.apache.org/dist/zeppelin/zeppelin-0.8.0/zeppelin-0.8.0-bin-all.tgz
>> tar  zxvf zeppelin-0.8.0-bin-all.tgz
>> cd   ./zeppelin-0.8.0-bin-all/
>> bash ./bin/zeppelin.sh
>
>
> Thanks.
>
>
>
>
> *// **Adamantios Corais*
>
> On Tue, Jul 3, 2018 at 2:24 AM, Jeff Zhang  wrote:
>
>>
>> Do you use the embeded spark or specify SPARK_HOME ? If you set
>> SPARK_HOME, which spark version and hadoop version do you use ?
>>
>>
>>
>> Adamantios Corais 于2018年7月3日周二 上午12:32写道:
>>
>>> Hi,
>>>
>>> I have downloaded the latest binary package of Zeppelin (ver. 0.8.0),
>>> extracted, and started as follows: `./bin/zeppelin.sh`
>>>
>>> Next, I tried a very simple example:
>>>
>>> `spark.read.parquet("./bin/userdata1.parquet").show()`
>>>
>>> Which unfortunately returns the following error. Note that the same
>>> example works fine with the official docker version of Zeppelin (ver.
>>> 0.7.3). Any ideas?
>>>
>>> org.apache.spark.SparkException: Job aborted due to stage failure: Task
 0 in stage 7.0 failed 1 times, most recent failure: Lost task 0.0 in stage
 7.0 (TID 7, localhost, executor driver): java.lang.NoSuchMethodError:
 org.apache.hadoop.fs.FileSystem$Statistics.getThreadStatistics()Lorg/apache/hadoop/fs/FileSystem$Statistics$StatisticsData;
 at
 org.apache.spark.deploy.SparkHadoopUtil$$anonfun$1$$anonfun$apply$mcJ$sp$1.apply(SparkHadoopUtil.scala:149)
 at
 org.apache.spark.deploy.SparkHadoopUtil$$anonfun$1$$anonfun$apply$mcJ$sp$1.apply(SparkHadoopUtil.scala:149)
 at
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
 at
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
 at scala.collection.Iterator$class.foreach(Iterator.scala:893)
 at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
 at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
 at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
 at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
 at scala.collection.AbstractTraversable.map(Traversable.scala:104)
 at
 org.apache.spark.deploy.SparkHadoopUtil$$anonfun$1.apply$mcJ$sp(SparkHadoopUtil.scala:149)
 at
 org.apache.spark.deploy.SparkHadoopUtil.getFSBytesReadOnThreadCallback(SparkHadoopUtil.scala:150)
 at
 org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.(FileScanRDD.scala:78)
 at
 org.apache.spark.sql.execution.datasources.FileScanRDD.compute(FileScanRDD.scala:71)
 at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
 at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
 at
 org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
 at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
 at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
 at
 org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
 at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
 at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
 at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
 at org.apache.spark.scheduler.Task.run(Task.scala:108)
 at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)
 Driver stacktrace:
   at org.apache.spark.scheduler.DAGScheduler.org
 $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1499)
   at
 org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1487)
   at
 org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1486)
   at
 scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
   at
 org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1486)
   at
 org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814)
   at
 

Upload file to Filesystem

2018-01-02 Thread Andrea Santurbano
Hi guys,
I was wondering if already exists a snippet that from a from you can import
a file directly to the filesystem.
Thanks
Andrea


Re: [DISCUSSION] Extending TableData API

2017-06-12 Thread Andrea Santurbano
Hi guys,
this is great! I think this can also enable some drop-down feature between
tables in the UI...
Do you think this enhancements can also include the graph part?

Andrea

Il giorno lun 12 giu 2017 alle ore 05:47 Jun Kim  ha
scritto:

> All of the enhancements looks great to me!
>
> And I wish a feature which can upload a small CSV file (maybe about
> 20MB..?) and play with it directly.
> It would be great if I can drag a file to Zeppelin and register it as the
> table.
>
> Thanks :)
>
> 2017년 6월 12일 (월) 오전 11:40, Park Hoon <1am...@gmail.com>님이 작성:
>
>> Hi All,
>>
>> Recently, ZEPPELIN-753
>>  (Tabledata
>> abstraction) and ZEPPELIN-2020
>>  (Remote method
>> invocation for resources) were resolved.
>> Based on this work, we can improve Zeppelin with the following
>> enhancements:
>>
>> * register the table result as a shared resource
>> * list all available (registered) tables
>> * preview tables including its meta information (e.g columns, types, ..)
>> * download registered tables as CSV, and other formats.
>> * pivoting/filtering in backend to transforming larger data
>> * cross join tables in different interpreters (e.g Spark interpreter uses
>> a table result generated from JDBC interpreter)
>>
>> You can find the full proposal in Extending Table Data API
>> 
>>  which
>> is contributed by @1ambda, @khalidhuseynov, @Leemoonsoo.
>>
>> Any question, feedback or discussion will be welcomed.
>>
>>
>> Thanks.
>>
> --
> Taejun Kim
>
> Data Mining Lab.
> School of Electrical and Computer Engineering
> University of Seoul
>


Re: Stanford Core NLP & Databricks Wrapper

2016-08-01 Thread Andrea Santurbano
Hi Jeff after some analysis of the logs i have achieved the goal!
What i can't still do is the problem explained in thread "Import jars from
spark spackages", have you some experiences on that?


Il giorno dom 31 lug 2016 alle ore 15:43 Jeff Zhang <zjf...@gmail.com> ha
scritto:

> What kind of issue do you have ?
>
> On Sun, Jul 31, 2016 at 8:27 PM, Andrea Santurbano <sant...@gmail.com>
> wrote:
>
>> Hi all,
>> has someone successfully imported this libraryin association with
>> databricks core-nlp wrapper in zeppelin?
>>
>
>
>
> --
> Best Regards
>
> Jeff Zhang
>


Re: Import jars from spark spackages

2016-07-30 Thread Andrea Santurbano
Here is the log[1]
I have a standard zeppelin configuration:
zeppelin.dep.additionalRemoteRepository=spark-packages,
http://dl.bintray.com/spark-packages/maven,false;


[1] https://gist.github.com/conker84/d2ad350850f39022e594825b6fda980e

Il giorno sab 30 lug 2016 alle ore 15:14 DuyHai Doan <doanduy...@gmail.com>
ha scritto:

> What is exactly the error message you have in the logs ?
>
> On Sat, Jul 30, 2016 at 2:58 PM, Andrea Santurbano <sant...@gmail.com>
> wrote:
>
>> Hi all,
>> i want to import this library:
>> https://github.com/databricks/spark-corenlp
>> which is under spark packages:
>> https://spark-packages.org/package/databricks/spark-corenlp
>> If in my interpreter settings, in artifact section i insert:
>> databricks:spark-corenlp:0.1
>> or
>> com.databricks:spark-corenlp:0.1
>>
>> no package is found.
>> Where am i wrong?
>>
>> Thanks
>> Andrea
>>
>
>


Import jars from spark spackages

2016-07-30 Thread Andrea Santurbano
Hi all,
i want to import this library:
https://github.com/databricks/spark-corenlp
which is under spark packages:
https://spark-packages.org/package/databricks/spark-corenlp
If in my interpreter settings, in artifact section i insert:
databricks:spark-corenlp:0.1
or
com.databricks:spark-corenlp:0.1

no package is found.
Where am i wrong?

Thanks
Andrea


"Private paragraph session"

2016-06-23 Thread Andrea Santurbano
Hi to all,
Zeppelin makes an heavy use of websockets so if some paragraph with input
value (in a %jdbc interpreter for instance) is manipulated by a user, the
paragraph change for every user who is viewing that paragraph.
Is there something like a "private" session where every input inserted in a
paragraph makes changes only on current user view?
Thanks in advice


Get stack Tooltip Value

2016-06-08 Thread Andrea Santurbano
Hi,
i'm making a dashboard with Zeppelin and Spark, the tool is fantastic but i
have a little problem: i have to get some values from the d3 visualization (on
click on stack for instance) from the paragraph "A" and use this values to
start a paragraph "B" (via rest api). Is it possibile to do this?
Thank
Andrea