Re: csv dependencies loaded in %spark but not %sql in spark 1.6/zeppelin 0.5.6

2016-02-03 Thread Benjamin Kim
Same here. I want to know the answer too.


> On Feb 2, 2016, at 12:32 PM, Jonathan Kelly  wrote:
> 
> Hey, I just ran into that same exact issue yesterday and wasn't sure if I was 
> doing something wrong or what. Glad to know it's not just me! Unfortunately I 
> have not yet had the time to look any deeper into it. Would you mind filing a 
> JIRA if there isn't already one?
> 
> On Tue, Feb 2, 2016 at 12:29 PM Lin, Yunfeng  > wrote:
> Hi guys,
> 
>  
> 
> I load spark-csv dependencies in %spark, but not in %sql using apache 
> zeppelin 0.5.6 with spark 1.6.0. Everything is working fine in zeppelin 0.5.5 
> with spark 1.5 through
> 
>  
> 
> Do you have similar problems?
> 
>  
> 
> I am loading spark csv dependencies (https://github.com/databricks/spark-csv 
> )
> 
>  
> 
> Using:
> 
> %dep
> 
> z.load(“PATH/commons-csv-1.1.jar”)
> 
> z.load(“PATH/spark-csv_2.10-1.3.0.jar”)
> 
> z.load(“PATH/univocity-parsers-1.5.1.jar:)
> 
> z.load(“PATH/scala-library-2.10.5.jar”)
> 
>  
> 
> I am able to load a csv from hdfs using data frame API in spark. It is 
> running perfect fine.
> 
> %spark
> 
> val df = sqlContext.read
> 
> .format("com.databricks.spark.csv")
> 
> .option("header", "false") // Use finrst line of all files as header
> 
> .option("inferSchema", "true") // Automatically infer data types
> 
> .load("hdfs://sd-6f48-7fe6:8020/tmp/people.txt")   // this is a file in 
> HDFS
> 
> df.registerTempTable("people")
> 
> df.show()
> 
>  
> 
> This also work:
> 
> %spark
> 
> val df2=sqlContext.sql(“select * from people”)
> 
> df2.show()
> 
>  
> 
> But this doesn’t work….
> 
> %sql
> 
> select * from people
> 
>  
> 
> java.lang.ClassNotFoundException: 
> com.databricks.spark.csv.CsvRelation$$anonfun$1$$anonfun$2 at 
> java.net.URLClassLoader$1.run(URLClassLoader.java:366) at 
> java.net.URLClassLoader$1.run(URLClassLoader.java:355) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> java.net.URLClassLoader.findClass(URLClassLoader.java:354) at 
> java.lang.ClassLoader.loadClass(ClassLoader.java:425) at 
> sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at 
> java.lang.ClassLoader.loadClass(ClassLoader.java:358) at 
> java.lang.Class.forName0(Native Method) at 
> java.lang.Class.forName(Class.java:270) at 
> org.apache.spark.util.InnerClosureFinder$$anon$4.visitMethodInsn(ClosureCleaner.scala:435)
>  at org.apache.xbean.asm5.ClassReader.a(Unknown Source) at 
> org.apache.xbean.asm5.ClassReader.b(Unknown Source) at 
> org.apache.xbean.asm5.ClassReader.accept(Unknown Source) at 
> org.apache.xbean.asm5.ClassReader.accept(Unknown Source) at 
> org.apache.spark.util.ClosureCleaner$.getInnerClosureClasses(ClosureCleaner.scala:84)
>  at 
> org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:187)
>  at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:122) at 
> org.apache.spark.SparkContext.clean(SparkContext.scala:2055) at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1.apply(RDD.scala:707) at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1.apply(RDD.scala:706) at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
>  at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
>  at org.apache.spark.rdd.RDD.withScope(RDD.scala:316) at 
> org.apache.spark.rdd.RDD.mapPartitions(RDD.scala:706) at 
> com.databricks.spark.csv.CsvRelation.tokenRdd(CsvRelation.scala:90) at 
> com.databricks.spark.csv.CsvRelation.buildScan(CsvRelation.scala:104) at 
> com.databricks.spark.csv.CsvRelation.buildScan(CsvRelation.scala:152) at 
> org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$4.apply(DataSourceStrategy.scala:64)
>  at 
> org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$4.apply(DataSourceStrategy.scala:64)
>  at 
> org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$pruneFilterProject$1.apply(DataSourceStrategy.scala:274)
>  at 
> org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$pruneFilterProject$1.apply(DataSourceStrategy.scala:273)
>  at 
> org.apache.spark.sql.execution.datasources.DataSourceStrategy$.pruneFilterProjectRaw(DataSourceStrategy.scala:352)
>  at 
> org.apache.spark.sql.execution.datasources.DataSourceStrategy$.pruneFilterProject(DataSourceStrategy.scala:269)
>  at 
> org.apache.spark.sql.execution.datasources.DataSourceStrategy$.apply(DataSourceStrategy.scala:60)
>  at 
> org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:58)
>  at 
> org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:58)
>  at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371) at 
> org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:59)
>  at 
> 

Re: csv dependencies loaded in %spark but not %sql in spark 1.6/zeppelin 0.5.6

2016-02-02 Thread Jonathan Kelly
BTW, this sounds very similar to
https://issues.apache.org/jira/browse/ZEPPELIN-297, which affects %pyspark
and was fixed in Zeppelin 0.5.5.

On Tue, Feb 2, 2016 at 12:32 PM Jonathan Kelly 
wrote:

> Hey, I just ran into that same exact issue yesterday and wasn't sure if I
> was doing something wrong or what. Glad to know it's not just me!
> Unfortunately I have not yet had the time to look any deeper into it. Would
> you mind filing a JIRA if there isn't already one?
>
> On Tue, Feb 2, 2016 at 12:29 PM Lin, Yunfeng  wrote:
>
>> Hi guys,
>>
>>
>>
>> I load spark-csv dependencies in %spark, but not in %sql using apache
>> zeppelin 0.5.6 with spark 1.6.0. Everything is working fine in zeppelin
>> 0.5.5 with spark 1.5 through
>>
>>
>>
>> Do you have similar problems?
>>
>>
>>
>> I am loading spark csv dependencies (
>> https://github.com/databricks/spark-csv)
>>
>>
>>
>> Using:
>>
>> %dep
>>
>> z.load(“PATH/commons-csv-1.1.jar”)
>>
>> z.load(“PATH/spark-csv_2.10-1.3.0.jar”)
>>
>> z.load(“PATH/univocity-parsers-1.5.1.jar:)
>>
>> z.load(“PATH/scala-library-2.10.5.jar”)
>>
>>
>>
>> I am able to load a csv from hdfs using data frame API in spark. It is
>> running perfect fine.
>>
>> %spark
>>
>> val df = sqlContext.read
>>
>> .format("com.databricks.spark.csv")
>>
>> .option("header", "false") // Use finrst line of all files as header
>>
>> .option("inferSchema", "true") // Automatically infer data types
>>
>> .load("hdfs://sd-6f48-7fe6:8020/tmp/people.txt")   // this is a file
>> in HDFS
>>
>> df.registerTempTable("people")
>>
>> df.show()
>>
>>
>>
>> This also work:
>>
>> %spark
>>
>> val df2=sqlContext.sql(“select * from people”)
>>
>> df2.show()
>>
>>
>>
>> But this doesn’t work….
>>
>> %sql
>>
>> select * from people
>>
>>
>>
>> java.lang.ClassNotFoundException:
>> com.databricks.spark.csv.CsvRelation$$anonfun$1$$anonfun$2 at
>> java.net.URLClassLoader$1.run(URLClassLoader.java:366) at
>> java.net.URLClassLoader$1.run(URLClassLoader.java:355) at
>> java.security.AccessController.doPrivileged(Native Method) at
>> java.net.URLClassLoader.findClass(URLClassLoader.java:354) at
>> java.lang.ClassLoader.loadClass(ClassLoader.java:425) at
>> sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at
>> java.lang.ClassLoader.loadClass(ClassLoader.java:358) at
>> java.lang.Class.forName0(Native Method) at
>> java.lang.Class.forName(Class.java:270) at
>> org.apache.spark.util.InnerClosureFinder$$anon$4.visitMethodInsn(ClosureCleaner.scala:435)
>> at org.apache.xbean.asm5.ClassReader.a(Unknown Source) at
>> org.apache.xbean.asm5.ClassReader.b(Unknown Source) at
>> org.apache.xbean.asm5.ClassReader.accept(Unknown Source) at
>> org.apache.xbean.asm5.ClassReader.accept(Unknown Source) at
>> org.apache.spark.util.ClosureCleaner$.getInnerClosureClasses(ClosureCleaner.scala:84)
>> at
>> org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:187)
>> at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:122) at
>> org.apache.spark.SparkContext.clean(SparkContext.scala:2055) at
>> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1.apply(RDD.scala:707) at
>> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1.apply(RDD.scala:706) at
>> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
>> at
>> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
>> at org.apache.spark.rdd.RDD.withScope(RDD.scala:316) at
>> org.apache.spark.rdd.RDD.mapPartitions(RDD.scala:706) at
>> com.databricks.spark.csv.CsvRelation.tokenRdd(CsvRelation.scala:90) at
>> com.databricks.spark.csv.CsvRelation.buildScan(CsvRelation.scala:104) at
>> com.databricks.spark.csv.CsvRelation.buildScan(CsvRelation.scala:152) at
>> org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$4.apply(DataSourceStrategy.scala:64)
>> at
>> org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$4.apply(DataSourceStrategy.scala:64)
>> at
>> org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$pruneFilterProject$1.apply(DataSourceStrategy.scala:274)
>> at
>> org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$pruneFilterProject$1.apply(DataSourceStrategy.scala:273)
>> at
>> org.apache.spark.sql.execution.datasources.DataSourceStrategy$.pruneFilterProjectRaw(DataSourceStrategy.scala:352)
>> at
>> org.apache.spark.sql.execution.datasources.DataSourceStrategy$.pruneFilterProject(DataSourceStrategy.scala:269)
>> at
>> org.apache.spark.sql.execution.datasources.DataSourceStrategy$.apply(DataSourceStrategy.scala:60)
>> at
>> org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:58)
>> at
>> org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:58)
>> at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371) at
>> 

csv dependencies loaded in %spark but not %sql in spark 1.6/zeppelin 0.5.6

2016-02-02 Thread Lin, Yunfeng
Hi guys,

I load spark-csv dependencies in %spark, but not in %sql using apache zeppelin 
0.5.6 with spark 1.6.0. Everything is working fine in zeppelin 0.5.5 with spark 
1.5 through

Do you have similar problems?

I am loading spark csv dependencies (https://github.com/databricks/spark-csv)

Using:
%dep
z.load("PATH/commons-csv-1.1.jar")
z.load("PATH/spark-csv_2.10-1.3.0.jar")
z.load("PATH/univocity-parsers-1.5.1.jar:)
z.load("PATH/scala-library-2.10.5.jar")

I am able to load a csv from hdfs using data frame API in spark. It is running 
perfect fine.
%spark
val df = sqlContext.read
.format("com.databricks.spark.csv")
.option("header", "false") // Use finrst line of all files as header
.option("inferSchema", "true") // Automatically infer data types
.load("hdfs://sd-6f48-7fe6:8020/tmp/people.txt")   // this is a file in HDFS
df.registerTempTable("people")
df.show()

This also work:
%spark
val df2=sqlContext.sql("select * from people")
df2.show()

But this doesn't work
%sql
select * from people

java.lang.ClassNotFoundException: 
com.databricks.spark.csv.CsvRelation$$anonfun$1$$anonfun$2 at 
java.net.URLClassLoader$1.run(URLClassLoader.java:366) at 
java.net.URLClassLoader$1.run(URLClassLoader.java:355) at 
java.security.AccessController.doPrivileged(Native Method) at 
java.net.URLClassLoader.findClass(URLClassLoader.java:354) at 
java.lang.ClassLoader.loadClass(ClassLoader.java:425) at 
sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at 
java.lang.ClassLoader.loadClass(ClassLoader.java:358) at 
java.lang.Class.forName0(Native Method) at 
java.lang.Class.forName(Class.java:270) at 
org.apache.spark.util.InnerClosureFinder$$anon$4.visitMethodInsn(ClosureCleaner.scala:435)
 at org.apache.xbean.asm5.ClassReader.a(Unknown Source) at 
org.apache.xbean.asm5.ClassReader.b(Unknown Source) at 
org.apache.xbean.asm5.ClassReader.accept(Unknown Source) at 
org.apache.xbean.asm5.ClassReader.accept(Unknown Source) at 
org.apache.spark.util.ClosureCleaner$.getInnerClosureClasses(ClosureCleaner.scala:84)
 at 
org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:187)
 at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:122) at 
org.apache.spark.SparkContext.clean(SparkContext.scala:2055) at 
org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1.apply(RDD.scala:707) at 
org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1.apply(RDD.scala:706) at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) 
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111) 
at org.apache.spark.rdd.RDD.withScope(RDD.scala:316) at 
org.apache.spark.rdd.RDD.mapPartitions(RDD.scala:706) at 
com.databricks.spark.csv.CsvRelation.tokenRdd(CsvRelation.scala:90) at 
com.databricks.spark.csv.CsvRelation.buildScan(CsvRelation.scala:104) at 
com.databricks.spark.csv.CsvRelation.buildScan(CsvRelation.scala:152) at 
org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$4.apply(DataSourceStrategy.scala:64)
 at 
org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$4.apply(DataSourceStrategy.scala:64)
 at 
org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$pruneFilterProject$1.apply(DataSourceStrategy.scala:274)
 at 
org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$pruneFilterProject$1.apply(DataSourceStrategy.scala:273)
 at 
org.apache.spark.sql.execution.datasources.DataSourceStrategy$.pruneFilterProjectRaw(DataSourceStrategy.scala:352)
 at 
org.apache.spark.sql.execution.datasources.DataSourceStrategy$.pruneFilterProject(DataSourceStrategy.scala:269)
 at 
org.apache.spark.sql.execution.datasources.DataSourceStrategy$.apply(DataSourceStrategy.scala:60)
 at 
org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:58)
 at 
org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:58)
 at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371) at 
org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:59) 
at 
org.apache.spark.sql.catalyst.planning.QueryPlanner.planLater(QueryPlanner.scala:54)
 at 
org.apache.spark.sql.execution.SparkStrategies$BasicOperators$.apply(SparkStrategies.scala:349)
 at 
org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:58)
 at 
org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:58)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371) at 
org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:59) 
at 
org.apache.spark.sql.execution.QueryExecution.sparkPlan$lzycompute(QueryExecution.scala:47)
 at 
org.apache.spark.sql.execution.QueryExecution.sparkPlan(QueryExecution.scala:45)
 at 
org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:52)
 at 

Re: csv dependencies loaded in %spark but not %sql in spark 1.6/zeppelin 0.5.6

2016-02-02 Thread Jonathan Kelly
Hey, I just ran into that same exact issue yesterday and wasn't sure if I
was doing something wrong or what. Glad to know it's not just me!
Unfortunately I have not yet had the time to look any deeper into it. Would
you mind filing a JIRA if there isn't already one?

On Tue, Feb 2, 2016 at 12:29 PM Lin, Yunfeng  wrote:

> Hi guys,
>
>
>
> I load spark-csv dependencies in %spark, but not in %sql using apache
> zeppelin 0.5.6 with spark 1.6.0. Everything is working fine in zeppelin
> 0.5.5 with spark 1.5 through
>
>
>
> Do you have similar problems?
>
>
>
> I am loading spark csv dependencies (
> https://github.com/databricks/spark-csv)
>
>
>
> Using:
>
> %dep
>
> z.load(“PATH/commons-csv-1.1.jar”)
>
> z.load(“PATH/spark-csv_2.10-1.3.0.jar”)
>
> z.load(“PATH/univocity-parsers-1.5.1.jar:)
>
> z.load(“PATH/scala-library-2.10.5.jar”)
>
>
>
> I am able to load a csv from hdfs using data frame API in spark. It is
> running perfect fine.
>
> %spark
>
> val df = sqlContext.read
>
> .format("com.databricks.spark.csv")
>
> .option("header", "false") // Use finrst line of all files as header
>
> .option("inferSchema", "true") // Automatically infer data types
>
> .load("hdfs://sd-6f48-7fe6:8020/tmp/people.txt")   // this is a file
> in HDFS
>
> df.registerTempTable("people")
>
> df.show()
>
>
>
> This also work:
>
> %spark
>
> val df2=sqlContext.sql(“select * from people”)
>
> df2.show()
>
>
>
> But this doesn’t work….
>
> %sql
>
> select * from people
>
>
>
> java.lang.ClassNotFoundException:
> com.databricks.spark.csv.CsvRelation$$anonfun$1$$anonfun$2 at
> java.net.URLClassLoader$1.run(URLClassLoader.java:366) at
> java.net.URLClassLoader$1.run(URLClassLoader.java:355) at
> java.security.AccessController.doPrivileged(Native Method) at
> java.net.URLClassLoader.findClass(URLClassLoader.java:354) at
> java.lang.ClassLoader.loadClass(ClassLoader.java:425) at
> sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at
> java.lang.ClassLoader.loadClass(ClassLoader.java:358) at
> java.lang.Class.forName0(Native Method) at
> java.lang.Class.forName(Class.java:270) at
> org.apache.spark.util.InnerClosureFinder$$anon$4.visitMethodInsn(ClosureCleaner.scala:435)
> at org.apache.xbean.asm5.ClassReader.a(Unknown Source) at
> org.apache.xbean.asm5.ClassReader.b(Unknown Source) at
> org.apache.xbean.asm5.ClassReader.accept(Unknown Source) at
> org.apache.xbean.asm5.ClassReader.accept(Unknown Source) at
> org.apache.spark.util.ClosureCleaner$.getInnerClosureClasses(ClosureCleaner.scala:84)
> at
> org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:187)
> at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:122) at
> org.apache.spark.SparkContext.clean(SparkContext.scala:2055) at
> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1.apply(RDD.scala:707) at
> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1.apply(RDD.scala:706) at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
> at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
> at org.apache.spark.rdd.RDD.withScope(RDD.scala:316) at
> org.apache.spark.rdd.RDD.mapPartitions(RDD.scala:706) at
> com.databricks.spark.csv.CsvRelation.tokenRdd(CsvRelation.scala:90) at
> com.databricks.spark.csv.CsvRelation.buildScan(CsvRelation.scala:104) at
> com.databricks.spark.csv.CsvRelation.buildScan(CsvRelation.scala:152) at
> org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$4.apply(DataSourceStrategy.scala:64)
> at
> org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$4.apply(DataSourceStrategy.scala:64)
> at
> org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$pruneFilterProject$1.apply(DataSourceStrategy.scala:274)
> at
> org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$pruneFilterProject$1.apply(DataSourceStrategy.scala:273)
> at
> org.apache.spark.sql.execution.datasources.DataSourceStrategy$.pruneFilterProjectRaw(DataSourceStrategy.scala:352)
> at
> org.apache.spark.sql.execution.datasources.DataSourceStrategy$.pruneFilterProject(DataSourceStrategy.scala:269)
> at
> org.apache.spark.sql.execution.datasources.DataSourceStrategy$.apply(DataSourceStrategy.scala:60)
> at
> org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:58)
> at
> org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:58)
> at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371) at
> org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:59)
> at
> org.apache.spark.sql.catalyst.planning.QueryPlanner.planLater(QueryPlanner.scala:54)
> at
> org.apache.spark.sql.execution.SparkStrategies$BasicOperators$.apply(SparkStrategies.scala:349)
> at
> org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:58)
> at
> 

Re: csv dependencies loaded in %spark but not %sql in spark 1.6/zeppelin 0.5.6

2016-02-02 Thread mina lee
This issue has been fixed few days ago in master branch.

Here is the PR
https://github.com/apache/incubator-zeppelin/pull/673

And related issues filed in JIRA before
https://issues.apache.org/jira/browse/ZEPPELIN-194
https://issues.apache.org/jira/browse/ZEPPELIN-381

With the latest master branch, we recommend you to load dependencies via
interpreter setting menu instead of %dep interpreter.

If you want to know how to set dependencies with latest master branch,
please check doc

and
let me know if it works.

Cheers,
Mina

On Tue, Feb 2, 2016 at 12:50 PM, Lin, Yunfeng  wrote:

> I’ve created an issue in jira
>
>
>
> https://issues.apache.org/jira/browse/ZEPPELIN-648
>
>
>
> *From:* Benjamin Kim [mailto:bbuil...@gmail.com]
> *Sent:* Tuesday, February 02, 2016 3:34 PM
> *To:* us...@zeppelin.incubator.apache.org
> *Cc:* dev@zeppelin.incubator.apache.org
> *Subject:* Re: csv dependencies loaded in %spark but not %sql in spark
> 1.6/zeppelin 0.5.6
>
>
>
> Same here. I want to know the answer too.
>
>
>
>
>
> On Feb 2, 2016, at 12:32 PM, Jonathan Kelly 
> wrote:
>
>
>
> Hey, I just ran into that same exact issue yesterday and wasn't sure if I
> was doing something wrong or what. Glad to know it's not just me!
> Unfortunately I have not yet had the time to look any deeper into it. Would
> you mind filing a JIRA if there isn't already one?
>
>
>
> On Tue, Feb 2, 2016 at 12:29 PM Lin, Yunfeng  wrote:
>
> Hi guys,
>
>
>
> I load spark-csv dependencies in %spark, but not in %sql using apache
> zeppelin 0.5.6 with spark 1.6.0. Everything is working fine in zeppelin
> 0.5.5 with spark 1.5 through
>
>
>
> Do you have similar problems?
>
>
>
> I am loading spark csv dependencies (
> https://github.com/databricks/spark-csv
> 
> )
>
>
>
> Using:
>
> %dep
>
> z.load(“PATH/commons-csv-1.1.jar”)
>
> z.load(“PATH/spark-csv_2.10-1.3.0.jar”)
>
> z.load(“PATH/univocity-parsers-1.5.1.jar:)
>
> z.load(“PATH/scala-library-2.10.5.jar”)
>
>
>
> I am able to load a csv from hdfs using data frame API in spark. It is
> running perfect fine.
>
> %spark
>
> val df = sqlContext.read
>
> .format("com.databricks.spark.csv")
>
> .option("header", "false") // Use finrst line of all files as header
>
> .option("inferSchema", "true") // Automatically infer data types
>
> .load("hdfs://sd-6f48-7fe6:8020/tmp/people.txt")   // this is a file
> in HDFS
>
> df.registerTempTable("people")
>
> df.show()
>
>
>
> This also work:
>
> %spark
>
> val df2=sqlContext.sql(“select * from people”)
>
> df2.show()
>
>
>
> But this doesn’t work….
>
> %sql
>
> select * from people
>
>
>
> java.lang.ClassNotFoundException:
> com.databricks.spark.csv.CsvRelation$$anonfun$1$$anonfun$2 at
> java.net.URLClassLoader$1.run(URLClassLoader.java:366) at
> java.net.URLClassLoader$1.run(URLClassLoader.java:355) at
> java.security.AccessController.doPrivileged(Native Method) at
> java.net.URLClassLoader.findClass(URLClassLoader.java:354) at
> java.lang.ClassLoader.loadClass(ClassLoader.java:425) at
> sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at
> java.lang.ClassLoader.loadClass(ClassLoader.java:358) at
> java.lang.Class.forName0(Native Method) at
> java.lang.Class.forName(Class.java:270) at
> org.apache.spark.util.InnerClosureFinder$$anon$4.visitMethodInsn(ClosureCleaner.scala:435)
> at org.apache.xbean.asm5.ClassReader.a(Unknown Source) at
> org.apache.xbean.asm5.ClassReader.b(Unknown Source) at
> org.apache.xbean.asm5.ClassReader.accept(Unknown Source) at
> org.apache.xbean.asm5.ClassReader.accept(Unknown Source) at
> org.apache.spark.util.ClosureCleaner$.getInnerClosureClasses(ClosureCleaner.scala:84)
> at
> org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:187)
> at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:122) at
> org.apache.spark.SparkContext.clean(SparkContext.scala:2055) at
> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1.apply(RDD.scala:707) at
> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1.apply(RDD.scala:706) at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
> at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
> at org.apache.spark.rdd.RDD.withScope(RDD.scala:316) at
> org.apache.spark.rdd.RDD.mapPartitions(RDD.scala:706) at
> com.databricks.spark.csv.CsvRelation.tokenRdd(CsvRelation.scala:90) at
> com.databricks.spark.csv.CsvRelation.buildScan(CsvRelation.scala:104) at
> com.databricks.spark.csv.CsvRelation.buildScan(CsvRelation.scala:152) at
> 

Re: csv dependencies loaded in %spark but not %sql in spark 1.6/zeppelin 0.5.6

2016-02-02 Thread Jonathan Kelly
Awesome, thank you! BTW, I know that the Zeppelin 0.5.6 release was only
very recently, but do you happen to know yet when you plan on releasing
0.6.0?

On Tue, Feb 2, 2016 at 1:07 PM mina lee  wrote:

> This issue has been fixed few days ago in master branch.
>
> Here is the PR
> https://github.com/apache/incubator-zeppelin/pull/673
>
> And related issues filed in JIRA before
> https://issues.apache.org/jira/browse/ZEPPELIN-194
> https://issues.apache.org/jira/browse/ZEPPELIN-381
>
> With the latest master branch, we recommend you to load dependencies via
> interpreter setting menu instead of %dep interpreter.
>
> If you want to know how to set dependencies with latest master branch,
> please check doc
> <
> https://zeppelin.incubator.apache.org/docs/0.6.0-incubating-SNAPSHOT/manual/dependencymanagement.html
> >
> and
> let me know if it works.
>
> Cheers,
> Mina
>
> On Tue, Feb 2, 2016 at 12:50 PM, Lin, Yunfeng 
> wrote:
>
> > I’ve created an issue in jira
> >
> >
> >
> > https://issues.apache.org/jira/browse/ZEPPELIN-648
> >
> >
> >
> > *From:* Benjamin Kim [mailto:bbuil...@gmail.com]
> > *Sent:* Tuesday, February 02, 2016 3:34 PM
> > *To:* us...@zeppelin.incubator.apache.org
> > *Cc:* dev@zeppelin.incubator.apache.org
> > *Subject:* Re: csv dependencies loaded in %spark but not %sql in spark
> > 1.6/zeppelin 0.5.6
> >
> >
> >
> > Same here. I want to know the answer too.
> >
> >
> >
> >
> >
> > On Feb 2, 2016, at 12:32 PM, Jonathan Kelly 
> > wrote:
> >
> >
> >
> > Hey, I just ran into that same exact issue yesterday and wasn't sure if I
> > was doing something wrong or what. Glad to know it's not just me!
> > Unfortunately I have not yet had the time to look any deeper into it.
> Would
> > you mind filing a JIRA if there isn't already one?
> >
> >
> >
> > On Tue, Feb 2, 2016 at 12:29 PM Lin, Yunfeng 
> wrote:
> >
> > Hi guys,
> >
> >
> >
> > I load spark-csv dependencies in %spark, but not in %sql using apache
> > zeppelin 0.5.6 with spark 1.6.0. Everything is working fine in zeppelin
> > 0.5.5 with spark 1.5 through
> >
> >
> >
> > Do you have similar problems?
> >
> >
> >
> > I am loading spark csv dependencies (
> > https://github.com/databricks/spark-csv
> > <
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_databricks_spark-2Dcsv=BQMFaQ=j-EkbjBYwkAB4f8ZbVn1Fw=b2BXWa66OlJ_NWqk5P310M6mGfus8eDC5O4J0-nePFY=dSXRZCZNlnU1tx9rtyX9UWfdjT0EPbafKr2NyIrXP-o=zUPPWKYhZiNUuIUWmlXXGF_94ImGHQ4qHpFCU0xSEzg=
> >
> > )
> >
> >
> >
> > Using:
> >
> > %dep
> >
> > z.load(“PATH/commons-csv-1.1.jar”)
> >
> > z.load(“PATH/spark-csv_2.10-1.3.0.jar”)
> >
> > z.load(“PATH/univocity-parsers-1.5.1.jar:)
> >
> > z.load(“PATH/scala-library-2.10.5.jar”)
> >
> >
> >
> > I am able to load a csv from hdfs using data frame API in spark. It is
> > running perfect fine.
> >
> > %spark
> >
> > val df = sqlContext.read
> >
> > .format("com.databricks.spark.csv")
> >
> > .option("header", "false") // Use finrst line of all files as header
> >
> > .option("inferSchema", "true") // Automatically infer data types
> >
> > .load("hdfs://sd-6f48-7fe6:8020/tmp/people.txt")   // this is a file
> > in HDFS
> >
> > df.registerTempTable("people")
> >
> > df.show()
> >
> >
> >
> > This also work:
> >
> > %spark
> >
> > val df2=sqlContext.sql(“select * from people”)
> >
> > df2.show()
> >
> >
> >
> > But this doesn’t work….
> >
> > %sql
> >
> > select * from people
> >
> >
> >
> > java.lang.ClassNotFoundException:
> > com.databricks.spark.csv.CsvRelation$$anonfun$1$$anonfun$2 at
> > java.net.URLClassLoader$1.run(URLClassLoader.java:366) at
> > java.net.URLClassLoader$1.run(URLClassLoader.java:355) at
> > java.security.AccessController.doPrivileged(Native Method) at
> > java.net.URLClassLoader.findClass(URLClassLoader.java:354) at
> > java.lang.ClassLoader.loadClass(ClassLoader.java:425) at
> > sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at
> > java.lang.ClassLoader.loadClass(ClassLoader.java:358) at
> > java.lang.Class.forName0(Native Method) at
> > java.lang.Class.forName(Class.java:270) at
> >
> org.apache.spark.util.InnerClosureFinder$$anon$4.visitMethodInsn(ClosureCleaner.scala:435)
> > at org.apache.xbean.asm5.ClassReader.a(Unknown Source) at
> > org.apache.xbean.asm5.ClassReader.b(Unknown Source) at
> > org.apache.xbean.asm5.ClassReader.accept(Unknown Source) at
> > org.apache.xbean.asm5.ClassReader.accept(Unknown Source) at
> >
> org.apache.spark.util.ClosureCleaner$.getInnerClosureClasses(ClosureCleaner.scala:84)
> > at
> >
> org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:187)
> > at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:122)
> at
> > org.apache.spark.SparkContext.clean(SparkContext.scala:2055) at
> > org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1.apply(RDD.scala:707) at
> > 

RE: csv dependencies loaded in %spark but not %sql in spark 1.6/zeppelin 0.5.6

2016-02-02 Thread Lin, Yunfeng
Thanks! Is there a possible workaround for 0.5.6 before releasing 0.6.0?

From: Jonathan Kelly [mailto:jonathaka...@gmail.com]
Sent: Tuesday, February 02, 2016 4:19 PM
To: dev@zeppelin.incubator.apache.org; users
Subject: Re: csv dependencies loaded in %spark but not %sql in spark 
1.6/zeppelin 0.5.6

Awesome, thank you! BTW, I know that the Zeppelin 0.5.6 release was only very 
recently, but do you happen to know yet when you plan on releasing 0.6.0?

On Tue, Feb 2, 2016 at 1:07 PM mina lee 
> wrote:
This issue has been fixed few days ago in master branch.

Here is the PR
https://github.com/apache/incubator-zeppelin/pull/673

And related issues filed in JIRA before
https://issues.apache.org/jira/browse/ZEPPELIN-194
https://issues.apache.org/jira/browse/ZEPPELIN-381

With the latest master branch, we recommend you to load dependencies via
interpreter setting menu instead of %dep interpreter.

If you want to know how to set dependencies with latest master branch,
please check doc
>
and
let me know if it works.

Cheers,
Mina

On Tue, Feb 2, 2016 at 12:50 PM, Lin, Yunfeng 
> wrote:

> I’ve created an issue in jira
>
>
>
> https://issues.apache.org/jira/browse/ZEPPELIN-648
>
>
>
> *From:* Benjamin Kim [mailto:bbuil...@gmail.com]
> *Sent:* Tuesday, February 02, 2016 3:34 PM
> *To:* 
> us...@zeppelin.incubator.apache.org
> *Cc:* 
> dev@zeppelin.incubator.apache.org
> *Subject:* Re: csv dependencies loaded in %spark but not %sql in spark
> 1.6/zeppelin 0.5.6
>
>
>
> Same here. I want to know the answer too.
>
>
>
>
>
> On Feb 2, 2016, at 12:32 PM, Jonathan Kelly 
> >
> wrote:
>
>
>
> Hey, I just ran into that same exact issue yesterday and wasn't sure if I
> was doing something wrong or what. Glad to know it's not just me!
> Unfortunately I have not yet had the time to look any deeper into it. Would
> you mind filing a JIRA if there isn't already one?
>
>
>
> On Tue, Feb 2, 2016 at 12:29 PM Lin, Yunfeng 
> > wrote:
>
> Hi guys,
>
>
>
> I load spark-csv dependencies in %spark, but not in %sql using apache
> zeppelin 0.5.6 with spark 1.6.0. Everything is working fine in zeppelin
> 0.5.5 with spark 1.5 through
>
>
>
> Do you have similar problems?
>
>
>
> I am loading spark csv dependencies (
> https://github.com/databricks/spark-csv
> 
> )
>
>
>
> Using:
>
> %dep
>
> z.load(“PATH/commons-csv-1.1.jar”)
>
> z.load(“PATH/spark-csv_2.10-1.3.0.jar”)
>
> z.load(“PATH/univocity-parsers-1.5.1.jar:)
>
> z.load(“PATH/scala-library-2.10.5.jar”)
>
>
>
> I am able to load a csv from hdfs using data frame API in spark. It is
> running perfect fine.
>
> %spark
>
> val df = sqlContext.read
>
> .format("com.databricks.spark.csv")
>
> .option("header", "false") // Use finrst line of 

RE: csv dependencies loaded in %spark but not %sql in spark 1.6/zeppelin 0.5.6

2016-02-02 Thread Lin, Yunfeng
I’ve created an issue in jira

https://issues.apache.org/jira/browse/ZEPPELIN-648

From: Benjamin Kim [mailto:bbuil...@gmail.com]
Sent: Tuesday, February 02, 2016 3:34 PM
To: us...@zeppelin.incubator.apache.org
Cc: dev@zeppelin.incubator.apache.org
Subject: Re: csv dependencies loaded in %spark but not %sql in spark 
1.6/zeppelin 0.5.6

Same here. I want to know the answer too.


On Feb 2, 2016, at 12:32 PM, Jonathan Kelly 
> wrote:

Hey, I just ran into that same exact issue yesterday and wasn't sure if I was 
doing something wrong or what. Glad to know it's not just me! Unfortunately I 
have not yet had the time to look any deeper into it. Would you mind filing a 
JIRA if there isn't already one?

On Tue, Feb 2, 2016 at 12:29 PM Lin, Yunfeng 
> wrote:
Hi guys,

I load spark-csv dependencies in %spark, but not in %sql using apache zeppelin 
0.5.6 with spark 1.6.0. Everything is working fine in zeppelin 0.5.5 with spark 
1.5 through

Do you have similar problems?

I am loading spark csv dependencies 
(https://github.com/databricks/spark-csv)

Using:
%dep
z.load(“PATH/commons-csv-1.1.jar”)
z.load(“PATH/spark-csv_2.10-1.3.0.jar”)
z.load(“PATH/univocity-parsers-1.5.1.jar:)
z.load(“PATH/scala-library-2.10.5.jar”)

I am able to load a csv from hdfs using data frame API in spark. It is running 
perfect fine.
%spark
val df = sqlContext.read
.format("com.databricks.spark.csv")
.option("header", "false") // Use finrst line of all files as header
.option("inferSchema", "true") // Automatically infer data types
.load("hdfs://sd-6f48-7fe6:8020/tmp/people.txt")   // this is a file in HDFS
df.registerTempTable("people")
df.show()

This also work:
%spark
val df2=sqlContext.sql(“select * from people”)
df2.show()

But this doesn’t work….
%sql
select * from people

java.lang.ClassNotFoundException: 
com.databricks.spark.csv.CsvRelation$$anonfun$1$$anonfun$2 at 
java.net.URLClassLoader$1.run(URLClassLoader.java:366) at 
java.net.URLClassLoader$1.run(URLClassLoader.java:355) at 
java.security.AccessController.doPrivileged(Native Method) at 
java.net.URLClassLoader.findClass(URLClassLoader.java:354) at 
java.lang.ClassLoader.loadClass(ClassLoader.java:425) at 
sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at 
java.lang.ClassLoader.loadClass(ClassLoader.java:358) at 
java.lang.Class.forName0(Native Method) at 
java.lang.Class.forName(Class.java:270) at 
org.apache.spark.util.InnerClosureFinder$$anon$4.visitMethodInsn(ClosureCleaner.scala:435)
 at org.apache.xbean.asm5.ClassReader.a(Unknown Source) at 
org.apache.xbean.asm5.ClassReader.b(Unknown Source) at 
org.apache.xbean.asm5.ClassReader.accept(Unknown Source) at 
org.apache.xbean.asm5.ClassReader.accept(Unknown Source) at 
org.apache.spark.util.ClosureCleaner$.getInnerClosureClasses(ClosureCleaner.scala:84)
 at 
org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:187)
 at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:122) at 
org.apache.spark.SparkContext.clean(SparkContext.scala:2055) at 
org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1.apply(RDD.scala:707) at 
org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1.apply(RDD.scala:706) at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) 
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111) 
at org.apache.spark.rdd.RDD.withScope(RDD.scala:316) at 
org.apache.spark.rdd.RDD.mapPartitions(RDD.scala:706) at 
com.databricks.spark.csv.CsvRelation.tokenRdd(CsvRelation.scala:90) at 
com.databricks.spark.csv.CsvRelation.buildScan(CsvRelation.scala:104) at 
com.databricks.spark.csv.CsvRelation.buildScan(CsvRelation.scala:152) at 
org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$4.apply(DataSourceStrategy.scala:64)
 at 
org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$4.apply(DataSourceStrategy.scala:64)
 at 
org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$pruneFilterProject$1.apply(DataSourceStrategy.scala:274)
 at 
org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$pruneFilterProject$1.apply(DataSourceStrategy.scala:273)
 at 
org.apache.spark.sql.execution.datasources.DataSourceStrategy$.pruneFilterProjectRaw(DataSourceStrategy.scala:352)
 at 
org.apache.spark.sql.execution.datasources.DataSourceStrategy$.pruneFilterProject(DataSourceStrategy.scala:269)
 at 
org.apache.spark.sql.execution.datasources.DataSourceStrategy$.apply(DataSourceStrategy.scala:60)
 at 

Re: csv dependencies loaded in %spark but not %sql in spark 1.6/zeppelin 0.5.6

2016-02-02 Thread Mina Lee
@Janathan Zeppelin community plans to release every 3 months so I expect
next release will be around end of April.

@Lin The workaround I can think of right now is adding libraries to
ZEPPELIN_CLASSPATH in bin/interpreter.sh
To do this,
  1. place all libraries you need(commons-csv-1.1.jar, spark-csv_2.10-1.
3.0.jar, univocity-parsers-1.5.1.jar) under one specific directory. Note
that there should be only files not subdirectories. I will name it as
"/my/path/spark-csv" for the ease of explanation.
  2. open bin/interpreter.sh
  3. Add following code `addJarInDir "/mydir/path/spark-csv"` to line
130.(I assume that you are using zeppelin-0.5.6-incubating)
  4. restart Zeppelin

Hope this helps


On Tue, Feb 2, 2016 at 1:30 PM, Lin, Yunfeng  wrote:

> Thanks! Is there a possible workaround for 0.5.6 before releasing 0.6.0?
>
>
>
> *From:* Jonathan Kelly [mailto:jonathaka...@gmail.com]
> *Sent:* Tuesday, February 02, 2016 4:19 PM
> *To:* dev@zeppelin.incubator.apache.org; users
> *Subject:* Re: csv dependencies loaded in %spark but not %sql in spark
> 1.6/zeppelin 0.5.6
>
>
>
> Awesome, thank you! BTW, I know that the Zeppelin 0.5.6 release was only
> very recently, but do you happen to know yet when you plan on releasing
> 0.6.0?
>
>
>
> On Tue, Feb 2, 2016 at 1:07 PM mina lee  wrote:
>
> This issue has been fixed few days ago in master branch.
>
> Here is the PR
> https://github.com/apache/incubator-zeppelin/pull/673
> 
>
> And related issues filed in JIRA before
> https://issues.apache.org/jira/browse/ZEPPELIN-194
> 
> https://issues.apache.org/jira/browse/ZEPPELIN-381
> 
>
> With the latest master branch, we recommend you to load dependencies via
> interpreter setting menu instead of %dep interpreter.
>
> If you want to know how to set dependencies with latest master branch,
> please check doc
> <
> https://zeppelin.incubator.apache.org/docs/0.6.0-incubating-SNAPSHOT/manual/dependencymanagement.html
> 
> >
> and
> let me know if it works.
>
> Cheers,
> Mina
>
> On Tue, Feb 2, 2016 at 12:50 PM, Lin, Yunfeng 
> wrote:
>
> > I’ve created an issue in jira
> >
> >
> >
> > https://issues.apache.org/jira/browse/ZEPPELIN-648
> 
> >
> >
> >
> > *From:* Benjamin Kim [mailto:bbuil...@gmail.com]
> > *Sent:* Tuesday, February 02, 2016 3:34 PM
> > *To:* us...@zeppelin.incubator.apache.org
> > *Cc:* dev@zeppelin.incubator.apache.org
> > *Subject:* Re: csv dependencies loaded in %spark but not %sql in spark
> > 1.6/zeppelin 0.5.6
> >
> >
> >
> > Same here. I want to know the answer too.
> >
> >
> >
> >
> >
> > On Feb 2, 2016, at 12:32 PM, Jonathan Kelly 
> > wrote:
> >
> >
> >
> > Hey, I just ran into that same exact issue yesterday and wasn't sure if I
> > was doing something wrong or what. Glad to know it's not just me!
> > Unfortunately I have not yet had the time to look any deeper into it.
> Would
> > you mind filing a JIRA if there isn't already one?
> >
> >
> >
> > On Tue, Feb 2, 2016 at 12:29 PM Lin, Yunfeng 
> wrote:
> >
> > Hi guys,
> >
> >
> >
> > I load spark-csv dependencies in %spark, but not in %sql using apache
> > zeppelin 0.5.6 with spark 1.6.0. Everything is working fine in zeppelin
> > 0.5.5 with spark 1.5 through
> >
> >
> >
> > Do you have similar problems?
> >
> >
> >
> > I am loading spark csv dependencies (
> > https://github.com/databricks/spark-csv
> 

RE: csv dependencies loaded in %spark but not %sql in spark 1.6/zeppelin 0.5.6

2016-02-02 Thread Lin, Yunfeng
Thanks, Mina, I confirm that the workaround works!

-Original Message-
From: Mina Lee [mailto:mina...@apache.org] 
Sent: Tuesday, February 02, 2016 6:22 PM
To: users
Cc: dev@zeppelin.incubator.apache.org
Subject: Re: csv dependencies loaded in %spark but not %sql in spark 
1.6/zeppelin 0.5.6

@Janathan Zeppelin community plans to release every 3 months so I expect next 
release will be around end of April.

@Lin The workaround I can think of right now is adding libraries to 
ZEPPELIN_CLASSPATH in bin/interpreter.sh To do this,
  1. place all libraries you need(commons-csv-1.1.jar, spark-csv_2.10-1.
3.0.jar, univocity-parsers-1.5.1.jar) under one specific directory. Note that 
there should be only files not subdirectories. I will name it as 
"/my/path/spark-csv" for the ease of explanation.
  2. open bin/interpreter.sh
  3. Add following code `addJarInDir "/mydir/path/spark-csv"` to line 130.(I 
assume that you are using zeppelin-0.5.6-incubating)
  4. restart Zeppelin

Hope this helps


On Tue, Feb 2, 2016 at 1:30 PM, Lin, Yunfeng  wrote:

> Thanks! Is there a possible workaround for 0.5.6 before releasing 0.6.0?
>
>
>
> *From:* Jonathan Kelly [mailto:jonathaka...@gmail.com]
> *Sent:* Tuesday, February 02, 2016 4:19 PM
> *To:* dev@zeppelin.incubator.apache.org; users
> *Subject:* Re: csv dependencies loaded in %spark but not %sql in spark 
> 1.6/zeppelin 0.5.6
>
>
>
> Awesome, thank you! BTW, I know that the Zeppelin 0.5.6 release was 
> only very recently, but do you happen to know yet when you plan on 
> releasing 0.6.0?
>
>
>
> On Tue, Feb 2, 2016 at 1:07 PM mina lee  wrote:
>
> This issue has been fixed few days ago in master branch.
>
> Here is the PR
> https://github.com/apache/incubator-zeppelin/pull/673
>  e_incubator-2Dzeppelin_pull_673=BQMFaQ=j-EkbjBYwkAB4f8ZbVn1Fw=b2
> BXWa66OlJ_NWqk5P310M6mGfus8eDC5O4J0-nePFY=P05kinGmpfFiWwLExz9hZO76uI
> 5rpP5SoqI8JJ2UWEY=hkFOUihaS8xvGHC2psZm_eTCpaIbp9JvGoR0Y6XV2wc=>
>
> And related issues filed in JIRA before
> https://issues.apache.org/jira/browse/ZEPPELIN-194
>  g_jira_browse_ZEPPELIN-2D194=BQMFaQ=j-EkbjBYwkAB4f8ZbVn1Fw=b2BXW
> a66OlJ_NWqk5P310M6mGfus8eDC5O4J0-nePFY=P05kinGmpfFiWwLExz9hZO76uI5rp
> P5SoqI8JJ2UWEY=AvJcgiZOLR_OmESYH5osZSMnYd9_A6DBRo9CxjYsxuY=>
> https://issues.apache.org/jira/browse/ZEPPELIN-381
>  g_jira_browse_ZEPPELIN-2D381=BQMFaQ=j-EkbjBYwkAB4f8ZbVn1Fw=b2BXW
> a66OlJ_NWqk5P310M6mGfus8eDC5O4J0-nePFY=P05kinGmpfFiWwLExz9hZO76uI5rp
> P5SoqI8JJ2UWEY=6iFF2WKrNwEJZ68upi94jfS_2wCVPZHeZE0aTzS4slc=>
>
> With the latest master branch, we recommend you to load dependencies 
> via interpreter setting menu instead of %dep interpreter.
>
> If you want to know how to set dependencies with latest master branch, 
> please check doc < 
> https://zeppelin.incubator.apache.org/docs/0.6.0-incubating-SNAPSHOT/m
> anual/dependencymanagement.html 
>  or.apache.org_docs_0.6.0-2Dincubating-2DSNAPSHOT_manual_dependencymana
> gement.html=BQMFaQ=j-EkbjBYwkAB4f8ZbVn1Fw=b2BXWa66OlJ_NWqk5P310M
> 6mGfus8eDC5O4J0-nePFY=P05kinGmpfFiWwLExz9hZO76uI5rpP5SoqI8JJ2UWEY=
> 8vqDxxyOWAchE7NS0L2BLpav1-tIPMXDwE0ndsqln-E=>
> >
> and
> let me know if it works.
>
> Cheers,
> Mina
>
> On Tue, Feb 2, 2016 at 12:50 PM, Lin, Yunfeng 
> wrote:
>
> > I’ve created an issue in jira
> >
> >
> >
> > https://issues.apache.org/jira/browse/ZEPPELIN-648
>  g_jira_browse_ZEPPELIN-2D648=BQMFaQ=j-EkbjBYwkAB4f8ZbVn1Fw=b2BXW
> a66OlJ_NWqk5P310M6mGfus8eDC5O4J0-nePFY=P05kinGmpfFiWwLExz9hZO76uI5rp
> P5SoqI8JJ2UWEY=hlbpHl6qhaZZD5ZWy6DeFySuZEZt3JysxHFYer2shG8=>
> >
> >
> >
> > *From:* Benjamin Kim [mailto:bbuil...@gmail.com]
> > *Sent:* Tuesday, February 02, 2016 3:34 PM
> > *To:* us...@zeppelin.incubator.apache.org
> > *Cc:* dev@zeppelin.incubator.apache.org
> > *Subject:* Re: csv dependencies loaded in %spark but not %sql in 
> > spark 1.6/zeppelin 0.5.6
> >
> >
> >
> > Same here. I want to know the answer too.
> >
> >
> >
> >
> >
> > On Feb 2, 2016, at 12:32 PM, Jonathan Kelly 
> > wrote:
> >
> >
> >
> > Hey, I just ran into that same exact issue yesterday and wasn't sure 
> > if I was doing something wrong or what. Glad to know it's not just me!
> > Unfortunately I have not yet had the time to look any deeper into it.
> Would
> > you mind filing a JIRA if there isn't already one?
> >
> >
> >
> > On Tue, Feb 2, 2016 at 12:29 PM Lin, Yunfeng 
> wrote:
> >
> > Hi guys,
> >
> >
> >
> > I load spark-csv dependencies in %spark, but not in %sql using 
> > apache zeppelin 0.5.6 with spark 1.6.0. Everything is working fine 
> > in zeppelin
> >