Help needed with Py4J

2015-05-20 Thread Addanki, Santosh Kumar
Hi Colleagues We need to call a Scala Class from pySpark in Ipython notebook. We tried something like below : from py4j.java_gateway import java_import java_import(sparkContext._jvm,'mynamespace') myScalaClass = sparkContext._jvm.SimpleScalaClass () myScalaClass.sayHello(World) Works Fine

Re: Help needed with Py4J

2015-05-20 Thread Addanki, Santosh Kumar
? On Wednesday, May 20, 2015, Addanki, Santosh Kumar santosh.kumar.adda...@sap.commailto:santosh.kumar.adda...@sap.com wrote: Hi Colleagues We need to call a Scala Class from pySpark in Ipython notebook. We tried something like below : from py4j.java_gateway import java_import java_import(sparkContext

External Data Source in Spark

2015-03-02 Thread Addanki, Santosh Kumar
Hi Colleagues, Currently we have implemented External Data Source API and are able to push filters and projections. Could you provide some info on how perhaps the joins could be pushed to the original Data Source if both the data sources are from same database Briefly looked at

External Data Source in SPARK

2015-02-09 Thread Addanki, Santosh Kumar
Hi, We implemented an External Data Source by extending the TableScan . We added the classes to the classpath The data source works fine when run in Spark Shell . But currently we are unable to use this same data source in Python Environment. So when we execute the following below in an

saveAsParquetFile and DirectFileOutputCommitter Class not found Error

2014-12-07 Thread Addanki, Santosh Kumar
Hi, When we try to call saveAsParquetFile on a schemaRDD we get the following error : Py4JJavaError: An error occurred while calling o384.saveAsParquetFile. : java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/lib/output/DirectFileOutputCommitter at

Hive Context and Mapr

2014-11-03 Thread Addanki, Santosh Kumar
Hi We are currently using Mapr Distribution. To read the files from the file system we specify as follows : test = sc.textFile(mapr/mycluster/user/mapr/test.csv) This works fine from Spark Context. But ... Currently we are trying to create a table in hive using the hiveContext from Spark.

Schema RDD and saveAsTable in hive

2014-11-03 Thread Addanki, Santosh Kumar
Hi, I have a schemaRDD created like below : schemaTransactions = sqlContext.applySchema(transactions,schema); When I try to save the schemaRDD as a table using : schemaTransactions.saveAsTable(transactions) I get the error below Py4JJavaError: An error occurred while calling

Spark And Mapr

2014-10-01 Thread Addanki, Santosh Kumar
Hi We were using Horton 2.4.1 as our Hadoop distribution and now switched to MapR Previously to read a text file we would use : test = sc.textFile(\hdfs://10.48.101.111:8020/user/hdfs/test\) What would be the equivalent of the same for Mapr. Best Regards Santosh

RE: Spark And Mapr

2014-10-01 Thread Addanki, Santosh Kumar
Regards Santosh From: Vladimir Rodionov [mailto:vrodio...@splicemachine.com] Sent: Wednesday, October 01, 2014 3:59 PM To: Addanki, Santosh Kumar Cc: user@spark.apache.org Subject: Re: Spark And Mapr There is doc on MapR: http://doc.mapr.com/display/MapR/Accessing+MapR-FS+in+Java+Applications

RE: SchemaRDD and RegisterAsTable

2014-09-18 Thread Addanki, Santosh Kumar
@gmail.com] Sent: Wednesday, September 17, 2014 10:14 PM To: user@spark.apache.org; Addanki, Santosh Kumar Subject: Re: SchemaRDD and RegisterAsTable The registered table is stored within the spark context itself. To have the table available for the thrift server to get access to, you can save

SchemaRDD and RegisterAsTable

2014-09-17 Thread Addanki, Santosh Kumar
Hi, We built out SPARK 1.1.0 Version with MVN using mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -Phive clean package And the Thrift Server has been configured to use the Hive Meta Store. When a schemaRDD is registered as table where does the metadata of this table get stored. Can it be