Spark SQL Parallelism - While reading from Oracle

2016-08-10 Thread Siva A
Hi Team, How do we increase the parallelism in Spark SQL. In Spark Core, we can re-partition or pass extra arguments part of the transformation. I am trying the below example, val df1 = sqlContext.read.format("jdbc").options(Map(...)).load val df2= df1.cache val df2.count Here count operation u

Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

2016-06-17 Thread Siva A
1.6.1 > > > org.apache.spark > spark-sql_2.10 > 1.6.1 > > > com.databricks > spark-xml_2.10 > 0.2.0 > > > org.scala-lang > scala-library > 2.10.6 > > > Thanks > VG > > > On Fri, Jun 17, 2016 at 4:16 PM, Siva A wrote: > >>

Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

2016-06-17 Thread Siva A
gt; hth > > On Fri, Jun 17, 2016 at 11:32 AM, VG wrote: > >> nopes. eclipse. >> >> >> On Fri, Jun 17, 2016 at 3:58 PM, Siva A wrote: >> >>> If you are running from IDE, Are you using Intellij? >>> >>> On Fri, Jun 17, 2016 at

Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

2016-06-17 Thread Siva A
Try to import the class and see if you are getting compilation error import com.databricks.spark.xml Siva On Fri, Jun 17, 2016 at 4:02 PM, VG wrote: > nopes. eclipse. > > > On Fri, Jun 17, 2016 at 3:58 PM, Siva A wrote: > >> If you are running from IDE, Are you using I

Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

2016-06-17 Thread Siva A
If you are running from IDE, Are you using Intellij? On Fri, Jun 17, 2016 at 3:20 PM, Siva A wrote: > Can you try to package as a jar and run using spark-submit > > Siva > > On Fri, Jun 17, 2016 at 3:17 PM, VG wrote: > >> I am trying to run from IDE and everything el

Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

2016-06-17 Thread Siva A
Can you try to package as a jar and run using spark-submit Siva On Fri, Jun 17, 2016 at 3:17 PM, VG wrote: > I am trying to run from IDE and everything else is working fine. > I added spark-xml jar and now I ended up into this dependency > > 6/06/17 15:15:57 INFO BlockManagerMaster: Registered

Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

2016-06-17 Thread Siva A
If its not working, Add the package list while executing spark-submit/spark-shell like below $SPARK_HOME/bin/spark-shell --packages com.databricks:spark-xml_2.10:0.3.3 $SPARK_HOME/bin/spark-submit --packages com.databricks:spark-xml_2.10:0.3.3 On Fri, Jun 17, 2016 at 2:56 PM, Siva A wrote

Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

2016-06-17 Thread Siva A
Just try to use "xml" as format like below, SQLContext sqlContext = new SQLContext(sc); DataFrame df = sqlContext.read() .format("xml") .option("rowTag", "row") .load("A.xml"); FYR: https://github.com/databricks/spark-xml --Siva On Fri, Jun 17