Hi Team,
How do we increase the parallelism in Spark SQL.
In Spark Core, we can re-partition or pass extra arguments part of the
transformation.
I am trying the below example,
val df1 = sqlContext.read.format("jdbc").options(Map(...)).load
val df2= df1.cache
val df2.count
Here count operation u
1.6.1
>
>
> org.apache.spark
> spark-sql_2.10
> 1.6.1
>
>
> com.databricks
> spark-xml_2.10
> 0.2.0
>
>
> org.scala-lang
> scala-library
> 2.10.6
>
>
> Thanks
> VG
>
>
> On Fri, Jun 17, 2016 at 4:16 PM, Siva A wrote:
>
>>
gt; hth
>
> On Fri, Jun 17, 2016 at 11:32 AM, VG wrote:
>
>> nopes. eclipse.
>>
>>
>> On Fri, Jun 17, 2016 at 3:58 PM, Siva A wrote:
>>
>>> If you are running from IDE, Are you using Intellij?
>>>
>>> On Fri, Jun 17, 2016 at
Try to import the class and see if you are getting compilation error
import com.databricks.spark.xml
Siva
On Fri, Jun 17, 2016 at 4:02 PM, VG wrote:
> nopes. eclipse.
>
>
> On Fri, Jun 17, 2016 at 3:58 PM, Siva A wrote:
>
>> If you are running from IDE, Are you using I
If you are running from IDE, Are you using Intellij?
On Fri, Jun 17, 2016 at 3:20 PM, Siva A wrote:
> Can you try to package as a jar and run using spark-submit
>
> Siva
>
> On Fri, Jun 17, 2016 at 3:17 PM, VG wrote:
>
>> I am trying to run from IDE and everything el
Can you try to package as a jar and run using spark-submit
Siva
On Fri, Jun 17, 2016 at 3:17 PM, VG wrote:
> I am trying to run from IDE and everything else is working fine.
> I added spark-xml jar and now I ended up into this dependency
>
> 6/06/17 15:15:57 INFO BlockManagerMaster: Registered
If its not working,
Add the package list while executing spark-submit/spark-shell like below
$SPARK_HOME/bin/spark-shell --packages com.databricks:spark-xml_2.10:0.3.3
$SPARK_HOME/bin/spark-submit --packages com.databricks:spark-xml_2.10:0.3.3
On Fri, Jun 17, 2016 at 2:56 PM, Siva A wrote
Just try to use "xml" as format like below,
SQLContext sqlContext = new SQLContext(sc);
DataFrame df = sqlContext.read()
.format("xml")
.option("rowTag", "row")
.load("A.xml");
FYR: https://github.com/databricks/spark-xml
--Siva
On Fri, Jun 17