Hi Vamsi,
The DataFrame has an underlying number of partitions associated with it,
which will be processed by however many workers you have in your Spark
cluster.
You can check the number of partitions with:
df.rdd.partitions.size
And you can alter the partitions using:
df.repartition(numPartitions)
http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.DataFrame
Good luck,
Josh
On Tue, Jul 5, 2016 at 12:01 PM, Vamsi Krishna
wrote:
> Team,
>
> In Phoenix-Spark plugin is DataFrame save operation single threaded?
>
> df.write \
> .format("org.apache.phoenix.spark") \
> .mode("overwrite") \
> .option("table", "TABLE1") \
> .option("zkUrl", "localhost:2181") \
> .save()
>
>
> Thanks,
> Vamsi Attluri
> --
> Vamsi Attluri
>