subject:"Phoenix\\\\\\\-Spark\\\\\\\: is DataFrame saving a single threaded operation\\\\\\\?"

Re: Phoenix-Spark: is DataFrame saving a single threaded operation?

2016-07-05 Thread Josh Mahonin

Hi Vamsi,

The DataFrame has an underlying number of partitions associated with it,
which will be processed by however many workers you have in your Spark
cluster.

You can check the number of partitions with:
df.rdd.partitions.size

And you can alter the partitions using:
df.repartition(numPartitions)

http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.DataFrame

Good luck,

Josh

On Tue, Jul 5, 2016 at 12:01 PM, Vamsi Krishna 
wrote:

> Team,
>
> In Phoenix-Spark plugin is DataFrame save operation single threaded?
>
> df.write \
>   .format("org.apache.phoenix.spark") \
>   .mode("overwrite") \
>   .option("table", "TABLE1") \
>   .option("zkUrl", "localhost:2181") \
>   .save()
>
>
> Thanks,
> Vamsi Attluri
> --
> Vamsi Attluri
>

Phoenix-Spark: is DataFrame saving a single threaded operation?

2016-07-05 Thread Vamsi Krishna

Team,

In Phoenix-Spark plugin is DataFrame save operation single threaded?

df.write \
  .format("org.apache.phoenix.spark") \
  .mode("overwrite") \
  .option("table", "TABLE1") \
  .option("zkUrl", "localhost:2181") \
  .save()


Thanks,
Vamsi Attluri
-- 
Vamsi Attluri

Re: Phoenix-Spark: is DataFrame saving a single threaded operation?

Phoenix-Spark: is DataFrame saving a single threaded operation?

2 matches

Site Navigation

Mail list logo

Footer information