There is no such thing as primary keys in the Hive metastore, but Spark SQL
does support partitioned hive tables:

DataFrameWriter also has a partitionBy method.

On Thu, Aug 20, 2015 at 7:29 AM, VIJAYAKUMAR JAWAHARLAL <
> wrote:

> Hi
> I have a question regarding data frame partition. I read a hive table from
> spark and following spark api converts it as DF.
> test_df = sqlContext.sql(“select * from hivetable1”)
> How does spark decide partition of test_df? Is there a way to partition
> test_df based on some column while reading hive table? Second question is,
> if that hive table has primary key declared, does spark honor PK in hive
> table and partition based on PKs?
> Thanks
> Vijay
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

Reply via email to