Hi, we made the change because the partitioning discovery logic was too
flexible and it introduced problems that were very confusing to users. To
make your case work, we have introduced a new data source option called
basePath. You can use

DataFrame df = hiveContext.read().format("orc").option("basePath", "
path/to/table/").load("path/to/table/entity=xyz")

So, the partitioning discovery logic will understand that the base
path is path/to/table/
and your dataframe will has the column "entity".

You can find the doc at the end of partitioning discovery section of the
sql programming guide (
http://spark.apache.org/docs/latest/sql-programming-guide.html#partition-discovery
).

Thanks,

Yin

On Thu, Jan 7, 2016 at 7:34 AM, unk1102 <umesh.ka...@gmail.com> wrote:

> Hi from Spark 1.6 onwards as per this  doc
> <
> http://spark.apache.org/docs/latest/sql-programming-guide.html#partition-discovery
> >
> We cant add specific hive partitions to DataFrame
>
> spark 1.5 the following used to work and the following dataframe will have
> entity column
>
> DataFrame df =
> hiveContext.read().format("orc").load("path/to/table/entity=xyz")
>
> But in Spark 1.6 above does not work and I have to give base path like the
> following but it does not contain entity column which I want in DataFrame
>
> DataFrame df = hiveContext.read().format("orc").load("path/to/table/")
>
> How do I load specific hive partition in a dataframe? What was the driver
> behind removing this feature which was efficient I believe now above Spark
> 1.6 code load all partitions and if I filter for specific partitions it is
> not efficient it hits memory and throws GC error because of thousands of
> partitions get loaded into memory and not the specific one please guide.
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-load-specific-Hive-partition-in-DataFrame-Spark-1-6-tp25904.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to