[Spark Dataframe] How can I write a correct filter so the Hive table partitions are pruned correctly

2017-09-13 Thread Patrick Duin
Hi Spark users, I've got an issue where I wrote a filter on a Hive table using dataframes and despite setting: spark.sql.hive.metastorePartitionPruning=true no partitions are being pruned. In short: Doing this: table.filter("partition=x or partition=y") will result in Spark fetching all

Re: Create external table with partitions using sqlContext.createExternalTable

2016-06-14 Thread Patrick Duin
t;1" ) > """ > sql(sqltext) > sql("select count(1) from test.orctype").show > > res2: org.apache.spark.sql.DataFrame = [result: string] > +---+ > |_c0| > +---+ > | 0| > +---+ > > HTH > > > Dr Mich Talebzadeh > > > >

Create external table with partitions using sqlContext.createExternalTable

2016-06-14 Thread Patrick Duin
Hi, I'm trying to use sqlContext.createExternalTable("my_table", "/tmp/location/", "orc") to create tables. This is working fine for non-partitioned tables. I'd like to create a partitioned table though, how do I do that? Can I add some information in the options: Map[String, String] parameter?