Re: Query data in subdirectories in Hive Partitions using Spark SQL

2017-02-18 Thread Jon Gregg
Spark has partition discovery if your data is laid out in a parquet-friendly directory structure: http://spark.apache.org/docs/latest/sql-programming-guide.html#partition-discovery You can also use wildcards to get subdirectories (I'm using spark 1.6 here) >> data2 =

Re: Query data in subdirectories in Hive Partitions using Spark SQL

2017-02-17 Thread Yan Facai
Hi, Abdelfatah, How to you read these files? spark.read.parquet or spark.sql? Could you show some code? On Wed, Feb 15, 2017 at 8:47 PM, Ahmed Kamal Abdelfatah < ahmed.abdelfa...@careem.com> wrote: > Hi folks, > > > > How can I force spark sql to recursively get data stored in parquet format >

Query data in subdirectories in Hive Partitions using Spark SQL

2017-02-15 Thread Ahmed Kamal Abdelfatah
Hi folks, How can I force spark sql to recursively get data stored in parquet format from subdirectories ? In Hive, I could achieve this by setting few Hive configs. set hive.input.dir.recursive=true; set hive.mapred.supports.subdirectories=true; set hive.supports.subdirectories=true; set