Thanks Michael.
On Thursday, October 2, 2014 8:41 PM, Michael Armbrust <mich...@databricks.com> wrote: We actually leave all the DDL commands up to hive, so there is no programatic way to access the things you are looking for. On Thu, Oct 2, 2014 at 5:17 PM, Banias <calvi...@yahoo.com.invalid> wrote: Hi, > > >Would anybody know how to get the following information from HiveContext given >a Hive table name? > > >- partition key(s) >- table directory >- input/output format > > >I am new to Spark. And I have a couple tables created using Parquet data like: > > >CREATE EXTERNAL TABLE parquet_table ( >COL1 string, >COL2 string, >COL3 string >) >ROW FORMAT SERDE 'parquet.hive.serde.ParquetHiveSerDe' >STORED AS >INPUTFORMAT "parquet.hive.DeprecatedParquetInputFormat" >OUTPUTFORMAT "parquet.hive.DeprecatedParquetOutputFormat" >LOCATION '/user/foo/parquet_src'; > > >and some of the tables have partitions. In my Spark Java code, I am able to >run queries using the HiveContext like: > > >SparkConf sparkConf = new SparkConf().setAppName("example"); >JavaSparkContext ctx = new JavaSparkContext(sparkConf); >JavaHiveContext hiveCtx = new JavaHiveContext(ctx); >JavaSchemaRDD rdd = hiveCtx.sql("select * from parquet_table"); > > >Now am I able to get the INPUTFORMAT, OUTPUTFORMAT, LOCATION, and in other >cases partition key(s) programmatically through the HiveContext? > > >The only way I know (pardon my ignorance) is to parse from the SchemaRDD >returned by hiveCtx.sql("describe extended parquet_table"); > > >If anybody could shed some light on a better way, I would appreciate that. >Thanks :) > > >-BC > >