Hi, I had to use Pig for some preprocessing and to generate Parquet files for Spark to consume.
However, due to Pig's limitation, the generated schema contains Pig's identifier e.g. sorted::id, sorted::cre_ts, ... I tried to put the schema inside CREATE EXTERNAL TABLE, e.g. create external table pmt ( sorted::id bigint ) stored as parquet location '...' Obviously it didn't work, I also tried removing the identifier sorted::, but the resulting rows contain only nulls. Any idea how to create a table in HiveContext from these Parquet files? Thanks, Jianshi -- Jianshi Huang LinkedIn: jianshi Twitter: @jshuang Github & Blog: http://huangjs.github.com/