Hi Hao, you might want to try parquet-cli, which uses Parquet's Avro support. That should be able to do what you're looking for.
On Thu, Mar 1, 2018 at 2:02 PM, Hao Luo <h...@twitter.com.invalid> wrote: > Hi, > I have a parquet file with repetitive nested fields. The schema looks > like: > > c: OPTIONAL F:1 > .c_tuple: REPEATED F:3 > ..d: OPTIONAL INT64 R:1 D:3 > ..e: OPTIONAL BOOLEAN R:1 D:3 > ..f: OPTIONAL BINARY O:UTF8 R:1 D:3 > > When I try to dump the column c using parquet-tools, it prints nothing. > Dumping all columns will print out each of individual d, e and f column. > > I am wondering does parquet supports reading the complex type columns > without breaking it down into primitive columns? > > Thanks. > Hao Luo > -- Ryan Blue Software Engineer Netflix