Hi, We are using Avro files for all our logging and they contain long timestamp_mills values.
When they are converted to Parquet using CTAS we wither need a hint (or something) to ensure that these columns become Timestamp values in parquet - or - we need to create a complex select with casting. I'm wondering if there are any shortcuts/tricks available. Currently We have dictionary encoding turned on and strangely the BigInt values, the parquet field type selected for the long-timestamp, uses dictionary encoding: ov 21, 2015 9:59:16 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 8,525B for [occurred_at] INT64: 6,615 values, 9,379B raw, 8,479B comp, 1 pages, encodings: [*PLAIN_DICTIONARY*, RLE, BIT_PACKED], dic { 2,812 entries, 22,496B raw, 2,812B comp} Can someone shed a light on these two issues? Regards, -Stefan