Hello, Recently the Optimized Row Columnar (ORC) file format was spin off from Hive and became a top level Apache Project: https://orc.apache.org/
It is similar to Parquet in a sense that it uses column major format but ORC has a more elaborate type system and stores basic statistics about each row. I'd be interested extending Beam with ORC support if others find it helpful too. What do you think? - Tibor