Re: [PROPOSAL] ORC support

2017-04-03 Thread Jean-Baptiste Onofré
Thanks Tibor, Ready to help ! (I also started the ParquetIO). Regards JB On 04/03/2017 02:11 PM, Tibor Kiss wrote: Thanks for your replies, I've created https://issues.apache.org/jira/browse/BEAM-1861 to track this effort. On Sun, Apr 2, 2017 at 7:40 AM, Jean-Baptiste Onofré wrote: +1 By

Re: [PROPOSAL] ORC support

2017-04-03 Thread Tibor Kiss
Thanks for your replies, I've created https://issues.apache.org/jira/browse/BEAM-1861 to track this effort. On Sun, Apr 2, 2017 at 7:40 AM, Jean-Baptiste Onofré wrote: > +1 > > By the way, around the same topic, I'm working on Apache CarbonData > support (http://carbondata.apache.org/). > > Rega

Re: [PROPOSAL] ORC support

2017-04-01 Thread Jean-Baptiste Onofré
+1 By the way, around the same topic, I'm working on Apache CarbonData support (http://carbondata.apache.org/). Regards JB On 04/01/2017 05:31 PM, Tibor Kiss wrote: Hello, Recently the Optimized Row Columnar (ORC) file format was spin off from Hive and became a top level Apache Project: htt

Re: [PROPOSAL] ORC support

2017-04-01 Thread Ismaël Mejía
+1 >From my previous work experience ORC in certain cases performs better than Parquet and really deserves to be supported. On Sat, Apr 1, 2017 at 5:58 PM, Ted Yu wrote: > +1 > >> On Apr 1, 2017, at 8:31 AM, Tibor Kiss wrote: >> >> Hello, >> >> Recently the Optimized Row Columnar (ORC) file fo

Re: [PROPOSAL] ORC support

2017-04-01 Thread Ted Yu
+1 > On Apr 1, 2017, at 8:31 AM, Tibor Kiss wrote: > > Hello, > > Recently the Optimized Row Columnar (ORC) file format was spin off from Hive > and became a top level Apache Project: https://orc.apache.org/ > > It is similar to Parquet in a sense that it uses column major format but > ORC has

[PROPOSAL] ORC support

2017-04-01 Thread Tibor Kiss
Hello, Recently the Optimized Row Columnar (ORC) file format was spin off from Hive and became a top level Apache Project: https://orc.apache.org/ It is similar to Parquet in a sense that it uses column major format but ORC has a more elaborate type system and stores basic statistics about each r