Re: will/when Spark/SparkSQL will support ORCFile format

2014-10-09 Thread James Yu
Sounds great, thanks! On Thu, Oct 9, 2014 at 2:22 PM, Michael Armbrust wrote: > Yes, the foreign sources work is only about exposing a stable set of APIs > for external libraries to link against (to avoid the spark assembly > becoming a dependency mess). The code path these APIs use will be t

Re: will/when Spark/SparkSQL will support ORCFile format

2014-10-09 Thread Michael Armbrust
Yes, the foreign sources work is only about exposing a stable set of APIs for external libraries to link against (to avoid the spark assembly becoming a dependency mess). The code path these APIs use will be the same as that for datasources included in the core spark sql library. Michael On Thu,

Re: will/when Spark/SparkSQL will support ORCFile format

2014-10-09 Thread James Yu
For performance, will foreign data format support, same as native ones? Thanks, James On Wed, Oct 8, 2014 at 11:03 PM, Cheng Lian wrote: > The foreign data source API PR also matters here > https://www.github.com/apache/spark/pull/2475 > > Foreign data source like ORC can be added more easily

Re: will/when Spark/SparkSQL will support ORCFile format

2014-10-08 Thread Cheng Lian
The foreign data source API PR also matters here https://www.github.com/apache/spark/pull/2475 Foreign data source like ORC can be added more easily and systematically after this PR is merged. On 10/9/14 8:22 AM, James Yu wrote: Thanks Mark! I will keep eye on it. @Evan, I saw people use bo

Re: will/when Spark/SparkSQL will support ORCFile format

2014-10-08 Thread James Yu
Thanks Mark! I will keep eye on it. @Evan, I saw people use both format, so I really want to have Spark support ORCFile. On Wed, Oct 8, 2014 at 11:12 AM, Mark Hamstra wrote: > https://github.com/apache/spark/pull/2576 > > > > On Wed, Oct 8, 2014 at 11:01 AM, Evan Chan > wrote: > >> James, >>

Re: will/when Spark/SparkSQL will support ORCFile format

2014-10-08 Thread Mark Hamstra
https://github.com/apache/spark/pull/2576 On Wed, Oct 8, 2014 at 11:01 AM, Evan Chan wrote: > James, > > Michael at the meetup last night said there was some development > activity around ORCFiles. > > I'm curious though, what are the pros and cons of ORCFiles vs Parquet? > > On Wed, Oct 8, 20

Re: will/when Spark/SparkSQL will support ORCFile format

2014-10-08 Thread Evan Chan
James, Michael at the meetup last night said there was some development activity around ORCFiles. I'm curious though, what are the pros and cons of ORCFiles vs Parquet? On Wed, Oct 8, 2014 at 10:03 AM, James Yu wrote: > Didn't see anyone asked the question before, but I was wondering if anyone

will/when Spark/SparkSQL will support ORCFile format

2014-10-08 Thread James Yu
Didn't see anyone asked the question before, but I was wondering if anyone knows if Spark/SparkSQL will support ORCFile format soon? ORCFile is getting more and more popular hi Hive world. Thanks, James