Re: Any plan to support update, delete and others

2019-08-08 Thread Saisai Shao
Got it. Thanks a lot for the reply. Best regards, Saisai Ryan Blue 于2019年8月9日周五 上午6:36写道: > We've actually been doing all of our API work in upstream Spark instead of > adding APIs to Iceberg for row-level data manipulation. That's why I'm > involved in the DataSourceV2 work. > > I think for De

Re: Two newbie question about Iceberg

2019-08-08 Thread Saisai Shao
I'm still looking into this, to figure out a way to add HIVE_LOCKS table in the Spark side. Anyway I will create an issue first to track this. Best regards, Saisai Ryan Blue 于2019年8月9日周五 上午4:58写道: > Any ideas on how to fix this? Can we create the HIVE_LOCKS table if it is > missing automaticall

Re: Encouraging performance results for Vectorized Iceberg code

2019-08-08 Thread Anjali Norwood
Yes, will do so early next week if not sooner. thanks, Anjali. On Thu, Aug 8, 2019 at 4:45 PM Gautam Kowshik wrote: > Thanks Anjali and Samarth, >These look good! Great progress. Can you push your changes to the > vectorized-read branch please? > > Sent from my iPhone > > On Aug 8, 2019, a

Re: Encouraging performance results for Vectorized Iceberg code

2019-08-08 Thread Gautam Kowshik
Thanks Anjali and Samarth, These look good! Great progress. Can you push your changes to the vectorized-read branch please? Sent from my iPhone > On Aug 8, 2019, at 11:56 AM, Anjali Norwood wrote: > > Good suggestion Ryan. Added dev@iceberg now. > > Dev: Please see early vectorized Icebe

Re: Iceberg in Spark 3.0.0

2019-08-08 Thread Edgar Rodriguez
On Thu, Aug 8, 2019 at 3:37 PM Ryan Blue wrote: > I think it's a great idea to branch and get ready for Spark 3.0.0. Right > now, I'm focused on getting a release out, but I can review patches for > Spark 3.0. > > Anyone know if there are nightly builds of Spark 3.0 to test with? > Seems like th

Re: Iceberg in Spark 3.0.0

2019-08-08 Thread Ryan Blue
One more thing: Spark 3.0.0 has several changes regarding to DataSource V2, it would be better to evaluate the changes and do the design by also considering 3.0 changes This actually goes the other way. We’ve been influencing the design of DataSourceV2 based on what we need for Iceberg. I’m track

Re: Iceberg in Spark 3.0.0

2019-08-08 Thread Ryan Blue
I think it's a great idea to branch and get ready for Spark 3.0.0. Right now, I'm focused on getting a release out, but I can review patches for Spark 3.0. Anyone know if there are nightly builds of Spark 3.0 to test with? On Wed, Aug 7, 2019 at 7:34 PM Saisai Shao wrote: > IMHO I agree that we

Re: Any plan to support update, delete and others

2019-08-08 Thread Ryan Blue
We've actually been doing all of our API work in upstream Spark instead of adding APIs to Iceberg for row-level data manipulation. That's why I'm involved in the DataSourceV2 work. I think for Delta, this is probably an effort to get some features out earlier. I think that's easier for Delta becau

Re: Two newbie question about Iceberg

2019-08-08 Thread Ryan Blue
Any ideas on how to fix this? Can we create the HIVE_LOCKS table if it is missing automatically? On Wed, Aug 7, 2019 at 7:13 PM Saisai Shao wrote: > Thanks guys for your reply. > > I didn't do anything special, I don't even have a configured Hive. I just > simply put the iceberg (assembly) jar i

Re: Encouraging performance results for Vectorized Iceberg code

2019-08-08 Thread Anjali Norwood
Good suggestion Ryan. Added dev@iceberg now. Dev: Please see early vectorized Iceberg performance results a couple emails down. This WIP. thanks, Anjali. On Thu, Aug 8, 2019 at 10:39 AM Ryan Blue wrote: > Hi everyone, > > Is it possible to copy the Iceberg dev list when sending these emails? >

Re: [DISCUSS] Implementation strategies for supporting Iceberg tables in Hive

2019-08-08 Thread Ryan Blue
> With an iceberg raw store, I suspect that you might not need a storage handler and could go straight to a input/output format. You would probably need an input and output format for each of the storage formats: Iceberg{Orc,Parquet,Avro}{Input,Output}Format. I don't think that would work because