Re: Hive Iceberg writes

2020-09-01 Thread Peter Vary
Uploaded a working implementation for unpartitioned tables.
Those who are interested can take a look here: 
https://github.com/apache/iceberg/pull/1407 


> On Aug 31, 2020, at 16:34, Peter Vary  wrote:
> 
> Thanks everyone for the quick answers.
> I will post a WIP patch as Adrian suggested in the next few days.
> 
> Thanks,
> Peter
> 
>> On Aug 27, 2020, at 19:35, RD mailto:rdsr...@gmail.com>> 
>> wrote:
>> 
>> Our stance has been similar at LinkedIn. Hive writes are not a priority for 
>> us as we plan to move more and more of our workloads on Hive to Spark SQL
>> 
>> -R
>> 
>> On Thu, Aug 27, 2020 at 10:18 AM Edgar Rodriguez 
>> > > wrote:
>> Hi folks,
>> 
>> We have not started to work on this either, but we've discussed this 
>> internally on whether supporting Hive writes or not. Our first priority 
>> right now is getting Hive reads in production to have read compatibility 
>> with our existing Hive clients. We'd be interested in this, however, at 
>> Airbnb we're moving to Spark so writes in Hive most likely won't be on top 
>> of our list.
>> 
>> Thanks!
>> 
>> Cheers,
>> 
>> On Thu, Aug 27, 2020 at 12:53 AM Mass Dosage > > wrote:
>> We're definitely interested in this too but haven't started work on it yet. 
>> It has been discussed at our community syncs as something quite a few people 
>> are interested in so if nobody else responds a good starting point would 
>> probably be an early WIP PR that everyone can follow and contribute to.
>> 
>> Thanks,
>> 
>> Adrian
>> 
>> On Wed, 26 Aug 2020 at 17:35, Ryan Blue > > wrote:
>> I think Edgar and Adrien who have been contributing support for ORC and Hive 
>> are interested in this as well.
>> 
>> On Wed, Aug 26, 2020 at 9:22 AM Peter Vary > > wrote:
>> Hi Team,
>> 
>> We are thinking about implementing HiveOutputFormat, so writes through Hive 
>> can work as well.
>> Has anybody working on this? Do you know any ongoing effort related to Hive 
>> writes?
>> Asking because we would like to prevent duplicate effort.
>> Also if anyone has some good pointers to start for an Iceberg noobie, it 
>> would be good.
>> 
>> Thanks,
>> Peter 
>> 
>> 
>> 
>> -- 
>> Ryan Blue
>> Software Engineer
>> Netflix
>> 
>> 
>> -- 
>> Edgar R
> 



Re: Hive Iceberg writes

2020-08-31 Thread Peter Vary
Thanks everyone for the quick answers.
I will post a WIP patch as Adrian suggested in the next few days.

Thanks,
Peter

> On Aug 27, 2020, at 19:35, RD  wrote:
> 
> Our stance has been similar at LinkedIn. Hive writes are not a priority for 
> us as we plan to move more and more of our workloads on Hive to Spark SQL
> 
> -R
> 
> On Thu, Aug 27, 2020 at 10:18 AM Edgar Rodriguez 
>  wrote:
> Hi folks,
> 
> We have not started to work on this either, but we've discussed this 
> internally on whether supporting Hive writes or not. Our first priority right 
> now is getting Hive reads in production to have read compatibility with our 
> existing Hive clients. We'd be interested in this, however, at Airbnb we're 
> moving to Spark so writes in Hive most likely won't be on top of our list.
> 
> Thanks!
> 
> Cheers,
> 
> On Thu, Aug 27, 2020 at 12:53 AM Mass Dosage  > wrote:
> We're definitely interested in this too but haven't started work on it yet. 
> It has been discussed at our community syncs as something quite a few people 
> are interested in so if nobody else responds a good starting point would 
> probably be an early WIP PR that everyone can follow and contribute to.
> 
> Thanks,
> 
> Adrian
> 
> On Wed, 26 Aug 2020 at 17:35, Ryan Blue  wrote:
> I think Edgar and Adrien who have been contributing support for ORC and Hive 
> are interested in this as well.
> 
> On Wed, Aug 26, 2020 at 9:22 AM Peter Vary  wrote:
> Hi Team,
> 
> We are thinking about implementing HiveOutputFormat, so writes through Hive 
> can work as well.
> Has anybody working on this? Do you know any ongoing effort related to Hive 
> writes?
> Asking because we would like to prevent duplicate effort.
> Also if anyone has some good pointers to start for an Iceberg noobie, it 
> would be good.
> 
> Thanks,
> Peter 
> 
> 
> 
> -- 
> Ryan Blue
> Software Engineer
> Netflix
> 
> 
> -- 
> Edgar R



Re: Hive Iceberg writes

2020-08-27 Thread RD
Our stance has been similar at LinkedIn. Hive writes are not a priority for
us as we plan to move more and more of our workloads on Hive to Spark SQL

-R

On Thu, Aug 27, 2020 at 10:18 AM Edgar Rodriguez
 wrote:

> Hi folks,
>
> We have not started to work on this either, but we've discussed this
> internally on whether supporting Hive writes or not. Our first priority
> right now is getting Hive reads in production to have read compatibility
> with our existing Hive clients. We'd be interested in this, however, at
> Airbnb we're moving to Spark so writes in Hive most likely won't be on top
> of our list.
>
> Thanks!
>
> Cheers,
>
> On Thu, Aug 27, 2020 at 12:53 AM Mass Dosage  wrote:
>
>> We're definitely interested in this too but haven't started work on it
>> yet. It has been discussed at our community syncs as something quite a few
>> people are interested in so if nobody else responds a good starting point
>> would probably be an early WIP PR that everyone can follow and contribute
>> to.
>>
>> Thanks,
>>
>> Adrian
>>
>> On Wed, 26 Aug 2020 at 17:35, Ryan Blue 
>> wrote:
>>
>>> I think Edgar and Adrien who have been contributing support for ORC and
>>> Hive are interested in this as well.
>>>
>>> On Wed, Aug 26, 2020 at 9:22 AM Peter Vary 
>>> wrote:
>>>
 Hi Team,

 We are thinking about implementing HiveOutputFormat, so writes through
 Hive can work as well.
 Has anybody working on this? Do you know any ongoing effort related to
 Hive writes?
 Asking because we would like to prevent duplicate effort.
 Also if anyone has some good pointers to start for an Iceberg noobie,
 it would be good.

 Thanks,
 Peter


>>>
>>> --
>>> Ryan Blue
>>> Software Engineer
>>> Netflix
>>>
>>
>
> --
> Edgar R
>


Re: Hive Iceberg writes

2020-08-27 Thread Edgar Rodriguez
Hi folks,

We have not started to work on this either, but we've discussed this
internally on whether supporting Hive writes or not. Our first priority
right now is getting Hive reads in production to have read compatibility
with our existing Hive clients. We'd be interested in this, however, at
Airbnb we're moving to Spark so writes in Hive most likely won't be on top
of our list.

Thanks!

Cheers,

On Thu, Aug 27, 2020 at 12:53 AM Mass Dosage  wrote:

> We're definitely interested in this too but haven't started work on it
> yet. It has been discussed at our community syncs as something quite a few
> people are interested in so if nobody else responds a good starting point
> would probably be an early WIP PR that everyone can follow and contribute
> to.
>
> Thanks,
>
> Adrian
>
> On Wed, 26 Aug 2020 at 17:35, Ryan Blue  wrote:
>
>> I think Edgar and Adrien who have been contributing support for ORC and
>> Hive are interested in this as well.
>>
>> On Wed, Aug 26, 2020 at 9:22 AM Peter Vary 
>> wrote:
>>
>>> Hi Team,
>>>
>>> We are thinking about implementing HiveOutputFormat, so writes through
>>> Hive can work as well.
>>> Has anybody working on this? Do you know any ongoing effort related to
>>> Hive writes?
>>> Asking because we would like to prevent duplicate effort.
>>> Also if anyone has some good pointers to start for an Iceberg noobie, it
>>> would be good.
>>>
>>> Thanks,
>>> Peter
>>>
>>>
>>
>> --
>> Ryan Blue
>> Software Engineer
>> Netflix
>>
>

-- 
Edgar R


Re: Hive Iceberg writes

2020-08-27 Thread Mass Dosage
We're definitely interested in this too but haven't started work on it yet.
It has been discussed at our community syncs as something quite a few
people are interested in so if nobody else responds a good starting point
would probably be an early WIP PR that everyone can follow and contribute
to.

Thanks,

Adrian

On Wed, 26 Aug 2020 at 17:35, Ryan Blue  wrote:

> I think Edgar and Adrien who have been contributing support for ORC and
> Hive are interested in this as well.
>
> On Wed, Aug 26, 2020 at 9:22 AM Peter Vary 
> wrote:
>
>> Hi Team,
>>
>> We are thinking about implementing HiveOutputFormat, so writes through
>> Hive can work as well.
>> Has anybody working on this? Do you know any ongoing effort related to
>> Hive writes?
>> Asking because we would like to prevent duplicate effort.
>> Also if anyone has some good pointers to start for an Iceberg noobie, it
>> would be good.
>>
>> Thanks,
>> Peter
>>
>>
>
> --
> Ryan Blue
> Software Engineer
> Netflix
>


Re: Hive Iceberg writes

2020-08-26 Thread Ryan Blue
I think Edgar and Adrien who have been contributing support for ORC and
Hive are interested in this as well.

On Wed, Aug 26, 2020 at 9:22 AM Peter Vary 
wrote:

> Hi Team,
>
> We are thinking about implementing HiveOutputFormat, so writes through
> Hive can work as well.
> Has anybody working on this? Do you know any ongoing effort related to
> Hive writes?
> Asking because we would like to prevent duplicate effort.
> Also if anyone has some good pointers to start for an Iceberg noobie, it
> would be good.
>
> Thanks,
> Peter
>
>

-- 
Ryan Blue
Software Engineer
Netflix


Hive Iceberg writes

2020-08-26 Thread Peter Vary
Hi Team,

We are thinking about implementing HiveOutputFormat, so writes through Hive can 
work as well.
Has anybody working on this? Do you know any ongoing effort related to Hive 
writes?
Asking because we would like to prevent duplicate effort.
Also if anyone has some good pointers to start for an Iceberg noobie, it would 
be good.

Thanks,
Peter