As a cloud-native table format standard for the big-data ecosystem,  I
believe supporting multiple languages is the correct direction so that
different languages can connect to the apache iceberg table format.

But I can also get Kyle's point about lacking enough resources(developers
and reviewers ) to accomplish this goal.  In my mind,  Python, Golang, C++,
Rust , all of them can be regarded as the native language support.  we may
just need to support the Rust SDK and then all of the other languages can
just wrap the Rust SDK to access the table format.

Anyway,  we will need to wait for the REST catalog finished before we
introduce another languages support , because we can not access the iceberg
table by invoking the JVM catalog interfaces.

On Tue, Jun 7, 2022 at 4:41 AM Micah Kornfield <emkornfi...@gmail.com>
wrote:

> There’s also the question of how useful this would be in practice given
>> the complexity of using C++ (or Rust etc) within some of the major
>> frameworks.
>>
>
> One place this would be useful is for the Arrow's DataSet API [1].  An
> option the Arrow community might be open to is hosting parts of the code
> there (this is what is done for Apache Parquet C++).  This helps shape some
> of the answers to other questions posed (ORC and Parquet are already in the
> Repo, it provides a Filesystem interface, etc).  The project doesn't
> currently consume Avro, and I think the preferred approach is to make a
> clean room Avro parser.  But I agree this is a non-trivial effort to get
> underway.
>
> Another area to consider is compatibility testing.  I think before a third
> officially supported community library is introduced it would be good to
> have a compatibility framework in place to make sure implementations are
> all interpreting the specification correctly.  If there isn't already an
> effort here, I'd like to start contributing something (probably will have
> bandwidth sometime place in Q3).
>
> Thanks,
> -Micah
>
>
> [1] https://arrow.apache.org/docs/cpp/dataset.html
>
> On Sun, Jun 5, 2022 at 11:07 PM Kyle Bendickson <k...@tabular.io> wrote:
>
>> Hi caneGuy,
>>
>> I personally don’t dislike this idea. I understand the performance
>> benefits.
>>
>> But this would be a huge undertaking for the community. We’d need to
>> ensure we had sufficient developer support for reviews (likely one of the
>> biggest issues), as well as a number of other things. Particularly
>> dependencies, package management, etc. We’d also need to scope support down
>> to specific OS / compilers etc.
>>
>> We’d also need to be sure we had adequate developer support from a wide
>> enough range of the community to support the project long term. One issue
>> in open source is that developers will work on something tangential to
>> their project in another repository, but nobody is available to maintain it.
>>
>> There’s also the question of how useful this would be in practice given
>> the complexity of using C++ (or Rust etc) within some of the major
>> frameworks.
>>
>> Again, I’m not opposed to the idea but just trying to be realistic about
>> the realities of such an undertaking. It would need full community support
>> (or at least support from enough community members to be sustainable).
>>
>> If you wanted to make a design doc, the milestones tab in the Iceberg
>> project has some that you might use as reference.
>>
>> *I highly suggest you come to the next community sync and bring this up
>> to the community then.*
>>
>> If you’re not already on the invite list for the monthly community sync,
>> you can get on it by joining the Google group. You’ll receive incites when
>> they go out:
>> https://groups.google.com/g/iceberg-sync
>>
>> Looking forward to seeing you at the next community sync.
>>
>> A design document and/or any prior art would be very helpful as the
>> community sync does discuss many topics (possibly there is existing C++
>> support in StarRocks for Iceberg V1?).
>>
>> Thank you,
>> Kyle Bendickson
>> GitHub: kbendick
>>
>> On Sun, Jun 5, 2022 at 10:44 PM Sam Redai <s...@tabular.io> wrote:
>>
>>> Currently there is no existing effort to develop a C++ package. That
>>> being said I think it would be awesome to have one! If anyone is willing to
>>> start that development effort, I can help with some of the ground work to
>>> kickstart it.
>>>
>>> I would say the first step would be for someone to prepare a high-level
>>> proposal.
>>>
>>> -Sam
>>>
>>> On Sun, Jun 5, 2022 at 11:02 PM 周康 <zhoukang199...@gmail.com> wrote:
>>>
>>>> Hi team
>>>> I am a dev from StarRocks community, and we have supported iceberg v1
>>>> format.
>>>> We are also planning to support v2 format. If there is a C++ package,
>>>> it will be very convenient for our implementation.
>>>> At the same time, other c++ computing engines support v2 format will
>>>> also be faster.
>>>>
>>>> Do we have plans to support c++ version sdk?
>>>> --
>>>> caneGuy
>>>>
>>> --
>>>
>>> Sam Redai <s...@tabular.io>
>>>
>>> Developer Advocate  |  Tabular <https://tabular.io/>
>>>
>>> c (267) 226-8606
>>>
>>

Reply via email to