>From my understanding,  the most abstracted approach for implementing a
multi-language SDK is:  building another language SDK (Python, Ruby, Go,
etc) on top of the Iceberg-Rust SDK.  In this case, we can make our
community resources focus on the rust native kernel, and all of the other
language bindings will have the same progress and support.

This may put higher requirements on our iceberg-rust:
1. The community needs more people who understand both iceberg and rust;
2. The API of iceberg-rust needs to be more abstract and universal, so that
we can more easily export it into native API and use it in other language
bindings.

Anyway,  I personally think that abstracting the problem of multi-language
and focusing on iceberg-rust will be the right direction. Just like what we
see lancedb[1] doing, its kernel is very focused on the rust kernel, based
on which it has successfully built a multi-language ecosystem of python and
nodejs. In the future, I think it is also very reasonable and easy to
expand to a big data ecosystem like Java.

1. https://github.com/lancedb/lancedb

On Thu, Aug 8, 2024 at 7:09 AM Chris Atkins <chri...@buildkite.com.invalid>
wrote:

> > Do you know how big the Ruby data community is? I think the most
> important part is that it gets some traction and will continue to be
> maintained.
>
> Its a great question Fokko! I'd say that the data community in Ruby is
> nascent, but definitely exists. There are some prolific folks like Andrew
> Kane (the fellow who created pgvector) https://ankane.org/opensource who
> have released a lot of data related gems, and there are a healthy set of
> bindings for Apache Arrow, Avro and friends.
>
> In my experience, data technology in Ruby often shows up for very
> particular use-cases within a larger Ruby on Rails application. An
> example would be using the DuckDB, Arrow or Polars binding gems to do data
> export or reverse-ETL with Parquet files in object storage; where parts of
> the process work with the core domain objects. Another use-case is for
> user or administrator-facing reporting features. My own use-case is wanting
> to (eventually) perform some reads/write from some iceberg tables directly
> from our Rails monolith without needing to call out to Trino.
>
> Thanks,
>
> Chris Atkins
> Principal Engineer
> Buildkite
>
> On Tue, 6 Aug 2024 at 17:14, Fokko Driesprong <fo...@apache.org> wrote:
>
>> Hi Chris,
>>
>> Thanks for raising this. Do you know how big the Ruby data community is?
>> I think the most important part is that it gets some traction and will
>> continue to be maintained.
>>
>> I fully agree that building on top of iceberg-rust makes a lot of sense,
>> since also with PyIceberg we're running into limitations when it comes to
>> performance and limited parallelism.
>>
>> Kind regards,
>> Fokko
>>
>> Op ma 5 aug 2024 om 14:14 schreef Xuanwo <xua...@apache.org>:
>>
>>> Hi, Chris
>>>
>>> I love this idea. One of the main reasons I started working on
>>> iceberg-rust is due to the potential that a rust-powered iceberg core can
>>> offer.
>>>
>>> I'm not an experienced ruby developer, but I'm willing to help with some
>>> CI setup or docs since I have some experience in the opendal community with
>>> ruby bindings.
>>>
>>> On Mon, Aug 5, 2024, at 20:03, Renjie Liu wrote:
>>>
>>> Hi, Chris:
>>>
>>> Thanks for raising this. Generally I'm +1 with building ruby bindings on
>>> top of rust implementation, who would help introduce iceberg into the ruby
>>> ecosystem.
>>>
>>> On Mon, Aug 5, 2024 at 7:30 PM Chris Atkins
>>> <chri...@buildkite.com.invalid> wrote:
>>>
>>> Hi there,
>>>
>>> I'm following up on a discussion
>>> <https://apache-iceberg.slack.com/archives/C05HTENMJG4/p1722750831522969> 
>>> from
>>> the #rust channel on the Iceberg community slack, so starting a thread here
>>> too.
>>>
>>> After seeing Xuanwo's and Song's recent proposals around leveraging
>>> iceberg-rust to power parts of PyIceberg, I was thinking it could be
>>> valuable to follow a similar pattern to build out Ruby bindings for
>>> Iceberg. Being able to stand on the shoulders of iceberg-rust could really
>>> help build out a robust Ruby interface, and also offer some opportunities
>>> for interop with things like datafusion and opendal.
>>>
>>> Recently in the Ruby ecosystem, writing native extensions in Rust has
>>> become more popular, and tools like rb-sys and magnus provide a lot of the
>>> required infrastructure. A good example is ruby-polars, which provides an
>>> interface that is idiomatic Ruby but retains good symmetry with the APIs
>>> exposed by py-polars. I wonder if we could eventually aim for a similar
>>> type of symmetry between PyIceberg and a Ruby gem?
>>>
>>> Is there much interest in this? I've started playing around with some of
>>> the basics, and started out with a plain native Ruby implementation of some
>>> of the basic metadata APIs, but quickly realised that building on
>>> iceberg-rust could be more productive than writing it all from scratch.
>>>
>>> *References*
>>>
>>> https://lists.apache.org/thread/5570vbdkrk7mdswt4jqy45lv7y58pz4b
>>> https://lists.apache.org/thread/33c0nkc3k6646lvro1lv22pvhwlp50ss
>>> https://github.com/apache/iceberg-rust/pull/518
>>>
>>> *Prior Art in Ruby*
>>>
>>> https://github.com/matsadler/magnus
>>> https://github.com/oxidize-rb/rb-sys
>>> https://github.com/ankane/ruby-polars
>>> https://github.com/apache/opendal/tree/main/bindings/ruby
>>>
>>> Thanks,
>>> Chris Atkins
>>>
>>> Xuanwo
>>>
>>> https://xuanwo.io/
>>>
>>>

Reply via email to