> Secondly, this will be the first time I will be maintaining an Apache
> project, and I am not very familiar with the internal processes you use. I 
> feel I might
> move faster with a repo under my own user

This does sound like it might be another use case for the 'arrow-contrib' org:
Apache Datafusion has a community run, non-apache org called
'datafusion-contrib' [1], where unofficial extensions and datafusion
related crates are developed. Once a project is mature/used enough it
can be donated to the ASF Datafusion TLP (so that is not a necessity).
This was for example done for Datafusion for Ray [2]. Though
apparently it will now be archived due to a lack of maintenance [3].
(So maybe not the best example xD)

The idea of creating a similar org for arrow has been brought up a
number of times in the community meeting, This would not come with the
'red tape' of an ASF project  and would allow faster initial
development for the Erlang implementation.

Regarding the ip clearance process (that as you say will need to
happen at some point of moving the implementation into
apache/arrow-erlang), IIRC as long as the code has always been
licensed under ASL 2.0 the process is more of a formality and
shouldn't be too hard to do.

Best
Jacob

[1]: https://github.com/datafusion-contrib
[2]: https://github.com/datafusion-contrib/ray-sql
[3]: https://lists.apache.org/thread/7lprgjdrkfj8hq3xs17hslwmmgbj03mx

Am Di., 19. Aug. 2025 um 10:40 Uhr schrieb Antoine Pitrou <[email protected]>:
>
>
> Hi Benjamin,
>
> Le 14/08/2025 à 20:17, Benjamin Philip a écrit :
> >
> >> serialization/deserialization features but arrow-rs provides
> >> more features such as computation features.
> >
> > This reminds me. What features will I have to support out of
> > (de)serialization
> > for an implementation to be considered complete?
>
> You're probably aware of https://arrow.apache.org/docs/dev/status.html ,
> otherwise it will give you an idea of the variety of features that *can*
> be implemented.
>
> There isn't an official criterion for declaring an implementation
> "complete" (and we don't really use that term, either).
>
> What is important is to address the most common needs that your users
> may have (such as OpenTelemetry data payloads). I would personally suggest:
>
> - support the most common data types (all primitive types + at least
> list and struct + dictionary + basic support for extension types)
> - support either the C Data Interface or the IPC format (preferably both)
>
> In the IPC format, you don't need to support everything (tensors are
> rarely used, for example; endianness conversion is only useful if you
> plan to exchange data with big-endian systems...).
>
> Regards
>
> Antoine.
>

Reply via email to