Any update on this? If you can send me a link to the IP clearance process
and the guidelines and development practices for Apache repositories, I can
notify the other stakeholders in the EEF and start the transfer process.

-- bp

On Wed, 20 Aug, 2025, 1:51 pm Benjamin Philip, <[email protected]>
wrote:

> On Wed, 20 Aug 2025 at 04:08, Jacob Wujciak <[email protected]> wrote:
>
>> > Secondly, this will be the first time I will be maintaining an Apache
>> > project, and I am not very familiar with the internal processes you
>> use. I feel I might
>> > move faster with a repo under my own user
>>
>> This does sound like it might be another use case for the 'arrow-contrib'
>> org:
>> Apache Datafusion has a community run, non-apache org called
>> 'datafusion-contrib' [1], where unofficial extensions and datafusion
>> related crates are developed. Once a project is mature/used enough it
>> can be donated to the ASF Datafusion TLP (so that is not a necessity).
>> This was for example done for Datafusion for Ray [2]. Though
>> apparently it will now be archived due to a lack of maintenance [3].
>> (So maybe not the best example xD)
>>
>> The idea of creating a similar org for arrow has been brought up a
>> number of times in the community meeting, This would not come with the
>> 'red tape' of an ASF project  and would allow faster initial
>> development for the Erlang implementation.
>>
>>
> That sounds like a good option. However, I don't want to eliminate
> developing this as an ASF project from the start. I figure that this will
> eventually become a regular ASF project, so I might as well get accustomed
> to it now. Is there a document with all the "red tape" an ASF project
> entails?
>
> If we were to do this, would the Erlang implementation be considered
> "official" and linked from the docs? I would like to improve awareness of
> the project, and I'd prefer it be mentioned in the official docs even as an
> alpha release. I think that is important in addition to promoting it on
> Elixir/Erlang specific channels.
>
> I also forgot to mention this in my previous email, but would any Arrow
> maintainer be able to review PRs to this project, maybe multiple times a
> week? I remember having many arrow specific doubts while working on this,
> and I think it would be wise to have someone re-check my work to ensure I
> haven't misinterpreted anything in the specifications and generally keep an
> eye from the Apache side. I also have 2 other reviewers from the Erlang
> Ecosystem Foundation reviewing my Erlang code, so that part is already
> taken care of.
>
> Regarding the ip clearance process (that as you say will need to
>> happen at some point of moving the implementation into
>> apache/arrow-erlang), IIRC as long as the code has always been
>> licensed under ASL 2.0 the process is more of a formality and
>> shouldn't be too hard to do.
>>
>
> The code is indeed licensed under ASL 2.0, so I think we can go with the
> ip clearance process then. Are there any other legal matters that need to
> be addressed?
>
> On Tue, 19 Aug 2025 at 14:09, Antoine Pitrou <[email protected]> wrote:
>
>> There isn't an official criterion for declaring an implementation
>> "complete" (and we don't really use that term, either).
>>
>> What is important is to address the most common needs that your users
>> may have (such as OpenTelemetry data payloads).
>
>
> That makes sense.
>
>
>> I would personally suggest:
>>
>> - support the most common data types (all primitive types + at least
>> list and struct + dictionary + basic support for extension types)
>> - support either the C Data Interface or the IPC format (preferably both)
>>
>> In the IPC format, you don't need to support everything (tensors are
>> rarely used, for example; endianness conversion is only useful if you
>> plan to exchange data with big-endian systems...).
>>
>>
> As of right now, we support about half of all primitive types and most of
> the lists (under nested types), but none of the special or extension types.
> We also have some rudimentary support for IPC (since that's needed for
> OTel). I plan to add support for everything under the Columnar Format
> anyway, so it's just a matter of time. Is Flight and friends handled by the
> Arrow team? How often and where is Flight used?
>
> Hi Benjamin,
>>
>> Le 14/08/2025 à 20:17, Benjamin Philip a écrit :
>> >
>> >> serialization/deserialization features but arrow-rs provides
>> >> more features such as computation features.
>> >
>> > This reminds me. What features will I have to support out of
>> > (de)serialization
>> > for an implementation to be considered complete?
>>
>> You're probably aware of https://arrow.apache.org/docs/dev/status.html ,
>> otherwise it will give you an idea of the variety of features that *can*
>> be implemented.
>>
>
>  This list only lists support for serialization and deserialization of
> various data types, whether that be the Columnar Format, the IPC Format or
> Flight. I realize that the words "out of" weren't very clear, but what I
> meant was what should I support *apart from* serde? For example, Sutou
> mentioned computation. I don't see a list of supported computations
> anywhere, what computations must I provide? I'm guessing serde (i.e. R/W of
> Arrow arrays) and computations (i.e. transformations of Arrow arrays) are
> it, but are there any other high-level features I should support?
>
> -- bp
>

Reply via email to