Hi Sutou, Sorry about the long delay, but I wanted to follow up on this. I finally filled up the forms and some other documents, and I sent an initial draft in a new thread almost a week ago. Could you please have a look?
-- bp Sutou Kouhei <[email protected]> writes: > Hi, > > https://incubator.apache.org/ip-clearance/ > > We need to fill the IP clearance template: > https://svn.apache.org/repos/asf/incubator/public/trunk/content/ip-clearance/ip-clearance-template.xml > > (It's linked the above IP clearance page.) > > https://svn.apache.org/repos/asf/incubator/public/trunk/content/ip-clearance/arrow-flight-sql-odbc.xml > is one of filled templates by us. > > Could you try filling the template as much as possible? > > > Thanks, > -- > kou > > In <camexywdchjdtjvdutjq4-zzc+faq0u3xhdwcwg_1xyhpetk...@mail.gmail.com> > "Re: [DISCUSS][Erlang] Erlang Apache Arrow Implementation" on Mon, 1 Sep > 2025 10:29:40 +0530, > Benjamin Philip <[email protected]> wrote: > >> Any update on this? If you can send me a link to the IP clearance process >> and the guidelines and development practices for Apache repositories, I can >> notify the other stakeholders in the EEF and start the transfer process. >> >> -- bp >> >> On Wed, 20 Aug, 2025, 1:51 pm Benjamin Philip, <[email protected]> >> wrote: >> >>> On Wed, 20 Aug 2025 at 04:08, Jacob Wujciak <[email protected]> wrote: >>> >>>> > Secondly, this will be the first time I will be maintaining an Apache >>>> > project, and I am not very familiar with the internal processes you >>>> use. I feel I might >>>> > move faster with a repo under my own user >>>> >>>> This does sound like it might be another use case for the 'arrow-contrib' >>>> org: >>>> Apache Datafusion has a community run, non-apache org called >>>> 'datafusion-contrib' [1], where unofficial extensions and datafusion >>>> related crates are developed. Once a project is mature/used enough it >>>> can be donated to the ASF Datafusion TLP (so that is not a necessity). >>>> This was for example done for Datafusion for Ray [2]. Though >>>> apparently it will now be archived due to a lack of maintenance [3]. >>>> (So maybe not the best example xD) >>>> >>>> The idea of creating a similar org for arrow has been brought up a >>>> number of times in the community meeting, This would not come with the >>>> 'red tape' of an ASF project and would allow faster initial >>>> development for the Erlang implementation. >>>> >>>> >>> That sounds like a good option. However, I don't want to eliminate >>> developing this as an ASF project from the start. I figure that this will >>> eventually become a regular ASF project, so I might as well get accustomed >>> to it now. Is there a document with all the "red tape" an ASF project >>> entails? >>> >>> If we were to do this, would the Erlang implementation be considered >>> "official" and linked from the docs? I would like to improve awareness of >>> the project, and I'd prefer it be mentioned in the official docs even as an >>> alpha release. I think that is important in addition to promoting it on >>> Elixir/Erlang specific channels. >>> >>> I also forgot to mention this in my previous email, but would any Arrow >>> maintainer be able to review PRs to this project, maybe multiple times a >>> week? I remember having many arrow specific doubts while working on this, >>> and I think it would be wise to have someone re-check my work to ensure I >>> haven't misinterpreted anything in the specifications and generally keep an >>> eye from the Apache side. I also have 2 other reviewers from the Erlang >>> Ecosystem Foundation reviewing my Erlang code, so that part is already >>> taken care of. >>> >>> Regarding the ip clearance process (that as you say will need to >>>> happen at some point of moving the implementation into >>>> apache/arrow-erlang), IIRC as long as the code has always been >>>> licensed under ASL 2.0 the process is more of a formality and >>>> shouldn't be too hard to do. >>>> >>> >>> The code is indeed licensed under ASL 2.0, so I think we can go with the >>> ip clearance process then. Are there any other legal matters that need to >>> be addressed? >>> >>> On Tue, 19 Aug 2025 at 14:09, Antoine Pitrou <[email protected]> wrote: >>> >>>> There isn't an official criterion for declaring an implementation >>>> "complete" (and we don't really use that term, either). >>>> >>>> What is important is to address the most common needs that your users >>>> may have (such as OpenTelemetry data payloads). >>> >>> >>> That makes sense. >>> >>> >>>> I would personally suggest: >>>> >>>> - support the most common data types (all primitive types + at least >>>> list and struct + dictionary + basic support for extension types) >>>> - support either the C Data Interface or the IPC format (preferably both) >>>> >>>> In the IPC format, you don't need to support everything (tensors are >>>> rarely used, for example; endianness conversion is only useful if you >>>> plan to exchange data with big-endian systems...). >>>> >>>> >>> As of right now, we support about half of all primitive types and most of >>> the lists (under nested types), but none of the special or extension types. >>> We also have some rudimentary support for IPC (since that's needed for >>> OTel). I plan to add support for everything under the Columnar Format >>> anyway, so it's just a matter of time. Is Flight and friends handled by the >>> Arrow team? How often and where is Flight used? >>> >>> Hi Benjamin, >>>> >>>> Le 14/08/2025 à 20:17, Benjamin Philip a écrit : >>>> > >>>> >> serialization/deserialization features but arrow-rs provides >>>> >> more features such as computation features. >>>> > >>>> > This reminds me. What features will I have to support out of >>>> > (de)serialization >>>> > for an implementation to be considered complete? >>>> >>>> You're probably aware of https://arrow.apache.org/docs/dev/status.html , >>>> otherwise it will give you an idea of the variety of features that *can* >>>> be implemented. >>>> >>> >>> This list only lists support for serialization and deserialization of >>> various data types, whether that be the Columnar Format, the IPC Format or >>> Flight. I realize that the words "out of" weren't very clear, but what I >>> meant was what should I support *apart from* serde? For example, Sutou >>> mentioned computation. I don't see a list of supported computations >>> anywhere, what computations must I provide? I'm guessing serde (i.e. R/W of >>> Arrow arrays) and computations (i.e. transformations of Arrow arrays) are >>> it, but are there any other high-level features I should support? >>> >>> -- bp >>>
