Re: [DISCUSS] Canonical alternative layout proposal

2023-07-13 Thread Raphael Taylor-Davies
clarify what constitutes support for a canonical alternative layout I had envisaged, perhaps naively, that we would just add a new DataType containing a string layout name, perhaps DataType::Raw(String). This would have no restrictions on the number of buffers, children, etc... and would

Re: [RESULT][VOTE] Release Apache Arrow 12.0.1 - RC1

2023-07-13 Thread Jacob Wujciak-Jens
Hello Everyone, I have opened a PR to add branch protection to the main branch, this will prevent force pushing and direct commits without a reviewed PR https://github.com/apache/arrow/pull/36678 Jacob On Tue, Jun 13, 2023 at 4:23 PM Raúl Cumplido wrote: > Hi, > > I've had an issue with the

Re: [DISCUSS] Canonical alternative layout proposal

2023-07-13 Thread Benjamin Kietzman
Canonical alternative layouts sounds like a workable path forward. Perhaps understandably, my immediate thought is how I could rephrase Utf8View as a canonical alternative layout for Utf8. In light of that, I have a few questions to clarify what constitutes support for a canonical alternative

Re: [DISCUSS] Canonical alternative layout proposal

2023-07-13 Thread Aldrin
Thanks Neal and Weston! I prepared a diagram to solidify my own understanding of the context, which can be found at [1]. I think alternative layouts sounds like a nice first approach to allowing new layouts that can be supported lazily (implemented when it is beneficial) by various

Re: [DISCUSS] Canonical alternative layout proposal

2023-07-13 Thread Dane Pitkin
I am in favor of this proposal. IMO the Arrow project is the right place to standardize both the interoperability *and operability* of columnar data layouts. Data engines are a core component of the Arrow ecosystem and the project should be able to grow with these data engines as they converge on

Re: [CROWDSOURCING] Board Report -- 2 DAYS -- Please provide feedback

2023-07-13 Thread Kevin Gurney
Hi All, Thanks for putting this together, Andrew! Sarah, Fiona, and I added some notes about the MATLAB interface. Best Regards, Kevin Gurney From: Sutou Kouhei Sent: Wednesday, July 12, 2023 9:36 PM To: dev@arrow.apache.org Subject: Re: [CROWDSOURCING] Board

Re: [DISCUSS] Canonical alternative layout proposal

2023-07-13 Thread Ian Cook
Thank you Weston for proposing this solution and Neal for describing its context and implications. I agree with the other replies here—this seems like an elegant solution to a growing need that could, if left unaddressed, increase the fragmentation of the ecosystem and reduce the centrality of the

Re: [DISCUSS] Canonical alternative layout proposal

2023-07-13 Thread Raphael Taylor-Davies
I like this proposal, I think it strikes a pragmatic balance between preserving interoperability whilst still allowing new ideas to be incorporated into the standard. Thank you for writing this up. On 13/07/2023 10:22, Matt Topol wrote: I don't have much to add but I do want to second Jacob's

Re: [DISCUSS] Canonical alternative layout proposal

2023-07-13 Thread Matt Topol
I don't have much to add but I do want to second Jacob's comments. I agree that this is a good way to avoid the fragmentation while keeping Arrow relevant, and likely something we need to do so that we can ensure Arrow remains the way to do this data integration and interoperability. On Wed, Jul