> I think this is similar to the proposal with the exception that your
> suggestion would require amending existing types that happen to be
> alternatives to each other.
I want to avoid electing one canonical layout for a kind (AKA "logical
type"). And the existence of "alternative layouts"
> I would welcome a draft PR showcasing the changes necessary in the IPC
> format definition, and in the C Data Interface specification (no need to
> actually implement them for now :-)).
I've proposed something at [1].
> One sketch of an idea: define sets of types that we can call “kinds”**
>
A major difficulty in making the Arrow array types open for extension [1]
is that as soon as we define an (a) universal representation* or (b)
abstract interface, we close the door for vectorization. (a) prevents
having new vectorization friendly formats and (b) limits the implementation
of new
I am also in favor of the idea of an alternative layout. IIRC, a new
alternative
layout still goes into a process of standardization though it is the choice
of
each implementation to decide support now or later. I'd like to ask if we
can
provide the flexibility for implementations or downstream
Hello,
I'm trying to reason about the advantages and drawbacks of this
proposal, but it seems to me that it lacks definition.
I would welcome a draft PR showcasing the changes necessary in the IPC
format definition, and in the C Data Interface specification (no need to
actually implement
Thank you Neil for writing this summary and everyone whose thoughts went
into the discussions -- I think the proposal, as summarized, offers a great
path forward by allowing the various Arrow communities to specialize when
advantageous but remain compatible.
On Thu, Jul 13, 2023 at 11:59 AM Ian
clarify what constitutes support for a canonical alternative
layout
I had envisaged, perhaps naively, that we would just add a new DataType
containing a string layout name, perhaps DataType::Raw(String). This
would have no restrictions on the number of buffers, children, etc...
and would
Canonical alternative layouts sounds like a workable path forward. Perhaps
understandably, my immediate thought is how I could rephrase Utf8View as a
canonical alternative layout for Utf8. In light of that, I have a few
questions to clarify what constitutes support for a canonical alternative
Thanks Neal and Weston!
I prepared a diagram to solidify my own understanding of the context, which can
be found at [1].
I think alternative layouts sounds like a nice first approach to allowing new
layouts that can be supported lazily (implemented when it is beneficial) by
various
I am in favor of this proposal. IMO the Arrow project is the right place to
standardize both the interoperability *and operability* of columnar data
layouts. Data engines are a core component of the Arrow ecosystem and the
project should be able to grow with these data engines as they converge on
Thank you Weston for proposing this solution and Neal for describing
its context and implications. I agree with the other replies here—this
seems like an elegant solution to a growing need that could, if left
unaddressed, increase the fragmentation of the ecosystem and reduce
the centrality of the
I like this proposal, I think it strikes a pragmatic balance between
preserving interoperability whilst still allowing new ideas to be
incorporated into the standard. Thank you for writing this up.
On 13/07/2023 10:22, Matt Topol wrote:
I don't have much to add but I do want to second Jacob's
I don't have much to add but I do want to second Jacob's comments. I agree
that this is a good way to avoid the fragmentation while keeping Arrow
relevant, and likely something we need to do so that we can ensure Arrow
remains the way to do this data integration and interoperability.
On Wed, Jul
Hello Everyone,
Thanks for this comprehensive but concise write up Neal! I think this
proposal is a good way to avoid both fragmentation of the arrow ecosystem
as well as its obsolescence. In my opinion of these two problems the
obsolescence is the bigger issue as (as mentioned in the proposal)
Hi all,
As was previously raised in [1] and surfaced again in [2], there is a
proposal for representing alternative layouts. The intent, as I understand
it, is to be able to support memory layouts that some (but perhaps not all)
applications of Arrow find valuable, so that these nearly Arrow
15 matches
Mail list logo