Re: Question about Arrow Mutable/Immutable Arrays choice

2021-11-08 Thread Wes McKinney
I don't think there is a problem with having "internal" data structures that provide mutation and other capabilities, but when internal data structures are made external (exported to consumers through "public" C++ APIs / namespaces) then immutability is good there (or at least forcing a consumer

Re: Question about Arrow Mutable/Immutable Arrays choice

2021-11-04 Thread Antoine Pitrou
Le 04/11/2021 à 10:56, Alessandro Molina a écrit : On Wed, Nov 3, 2021 at 11:34 PM Jacques Nadeau wrote: In a perfect world we would have done a better job in the object hierarchy/behavior of making this explicit but we don't live in that world, unfortunately. Makes sense, but I thought

Re: Question about Arrow Mutable/Immutable Arrays choice

2021-11-04 Thread Alessandro Molina
On Wed, Nov 3, 2021 at 11:34 PM Jacques Nadeau wrote: > In a perfect world we would have done a better job in the object > hierarchy/behavior of making this explicit but we don't live in that world, > unfortunately. Makes sense, but I thought that was exactly the reason why set/setSafe are

Re: Question about Arrow Mutable/Immutable Arrays choice

2021-11-03 Thread Jacques Nadeau
Hey Alessandro, take a look at the top level docs on ValueVector: https://arrow.apache.org/docs/java/reference/org/apache/arrow/vector/ValueVector.html Specifically the following: - values need to be written in order (e.g. index 0, 1, 2, 5) - null vectors start with all values as null

Re: Question about Arrow Mutable/Immutable Arrays choice

2021-11-03 Thread Jorge Cardoso Leitão
I think the c data interface requires the arrays to be immutable or two implementations will race when mutating/reading the shared regions, since we have no mechanism to synchronize read/write access across the boundary. Best, Jorge On Wed, Nov 3, 2021 at 1:50 PM Alessandro Molina <

Question about Arrow Mutable/Immutable Arrays choice

2021-11-03 Thread Alessandro Molina
I recently noticed that in the Java implementation we expose a set/setSafe function that allows to mutate Arrow Arrays [1] This seems to be at odds with the general design of the C++ (and by consequence Python and R) library where Arrays are immutable and can be modified only through compute