Re: [datafusion] I want to learn datafusion but don't know where to get help,

2022-09-15 Thread Francis Du
Hi Shao: I am in China too. Welcome and hack together. Here is an ASF slack invite link[1], you can ask anything in the arrow-rust channel. [1]. https://join.slack.com/t/the-asf/shared_invite/zt-1g483mk86-wv7N58SmP6yRcEoVtFjO4A Regards, Francis On Fri, 16 Sept 2022 at 00:56, Andrew Lamb wrote:

Re: RLE array slicing

2022-09-15 Thread Weston Pace
Thank you everyone, I think I was pretty far off base in representing the work Tobias had done and both Tobias and Matt have clarified well. * There are two child arrays not necessarily for slicing but more to help distinguish between the logical length (there are no buffers with the logical leng

Re: [RESULT] [VOTE][RUST][DataFusion] Release Apache Arrow DataFusion 12.0.0 RC1

2022-09-15 Thread Andy Grove
DataFusion 12.0.0 is now released to crates.io On Thu, Sep 15, 2022 at 12:55 PM Andy Grove wrote: > The vote has passed with 9 +1 votes (4 binding). Thank you to all who > helped with the release verification. > > Andy. > > On Tue, Sep 13, 2022 at 7:31 AM Ian Joiner wrote: > >> +1 (Non-binding)

Re: [RESULT] [VOTE][RUST][DataFusion] Release Apache Arrow DataFusion 12.0.0 RC1

2022-09-15 Thread Andy Grove
The vote has passed with 9 +1 votes (4 binding). Thank you to all who helped with the release verification. Andy. On Tue, Sep 13, 2022 at 7:31 AM Ian Joiner wrote: > +1 (Non-binding) > > Verified on my macOS 12.2.1 / Apple M1 Chip > > On Mon, Sep 12, 2022 at 2:55 PM Andy Grove wrote: > > > Hi,

Re: [datafusion] I want to learn datafusion but don't know where to get help,

2022-09-15 Thread Andrew Lamb
Hi Shao, Some other sources of information about DataFusion are the API docs [1] as well as the github repository [2]. There are also several examples of use in [3] Hope that helps, Andrew [1] https://docs.rs/datafusion/11.0.0/datafusion/ [2] https://github.com/apache/arrow-datafusion [3] http

Re: Usage of the name Feather?

2022-09-15 Thread SHIMA Tatsuya
Thank you all for sharing your opinions. Many believe that the name Feather V2 should be deprecated. However, as several have pointed out, simply deprecating the name "Feather" is not enough to end the confusion; a recommended name seems to need to be determined. Perhaps we need to have a vot

Re: RLE array slicing

2022-09-15 Thread Matt Topol
> why would the run ends and values have the same offset? That's why I liked the idea of the children arrays and having the parent offset being a "logical offset" and children being "physical offsets" because it maintains the independence of the arrays. Slicing the RLE is simply just setting the l

Re: PRs for RLE support

2022-09-15 Thread Matt Topol
IMHO I think it's worth parameterizing for the 16/32-bit case. Despite it being nice to be able to just assume it's a 32bit signed int in terms of code simplicity, I think it would be a good benefit for memory usage of RLE arrays. That said I don't have anything to back that up as I don't regularl

[datafusion] I want to learn datafusion but don't know where to get help,

2022-09-15 Thread Shao Grant
I'm in China, and cannot open youtube, it's impossible to find enough documents about datafusion on how to use it, the official site: https://arrow.apache.org/datafusion/, just the general information, I'm a newbie on datafusion, where to get help on this.

Re: RLE array slicing

2022-09-15 Thread Tobias Zagorni
> { >     length: 2 >     offset: 6 >     rle: { >     length: 1 // actually physical length >     offset: 2 >     buffer: [3, 5,8] >     } >     values: { >    length: 1 >    offset: 2 >    buffer: [5, 6, 7] >     } > } > Does this make sense? I think this is a valid way o

Re: I need C++ tutoring

2022-09-15 Thread Nic
Hi Pacha, This isn't a great topic for the Arrow dev mailing list; it's not related to Arrow and I think it would be fair to say it's completely off-topic and could be considered a bit spammy, so please don't post requests like this here again. That said, I might be able to help you, so feel free

Re: RLE array slicing

2022-09-15 Thread Micah Kornfield
> > Slicing is part of the C data interface (with the offset member). OK, so refreshing myself for the C data interface, IIUC, I think one needs to hack RLE at a parent Array with two children arrays, because otherwise in general, I don't think I see a way of actually communicating buffer size at

Re: RLE array slicing

2022-09-15 Thread Antoine Pitrou
Le 15/09/2022 à 10:14, Micah Kornfield a écrit : I agree slicing can be tricky here. Since slicing is not part of the specification, maybe there should be two separate discussions here. I'll be honest, I forget exactly how slicing works in the C++ implementation, but is Slicing is part of t

Re: RLE array slicing

2022-09-15 Thread Micah Kornfield
I agree slicing can be tricky here. Since slicing is not part of the specification, maybe there should be two separate discussions here. I'll be honest, I forget exactly how slicing works in the C++ implementation, but is > Say you want to slice the RLE array from Logical Offset 4 (which doesn't

Re: RLE array slicing

2022-09-15 Thread Antoine Pitrou
On Thu, 15 Sep 2022 09:25:53 +0200 Antoine Pitrou wrote: > > Why would the run ends and the values have the same offset? > Also, how do you interpret the run ends if you have a physical offset > into the values array? > > > Say you have the logical values: [5, 5, 5, 6, 6, 7, 7, 7] > > Run end

Re: RLE array slicing

2022-09-15 Thread Antoine Pitrou
Le 14/09/2022 à 20:18, Weston Pace a écrit : I will clarify the offset problem. It essentially boils down to "if you don't have constant access to elements then an array length offset does not give you constant access to buffer offsets". We start with an RLE array of length 200. We slice it