Re: [VOTE] Donation of rust arrow2 and parquet2

2021-06-27 Thread Fernando Herrera
+1 On Sun, 27 Jun 2021, 07:40 Jorge Cardoso Leitão, wrote: > Hi, > > I would like to bring to this mailing list a proposal to donate the source > code of arrow2 [1] and parquet2 [2] as experimental repositories [3] within > Apache Arrow, conditional on IP clearance. > > The specific PRs are: > >

Re: [Rust] [Discuss] proposal to redesign Arrow crate to resolve safety violations

2021-05-26 Thread Fernando Herrera
Thanks Jorge for the update and the continuous development on a safer version of arrow. I would like to give my support for option 3 as well. IMHO it will give arrow2 the exposition it needs to be considered by a wider set of users. This exposition will open the possibility to receive more partici

Re: [ANNOUNCE] New Arrow committer: Daniël Heres

2021-04-28 Thread Fernando Herrera
Congrats Daniël, It is nice seeing all the code you apport to datafusion and ballista On Wed, Apr 28, 2021 at 7:39 PM QP Hou wrote: > Congrats Daniël, well deserved! > > Thanks, > QP Hou > > On Wed, Apr 28, 2021 at 6:25 AM Andy Grove wrote: > > > > On behalf of the Arrow PMC, I'm happy to annou

Re: [VOTE] Move Rust components to new repos and process

2021-04-14 Thread Fernando Herrera
+1 On Thu, 15 Apr 2021, 05:57 Sutou Kouhei, wrote: > +1 > > In > "[VOTE] Move Rust components to new repos and process" on Wed, 14 Apr > 2021 18:04:44 -0600, > Andy Grove wrote: > > > This vote is to determine if the Arrow PMC is in favor of the Rust > > community moving the Rust implement

Re: [ALL] Integration tests for dense and sparse tensor

2021-03-16 Thread Fernando Herrera
y JSON) such as > > exists for other IPC types: > > https://arrow.apache.org/docs/format/Integration.html > > > > Regards > > > > Antoine. > > > > > > Le 16/03/2021 à 10:02, Fernando Herrera a écrit : > > > Are there any plans to inclu

Re: [ALL] Integration tests for dense and sparse tensor

2021-03-16 Thread Fernando Herrera
Are there any plans to include integration testing for tensors in the pipeline? Thanks, Fernando On Mon, Mar 15, 2021 at 8:16 PM Antoine Pitrou wrote: > On Mon, 15 Mar 2021 19:48:22 + > Fernando Herrera wrote: > > Hi Neal, > > > > Thanks for the update and the li

Re: [ALL] Integration tests for dense and sparse tensor

2021-03-15 Thread Fernando Herrera
testing I think > (though there may be some features listed as "implemented" that aren't > tested). > > Neal > > On Mon, Mar 15, 2021 at 5:00 AM Fernando Herrera < > fernando.j.herr...@gmail.com> wrote: > > > Hi all, > > > > Does anyone kn

[ALL] Integration tests for dense and sparse tensor

2021-03-15 Thread Fernando Herrera
Hi all, Does anyone know what is the status for the dense and sparse tensor tests? I was looking for a data file with a tensor created with the C++ implementation but I couldnt find anything. Is anybody testing IPC for a tensor? Thanks in advance, Fernando

Re: [ANNOUNCE] New Arrow PMC member: Jorge Leitão

2021-03-08 Thread Fernando Herrera
Congrats Jorge On Mon, 8 Mar 2021, 17:26 Micah Kornfield, wrote: > Congratulations Jorge! > > On Mon, Mar 8, 2021 at 9:25 AM Wes McKinney wrote: > > > The Project Management Committee (PMC) for Apache Arrow has invited > > Jorge Leitão to become a PMC member and we are pleased to announce > > t

Re: [ANNOUNCE] New Arrow PMC member: Andrew Lamb

2021-03-08 Thread Fernando Herrera
Congrats Andrew On Mon, 8 Mar 2021, 17:26 Micah Kornfield, wrote: > Congratulations Andrew! > > On Mon, Mar 8, 2021 at 9:23 AM Wes McKinney wrote: > > > The Project Management Committee (PMC) for Apache Arrow has invited > > Andrew Lamb to become a PMC member and we are pleased to announce > >

Re: Requirements on JIRA usage in Apache Arrow

2021-03-02 Thread Fernando Herrera
Hi, Adding my two cents to this thread. I would suggest that the Jira format imposes a high wall for newcomers. Since I have been trying to help with the project, I have to get familiar with Jira to be able to help with little changes. I cannot imagine how much more work others that are contributi

Apache Arrow website

2021-02-26 Thread Fernando Herrera
Hi Apache Arrow website devs, Is there somebody that could help me to set up two pages that we are working on for Rust? Both are markdown books, one is a list of RFCs for the project and the other is a guide for Arrow in Rust. I dont know what would be the process to set up space in your server t

Re: [Rust] Arrow in WebAssemby

2021-02-26 Thread Fernando Herrera
Hi Dominic, I would be interested in a demo. Im curious to see your implementation and what advantages you have seen over javascript thanks Fernando On Thu, Feb 25, 2021 at 10:39 PM Dominik Moritz wrote: > Hello Rust Arrow Devs, > > I have been working on a wasm version of Arrow using the Rust

Re: [RUST] Error when running tests

2021-02-25 Thread Fernando Herrera
order to catch those errors. > > On Thu, Feb 25, 2021, 18:36 Fernando Herrera > > wrote: > > > It was a simple "cargo test" in the rust folder > > > > error[E0658]: binding by-move and by-ref in the same pattern is unstable > > > --> arrow/sr

Re: [RUST] Error when running tests

2021-02-25 Thread Fernando Herrera
plain E0658`. > error: could not compile `arrow` On Thu, Feb 25, 2021 at 5:34 PM Andrew Lamb wrote: > Could you possibly provide the exact error message / steps to reproduce the > problem you are seeing? I wonder if some dependent library pushed an > incompatible upgrade or somethin

[RUST] Error when running tests

2021-02-25 Thread Fernando Herrera
Today I was running Rust tests on my fork in master and got an error message regarding flight. The message reads: binding by-move and by-ref in the same pattern is unstable Does anyone know what is wrong with flight? Thanks Fernando

Re: [Rust][DataFusion] Supporting input_file_name()

2021-02-25 Thread Fernando Herrera
eddonm1:input-file > > I need to add some more tests (mainly ensure multipart parquet works as > expected) but I wanted to gather feedback on the proposal before cleaning > up for PR. > > Mike > > On Thu, Feb 25, 2021 at 8:30 PM Fernando Herrera < > fernando.j.herr...@g

Re: [Rust][DataFusion] Supporting input_file_name()

2021-02-25 Thread Fernando Herrera
Hi Mike, I've been thinking how you are considering adding metadata to the RecordBatch. The struct it is now defined as pub struct RecordBatch { > schema: SchemaRef, > columns: Vec>, > } Are you suggesting something like this? pub struct RecordBatch { > schema: SchemaRef, > co

[Rust] Actix-web and IPC

2021-02-14 Thread Fernando Herrera
Hi all, Im working on a chapter about IPC and how to use it. So far I have working examples using simple std::net and tokio, but I was wondering if it can be done using Actix-web. Does anyone have an idea how it could be done? I've seen the streaming response from Actix-web, but I couldnt find a w

Re: [Rust] [Discuss] proposal to redesign Arrow crate to resolve safety violations

2021-02-07 Thread Fernando Herrera
Hi Jorge, I tried running the code you pasted but it didnt compile. I get the next error: the trait `AsRef<[u8]>` is not implemented for `[i32; 2i32]` I had to change it to this to compile: let buffer = Buffer::from(&[0u8, 2]); > let data = ArrayData::new(DataType::Int64, 10, None, None, 0, >

Re: [RUST] Fields and schema metadata

2021-02-06 Thread Fernando Herrera
iest way to see this, is to replace it with a HashMap, and try > compile the arrow crate. > > Neville > > On Sat, 06 Feb 2021, 13:50 Fernando Herrera, > > wrote: > > > Hi all, Is there a reason why the Field metadata is a BTreeMap and > Schema's > > metada

[RUST] Fields and schema metadata

2021-02-06 Thread Fernando Herrera
Hi all, Is there a reason why the Field metadata is a BTreeMap and Schema's metadata is a HashMap? I'm just curious why different structures were selected for the same thing. Sorry if this is explained somewhere in the code, but I couldn't find anything about it. Fernando,

Re: [Rust][DataFusion] DataFusion Overview / Architecture

2021-02-04 Thread Fernando Herrera
Hi Andy. I would like to take you offer and get a copy of your book. It would help me to understand better datafusion and help Andrew with the project documentation. Fernando On Thu, 4 Feb 2021, 18:01 Andy Grove, wrote: > That's correct, Remi. I built the Kotlin query engine from scratch as I w

Re: [Rust] DataFusion TPCH benchmark overview

2021-02-04 Thread Fernando Herrera
ides as a Tech Talk > (for work, but will be open to the public) sometime in March. How about I > pull together some initial material, and then I can share that / go over it > with anyone who is interested? > > What do you think? > > Andrew > > > On Thu, Feb 4, 202

Re: [Rust] DataFusion TPCH benchmark overview

2021-02-04 Thread Fernando Herrera
Hi Andrew, I would like to work a little bit more on Datafusion, so I was wondering if you could give a small walkthrough of the code and how the queries are constructed. Do you think that could be possible? Fernando On Wed, Feb 3, 2021 at 11:13 PM Andrew Lamb wrote: > This is awesome, thank yo

Re: [RUST] Arrow guide

2021-02-02 Thread Fernando Herrera
in a separate repo for now makes sense, what do you > think about including a link to your guide in the Rust Arrow crate's > README.md? > > Andrew > > > On Mon, Feb 1, 2021 at 2:31 PM Fernando Herrera < > fernando.j.herr...@gmail.com> wrote: > > > Thank

Re: [RUST] Arrow guide

2021-02-01 Thread Fernando Herrera
ects. I for one would contribute and put time into enhancing > and maintaining it as part of the rust implementation, review changes to it > by other contributors, and keep it up to date. > > Best, > Jorge > > > > > On Sun, Jan 31, 2021 at 6:25 PM Fernando Herrera &l

[RUST] Arrow guide

2021-01-31 Thread Fernando Herrera
Hi all, During the past months I have been trying to read and understand the code base for the Rust implementation of Arrow. At the beginning I was just reading the code and figuring out what each part or module was used for. Unfortunately this approach didn't work very well and had to start from

Re: [RUST] Implement value function with Array trait

2021-01-28 Thread Fernando Herrera
/4b7cdcb9220b6d94b251aef32c21ef9b4097ecfa/rust/datafusion/src/scalar.rs#L83). That's great. On Thu, Jan 28, 2021 at 12:21 PM Fernando Herrera < fernando.j.herr...@gmail.com> wrote: > In the application I'm working on I'm reading a parquet file and creating > a table to

Re: [RUST] Implement value function with Array trait

2021-01-28 Thread Fernando Herrera
adding to Arrow for usability), but I fear it will be > quite slow as now the program would have to do some sort of type dispatch > on each element in an array rather than once for the entire array. > > On Thu, Jan 28, 2021 at 5:50 AM Fernando Herrera < > fernando.j.herr...@gmail.co

Re: [RUST] Slow Parquet writer

2021-01-28 Thread Fernando Herrera
M Andrew Lamb wrote: > The first thing I would check is that you are using a release build (`cargo > build --release`) > > If you are, there may be additional optimizations needed in the Rust > implementations > > Andrew > > On Thu, Jan 28, 2021 at 6:19 AM Fernando

[RUST] Slow Parquet writer

2021-01-28 Thread Fernando Herrera
Hi, What is the writing speed that we should expect from the Arrow Parquet writer? I'm writing a RecordBatch with two columns and 1,000,000 records and it takes a lot of time to write the batch to the file (close to 2 secs). This is what I'm doing let schema = Schema::new(vec![ > Field::new

Re: [RUST] Implement value function with Array trait

2021-01-28 Thread Fernando Herrera
nested) possible variations of the > generic. > > So, overall, this exercise convinced me that what we have is already the > simplest (but no simpler) API that we can offer under the requirements we > have (But I would love to be proven wrong, as I share your concerns) > > B

Re: [RUST] Implement value function with Array trait

2021-01-28 Thread Fernando Herrera
4` type > functions > 2. Such access would likely be much slower (though possible more > convenient) as it would dispatch based on type for each row (whereas the > downcast_as pattern does that dispatch once per array) > > Andrew > > On Wed, Jan 27, 2021 at 6:27 AM Fernando Herr

[RUST] Implement value function with Array trait

2021-01-27 Thread Fernando Herrera
Hi, I'm wondering if it has been considered to move the value function that is implemented in all the arrays (StringArray, BooleanArray, ListArray, etc) as part of the Array trait? This would help when extracting values from generic arrays that implement dyn Array without having to manually downc

[RUST] Arrow crate guide

2020-10-28 Thread Fernando Herrera
I have been working on a guide for the Rust Arrow crate. So far I have finished the first two chapters of it; the first one being an introduction to apache arrow and the second one a description of the types of arrays that can be created using the crate. I would appreciate if you could give it a r

Re: [Rust] Blog post for 2.0.0

2020-10-18 Thread Fernando Herrera
eptic>, and include these on arrow's > official documentation on build (we would need to depend on a third-party > Sphinx extension <https://www.sphinx-doc.org/en/master/usage/markdown.html > > > for this). > > This way, we keep the examples up-to-date, and the style and loc

Re: [Rust] Blog post for 2.0.0

2020-10-16 Thread Fernando Herrera
t 3:59 PM Micah Kornfield > > wrote: > > > > > Java and C++ have tutorials in Restructured Text Format in the docs > > folder > > > [1]. I think creating something similar for Rust might be the best > place > > > to start. These are rendered on the we

Re: [Rust] Blog post for 2.0.0

2020-10-16 Thread Fernando Herrera
then not sure i’ve got it right. > > Usage examples would be great. > > Regards > > Mark > > > On Oct 14, 2020, at 4:08 PM, Fernando Herrera < > fernando.j.herr...@gmail.com> wrote: > > > > I was wondering if besides this blog post there should be anot

Re: [Rust] Blog post for 2.0.0

2020-10-14 Thread Fernando Herrera
I was wondering if besides this blog post there should be another on with an example of usage. I think that is one of the key things missing for Arrow in general. This example should show the problems that Arrow is solving and how to implement the solution in real life. On Tue, Oct 13, 2020 at 10: