Arthur Maciejewicz created ARROW-6583:
-----------------------------------------

             Summary: Question and Request for Examples of Array Operations
                 Key: ARROW-6583
                 URL: https://issues.apache.org/jira/browse/ARROW-6583
             Project: Apache Arrow
          Issue Type: Improvement
          Components: Rust
            Reporter: Arthur Maciejewicz


Hi all, thank you for your excellent work on Arrow.

As I was going through the example for the Rust Arrow implementation, 
specifically the read_csv example 
[https://github.com/apache/arrow/blob/master/rust/arrow/examples/read_csv.rs] , 
as well as the generated Rustdocs, and unit tests, it was not quite clear what 
the intended usage is for operations such as filtering and masking over Arrays.

One particular use-case I'm interested in is finding all values in an Array 
such that x >= N for all x. I came across arrow::compute::array_ops::filter, 
which seems to be similar to what I want, although it's expecting a mask to 
already be constructed before performing the filter operation, and it was not 
obviously visible in the documentation, leading me to believe this might not be 
idiomatic usage.

More generally, is the expectation for Arrays on the Rust side that they are 
just simple data abstractions, without exposing higher-order methods such as 
filtering/masking? Is the intent to leave that to users? If I missed some piece 
of documentation, please let me know. For my use-case I ended up trying 
something like:


{code:java}
let column = batch.column(0).as_any().downcast_ref::<Float64Array>().unwrap();
let mut builder = BooleanBuilder::new(batch.num_rows());
let N = 5.0;
for i in 0..batch.num_rows() {
   if column.value(i).unwrap() > N {
      builder.append_value(true).unwrap();
   } else {
      builder.append_value(false).unwrap();
   }
}

let mask = builder.finish();
let filtered_column = filter(column, mask);{code}

If possible, could you provide examples of intended usage of Arrays? Thank you!

 

 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to