alamb commented on code in PR #2196:
URL: https://github.com/apache/arrow-datafusion/pull/2196#discussion_r867879803
##########
datafusion/core/tests/sql/expr.rs:
##########
@@ -491,6 +491,16 @@ async fn test_crypto_expressions() -> Result<()> {
Ok(())
}
+#[tokio::test]
+async fn test_array_index() -> Result<()> {
+ test_expression!("([5,4,3,2,1])[1]", "5");
Review Comment:
https://www.postgresql.org/docs/current/arrays.html
I verified that the subscripts are 1 based 👍
> The array subscript numbers are written within square brackets. By default
PostgreSQL uses a one-based numbering convention for arrays, that is, an array
of n elements starts with array[1] and ends with array[n].
##########
datafusion/core/tests/sql/expr.rs:
##########
@@ -491,6 +491,16 @@ async fn test_crypto_expressions() -> Result<()> {
Ok(())
}
+#[tokio::test]
+async fn test_array_index() -> Result<()> {
+ test_expression!("([5,4,3,2,1])[1]", "5");
+ test_expression!("([5,4,3,2,1])[5]", "1");
+ test_expression!("([5,4,3,2,1])[100]", "NULL");
+ test_expression!("([5,4,3,2,1])[-1]", "NULL");
Review Comment:
I wonder if you want to potentially try nested lists. Something like
```suggestion
test_expression!("([5,4,3,2,1])[-1]", "NULL");
test_expression!("([[123],[4,5,6]])[2]", "[4,5,6]");
```
##########
datafusion/physical-expr/src/expressions/get_indexed_field.rs:
##########
@@ -107,9 +136,69 @@ impl PhysicalExpr for GetIndexedFieldExpr {
}
(dt, key) => Err(DataFusionError::NotImplemented(format!("get
indexed field is only possible on lists with int64 indexes. Tried {} with {}
index", dt, key))),
},
- ColumnarValue::Scalar(_) => Err(DataFusionError::NotImplemented(
- "field access is not yet implemented for scalar
values".to_string(),
- )),
+ ColumnarValue::Scalar(scalar) => match (scalar.get_datatype(),
&self.key) {
Review Comment:
I am not sure I fully follow this code -- Since it is creating a `ArrayRef`
from the `ColumnarValue::Scalar`, I wonder why it can't use the same code as
the `ColumnarValue::Array` case and call `to_arrow()`?
https://github.com/cube-js/arrow-datafusion/blob/scalar-array-index/ballista/rust/client/src/columnar_batch.rs#L150
So for example, rather than
```rust
fn evaluate(&self, batch: &RecordBatch) -> Result<ColumnarValue> {
let arg = self.arg.evaluate(batch)?;
match arg {
ColumnarValue::Array(array) => match (array.data_type(),
&self.key) {
...
```
It could look like:
```rust
fn evaluate(&self, batch: &RecordBatch) -> Result<ColumnarValue> {
let array = self.arg.evaluate(batch)?
// convert to Arrayref
.to_arrow();
match (array.data_type(), &self.key) {
...
```
That way the same code would be used for the array and scalar cases of
`ColumnarValue`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]