[GitHub] [arrow-datafusion] houqp commented on a change in pull request #1556: Officially maintained Arrow2 branch

2022-01-16 Thread GitBox
houqp commented on a change in pull request #1556: URL: https://github.com/apache/arrow-datafusion/pull/1556#discussion_r785403949 ## File path: datafusion/src/datasource/object_store/mod.rs ## @@ -33,6 +33,12 @@ use local::LocalFileSystem; use crate::error::{DataFusionError

[GitHub] [arrow-datafusion] houqp commented on a change in pull request #1556: Officially maintained Arrow2 branch

2022-01-16 Thread GitBox
houqp commented on a change in pull request #1556: URL: https://github.com/apache/arrow-datafusion/pull/1556#discussion_r785403949 ## File path: datafusion/src/datasource/object_store/mod.rs ## @@ -33,6 +33,12 @@ use local::LocalFileSystem; use crate::error::{DataFusionError

[GitHub] [arrow-datafusion] houqp commented on a change in pull request #1556: Officially maintained Arrow2 branch

2022-01-16 Thread GitBox
houqp commented on a change in pull request #1556: URL: https://github.com/apache/arrow-datafusion/pull/1556#discussion_r785407092 ## File path: .github/workflows/rust.yml ## @@ -116,6 +116,7 @@ jobs: cargo test --no-default-features cargo run --example cs

[GitHub] [arrow-datafusion] houqp commented on a change in pull request #1556: Officially maintained Arrow2 branch

2022-01-16 Thread GitBox
houqp commented on a change in pull request #1556: URL: https://github.com/apache/arrow-datafusion/pull/1556#discussion_r785407309 ## File path: datafusion/tests/mod.rs ## @@ -1,18 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one Review comment: oh

[GitHub] [arrow-datafusion] houqp commented on pull request #1580: implement Hash for various types and replace PartialOrd

2022-01-16 Thread GitBox
houqp commented on pull request #1580: URL: https://github.com/apache/arrow-datafusion/pull/1580#issuecomment-1013834163 For the context, this is because arrow2's datatype doesn't derive PartialOrd anymore because order between types is not defined semantically. -- This is an automated

[GitHub] [arrow-datafusion] houqp commented on pull request #1556: Officially maintained Arrow2 branch

2022-01-16 Thread GitBox
houqp commented on pull request #1556: URL: https://github.com/apache/arrow-datafusion/pull/1556#issuecomment-1013835311 Thank you everyone for all the reviews and comments so far. @Igosuki and I have addressed most of them. Here are the two remaining todo items: * Get the parquet r

[GitHub] [arrow-datafusion] xudong963 opened a new issue #1585: convert `outer join` to `inner join` to improve performance

2022-01-16 Thread GitBox
xudong963 opened a new issue #1585: URL: https://github.com/apache/arrow-datafusion/issues/1585 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** Under certain conditions, convert `outer join` to `inner join` to improve performance

[GitHub] [arrow-datafusion] xudong963 commented on pull request #1339: fix: filter pushdown with unsafe null

2022-01-16 Thread GitBox
xudong963 commented on pull request #1339: URL: https://github.com/apache/arrow-datafusion/pull/1339#issuecomment-1013836655 @alamb , I thought about this again, I think we can open a new issue to summarize the conclusion, and close this old ticket. -- This is an automated message from

[GitHub] [arrow-datafusion] xudong963 opened a new issue #1586: Predicate pushdown requires special process for outer Join.

2022-01-16 Thread GitBox
xudong963 opened a new issue #1586: URL: https://github.com/apache/arrow-datafusion/issues/1586 **Describe the bug** Now datafusion doesn't process predicate pushdown correctly when there is `outer join`. I have discussed much in #1339 with @alamb. I think we ended up with a consens

[GitHub] [arrow-datafusion] xudong963 closed pull request #1339: fix: filter pushdown with unsafe null

2022-01-16 Thread GitBox
xudong963 closed pull request #1339: URL: https://github.com/apache/arrow-datafusion/pull/1339 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: git

[GitHub] [arrow-datafusion] xudong963 edited a comment on pull request #1339: fix: filter pushdown with unsafe null

2022-01-16 Thread GitBox
xudong963 edited a comment on pull request #1339: URL: https://github.com/apache/arrow-datafusion/pull/1339#issuecomment-1013836655 @alamb , I thought about this again, I think we can open a new issue #1586 to summarize the conclusion, and close this old ticket. -- This is an a

[GitHub] [arrow-datafusion] xudong963 commented on issue #1321: Sql query `LEFT JOIN WHERE right IS NULL` return unexpected result.

2022-01-16 Thread GitBox
xudong963 commented on issue #1321: URL: https://github.com/apache/arrow-datafusion/issues/1321#issuecomment-1013839989 I think the issue can be closed, #1586 is a more wide @alamb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

[GitHub] [arrow-rs] jhorstmann commented on a change in pull request #1183: Truncate bitmask on split

2022-01-16 Thread GitBox
jhorstmann commented on a change in pull request #1183: URL: https://github.com/apache/arrow-rs/pull/1183#discussion_r785417338 ## File path: arrow/src/array/builder.rs ## @@ -367,6 +367,14 @@ impl BooleanBufferBuilder { } } +/// Resizes the buffer, either t

[GitHub] [arrow-rs] tustvold commented on a change in pull request #1183: Truncate bitmask on split

2022-01-16 Thread GitBox
tustvold commented on a change in pull request #1183: URL: https://github.com/apache/arrow-rs/pull/1183#discussion_r785417604 ## File path: arrow/src/array/builder.rs ## @@ -367,6 +367,14 @@ impl BooleanBufferBuilder { } } +/// Resizes the buffer, either tru

[GitHub] [arrow-rs] tustvold commented on a change in pull request #1183: Truncate bitmask on split

2022-01-16 Thread GitBox
tustvold commented on a change in pull request #1183: URL: https://github.com/apache/arrow-rs/pull/1183#discussion_r785417604 ## File path: arrow/src/array/builder.rs ## @@ -367,6 +367,14 @@ impl BooleanBufferBuilder { } } +/// Resizes the buffer, either tru

[GitHub] [arrow-rs] codecov-commenter edited a comment on pull request #1183: Truncate bitmask on split

2022-01-16 Thread GitBox
codecov-commenter edited a comment on pull request #1183: URL: https://github.com/apache/arrow-rs/pull/1183#issuecomment-1013707795 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1183?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm

[GitHub] [arrow-rs] codecov-commenter edited a comment on pull request #1183: Truncate bitmask on split

2022-01-16 Thread GitBox
codecov-commenter edited a comment on pull request #1183: URL: https://github.com/apache/arrow-rs/pull/1183#issuecomment-1013707795 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1183?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm

[GitHub] [arrow-rs] tustvold commented on a change in pull request #1082: Optimized ByteArrayReader (#1040)

2022-01-16 Thread GitBox
tustvold commented on a change in pull request #1082: URL: https://github.com/apache/arrow-rs/pull/1082#discussion_r785424844 ## File path: parquet/src/arrow/array_reader/byte_array.rs ## @@ -192,7 +211,16 @@ impl OffsetBuffer { self.offsets.len() - 1 } -fn

[GitHub] [arrow-rs] tustvold commented on a change in pull request #1082: Optimized ByteArrayReader (#1040)

2022-01-16 Thread GitBox
tustvold commented on a change in pull request #1082: URL: https://github.com/apache/arrow-rs/pull/1082#discussion_r785424844 ## File path: parquet/src/arrow/array_reader/byte_array.rs ## @@ -192,7 +211,16 @@ impl OffsetBuffer { self.offsets.len() - 1 } -fn

[GitHub] [arrow-rs] codecov-commenter edited a comment on pull request #1082: Optimized ByteArrayReader (#1040)

2022-01-16 Thread GitBox
codecov-commenter edited a comment on pull request #1082: URL: https://github.com/apache/arrow-rs/pull/1082#issuecomment-998923929 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1082?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_

[GitHub] [arrow-rs] codecov-commenter edited a comment on pull request #1082: Optimized ByteArrayReader (#1040)

2022-01-16 Thread GitBox
codecov-commenter edited a comment on pull request #1082: URL: https://github.com/apache/arrow-rs/pull/1082#issuecomment-998923929 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1082?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_

[GitHub] [arrow-rs] alamb merged pull request #1165: Use tempfile for parquet tests

2022-01-16 Thread GitBox
alamb merged pull request #1165: URL: https://github.com/apache/arrow-rs/pull/1165 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[GitHub] [arrow-rs] alamb closed issue #1164: Parquet Tests Cleanup Temporary Files

2022-01-16 Thread GitBox
alamb closed issue #1164: URL: https://github.com/apache/arrow-rs/issues/1164 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@a

[GitHub] [arrow-datafusion] alamb merged pull request #1561: add correlation function

2022-01-16 Thread GitBox
alamb merged pull request #1561: URL: https://github.com/apache/arrow-datafusion/pull/1561 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-

[GitHub] [arrow-datafusion] alamb commented on pull request #1561: add correlation function

2022-01-16 Thread GitBox
alamb commented on pull request #1561: URL: https://github.com/apache/arrow-datafusion/pull/1561#issuecomment-1013864306 Thanks @realno ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

[GitHub] [arrow-datafusion] alamb opened a new issue #1587: Proposal: new release of DataFusion?

2022-01-16 Thread GitBox
alamb opened a new issue #1587: URL: https://github.com/apache/arrow-datafusion/issues/1587 I wonder if it is time to release a new version of datafusion to crates.io? It would be great to crowdsource: 1. Update readme / changelog 2. Update version 2. (maybe) a blog post?

[GitHub] [arrow-datafusion] xudong963 commented on issue #1587: Proposal: new release of DataFusion?

2022-01-16 Thread GitBox
xudong963 commented on issue #1587: URL: https://github.com/apache/arrow-datafusion/issues/1587#issuecomment-1013866128 Yeah, I think we can wait for arrow2 related tickets merged into master? BTW, I can help write a blog! -- This is an automated message from the Apache Git Service.

[GitHub] [arrow-datafusion] xudong963 commented on issue #1586: datafusion doesn't process predicate pushdown correctly when there is outer join

2022-01-16 Thread GitBox
xudong963 commented on issue #1586: URL: https://github.com/apache/arrow-datafusion/issues/1586#issuecomment-1013866259 cc @alamb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [arrow-datafusion] alamb closed issue #1574: SQL integration tests named `mod`

2022-01-16 Thread GitBox
alamb closed issue #1574: URL: https://github.com/apache/arrow-datafusion/issues/1574 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsub

[GitHub] [arrow-datafusion] alamb merged pull request #1575: Rename sql integration tests from `mod` to `sql_integration`

2022-01-16 Thread GitBox
alamb merged pull request #1575: URL: https://github.com/apache/arrow-datafusion/pull/1575 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-

[GitHub] [arrow-datafusion] alamb commented on issue #1587: Proposal: new release of DataFusion?

2022-01-16 Thread GitBox
alamb commented on issue #1587: URL: https://github.com/apache/arrow-datafusion/issues/1587#issuecomment-1013867577 arrow2 may be a good driver. I don't have a good sense of how many projects use datafusion from `crates.io` (aka what has been released) vs how many use it via a `git

[GitHub] [arrow-rs] Dandandan commented on a change in pull request #1082: Optimized ByteArrayReader (#1040)

2022-01-16 Thread GitBox
Dandandan commented on a change in pull request #1082: URL: https://github.com/apache/arrow-rs/pull/1082#discussion_r785434772 ## File path: parquet/src/arrow/array_reader/byte_array.rs ## @@ -192,7 +211,16 @@ impl OffsetBuffer { self.offsets.len() - 1 } -fn

[GitHub] [arrow-rs] Dandandan commented on a change in pull request #1082: Optimized ByteArrayReader (#1040)

2022-01-16 Thread GitBox
Dandandan commented on a change in pull request #1082: URL: https://github.com/apache/arrow-rs/pull/1082#discussion_r785434772 ## File path: parquet/src/arrow/array_reader/byte_array.rs ## @@ -192,7 +211,16 @@ impl OffsetBuffer { self.offsets.len() - 1 } -fn

[GitHub] [arrow-datafusion] alamb commented on pull request #1582: remove update and merge from accumulator

2022-01-16 Thread GitBox
alamb commented on pull request #1582: URL: https://github.com/apache/arrow-datafusion/pull/1582#issuecomment-1013868334 Nice @Jimexist -- you beat me to it :) There are a few more instances that need updating -- will see if I can try and help fix it -- This is an automated mes

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #1581: update reference to python and update readme

2022-01-16 Thread GitBox
alamb commented on a change in pull request #1581: URL: https://github.com/apache/arrow-datafusion/pull/1581#discussion_r785436468 ## File path: README.md ## @@ -49,14 +49,19 @@ the convenience of an SQL interface or a DataFrame API. ## Known Uses +Projects that adapt to o

[GitHub] [arrow-datafusion] alamb merged pull request #1581: update reference to python and update readme

2022-01-16 Thread GitBox
alamb merged pull request #1581: URL: https://github.com/apache/arrow-datafusion/pull/1581 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-

[GitHub] [arrow-datafusion] alamb merged pull request #1567: minor: improve the benchmark readme

2022-01-16 Thread GitBox
alamb merged pull request #1567: URL: https://github.com/apache/arrow-datafusion/pull/1567 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-

[GitHub] [arrow-datafusion] alamb commented on pull request #1567: minor: improve the benchmark readme

2022-01-16 Thread GitBox
alamb commented on pull request #1567: URL: https://github.com/apache/arrow-datafusion/pull/1567#issuecomment-1013869714 Thanks @xudong963 ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #1580: implement Hash for various types and replace PartialOrd

2022-01-16 Thread GitBox
alamb commented on a change in pull request #1580: URL: https://github.com/apache/arrow-datafusion/pull/1580#discussion_r785436677 ## File path: datafusion/src/logical_plan/expr.rs ## @@ -372,6 +373,23 @@ pub enum Expr { Wildcard, } +/// Fixed seed for the hashing so th

[GitHub] [arrow-datafusion] alamb merged pull request #1465: Tests for support try_cast/cast decimal to numeric

2022-01-16 Thread GitBox
alamb merged pull request #1465: URL: https://github.com/apache/arrow-datafusion/pull/1465 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-

[GitHub] [arrow-datafusion] alamb commented on pull request #1465: Tests for support try_cast/cast decimal to numeric

2022-01-16 Thread GitBox
alamb commented on pull request #1465: URL: https://github.com/apache/arrow-datafusion/pull/1465#issuecomment-1013870434 Thanks again @liukun4515 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [arrow-datafusion] Jimexist commented on a change in pull request #1582: remove update and merge from accumulator

2022-01-16 Thread GitBox
Jimexist commented on a change in pull request #1582: URL: https://github.com/apache/arrow-datafusion/pull/1582#discussion_r785437522 ## File path: datafusion/src/datasource/file_format/parquet.rs ## @@ -176,15 +188,21 @@ fn summarize_min_max( if let DataType::Int6

[GitHub] [arrow-datafusion] Jimexist merged pull request #1580: implement Hash for various types and replace PartialOrd

2022-01-16 Thread GitBox
Jimexist merged pull request #1580: URL: https://github.com/apache/arrow-datafusion/pull/1580 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gith

[GitHub] [arrow-datafusion] alamb commented on issue #1578: Write documentation explaining how to enable metrics

2022-01-16 Thread GitBox
alamb commented on issue #1578: URL: https://github.com/apache/arrow-datafusion/issues/1578#issuecomment-1013871212 This could perhaps start with showing `EXPLAIN` and then `EXPLAIN ANALYZE` to show the metrics and how to interpret them -- This is an automated message from the Apache Gi

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #1562: Consolidate `batch_size` configuration in `ExecutionConfig`, `RuntimeConfig` and `PhysicalPlanConfig`

2022-01-16 Thread GitBox
alamb commented on a change in pull request #1562: URL: https://github.com/apache/arrow-datafusion/pull/1562#discussion_r785437868 ## File path: ballista/rust/core/src/serde/logical_plan/from_proto.rs ## @@ -246,8 +246,8 @@ impl TryInto for &protobuf::LogicalPlanNode {

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #1582: remove update and merge from accumulator

2022-01-16 Thread GitBox
alamb commented on a change in pull request #1582: URL: https://github.com/apache/arrow-datafusion/pull/1582#discussion_r785438626 ## File path: datafusion-examples/examples/simple_udaf.rs ## @@ -113,6 +99,20 @@ impl Accumulator for GeometricMean { }; Ok(())

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #1582: remove update and merge from accumulator

2022-01-16 Thread GitBox
alamb commented on a change in pull request #1582: URL: https://github.com/apache/arrow-datafusion/pull/1582#discussion_r785439224 ## File path: datafusion-examples/examples/simple_udaf.rs ## @@ -124,6 +124,34 @@ impl Accumulator for GeometricMean { // Optimization hint: t

[GitHub] [arrow-datafusion] alamb commented on issue #1576: casting `Int64` to `Float64` unsuccessfully caused tpch8 to fail

2022-01-16 Thread GitBox
alamb commented on issue #1576: URL: https://github.com/apache/arrow-datafusion/issues/1576#issuecomment-1013874509 > Do type conversion in datafusion, such as we convert true_values and false_values type to final type in advance. I think it would be most appropriate to have datafus

[GitHub] [arrow-datafusion] alamb commented on pull request #1560: Introduce push-based task scheduling

2022-01-16 Thread GitBox
alamb commented on pull request #1560: URL: https://github.com/apache/arrow-datafusion/pull/1560#issuecomment-1013876597 Thank you for the contribution @yahoNanJing . This looks like some great work, but I don't feel qualified to review this at the moment. I think this is a symptom

[GitHub] [arrow-datafusion] xudong963 commented on pull request #1566: fix: sql planner creates cross join instead of inner join from select predicates

2022-01-16 Thread GitBox
xudong963 commented on pull request #1566: URL: https://github.com/apache/arrow-datafusion/pull/1566#issuecomment-1013877492 > Could you possibly provide some tests @xudong963 ? Sure. > I was expecting to see code that basically applied an algebraic transformation on pred

[GitHub] [arrow-datafusion] xudong963 edited a comment on pull request #1566: fix: sql planner creates cross join instead of inner join from select predicates

2022-01-16 Thread GitBox
xudong963 edited a comment on pull request #1566: URL: https://github.com/apache/arrow-datafusion/pull/1566#issuecomment-1013877492 > Could you possibly provide some tests @xudong963 ? Sure. The resulting correctness test can be overridden by the current test. I can add a test about

[GitHub] [arrow] dhruv9vats opened a new pull request #12162: ARROW-15089: [C++][Compute] Implement kernel to lookup a MapArray item for a given key

2022-01-16 Thread GitBox
dhruv9vats opened a new pull request #12162: URL: https://github.com/apache/arrow/pull/12162 Implement a Scalar Kernel to lookup a value for a given key in a `MapArray`, whose type is an alias for `List( Struct(, ) )` -- This is an automated message from the Apache Git Service. To respon

[GitHub] [arrow] github-actions[bot] commented on pull request #12162: ARROW-15089: [C++][Compute] Implement kernel to lookup a MapArray item for a given key

2022-01-16 Thread GitBox
github-actions[bot] commented on pull request #12162: URL: https://github.com/apache/arrow/pull/12162#issuecomment-1013884366 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow-datafusion] Jimexist opened a new pull request #1588: add from_slice trait to ease arrow2 migration

2022-01-16 Thread GitBox
Jimexist opened a new pull request #1588: URL: https://github.com/apache/arrow-datafusion/pull/1588 # Which issue does this PR close? Closes #. # Rationale for this change # What changes are included in this PR? # Are there any user-facing changes

[GitHub] [arrow] dhruv9vats commented on a change in pull request #12162: ARROW-15089: [C++][Compute] Implement kernel to lookup a MapArray item for a given key

2022-01-16 Thread GitBox
dhruv9vats commented on a change in pull request #12162: URL: https://github.com/apache/arrow/pull/12162#discussion_r785449075 ## File path: cpp/src/arrow/compute/api_scalar.h ## @@ -470,6 +470,30 @@ class ARROW_EXPORT RandomOptions : public FunctionOptions { uint64_t seed;

[GitHub] [arrow] dhruv9vats commented on a change in pull request #12162: ARROW-15089: [C++][Compute] Implement kernel to lookup a MapArray item for a given key

2022-01-16 Thread GitBox
dhruv9vats commented on a change in pull request #12162: URL: https://github.com/apache/arrow/pull/12162#discussion_r785449279 ## File path: cpp/src/arrow/compute/kernels/scalar_nested.cc ## @@ -17,6 +17,7 @@ // Vector kernels involving nested types +#include Review comm

[GitHub] [arrow] dhruv9vats commented on a change in pull request #12162: ARROW-15089: [C++][Compute] Implement kernel to lookup a MapArray item for a given key

2022-01-16 Thread GitBox
dhruv9vats commented on a change in pull request #12162: URL: https://github.com/apache/arrow/pull/12162#discussion_r785449504 ## File path: cpp/src/arrow/compute/kernels/scalar_nested.cc ## @@ -429,6 +430,97 @@ const FunctionDoc make_struct_doc{"Wrap Arrays into a StructArray

[GitHub] [arrow] dhruv9vats commented on a change in pull request #12162: ARROW-15089: [C++][Compute] Implement kernel to lookup a MapArray item for a given key

2022-01-16 Thread GitBox
dhruv9vats commented on a change in pull request #12162: URL: https://github.com/apache/arrow/pull/12162#discussion_r785449597 ## File path: cpp/src/arrow/compute/kernels/scalar_nested.cc ## @@ -429,6 +430,97 @@ const FunctionDoc make_struct_doc{"Wrap Arrays into a StructArray

[GitHub] [arrow] dhruv9vats commented on a change in pull request #12162: ARROW-15089: [C++][Compute] Implement kernel to lookup a MapArray item for a given key

2022-01-16 Thread GitBox
dhruv9vats commented on a change in pull request #12162: URL: https://github.com/apache/arrow/pull/12162#discussion_r785449919 ## File path: cpp/src/arrow/compute/kernels/scalar_nested.cc ## @@ -429,6 +430,97 @@ const FunctionDoc make_struct_doc{"Wrap Arrays into a StructArray

[GitHub] [arrow] dhruv9vats commented on a change in pull request #12162: ARROW-15089: [C++][Compute] Implement kernel to lookup a MapArray item for a given key

2022-01-16 Thread GitBox
dhruv9vats commented on a change in pull request #12162: URL: https://github.com/apache/arrow/pull/12162#discussion_r785450378 ## File path: cpp/src/arrow/compute/kernels/scalar_nested.cc ## @@ -429,6 +430,97 @@ const FunctionDoc make_struct_doc{"Wrap Arrays into a StructArray

[GitHub] [arrow] bkmgit commented on pull request #11882: ARROW-9843: [C++][Python] Implement Between ternary kernel and python bindings

2022-01-16 Thread GitBox
bkmgit commented on pull request #11882: URL: https://github.com/apache/arrow/pull/11882#issuecomment-1013887736 Not able to remove/add labels -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] dhruv9vats commented on a change in pull request #12162: ARROW-15089: [C++][Compute] Implement kernel to lookup a MapArray item for a given key

2022-01-16 Thread GitBox
dhruv9vats commented on a change in pull request #12162: URL: https://github.com/apache/arrow/pull/12162#discussion_r785451341 ## File path: cpp/src/arrow/compute/kernels/scalar_nested.cc ## @@ -429,6 +430,97 @@ const FunctionDoc make_struct_doc{"Wrap Arrays into a StructArray

[GitHub] [arrow] dhruv9vats commented on a change in pull request #12162: ARROW-15089: [C++][Compute] Implement kernel to lookup a MapArray item for a given key

2022-01-16 Thread GitBox
dhruv9vats commented on a change in pull request #12162: URL: https://github.com/apache/arrow/pull/12162#discussion_r785451748 ## File path: cpp/src/arrow/compute/kernels/scalar_nested.cc ## @@ -429,6 +430,97 @@ const FunctionDoc make_struct_doc{"Wrap Arrays into a StructArray

[GitHub] [arrow] dhruv9vats commented on a change in pull request #12162: ARROW-15089: [C++][Compute] Implement kernel to lookup a MapArray item for a given key

2022-01-16 Thread GitBox
dhruv9vats commented on a change in pull request #12162: URL: https://github.com/apache/arrow/pull/12162#discussion_r785451936 ## File path: cpp/src/arrow/compute/kernels/scalar_nested.cc ## @@ -429,6 +430,97 @@ const FunctionDoc make_struct_doc{"Wrap Arrays into a StructArray

[GitHub] [arrow] dhruv9vats commented on a change in pull request #12162: ARROW-15089: [C++][Compute] Implement kernel to lookup a MapArray item for a given key

2022-01-16 Thread GitBox
dhruv9vats commented on a change in pull request #12162: URL: https://github.com/apache/arrow/pull/12162#discussion_r785452692 ## File path: cpp/src/arrow/compute/kernels/scalar_nested.cc ## @@ -429,6 +430,97 @@ const FunctionDoc make_struct_doc{"Wrap Arrays into a StructArray

[GitHub] [arrow-rs] tustvold commented on a change in pull request #1082: Optimized ByteArrayReader (#1040)

2022-01-16 Thread GitBox
tustvold commented on a change in pull request #1082: URL: https://github.com/apache/arrow-rs/pull/1082#discussion_r785452762 ## File path: parquet/src/column/reader.rs ## @@ -258,11 +258,15 @@ where // At this point we have read values, definition and repetition

[GitHub] [arrow-rs] liukun4515 commented on pull request #1172: support sort decimal data

2022-01-16 Thread GitBox
liukun4515 commented on pull request #1172: URL: https://github.com/apache/arrow-rs/pull/1172#issuecomment-1013891615 @alamb @houqp PTAL -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

[GitHub] [arrow] toppyy commented on pull request #12083: ARROW-14744: [R] open_dataset() error when `schema` argument supplied, but `column_names` not supplied to `CSVReadOptions`

2022-01-16 Thread GitBox
toppyy commented on pull request #12083: URL: https://github.com/apache/arrow/pull/12083#issuecomment-1013892092 I reverted the previous commits and implemented the following logic in `CsvFileFormat$create` as discussed: If a schema is specified and column names are set in read_options, we

[GitHub] [arrow-rs] tustvold commented on a change in pull request #1082: Optimized ByteArrayReader (#1040)

2022-01-16 Thread GitBox
tustvold commented on a change in pull request #1082: URL: https://github.com/apache/arrow-rs/pull/1082#discussion_r785454161 ## File path: parquet/src/arrow/array_reader/byte_array.rs ## @@ -192,7 +211,16 @@ impl OffsetBuffer { self.offsets.len() - 1 } -fn

[GitHub] [arrow] dhruv9vats commented on a change in pull request #12162: ARROW-15089: [C++][Compute] Implement kernel to lookup a MapArray item for a given key

2022-01-16 Thread GitBox
dhruv9vats commented on a change in pull request #12162: URL: https://github.com/apache/arrow/pull/12162#discussion_r785454432 ## File path: cpp/src/arrow/compute/kernels/scalar_nested.cc ## @@ -429,6 +430,97 @@ const FunctionDoc make_struct_doc{"Wrap Arrays into a StructArray

[GitHub] [arrow-rs] tustvold commented on a change in pull request #1082: Optimized ByteArrayReader (#1040)

2022-01-16 Thread GitBox
tustvold commented on a change in pull request #1082: URL: https://github.com/apache/arrow-rs/pull/1082#discussion_r785454423 ## File path: parquet/src/arrow/array_reader/offset_buffer.rs ## @@ -0,0 +1,211 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #1588: add from_slice trait to ease arrow2 migration

2022-01-16 Thread GitBox
alamb commented on a change in pull request #1588: URL: https://github.com/apache/arrow-datafusion/pull/1588#discussion_r785455086 ## File path: datafusion/src/from_slice.rs ## @@ -0,0 +1,45 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contribut

[GitHub] [arrow-datafusion] alamb commented on pull request #1559: Remove call_ip in the SchedulerServer

2022-01-16 Thread GitBox
alamb commented on pull request #1559: URL: https://github.com/apache/arrow-datafusion/pull/1559#issuecomment-1013893772 Thanks @yahoNanJing -- CI seems to be failing on this PR. Can you please resolve the issues? -- This is an automated message from the Apache Git Service. To respond

[GitHub] [arrow-datafusion] Jimexist commented on a change in pull request #1588: add from_slice trait to ease arrow2 migration

2022-01-16 Thread GitBox
Jimexist commented on a change in pull request #1588: URL: https://github.com/apache/arrow-datafusion/pull/1588#discussion_r785455413 ## File path: datafusion/src/from_slice.rs ## @@ -0,0 +1,45 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contri

[GitHub] [arrow] dhruv9vats commented on a change in pull request #12162: ARROW-15089: [C++][Compute] Implement kernel to lookup a MapArray item for a given key

2022-01-16 Thread GitBox
dhruv9vats commented on a change in pull request #12162: URL: https://github.com/apache/arrow/pull/12162#discussion_r785455620 ## File path: cpp/src/arrow/compute/kernels/scalar_nested.cc ## @@ -429,6 +430,97 @@ const FunctionDoc make_struct_doc{"Wrap Arrays into a StructArray

[GitHub] [arrow-julia] jakkosdev commented on issue #244: `Arrow.write` doesn't play nicely when the target table is lazy.

2022-01-16 Thread GitBox
jakkosdev commented on issue #244: URL: https://github.com/apache/arrow-julia/issues/244#issuecomment-1013894181 > Ooh, good point, maybe #237 is related. Maybe we just shouldn’t have an `@async` there? Since `ntasks=1`. Not sure it's helpful, but removing the `@async` did let me wri

[GitHub] [arrow] dhruv9vats commented on a change in pull request #12162: ARROW-15089: [C++][Compute] Implement kernel to lookup a MapArray item for a given key

2022-01-16 Thread GitBox
dhruv9vats commented on a change in pull request #12162: URL: https://github.com/apache/arrow/pull/12162#discussion_r785455925 ## File path: cpp/src/arrow/compute/kernels/scalar_nested.cc ## @@ -429,6 +430,97 @@ const FunctionDoc make_struct_doc{"Wrap Arrays into a StructArray

[GitHub] [arrow] dhruv9vats commented on a change in pull request #12162: ARROW-15089: [C++][Compute] Implement kernel to lookup a MapArray item for a given key

2022-01-16 Thread GitBox
dhruv9vats commented on a change in pull request #12162: URL: https://github.com/apache/arrow/pull/12162#discussion_r785455925 ## File path: cpp/src/arrow/compute/kernels/scalar_nested.cc ## @@ -429,6 +430,97 @@ const FunctionDoc make_struct_doc{"Wrap Arrays into a StructArray

[GitHub] [arrow-julia] jakkosdev removed a comment on issue #244: `Arrow.write` doesn't play nicely when the target table is lazy.

2022-01-16 Thread GitBox
jakkosdev removed a comment on issue #244: URL: https://github.com/apache/arrow-julia/issues/244#issuecomment-1013894181 > Ooh, good point, maybe #237 is related. Maybe we just shouldn’t have an `@async` there? Since `ntasks=1`. Not sure it's helpful, but removing the `@async` did le

[GitHub] [arrow] dhruv9vats commented on a change in pull request #12162: ARROW-15089: [C++][Compute] Implement kernel to lookup a MapArray item for a given key

2022-01-16 Thread GitBox
dhruv9vats commented on a change in pull request #12162: URL: https://github.com/apache/arrow/pull/12162#discussion_r785456149 ## File path: cpp/src/arrow/compute/kernels/scalar_nested.cc ## @@ -429,6 +430,97 @@ const FunctionDoc make_struct_doc{"Wrap Arrays into a StructArray

[GitHub] [arrow] dhruv9vats commented on a change in pull request #12162: ARROW-15089: [C++][Compute] Implement kernel to lookup a MapArray item for a given key

2022-01-16 Thread GitBox
dhruv9vats commented on a change in pull request #12162: URL: https://github.com/apache/arrow/pull/12162#discussion_r785457583 ## File path: cpp/src/arrow/compute/kernels/scalar_nested_test.cc ## @@ -225,6 +225,30 @@ TEST(TestScalarNested, StructField) { } } +TEST(TestSca

[GitHub] [arrow] dhruv9vats commented on a change in pull request #12162: ARROW-15089: [C++][Compute] Implement kernel to lookup a MapArray item for a given key

2022-01-16 Thread GitBox
dhruv9vats commented on a change in pull request #12162: URL: https://github.com/apache/arrow/pull/12162#discussion_r785458284 ## File path: cpp/src/arrow/compute/kernels/scalar_nested_test.cc ## @@ -225,6 +225,30 @@ TEST(TestScalarNested, StructField) { } } +TEST(TestSca

[GitHub] [arrow] dhruv9vats commented on a change in pull request #12162: ARROW-15089: [C++][Compute] Implement kernel to lookup a MapArray item for a given key

2022-01-16 Thread GitBox
dhruv9vats commented on a change in pull request #12162: URL: https://github.com/apache/arrow/pull/12162#discussion_r785451341 ## File path: cpp/src/arrow/compute/kernels/scalar_nested.cc ## @@ -429,6 +430,97 @@ const FunctionDoc make_struct_doc{"Wrap Arrays into a StructArray

[GitHub] [arrow-datafusion] Igosuki commented on pull request #1556: Officially maintained Arrow2 branch

2022-01-16 Thread GitBox
Igosuki commented on pull request #1556: URL: https://github.com/apache/arrow-datafusion/pull/1556#issuecomment-1013898492 Yes sorry about that, these were simply comments to Indicate that these particular feature tests were not passing. Le dim. 16 janv. 2022 à 09:52, QP Hou ***@*

[GitHub] [arrow] dhruv9vats commented on a change in pull request #12162: ARROW-15089: [C++][Compute] Implement kernel to lookup a MapArray item for a given key

2022-01-16 Thread GitBox
dhruv9vats commented on a change in pull request #12162: URL: https://github.com/apache/arrow/pull/12162#discussion_r785451341 ## File path: cpp/src/arrow/compute/kernels/scalar_nested.cc ## @@ -429,6 +430,97 @@ const FunctionDoc make_struct_doc{"Wrap Arrays into a StructArray

[GitHub] [arrow-rs] tustvold commented on a change in pull request #1082: Optimized ByteArrayReader (#1040)

2022-01-16 Thread GitBox
tustvold commented on a change in pull request #1082: URL: https://github.com/apache/arrow-rs/pull/1082#discussion_r785469777 ## File path: parquet/src/arrow/array_reader/byte_array.rs ## @@ -0,0 +1,639 @@ +use crate::arrow::array_reader::{read_records, ArrayReader}; +use crate

[GitHub] [arrow-rs] tustvold commented on pull request #1082: Optimized ByteArrayReader (#1040)

2022-01-16 Thread GitBox
tustvold commented on pull request #1082: URL: https://github.com/apache/arrow-rs/pull/1082#issuecomment-1013914022 I'm going to continue to write tests for various parts of this, but I think the main bulk is ready for review and has fairly good coverage from the fuzz tests. Unfortu

[GitHub] [arrow-rs] tustvold commented on a change in pull request #1082: Optimized ByteArrayReader (#1040)

2022-01-16 Thread GitBox
tustvold commented on a change in pull request #1082: URL: https://github.com/apache/arrow-rs/pull/1082#discussion_r785470260 ## File path: parquet/src/arrow/array_reader/byte_array.rs ## @@ -0,0 +1,639 @@ +use crate::arrow::array_reader::{read_records, ArrayReader}; +use crate

[GitHub] [arrow-rs] tustvold commented on a change in pull request #1082: Optimized ByteArrayReader (#1040)

2022-01-16 Thread GitBox
tustvold commented on a change in pull request #1082: URL: https://github.com/apache/arrow-rs/pull/1082#discussion_r785470310 ## File path: parquet/src/arrow/array_reader/byte_array.rs ## @@ -0,0 +1,639 @@ +use crate::arrow::array_reader::{read_records, ArrayReader}; +use crate

[GitHub] [arrow-rs] tustvold edited a comment on pull request #1082: Optimized ByteArrayReader (#1040)

2022-01-16 Thread GitBox
tustvold edited a comment on pull request #1082: URL: https://github.com/apache/arrow-rs/pull/1082#issuecomment-1013914022 I'm going to continue to write tests for various parts of this, but I think the main bulk is ready for review and has fairly good coverage from the fuzz tests.

[GitHub] [arrow] ursabot commented on pull request #10371: ARROW-12549: [JS] Table and RecordBatch should not extend Vector, make JS lib smaller

2022-01-16 Thread GitBox
ursabot commented on pull request #10371: URL: https://github.com/apache/arrow/pull/10371#issuecomment-1013916366 Benchmark runs are scheduled for baseline = 7029f90ea3b39e97f1a671227ca932cbcdbcee05 and contender = 6619579f65926aec120b1fdf8c552657f4afebeb. Results will be available as each

[GitHub] [arrow] domoritz commented on pull request #10371: ARROW-12549: [JS] Table and RecordBatch should not extend Vector, make JS lib smaller

2022-01-16 Thread GitBox
domoritz commented on pull request #10371: URL: https://github.com/apache/arrow/pull/10371#issuecomment-1013916348 @ursabot please benchmark lang=JavaScript -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

[GitHub] [arrow] domoritz closed pull request #10371: ARROW-12549: [JS] Table and RecordBatch should not extend Vector, make JS lib smaller

2022-01-16 Thread GitBox
domoritz closed pull request #10371: URL: https://github.com/apache/arrow/pull/10371 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubs

[GitHub] [arrow-rs] codecov-commenter edited a comment on pull request #1082: Optimized ByteArrayReader (#1040)

2022-01-16 Thread GitBox
codecov-commenter edited a comment on pull request #1082: URL: https://github.com/apache/arrow-rs/pull/1082#issuecomment-998923929 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1082?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_

[GitHub] [arrow] ursabot commented on pull request #10371: ARROW-12549: [JS] Table and RecordBatch should not extend Vector, make JS lib smaller

2022-01-16 Thread GitBox
ursabot commented on pull request #10371: URL: https://github.com/apache/arrow/pull/10371#issuecomment-1013918249 Benchmark runs are scheduled for baseline = 7029f90ea3b39e97f1a671227ca932cbcdbcee05 and contender = 20b66c255ff617c438775e54081eaa02d5b983e1. 20b66c255ff617c438775e54081eaa02

[GitHub] [arrow] ursabot edited a comment on pull request #10371: ARROW-12549: [JS] Table and RecordBatch should not extend Vector, make JS lib smaller

2022-01-16 Thread GitBox
ursabot edited a comment on pull request #10371: URL: https://github.com/apache/arrow/pull/10371#issuecomment-1013916366 Benchmark runs are scheduled for baseline = 7029f90ea3b39e97f1a671227ca932cbcdbcee05 and contender = 6619579f65926aec120b1fdf8c552657f4afebeb. Results will be available

[GitHub] [arrow-datafusion] jychen7 commented on issue #1507: Python bindings create duplicated qualified fields after joining

2022-01-16 Thread GitBox
jychen7 commented on issue #1507: URL: https://github.com/apache/arrow-datafusion/issues/1507#issuecomment-1013921187 I think it is just python. I try following in datafusion-cli and works ``` select * from test join small using (c1); ``` with ``` CREATE EXTERNAL TABLE tes

[GitHub] [arrow-datafusion] domodwyer commented on pull request #1539: approx_quantile() aggregation function

2022-01-16 Thread GitBox
domodwyer commented on pull request #1539: URL: https://github.com/apache/arrow-datafusion/pull/1539#issuecomment-1013921251 The T-Digest algorithm returns an interpolated result, so I think `percentile_cont` makes most sense - however T-digest is an approximation of the quantile. Should

[GitHub] [arrow] thisisnic commented on a change in pull request #12083: ARROW-14744: [R] open_dataset() error when `schema` argument supplied, but `column_names` not supplied to `CSVReadOptions`

2022-01-16 Thread GitBox
thisisnic commented on a change in pull request #12083: URL: https://github.com/apache/arrow/pull/12083#discussion_r785474474 ## File path: r/R/dataset.R ## @@ -123,6 +123,7 @@ #' or call [`$NewScan()`][Scanner] to construct a query directly. #' @export #' @seealso `vignette

[GitHub] [arrow] ursabot edited a comment on pull request #10371: ARROW-12549: [JS] Table and RecordBatch should not extend Vector, make JS lib smaller

2022-01-16 Thread GitBox
ursabot edited a comment on pull request #10371: URL: https://github.com/apache/arrow/pull/10371#issuecomment-1013918249 Benchmark runs are scheduled for baseline = 7029f90ea3b39e97f1a671227ca932cbcdbcee05 and contender = 20b66c255ff617c438775e54081eaa02d5b983e1. 20b66c255ff617c438775e540

  1   2   3   >