[GitHub] [arrow-datafusion] Ted-Jiang commented on pull request #1665: Fix can not load parquet table form spark in datafusion-cli.

2022-01-29 Thread GitBox
Ted-Jiang commented on pull request #1665: URL: https://github.com/apache/arrow-datafusion/pull/1665#issuecomment-1024861121 @houqp Could please take a look? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[GitHub] [arrow-datafusion] Ted-Jiang commented on issue #1701: Ballista Enhancement Overview

2022-01-29 Thread GitBox
Ted-Jiang commented on issue #1701: URL: https://github.com/apache/arrow-datafusion/issues/1701#issuecomment-1024861355 This would be a milestone in Ballista! 👍 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow-datafusion] yahoNanJing opened a new issue #1702: [Ballista] Support to access remote object store, like HDFS, S3, etc

2022-01-29 Thread GitBox
yahoNanJing opened a new issue #1702: URL: https://github.com/apache/arrow-datafusion/issues/1702 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** After introducing the object store API, to support to access remote object

[GitHub] [arrow-rs] tustvold commented on a change in pull request #1246: Fix NullArrayReader (#1245)

2022-01-29 Thread GitBox
tustvold commented on a change in pull request #1246: URL: https://github.com/apache/arrow-rs/pull/1246#discussion_r795025401 ## File path: parquet/src/arrow/array_reader.rs ## @@ -214,6 +214,10 @@ where // save definition and repetition buffers self.def_level

[GitHub] [arrow] kszucs commented on pull request #12283: ARROW-15483: [Release] Exercise source verification builds on a nightly basis; support release verification without a conda environment

2022-01-29 Thread GitBox
kszucs commented on pull request #12283: URL: https://github.com/apache/arrow/pull/12283#issuecomment-1024865216 @github-actions crossbow submit verify-rc-source-cpp-linux-conda-amd64 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [arrow] github-actions[bot] commented on pull request #12283: ARROW-15483: [Release] Exercise source verification builds on a nightly basis; support release verification without a conda envir

2022-01-29 Thread GitBox
github-actions[bot] commented on pull request #12283: URL: https://github.com/apache/arrow/pull/12283#issuecomment-1024865385 Revision: 642fb17549a91e32d48a2918a18168c3e603d9a4 Submitted crossbow builds: [ursacomputing/crossbow @ actions-1540](https://github.com/ursacomputing/crossbo

[GitHub] [arrow] kszucs commented on pull request #12283: ARROW-15483: [Release] Exercise source verification builds on a nightly basis; support release verification without a conda environment

2022-01-29 Thread GitBox
kszucs commented on pull request #12283: URL: https://github.com/apache/arrow/pull/12283#issuecomment-1024865716 @github-actions crossbow submit verify-rc-source-integration-linux-almalinux-8-amd64 -- This is an automated message from the Apache Git Service. To respond to the message, pl

[GitHub] [arrow] kszucs commented on pull request #12283: ARROW-15483: [Release] Exercise source verification builds on a nightly basis; support release verification without a conda environment

2022-01-29 Thread GitBox
kszucs commented on pull request #12283: URL: https://github.com/apache/arrow/pull/12283#issuecomment-1024865879 @github-actions crossbow submit verify-rc-source-ruby-linux-almalinux-8-amd64 -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [arrow] github-actions[bot] commented on pull request #12283: ARROW-15483: [Release] Exercise source verification builds on a nightly basis; support release verification without a conda envir

2022-01-29 Thread GitBox
github-actions[bot] commented on pull request #12283: URL: https://github.com/apache/arrow/pull/12283#issuecomment-1024865893 Revision: 8219b59f4a52d33915cd930ee71188bf9030848e Submitted crossbow builds: [ursacomputing/crossbow @ actions-1541](https://github.com/ursacomputing/crossbo

[GitHub] [arrow] github-actions[bot] commented on pull request #12283: ARROW-15483: [Release] Exercise source verification builds on a nightly basis; support release verification without a conda envir

2022-01-29 Thread GitBox
github-actions[bot] commented on pull request #12283: URL: https://github.com/apache/arrow/pull/12283#issuecomment-1024866063 Revision: 8219b59f4a52d33915cd930ee71188bf9030848e Submitted crossbow builds: [ursacomputing/crossbow @ actions-1542](https://github.com/ursacomputing/crossbo

[GitHub] [arrow-datafusion] yahoNanJing opened a new issue #1703: [Ballista] Support to better manage cluster state, like alive executors, executor available task slots, etc

2022-01-29 Thread GitBox
yahoNanJing opened a new issue #1703: URL: https://github.com/apache/arrow-datafusion/issues/1703 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** Currently all of the cluster state, like executor info, task info, are stor

[GitHub] [arrow] ursabot edited a comment on pull request #12278: ARROW-15488: [Go] Fix ipc.Writer corrupting null bitmaps

2022-01-29 Thread GitBox
ursabot edited a comment on pull request #12278: URL: https://github.com/apache/arrow/pull/12278#issuecomment-1023642629 Benchmark runs are scheduled for baseline = 03f3cf986314654e932587d01df59ad145faf5b9 and contender = 20e9e935899c8e11439ee16fb41d24190f2fabd6. 20e9e935899c8e11439ee16fb

[GitHub] [arrow-datafusion] yahoNanJing opened a new issue #1704: [Ballista] Introduce more cluster state info for future better management

2022-01-29 Thread GitBox
yahoNanJing opened a new issue #1704: URL: https://github.com/apache/arrow-datafusion/issues/1704 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** Currently the cluster state info is not enough. For example, because of lack

[GitHub] [arrow-datafusion] liukun4515 commented on pull request #1685: Incorporate dyn scalar kernels

2022-01-29 Thread GitBox
liukun4515 commented on pull request #1685: URL: https://github.com/apache/arrow-datafusion/pull/1685#issuecomment-1024871151 > @liukun4515 @alamb turns out there was auto sort setting turned on for the syntax highlighter i was using for TOML files. Maybe I accidentally turned on. Apologi

[GitHub] [arrow-datafusion] yjshen commented on a change in pull request #1700: Support `create_physical_expr` and `ExecutionContextState` or `DefaultPhysicalPlanner` for faster speed

2022-01-29 Thread GitBox
yjshen commented on a change in pull request #1700: URL: https://github.com/apache/arrow-datafusion/pull/1700#discussion_r795028433 ## File path: datafusion/src/execution/context.rs ## @@ -1115,9 +1114,14 @@ impl ExecutionConfig { /// An instance of this struct is created each

[GitHub] [arrow] kszucs commented on pull request #12283: ARROW-15483: [Release] Exercise source verification builds on a nightly basis; support release verification without a conda environment

2022-01-29 Thread GitBox
kszucs commented on pull request #12283: URL: https://github.com/apache/arrow/pull/12283#issuecomment-1024873195 @github-actions crossbow submit verify-rc-source-cpp-linux-conda-amd64 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [arrow] github-actions[bot] commented on pull request #12283: ARROW-15483: [Release] Exercise source verification builds on a nightly basis; support release verification without a conda envir

2022-01-29 Thread GitBox
github-actions[bot] commented on pull request #12283: URL: https://github.com/apache/arrow/pull/12283#issuecomment-1024873437 Revision: f5301a3c96f6d7862d08b4241ac4118c2938936e Submitted crossbow builds: [ursacomputing/crossbow @ actions-1543](https://github.com/ursacomputing/crossbo

[GitHub] [arrow-datafusion] cpcloud commented on a change in pull request #1699: Implement TableProvider for DataFrameImpl

2022-01-29 Thread GitBox
cpcloud commented on a change in pull request #1699: URL: https://github.com/apache/arrow-datafusion/pull/1699#discussion_r795036649 ## File path: datafusion/src/execution/dataframe_impl.rs ## @@ -488,6 +548,17 @@ mod tests { Ok(()) } +#[tokio::test] +as

[GitHub] [arrow] kszucs commented on pull request #12293: [Release] Verify 7.0.0 RC10 [WIP]

2022-01-29 Thread GitBox
kszucs commented on pull request #12293: URL: https://github.com/apache/arrow/pull/12293#issuecomment-1024885128 @github-actions crossbow submit --group verify-rc-binaries --group verify-rc-jars --group verify-rc-wheels --param release=7.0.0 --param rc=10 -- This is an automated message

[GitHub] [arrow] github-actions[bot] commented on pull request #12293: [Release] Verify 7.0.0 RC10 [WIP]

2022-01-29 Thread GitBox
github-actions[bot] commented on pull request #12293: URL: https://github.com/apache/arrow/pull/12293#issuecomment-1024885304 Revision: e90472e35b40f58b17d408438bb8de1641bfe6ef Submitted crossbow builds: [ursacomputing/crossbow @ actions-1544](https://github.com/ursacomputing/crossbo

[GitHub] [arrow] kszucs commented on pull request #12283: ARROW-15483: [Release] Exercise source verification builds on a nightly basis; support release verification without a conda environment

2022-01-29 Thread GitBox
kszucs commented on pull request #12283: URL: https://github.com/apache/arrow/pull/12283#issuecomment-1024885661 @github-actions crossbow submit verify-rc-source-integration-linux-almalinux-8-amd64 -- This is an automated message from the Apache Git Service. To respond to the message, pl

[GitHub] [arrow] kszucs commented on pull request #12283: ARROW-15483: [Release] Exercise source verification builds on a nightly basis; support release verification without a conda environment

2022-01-29 Thread GitBox
kszucs commented on pull request #12283: URL: https://github.com/apache/arrow/pull/12283#issuecomment-1024885705 @github-actions crossbow submit verify-rc-source-ruby-linux-almalinux-8-amd64 -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [arrow-datafusion] cpcloud commented on a change in pull request #1699: Implement TableProvider for DataFrameImpl

2022-01-29 Thread GitBox
cpcloud commented on a change in pull request #1699: URL: https://github.com/apache/arrow-datafusion/pull/1699#discussion_r795037968 ## File path: datafusion/src/execution/dataframe_impl.rs ## @@ -62,6 +68,60 @@ impl DataFrameImpl { } } +#[async_trait] +impl TableProvid

[GitHub] [arrow] github-actions[bot] commented on pull request #12283: ARROW-15483: [Release] Exercise source verification builds on a nightly basis; support release verification without a conda envir

2022-01-29 Thread GitBox
github-actions[bot] commented on pull request #12283: URL: https://github.com/apache/arrow/pull/12283#issuecomment-1024885834 Revision: eaf4317ebc81655d153995faef7984d3164ecc3e Submitted crossbow builds: [ursacomputing/crossbow @ actions-1545](https://github.com/ursacomputing/crossbo

[GitHub] [arrow] github-actions[bot] commented on pull request #12283: ARROW-15483: [Release] Exercise source verification builds on a nightly basis; support release verification without a conda envir

2022-01-29 Thread GitBox
github-actions[bot] commented on pull request #12283: URL: https://github.com/apache/arrow/pull/12283#issuecomment-1024885875 ``` Failed to push updated references, potentially because of credential issues: ['refs/heads/actions-1545-github-verify-rc-source-ruby-linux-almalinux-8-amd64',

[GitHub] [arrow-datafusion] cpcloud commented on a change in pull request #1699: Implement TableProvider for DataFrameImpl

2022-01-29 Thread GitBox
cpcloud commented on a change in pull request #1699: URL: https://github.com/apache/arrow-datafusion/pull/1699#discussion_r795038010 ## File path: datafusion/src/execution/dataframe_impl.rs ## @@ -488,6 +553,17 @@ mod tests { Ok(()) } +#[tokio::test] +as

[GitHub] [arrow] kszucs commented on pull request #12283: ARROW-15483: [Release] Exercise source verification builds on a nightly basis; support release verification without a conda environment

2022-01-29 Thread GitBox
kszucs commented on pull request #12283: URL: https://github.com/apache/arrow/pull/12283#issuecomment-1024886162 @github-actions crossbow submit verify-rc-source-* -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

[GitHub] [arrow] github-actions[bot] commented on pull request #12283: ARROW-15483: [Release] Exercise source verification builds on a nightly basis; support release verification without a conda envir

2022-01-29 Thread GitBox
github-actions[bot] commented on pull request #12283: URL: https://github.com/apache/arrow/pull/12283#issuecomment-1024886337 Revision: eaf4317ebc81655d153995faef7984d3164ecc3e Submitted crossbow builds: [ursacomputing/crossbow @ actions-1546](https://github.com/ursacomputing/crossbo

[GitHub] [arrow] liyafan82 opened a new pull request #12294: Arrow-15501: [Java] Support validating decimal vectors

2022-01-29 Thread GitBox
liyafan82 opened a new pull request #12294: URL: https://github.com/apache/arrow/pull/12294 Support validating decimal vectors and check precisions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] github-actions[bot] commented on pull request #12294: Arrow-15501: [Java] Support validating decimal vectors

2022-01-29 Thread GitBox
github-actions[bot] commented on pull request #12294: URL: https://github.com/apache/arrow/pull/12294#issuecomment-1024887318 Thanks for opening a pull request! If this is not a [minor PR](https://github.com/apache/arrow/blob/master/CONTRIBUTING.md#Minor-Fixes). Could you op

[GitHub] [arrow-rs] jhorstmann commented on a change in pull request #1228: Faster bitmask iteration

2022-01-29 Thread GitBox
jhorstmann commented on a change in pull request #1228: URL: https://github.com/apache/arrow-rs/pull/1228#discussion_r795038980 ## File path: arrow/src/util/bit_chunk_iterator.rs ## @@ -272,4 +462,149 @@ mod tests { assert_eq!(u64::MAX, bitchunks.iter().last().unwrap()

[GitHub] [arrow] github-actions[bot] commented on pull request #12294: ARROW-15501: [Java] Support validating decimal vectors

2022-01-29 Thread GitBox
github-actions[bot] commented on pull request #12294: URL: https://github.com/apache/arrow/pull/12294#issuecomment-1024887609 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #1691: Add `MemTrackingMetrics` to ease memory tracking for non-limited memory consumers

2022-01-29 Thread GitBox
alamb commented on a change in pull request #1691: URL: https://github.com/apache/arrow-datafusion/pull/1691#discussion_r795040458 ## File path: datafusion/src/execution/memory_manager.rs ## @@ -270,25 +270,35 @@ impl MemoryManager { requesters: Arc::new(Mu

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #1699: Implement TableProvider for DataFrameImpl

2022-01-29 Thread GitBox
alamb commented on a change in pull request #1699: URL: https://github.com/apache/arrow-datafusion/pull/1699#discussion_r795040558 ## File path: datafusion/src/execution/dataframe_impl.rs ## @@ -488,6 +553,17 @@ mod tests { Ok(()) } +#[tokio::test] +asyn

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #1699: Implement TableProvider for DataFrameImpl

2022-01-29 Thread GitBox
alamb commented on a change in pull request #1699: URL: https://github.com/apache/arrow-datafusion/pull/1699#discussion_r795040657 ## File path: datafusion/src/execution/dataframe_impl.rs ## @@ -488,6 +547,44 @@ mod tests { Ok(()) } +#[tokio::test] +asyn

[GitHub] [arrow-rs] alamb commented on a change in pull request #1228: Faster bitmask iteration

2022-01-29 Thread GitBox
alamb commented on a change in pull request #1228: URL: https://github.com/apache/arrow-rs/pull/1228#discussion_r795041965 ## File path: arrow/src/util/bit_chunk_iterator.rs ## @@ -15,8 +17,192 @@ // specific language governing permissions and limitations // under the License

[GitHub] [arrow-rs] tustvold commented on a change in pull request #1228: Faster bitmask iteration

2022-01-29 Thread GitBox
tustvold commented on a change in pull request #1228: URL: https://github.com/apache/arrow-rs/pull/1228#discussion_r795042188 ## File path: arrow/src/util/bit_chunk_iterator.rs ## @@ -272,4 +462,149 @@ mod tests { assert_eq!(u64::MAX, bitchunks.iter().last().unwrap());

[GitHub] [arrow-datafusion] liukun4515 commented on pull request #1653: suppport bitwise and as an example

2022-01-29 Thread GitBox
liukun4515 commented on pull request #1653: URL: https://github.com/apache/arrow-datafusion/pull/1653#issuecomment-1024893211 @alamb @houqp @FauxFaux PTAL -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

[GitHub] [arrow-rs] tustvold commented on a change in pull request #1228: Faster bitmask iteration

2022-01-29 Thread GitBox
tustvold commented on a change in pull request #1228: URL: https://github.com/apache/arrow-rs/pull/1228#discussion_r795042188 ## File path: arrow/src/util/bit_chunk_iterator.rs ## @@ -272,4 +462,149 @@ mod tests { assert_eq!(u64::MAX, bitchunks.iter().last().unwrap());

[GitHub] [arrow-rs] codecov-commenter edited a comment on pull request #1228: Faster bitmask iteration

2022-01-29 Thread GitBox
codecov-commenter edited a comment on pull request #1228: URL: https://github.com/apache/arrow-rs/pull/1228#issuecomment-1019500860 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1228?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm

[GitHub] [arrow-rs] tustvold commented on pull request #1228: Faster bitmask iteration

2022-01-29 Thread GitBox
tustvold commented on pull request #1228: URL: https://github.com/apache/arrow-rs/pull/1228#issuecomment-1024895206 I think SlicesIterator has a bug :disappointed: - fixing... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

[GitHub] [arrow-rs] tustvold edited a comment on pull request #1228: Faster bitmask iteration

2022-01-29 Thread GitBox
tustvold edited a comment on pull request #1228: URL: https://github.com/apache/arrow-rs/pull/1228#issuecomment-1024895206 I think SlicesIterator has a bug :disappointed: - fixing... It's applying the offset in the wrong direction :sweat_smile: -- This is an automated message from the A

[GitHub] [arrow-rs] alamb commented on a change in pull request #1228: Faster bitmask iteration

2022-01-29 Thread GitBox
alamb commented on a change in pull request #1228: URL: https://github.com/apache/arrow-rs/pull/1228#discussion_r795043370 ## File path: arrow/src/compute/kernels/filter.rs ## @@ -17,184 +17,117 @@ //! Defines miscellaneous array kernels. +use crate::array::*; use crate::

[GitHub] [arrow-rs] tustvold commented on a change in pull request #1228: Faster bitmask iteration

2022-01-29 Thread GitBox
tustvold commented on a change in pull request #1228: URL: https://github.com/apache/arrow-rs/pull/1228#discussion_r795044493 ## File path: parquet/src/arrow/bit_util.rs ## @@ -15,40 +15,33 @@ // specific language governing permissions and limitations // under the License.

[GitHub] [arrow-rs] tustvold commented on a change in pull request #1228: Faster bitmask iteration

2022-01-29 Thread GitBox
tustvold commented on a change in pull request #1228: URL: https://github.com/apache/arrow-rs/pull/1228#discussion_r795044493 ## File path: parquet/src/arrow/bit_util.rs ## @@ -15,40 +15,33 @@ // specific language governing permissions and limitations // under the License.

[GitHub] [arrow-datafusion] domodwyer commented on pull request #1539: approx_quantile() aggregation function

2022-01-29 Thread GitBox
domodwyer commented on pull request #1539: URL: https://github.com/apache/arrow-datafusion/pull/1539#issuecomment-1024897478 Done! ``` +---+ | APPROXPERCENTILECONT(test.b,Float64(0.5)) | +---+

[GitHub] [arrow-datafusion] yjshen commented on a change in pull request #1691: Add `MemTrackingMetrics` to ease memory tracking for non-limited memory consumers

2022-01-29 Thread GitBox
yjshen commented on a change in pull request #1691: URL: https://github.com/apache/arrow-datafusion/pull/1691#discussion_r795044951 ## File path: datafusion/src/execution/memory_manager.rs ## @@ -270,25 +270,35 @@ impl MemoryManager { requesters: Arc::new(M

[GitHub] [arrow-datafusion] yjshen edited a comment on issue #944: Table Scan Enhancement Plan

2022-01-29 Thread GitBox
yjshen edited a comment on issue #944: URL: https://github.com/apache/arrow-datafusion/issues/944#issuecomment-905641087 Also, from my own perspective, I have some extensions to make: - [x] Datafusion-hdfs connector - [ ] Datafusion-hive metastore connector -- This is an automated

[GitHub] [arrow-datafusion] Igosuki commented on pull request #1697: [arrow2] Merge arrow2 and datafusion latest

2022-01-29 Thread GitBox
Igosuki commented on pull request #1697: URL: https://github.com/apache/arrow-datafusion/pull/1697#issuecomment-1024900121 @houqp Some fixes here https://github.com/Igosuki/arrow-datafusion/tree/arrow_testfix_20220129 remain a parquet projection issue, the decimal type issue I mentioned h

[GitHub] [arrow-rs] alamb commented on a change in pull request #1154: Add `async` arrow parquet reader

2022-01-29 Thread GitBox
alamb commented on a change in pull request #1154: URL: https://github.com/apache/arrow-rs/pull/1154#discussion_r795044139 ## File path: parquet/Cargo.toml ## @@ -55,24 +57,26 @@ brotli = "3.3" flate2 = "1.0" lz4 = "1.23" serde_json = { version = "1.0", features = ["preserve

[GitHub] [arrow-rs] alamb commented on pull request #1154: Add `async` arrow parquet reader

2022-01-29 Thread GitBox
alamb commented on pull request #1154: URL: https://github.com/apache/arrow-rs/pull/1154#issuecomment-1024901197 Actually, don't we have to add `async` to the tests to ensure CI coverage? -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [arrow-rs] codecov-commenter edited a comment on pull request #1228: Faster bitmask iteration

2022-01-29 Thread GitBox
codecov-commenter edited a comment on pull request #1228: URL: https://github.com/apache/arrow-rs/pull/1228#issuecomment-1019500860 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1228?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm

[GitHub] [arrow-rs] tustvold commented on a change in pull request #1154: Add `async` arrow parquet reader

2022-01-29 Thread GitBox
tustvold commented on a change in pull request #1154: URL: https://github.com/apache/arrow-rs/pull/1154#discussion_r795049009 ## File path: parquet/src/arrow/async_reader.rs ## @@ -0,0 +1,481 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contribu

[GitHub] [arrow-rs] tustvold commented on issue #1239: Add `test_validate` feature flag + CI check

2022-01-29 Thread GitBox
tustvold commented on issue #1239: URL: https://github.com/apache/arrow-rs/issues/1239#issuecomment-1024908585 Had a brief stab at adding this, the good news is it doesn't show any new problems, the bad news is it causes a lot of the tests of the validation logic to fail :laughing: I need

[GitHub] [arrow-datafusion] matthewmturner commented on issue #1702: [Ballista] Support to access remote object store, like HDFS, S3, etc

2022-01-29 Thread GitBox
matthewmturner commented on issue #1702: URL: https://github.com/apache/arrow-datafusion/issues/1702#issuecomment-1024909782 @yahoNanJing fyi @seddonm1 and I have been working on https://github.com/datafusion-contrib/datafusion-objectstore-s3. Still early stages but would be great to have

[GitHub] [arrow] ursabot edited a comment on pull request #12268: ARROW-15485: [Release][Java] Fix java jars upload script

2022-01-29 Thread GitBox
ursabot edited a comment on pull request #12268: URL: https://github.com/apache/arrow/pull/12268#issuecomment-1023670703 Benchmark runs are scheduled for baseline = 11f31839963077e5c793b040b1901f43812bbfe1 and contender = b1ae6029687aa3bb756c61a189c88125128cf026. b1ae6029687aa3bb756c61a18

[GitHub] [arrow-rs] alamb commented on pull request #1154: Add `async` arrow parquet reader

2022-01-29 Thread GitBox
alamb commented on pull request #1154: URL: https://github.com/apache/arrow-rs/pull/1154#issuecomment-1024915824 Actually, I see https://github.com/apache/arrow-datafusion/pull/1617 demonstrates what impacts this has on DataFusion, which seems just fine 👍 -- This is an automated message

[GitHub] [arrow-datafusion] thinkharderdev commented on issue #1702: [Ballista] Support to access remote object store, like HDFS, S3, etc

2022-01-29 Thread GitBox
thinkharderdev commented on issue #1702: URL: https://github.com/apache/arrow-datafusion/issues/1702#issuecomment-1024926421 This PR would support this use case: https://github.com/apache/arrow-datafusion/pull/1677 -- This is an automated message from the Apache Git Service. To respond

[GitHub] [arrow-rs] bjchambers commented on pull request #1246: Fix NullArrayReader (#1245)

2022-01-29 Thread GitBox
bjchambers commented on pull request #1246: URL: https://github.com/apache/arrow-rs/pull/1246#issuecomment-1024929727 No need to rush. Next week should be fine. I can just wait to upgrade until this is out. Thanks for the speedy fix! On Sat, Jan 29, 2022 at 3:19 AM Andrew Lamb ***@

[GitHub] [arrow] okadakk commented on pull request #12269: ARROW-15462: [GLib] Add GArrow{Month,DayTime,MonthDayNano}Interval{Scalar,Array,ArrayBuilder}

2022-01-29 Thread GitBox
okadakk commented on pull request #12269: URL: https://github.com/apache/arrow/pull/12269#issuecomment-1024935015 @kou Thanks!! I remove redundant tests and rebase master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow-rs] alamb closed issue #1245: Parquet v8.0.0 panics when reading all null column to NullArray

2022-01-29 Thread GitBox
alamb closed issue #1245: URL: https://github.com/apache/arrow-rs/issues/1245 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@a

[GitHub] [arrow-rs] alamb merged pull request #1246: Fix NullArrayReader (#1245)

2022-01-29 Thread GitBox
alamb merged pull request #1246: URL: https://github.com/apache/arrow-rs/pull/1246 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #1691: Add `MemTrackingMetrics` to ease memory tracking for non-limited memory consumers

2022-01-29 Thread GitBox
alamb commented on a change in pull request #1691: URL: https://github.com/apache/arrow-datafusion/pull/1691#discussion_r795069168 ## File path: datafusion/src/execution/memory_manager.rs ## @@ -270,25 +270,35 @@ impl MemoryManager { requesters: Arc::new(Mu

[GitHub] [arrow-datafusion] alamb closed issue #1569: Track memory usage in Non Limited Operators

2022-01-29 Thread GitBox
alamb closed issue #1569: URL: https://github.com/apache/arrow-datafusion/issues/1569 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsub

[GitHub] [arrow-datafusion] alamb merged pull request #1691: Add `MemTrackingMetrics` to ease memory tracking for non-limited memory consumers

2022-01-29 Thread GitBox
alamb merged pull request #1691: URL: https://github.com/apache/arrow-datafusion/pull/1691 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-

[GitHub] [arrow-rs] alamb commented on a change in pull request #1223: `DecimalArray` API ergonomics: add iter(), create from iter(), change precision / scale

2022-01-29 Thread GitBox
alamb commented on a change in pull request #1223: URL: https://github.com/apache/arrow-rs/pull/1223#discussion_r795069276 ## File path: arrow/src/array/array_binary.rs ## @@ -816,13 +827,80 @@ impl DecimalArray { let array_data = unsafe { builder.build_unchecked() };

[GitHub] [arrow-rs] codecov-commenter edited a comment on pull request #1223: `DecimalArray` API ergonomics: add iter(), create from iter(), change precision / scale

2022-01-29 Thread GitBox
codecov-commenter edited a comment on pull request #1223: URL: https://github.com/apache/arrow-rs/pull/1223#issuecomment-1019476786 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1223?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm

[GitHub] [arrow-rs] alamb opened a new pull request #1247: Use new DecimalArray creation API in parquet crate

2022-01-29 Thread GitBox
alamb opened a new pull request #1247: URL: https://github.com/apache/arrow-rs/pull/1247 Builds on https://github.com/apache/arrow-rs/pull/1223 so draft until that is done Rationale: https://github.com/apache/arrow-rs/pull/1223 introduces a more performant and idiomatic API f

[GitHub] [arrow] ursabot edited a comment on pull request #12276: ARROW-15461: [C++] Avoid clang bug in ReverseBitmap

2022-01-29 Thread GitBox
ursabot edited a comment on pull request #12276: URL: https://github.com/apache/arrow/pull/12276#issuecomment-1023885081 Benchmark runs are scheduled for baseline = b1ae6029687aa3bb756c61a189c88125128cf026 and contender = 8905de9b3db1667eff7678a3cad2de0b64ff46bf. 8905de9b3db1667eff7678a3c

[GitHub] [arrow-rs] codecov-commenter commented on pull request #1247: Use new DecimalArray creation API in parquet crate

2022-01-29 Thread GitBox
codecov-commenter commented on pull request #1247: URL: https://github.com/apache/arrow-rs/pull/1247#issuecomment-1024943559 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1247?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=T

[GitHub] [arrow-rs] alamb commented on pull request #1223: `DecimalArray` API ergonomics: add iter(), create from iter(), change precision / scale

2022-01-29 Thread GitBox
alamb commented on pull request #1223: URL: https://github.com/apache/arrow-rs/pull/1223#issuecomment-1024947863 🤔 while working to port some of the code over, it turns out that DecimalBuilder also does value validation on each value (so it is certain that the decimal values are within co

[GitHub] [arrow] ursabot edited a comment on pull request #12160: ARROW-13467: [C++] Support delta dictionaries in the IPC file format

2022-01-29 Thread GitBox
ursabot edited a comment on pull request #12160: URL: https://github.com/apache/arrow/pull/12160#issuecomment-1024007598 Benchmark runs are scheduled for baseline = 8905de9b3db1667eff7678a3cad2de0b64ff46bf and contender = 3663971f17cc5cc32bb389ad959eb5b30dacb1e1. 3663971f17cc5cc32bb389ad9

[GitHub] [arrow-rs] alamb commented on pull request #1223: `DecimalArray` API ergonomics: add iter(), create from iter(), change precision / scale

2022-01-29 Thread GitBox
alamb commented on pull request #1223: URL: https://github.com/apache/arrow-rs/pull/1223#issuecomment-1024978546 back to draft while I think about this a bit -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

[GitHub] [arrow-datafusion] realno commented on a change in pull request #1539: approx_quantile() aggregation function

2022-01-29 Thread GitBox
realno commented on a change in pull request #1539: URL: https://github.com/apache/arrow-datafusion/pull/1539#discussion_r795090321 ## File path: datafusion/tests/sql/aggregates.rs ## @@ -316,6 +316,95 @@ async fn csv_query_approx_count() -> Result<()> { Ok(()) } +// Th

[GitHub] [arrow] kou closed pull request #12288: ARROW-15497: [C++][Homebrew] Use Clang Tools 12

2022-01-29 Thread GitBox
kou closed pull request #12288: URL: https://github.com/apache/arrow/pull/12288 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...

[GitHub] [arrow] ursabot commented on pull request #12288: ARROW-15497: [C++][Homebrew] Use Clang Tools 12

2022-01-29 Thread GitBox
ursabot commented on pull request #12288: URL: https://github.com/apache/arrow/pull/12288#issuecomment-1024991556 Benchmark runs are scheduled for baseline = fcab4814f658e3adf181f122d016c2b04a2667c6 and contender = ff37b7adf21b319c0d08b2eb09ecbd8db0794cbe. ff37b7adf21b319c0d08b2eb09ecbd8d

[GitHub] [arrow] ursabot edited a comment on pull request #12271: ARROW-15463: [GLib] Add arrow::compute::Utf8NormalizeOptions bindings

2022-01-29 Thread GitBox
ursabot edited a comment on pull request #12271: URL: https://github.com/apache/arrow/pull/12271#issuecomment-1024014621 Benchmark runs are scheduled for baseline = 3663971f17cc5cc32bb389ad959eb5b30dacb1e1 and contender = c692c8842a959c2af0e68bc2f20f72d5f2a2ec67. c692c8842a959c2af0e68bc2f

[GitHub] [arrow] ursabot edited a comment on pull request #12288: ARROW-15497: [C++][Homebrew] Use Clang Tools 12

2022-01-29 Thread GitBox
ursabot edited a comment on pull request #12288: URL: https://github.com/apache/arrow/pull/12288#issuecomment-1024991556 Benchmark runs are scheduled for baseline = fcab4814f658e3adf181f122d016c2b04a2667c6 and contender = ff37b7adf21b319c0d08b2eb09ecbd8db0794cbe. ff37b7adf21b319c0d08b2eb0

[GitHub] [arrow] kou closed pull request #12269: ARROW-15462: [GLib] Add GArrow{Month,DayTime,MonthDayNano}Interval{Scalar,Array,ArrayBuilder}

2022-01-29 Thread GitBox
kou closed pull request #12269: URL: https://github.com/apache/arrow/pull/12269 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...

[GitHub] [arrow] ursabot commented on pull request #12269: ARROW-15462: [GLib] Add GArrow{Month,DayTime,MonthDayNano}Interval{Scalar,Array,ArrayBuilder}

2022-01-29 Thread GitBox
ursabot commented on pull request #12269: URL: https://github.com/apache/arrow/pull/12269#issuecomment-1025001545 Benchmark runs are scheduled for baseline = ff37b7adf21b319c0d08b2eb09ecbd8db0794cbe and contender = cc4e2a54309813e636ba50bcd22a7b71d3d9. cc4e2a54309813e636ba50bcd22a

[GitHub] [arrow] ursabot edited a comment on pull request #12269: ARROW-15462: [GLib] Add GArrow{Month,DayTime,MonthDayNano}Interval{Scalar,Array,ArrayBuilder}

2022-01-29 Thread GitBox
ursabot edited a comment on pull request #12269: URL: https://github.com/apache/arrow/pull/12269#issuecomment-1025001545 Benchmark runs are scheduled for baseline = ff37b7adf21b319c0d08b2eb09ecbd8db0794cbe and contender = cc4e2a54309813e636ba50bcd22a7b71d3d9. cc4e2a54309813e636ba5

[GitHub] [arrow-datafusion] matthewmturner opened a new issue #1705: Simplify creating new `ListingTable`

2022-01-29 Thread GitBox
matthewmturner opened a new issue #1705: URL: https://github.com/apache/arrow-datafusion/issues/1705 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** A clear and concise description of what the problem is. Ex. I'm always frustrate

[GitHub] [arrow-datafusion] matthewmturner commented on issue #1705: Simplify creating new `ListingTable`

2022-01-29 Thread GitBox
matthewmturner commented on issue #1705: URL: https://github.com/apache/arrow-datafusion/issues/1705#issuecomment-1025003439 @alamb @houqp @seddonm1 what do you think about this proposal? -- This is an automated message from the Apache Git Service. To respond to the message, please log o

[GitHub] [arrow] ursabot edited a comment on pull request #12288: ARROW-15497: [C++][Homebrew] Use Clang Tools 12

2022-01-29 Thread GitBox
ursabot edited a comment on pull request #12288: URL: https://github.com/apache/arrow/pull/12288#issuecomment-1024991556 Benchmark runs are scheduled for baseline = fcab4814f658e3adf181f122d016c2b04a2667c6 and contender = ff37b7adf21b319c0d08b2eb09ecbd8db0794cbe. ff37b7adf21b319c0d08b2eb0

[GitHub] [arrow-rs] tustvold opened a new pull request #1248: POC: Specialized filter kernels

2022-01-29 Thread GitBox
tustvold opened a new pull request #1248: URL: https://github.com/apache/arrow-rs/pull/1248 **Unfortunately I didn't get as much time to work on this today as I hoped, but I thought I'd get it up for some visibility into what I'm playing around with** # Which issue does this PR clos

[GitHub] [arrow-rs] tustvold commented on a change in pull request #1248: POC: Specialized filter kernels

2022-01-29 Thread GitBox
tustvold commented on a change in pull request #1248: URL: https://github.com/apache/arrow-rs/pull/1248#discussion_r795106504 ## File path: arrow/src/compute/kernels/filter.rs ## @@ -244,6 +419,110 @@ pub fn filter_record_batch( RecordBatch::try_new(record_batch.schema(),

[GitHub] [arrow-rs] tustvold commented on a change in pull request #1248: POC: Specialized filter kernels

2022-01-29 Thread GitBox
tustvold commented on a change in pull request #1248: URL: https://github.com/apache/arrow-rs/pull/1248#discussion_r795106705 ## File path: arrow/src/compute/kernels/filter.rs ## @@ -180,37 +246,146 @@ pub fn prep_null_mask_filter(filter: &BooleanArray) -> BooleanArray { ///

[GitHub] [arrow-rs] codecov-commenter commented on pull request #1248: POC: Specialized filter kernels

2022-01-29 Thread GitBox
codecov-commenter commented on pull request #1248: URL: https://github.com/apache/arrow-rs/pull/1248#issuecomment-1025014491 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1248?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=T

[GitHub] [arrow] ursabot edited a comment on pull request #12280: ARROW-15493: [C++][Gandiva] Init ExpressionCacheKey.mode_

2022-01-29 Thread GitBox
ursabot edited a comment on pull request #12280: URL: https://github.com/apache/arrow/pull/12280#issuecomment-1024148552 Benchmark runs are scheduled for baseline = c692c8842a959c2af0e68bc2f20f72d5f2a2ec67 and contender = a27c55660e575a3987283d5d9e443642db48f215. a27c55660e575a3987283d5d9

[GitHub] [arrow] ursabot edited a comment on pull request #12269: ARROW-15462: [GLib] Add GArrow{Month,DayTime,MonthDayNano}Interval{Scalar,Array,ArrayBuilder}

2022-01-29 Thread GitBox
ursabot edited a comment on pull request #12269: URL: https://github.com/apache/arrow/pull/12269#issuecomment-1025001545 Benchmark runs are scheduled for baseline = ff37b7adf21b319c0d08b2eb09ecbd8db0794cbe and contender = cc4e2a54309813e636ba50bcd22a7b71d3d9. cc4e2a54309813e636ba5

[GitHub] [arrow-rs] HaoYang670 commented on issue #1240: Get `Unknown configuration option rust-version` when running the rust format command

2022-01-29 Thread GitBox
HaoYang670 commented on issue #1240: URL: https://github.com/apache/arrow-rs/issues/1240#issuecomment-1025045965 @alamb Could you reproduce the problem? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow] github-actions[bot] commented on pull request #12295: Fix typos in python

2022-01-29 Thread GitBox
github-actions[bot] commented on pull request #12295: URL: https://github.com/apache/arrow/pull/12295#issuecomment-1025069407 Thanks for opening a pull request! If this is not a [minor PR](https://github.com/apache/arrow/blob/master/CONTRIBUTING.md#Minor-Fixes). Could you op

[GitHub] [arrow] ursabot edited a comment on pull request #12032: ARROW-15126: [C++] Support Null type as group keys

2022-01-29 Thread GitBox
ursabot edited a comment on pull request #12032: URL: https://github.com/apache/arrow/pull/12032#issuecomment-1024252094 Benchmark runs are scheduled for baseline = d035ff048a3d87d39746f8559fe09010a2961599 and contender = 39367db2dab321dbbf4d12d2229020614b049dde. 39367db2dab321dbbf4d12d22

[GitHub] [arrow] liyafan82 opened a new pull request #12296: ARROW-15502: [Java] Detect exceptional footer size in Arrow file reader

2022-01-29 Thread GitBox
liyafan82 opened a new pull request #12296: URL: https://github.com/apache/arrow/pull/12296 When a malformed Arrow file containing an extremely large footer size (much larger than the file size) is fed to the ArrowFileReader, our implementation fails detect the problem, due to integer arit

[GitHub] [arrow] github-actions[bot] commented on pull request #12296: ARROW-15502: [Java] Detect exceptional footer size in Arrow file reader

2022-01-29 Thread GitBox
github-actions[bot] commented on pull request #12296: URL: https://github.com/apache/arrow/pull/12296#issuecomment-1025079815 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow-datafusion] yjshen opened a new pull request #1706: Fuzz test for spillable sort

2022-01-29 Thread GitBox
yjshen opened a new pull request #1706: URL: https://github.com/apache/arrow-datafusion/pull/1706 # Which issue does this PR close? Closes #1573. # Rationale for this change Fuzz Test for various corner cases sorting RecordBatches exceeds available memory a