[GitHub] [arrow-datafusion] xudong963 edited a comment on pull request #1067: fix subquery alias

2021-10-07 Thread GitBox
xudong963 edited a comment on pull request #1067: URL: https://github.com/apache/arrow-datafusion/pull/1067#issuecomment-938362066 CI seems unstable. ```python === FAILURES === _ test_math_

[GitHub] [arrow] emkornfield commented on a change in pull request #11351: ARROW-13151: [C++][Parquet] Propagate schema changes from selection all the way up the stack

2021-10-07 Thread GitBox
emkornfield commented on a change in pull request #11351: URL: https://github.com/apache/arrow/pull/11351#discussion_r724725027 ## File path: cpp/src/parquet/arrow/reader.cc ## @@ -842,36 +842,79 @@ Status GetReader(const SchemaField& field, const std::shared_ptr& arrow_f

[GitHub] [arrow-datafusion] Jimexist commented on issue #753: implement `generate_series` function

2021-10-07 Thread GitBox
Jimexist commented on issue #753: URL: https://github.com/apache/arrow-datafusion/issues/753#issuecomment-938364481 see related #1080 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

[GitHub] [arrow-datafusion] houqp commented on pull request #1067: fix subquery alias

2021-10-07 Thread GitBox
houqp commented on pull request #1067: URL: https://github.com/apache/arrow-datafusion/pull/1067#issuecomment-938363407 Yes, we can ignore that CI test for now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow-datafusion] xudong963 commented on pull request #1067: fix subquery alias

2021-10-07 Thread GitBox
xudong963 commented on pull request #1067: URL: https://github.com/apache/arrow-datafusion/pull/1067#issuecomment-938362066 CI seems to be unstable. ```python === FAILURES === _ test_math_f

[GitHub] [arrow-datafusion] Jimexist opened a new issue #1085: Implement `array_agg` aggregate function

2021-10-07 Thread GitBox
Jimexist opened a new issue #1085: URL: https://github.com/apache/arrow-datafusion/issues/1085 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** A clear and concise description of what the problem is. Ex. I'm always frustrated when

[GitHub] [arrow-datafusion] Jimexist opened a new pull request #1084: use 2021 edition

2021-10-07 Thread GitBox
Jimexist opened a new pull request #1084: URL: https://github.com/apache/arrow-datafusion/pull/1084 # Which issue does this PR close? Closes #. # Rationale for this change # What changes are included in this PR? # Are there any user-facing changes

[GitHub] [arrow-datafusion] matthewmturner commented on pull request #1063: Optimize count agg expr with null column statistics

2021-10-07 Thread GitBox
matthewmturner commented on pull request #1063: URL: https://github.com/apache/arrow-datafusion/pull/1063#issuecomment-938356536 @rdettai @Dandandan sry for delay on this. ive been on vacation with very limited internet / computer access. Will pick this up early next week when im back. -

[GitHub] [arrow] emkornfield closed pull request #10911: ARROW-13604 [Java]: Remove deprecation annotations for APIs representing unsupported operations

2021-10-07 Thread GitBox
emkornfield closed pull request #10911: URL: https://github.com/apache/arrow/pull/10911 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-uns

[GitHub] [arrow-datafusion] Jimexist edited a comment on issue #1083: Implement `approx_distinct` using HyperLogLog

2021-10-07 Thread GitBox
Jimexist edited a comment on issue #1083: URL: https://github.com/apache/arrow-datafusion/issues/1083#issuecomment-938352888 relevant links: 1. [smhasher](https://github.com/aappleby/smhasher/wiki) test suite 2. trino uses xxhash 64 bit version 2. [`hyperloglog.c` from redis](http

[GitHub] [arrow-datafusion] Jimexist edited a comment on issue #1083: Implement `approx_distinct` using HyperLogLog

2021-10-07 Thread GitBox
Jimexist edited a comment on issue #1083: URL: https://github.com/apache/arrow-datafusion/issues/1083#issuecomment-938352888 relevant links: 1. [smhasher](https://github.com/aappleby/smhasher/wiki) test suite 2. trino uses xxhash 64 bit version 2. [`hyperloglog.c` from redis](http

[GitHub] [arrow-datafusion] Jimexist edited a comment on issue #1083: Implement `approx_distinct` using HyperLogLog

2021-10-07 Thread GitBox
Jimexist edited a comment on issue #1083: URL: https://github.com/apache/arrow-datafusion/issues/1083#issuecomment-938352888 relevant links: 1. [smhasher](https://github.com/aappleby/smhasher/wiki) 2. [`hyperloglog.c` from redis](https://github.com/yahoo/redislite/blob/master/redis.s

[GitHub] [arrow-datafusion] Jimexist commented on issue #1083: Implement `approx_distinct` using HyperLogLog

2021-10-07 Thread GitBox
Jimexist commented on issue #1083: URL: https://github.com/apache/arrow-datafusion/issues/1083#issuecomment-938352888 relevant links: 1. [smhasher](https://github.com/aappleby/smhasher/wiki) 2. [`hyperloglog.c` from redis](https://github.com/yahoo/redislite/blob/master/redis.submodul

[GitHub] [arrow] kou commented on pull request #11362: ARROW-14261: [C++] Includes should be in alphabetical order

2021-10-07 Thread GitBox
kou commented on pull request #11362: URL: https://github.com/apache/arrow/pull/11362#issuecomment-938352699 @github-actions autotune -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [arrow] emkornfield commented on pull request #10652: ARROW-13257: [Java][Dataset] Allow passing empty columns for projection

2021-10-07 Thread GitBox
emkornfield commented on pull request #10652: URL: https://github.com/apache/arrow/pull/10652#issuecomment-938349754 +1 Thank you @zhztheplayer -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

[GitHub] [arrow] emkornfield closed pull request #10652: ARROW-13257: [Java][Dataset] Allow passing empty columns for projection

2021-10-07 Thread GitBox
emkornfield closed pull request #10652: URL: https://github.com/apache/arrow/pull/10652 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-uns

[GitHub] [arrow-datafusion] xudong963 commented on pull request #1067: fix subquery alias

2021-10-07 Thread GitBox
xudong963 commented on pull request #1067: URL: https://github.com/apache/arrow-datafusion/pull/1067#issuecomment-938344747 > @xudong963 could you also update LogicalPlan's `pub fn display` method to print the projection alias? This should help make the logical plan more readable :)

[GitHub] [arrow-datafusion] xudong963 commented on a change in pull request #1067: fix subquery alias

2021-10-07 Thread GitBox
xudong963 commented on a change in pull request #1067: URL: https://github.com/apache/arrow-datafusion/pull/1067#discussion_r724701491 ## File path: datafusion/src/sql/planner.rs ## @@ -565,14 +587,22 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> { )));

[GitHub] [arrow-datafusion] houqp commented on issue #1083: Implement `approx_distinct` using HyperLogLog

2021-10-07 Thread GitBox
houqp commented on issue #1083: URL: https://github.com/apache/arrow-datafusion/issues/1083#issuecomment-938344053 It looks like a fun engineering challenge and a really cool feature to add :D I don't have a use-case for this at the moment but I think it's something we would need to suppo

[GitHub] [arrow-datafusion] xudong963 commented on a change in pull request #1067: fix subquery alias

2021-10-07 Thread GitBox
xudong963 commented on a change in pull request #1067: URL: https://github.com/apache/arrow-datafusion/pull/1067#discussion_r724698342 ## File path: datafusion/src/sql/planner.rs ## @@ -528,21 +528,43 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {

[GitHub] [arrow] sighingnow closed pull request #11265: ARROW-14065: [C++] Fixes behaviour of "Advance()" of FixedSizeBinaryBuilder.

2021-10-07 Thread GitBox
sighingnow closed pull request #11265: URL: https://github.com/apache/arrow/pull/11265 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsu

[GitHub] [arrow] sighingnow commented on pull request #11265: ARROW-14065: [C++] Fixes behaviour of "Advance()" of FixedSizeBinaryBuilder.

2021-10-07 Thread GitBox
sighingnow commented on pull request #11265: URL: https://github.com/apache/arrow/pull/11265#issuecomment-938340702 Close as won't fix. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

[GitHub] [arrow] cyb70289 closed pull request #11323: ARROW-13975: [C++] Implement decimal round

2021-10-07 Thread GitBox
cyb70289 closed pull request #11323: URL: https://github.com/apache/arrow/pull/11323 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubs

[GitHub] [arrow] emkornfield commented on pull request #11146: ARROW-13984: [Go][Parquet] file handling for go parquet, just the readers

2021-10-07 Thread GitBox
emkornfield commented on pull request #11146: URL: https://github.com/apache/arrow/pull/11146#issuecomment-938339185 @zeroshade sorry still reviewing (there is a lot here) will try to do some more tomorrow. Sorry for the delay. -- This is an automated message from the Apache Git Service

[GitHub] [arrow] aocsa commented on pull request #11210: ARROW-13576: [C++] Replace ExecNode::InputReceived with ::MakeTask

2021-10-07 Thread GitBox
aocsa commented on pull request #11210: URL: https://github.com/apache/arrow/pull/11210#issuecomment-938339174 Thanks @weston, I rebased this PR and addressed latest feedback. Moreover I ran some benchmarks to see the impact of: 1. the possible issue with ExecBatch copies; 2. async mode ex

[GitHub] [arrow] cyb70289 commented on pull request #11323: ARROW-13975: [C++] Implement decimal round

2021-10-07 Thread GitBox
cyb70289 commented on pull request #11323: URL: https://github.com/apache/arrow/pull/11323#issuecomment-938339038 Macos CI failures are not related. They also happen in other PRs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

[GitHub] [arrow] ursabot edited a comment on pull request #11352: ARROW-14243: [C++] Split vector_sort.cc

2021-10-07 Thread GitBox
ursabot edited a comment on pull request #11352: URL: https://github.com/apache/arrow/pull/11352#issuecomment-938073431 Benchmark runs are scheduled for baseline = 25a6f591d1f162106b74e29870ebd4012e9874cc and contender = 55d40f696022120567d2c7a3030c7da996bc6ad1. Results will be available a

[GitHub] [arrow] ursabot edited a comment on pull request #11352: ARROW-14243: [C++] Split vector_sort.cc

2021-10-07 Thread GitBox
ursabot edited a comment on pull request #11352: URL: https://github.com/apache/arrow/pull/11352#issuecomment-938073431 Benchmark runs are scheduled for baseline = 25a6f591d1f162106b74e29870ebd4012e9874cc and contender = 55d40f696022120567d2c7a3030c7da996bc6ad1. Results will be available a

[GitHub] [arrow] emkornfield commented on a change in pull request #11146: ARROW-13984: [Go][Parquet] file handling for go parquet, just the readers

2021-10-07 Thread GitBox
emkornfield commented on a change in pull request #11146: URL: https://github.com/apache/arrow/pull/11146#discussion_r724695925 ## File path: go/parquet/internal/encoding/boolean_encoder.go ## @@ -47,6 +47,9 @@ func (enc *PlainBooleanEncoder) Put(in []bool) { if enc.wr

[GitHub] [arrow] emkornfield commented on a change in pull request #11146: ARROW-13984: [Go][Parquet] file handling for go parquet, just the readers

2021-10-07 Thread GitBox
emkornfield commented on a change in pull request #11146: URL: https://github.com/apache/arrow/pull/11146#discussion_r724695859 ## File path: go/parquet/internal/encoding/boolean_decoder.go ## @@ -45,7 +45,7 @@ func (dec *PlainBooleanDecoder) Decode(out []bool) (int, error) {

[GitHub] [arrow] emkornfield commented on a change in pull request #11146: ARROW-13984: [Go][Parquet] file handling for go parquet, just the readers

2021-10-07 Thread GitBox
emkornfield commented on a change in pull request #11146: URL: https://github.com/apache/arrow/pull/11146#discussion_r724695690 ## File path: go/parquet/internal/bmi/bmi.go ## @@ -254,7 +254,7 @@ func extractBitsGo(bitmap, selectBitmap uint64) uint64 { for selectBitmap

[GitHub] [arrow-datafusion] houqp commented on a change in pull request #1067: fix subquery alias

2021-10-07 Thread GitBox
houqp commented on a change in pull request #1067: URL: https://github.com/apache/arrow-datafusion/pull/1067#discussion_r724695469 ## File path: datafusion/src/sql/planner.rs ## @@ -565,14 +587,22 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> { )));

[GitHub] [arrow] emkornfield commented on a change in pull request #11146: ARROW-13984: [Go][Parquet] file handling for go parquet, just the readers

2021-10-07 Thread GitBox
emkornfield commented on a change in pull request #11146: URL: https://github.com/apache/arrow/pull/11146#discussion_r724695340 ## File path: go/parquet/file/column_reader.go ## @@ -0,0 +1,542 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contrib

[GitHub] [arrow] emkornfield commented on a change in pull request #11146: ARROW-13984: [Go][Parquet] file handling for go parquet, just the readers

2021-10-07 Thread GitBox
emkornfield commented on a change in pull request #11146: URL: https://github.com/apache/arrow/pull/11146#discussion_r724694645 ## File path: go/parquet/file/column_reader.go ## @@ -0,0 +1,542 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contrib

[GitHub] [arrow] emkornfield commented on a change in pull request #11146: ARROW-13984: [Go][Parquet] file handling for go parquet, just the readers

2021-10-07 Thread GitBox
emkornfield commented on a change in pull request #11146: URL: https://github.com/apache/arrow/pull/11146#discussion_r724694093 ## File path: go/parquet/file/column_reader.go ## @@ -0,0 +1,542 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contrib

[GitHub] [arrow] emkornfield commented on a change in pull request #11146: ARROW-13984: [Go][Parquet] file handling for go parquet, just the readers

2021-10-07 Thread GitBox
emkornfield commented on a change in pull request #11146: URL: https://github.com/apache/arrow/pull/11146#discussion_r724693896 ## File path: go/parquet/file/column_reader.go ## @@ -0,0 +1,542 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contrib

[GitHub] [arrow] emkornfield commented on a change in pull request #11146: ARROW-13984: [Go][Parquet] file handling for go parquet, just the readers

2021-10-07 Thread GitBox
emkornfield commented on a change in pull request #11146: URL: https://github.com/apache/arrow/pull/11146#discussion_r724693770 ## File path: go/parquet/file/column_reader.go ## @@ -0,0 +1,542 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contrib

[GitHub] [arrow] emkornfield commented on a change in pull request #11146: ARROW-13984: [Go][Parquet] file handling for go parquet, just the readers

2021-10-07 Thread GitBox
emkornfield commented on a change in pull request #11146: URL: https://github.com/apache/arrow/pull/11146#discussion_r724693070 ## File path: go/parquet/file/column_reader.go ## @@ -0,0 +1,542 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contrib

[GitHub] [arrow] emkornfield commented on a change in pull request #11146: ARROW-13984: [Go][Parquet] file handling for go parquet, just the readers

2021-10-07 Thread GitBox
emkornfield commented on a change in pull request #11146: URL: https://github.com/apache/arrow/pull/11146#discussion_r724692380 ## File path: go/parquet/file/column_reader.go ## @@ -0,0 +1,542 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contrib

[GitHub] [arrow] github-actions[bot] commented on pull request #11362: ARROW-14261: [C++] Includes should be in alphabetical order

2021-10-07 Thread GitBox
github-actions[bot] commented on pull request #11362: URL: https://github.com/apache/arrow/pull/11362#issuecomment-938334179 https://issues.apache.org/jira/browse/ARROW-14261 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

[GitHub] [arrow] emkornfield commented on a change in pull request #11146: ARROW-13984: [Go][Parquet] file handling for go parquet, just the readers

2021-10-07 Thread GitBox
emkornfield commented on a change in pull request #11146: URL: https://github.com/apache/arrow/pull/11146#discussion_r724691643 ## File path: go/parquet/file/column_reader.go ## @@ -0,0 +1,542 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contrib

[GitHub] [arrow] emkornfield commented on a change in pull request #11146: ARROW-13984: [Go][Parquet] file handling for go parquet, just the readers

2021-10-07 Thread GitBox
emkornfield commented on a change in pull request #11146: URL: https://github.com/apache/arrow/pull/11146#discussion_r724690618 ## File path: go/parquet/file/column_reader.go ## @@ -0,0 +1,542 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contrib

[GitHub] [arrow] aocsa commented on a change in pull request #11210: ARROW-13576: [C++] Replace ExecNode::InputReceived with ::MakeTask

2021-10-07 Thread GitBox
aocsa commented on a change in pull request #11210: URL: https://github.com/apache/arrow/pull/11210#discussion_r724689820 ## File path: cpp/src/arrow/compute/exec/exec_plan.h ## @@ -243,6 +248,128 @@ class ARROW_EXPORT ExecNode { NodeVector outputs_; }; +/// \brief MapNod

[GitHub] [arrow] emkornfield commented on a change in pull request #11146: ARROW-13984: [Go][Parquet] file handling for go parquet, just the readers

2021-10-07 Thread GitBox
emkornfield commented on a change in pull request #11146: URL: https://github.com/apache/arrow/pull/11146#discussion_r724689414 ## File path: go/parquet/file/column_reader.go ## @@ -0,0 +1,542 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contrib

[GitHub] [arrow] emkornfield commented on a change in pull request #11146: ARROW-13984: [Go][Parquet] file handling for go parquet, just the readers

2021-10-07 Thread GitBox
emkornfield commented on a change in pull request #11146: URL: https://github.com/apache/arrow/pull/11146#discussion_r724689414 ## File path: go/parquet/file/column_reader.go ## @@ -0,0 +1,542 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contrib

[GitHub] [arrow] ursabot edited a comment on pull request #11352: ARROW-14243: [C++] Split vector_sort.cc

2021-10-07 Thread GitBox
ursabot edited a comment on pull request #11352: URL: https://github.com/apache/arrow/pull/11352#issuecomment-938073431 Benchmark runs are scheduled for baseline = 25a6f591d1f162106b74e29870ebd4012e9874cc and contender = 55d40f696022120567d2c7a3030c7da996bc6ad1. Results will be available a

[GitHub] [arrow] emkornfield commented on a change in pull request #11146: ARROW-13984: [Go][Parquet] file handling for go parquet, just the readers

2021-10-07 Thread GitBox
emkornfield commented on a change in pull request #11146: URL: https://github.com/apache/arrow/pull/11146#discussion_r724688094 ## File path: go/parquet/file/column_reader.go ## @@ -0,0 +1,542 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contrib

[GitHub] [arrow] emkornfield commented on a change in pull request #11146: ARROW-13984: [Go][Parquet] file handling for go parquet, just the readers

2021-10-07 Thread GitBox
emkornfield commented on a change in pull request #11146: URL: https://github.com/apache/arrow/pull/11146#discussion_r724687695 ## File path: go/parquet/file/column_reader.go ## @@ -0,0 +1,542 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contrib

[GitHub] [arrow] emkornfield commented on a change in pull request #11146: ARROW-13984: [Go][Parquet] file handling for go parquet, just the readers

2021-10-07 Thread GitBox
emkornfield commented on a change in pull request #11146: URL: https://github.com/apache/arrow/pull/11146#discussion_r724687610 ## File path: go/parquet/file/column_reader.go ## @@ -0,0 +1,542 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contrib

[GitHub] [arrow] ursabot edited a comment on pull request #11352: ARROW-14243: [C++] Split vector_sort.cc

2021-10-07 Thread GitBox
ursabot edited a comment on pull request #11352: URL: https://github.com/apache/arrow/pull/11352#issuecomment-938073431 Benchmark runs are scheduled for baseline = 25a6f591d1f162106b74e29870ebd4012e9874cc and contender = 55d40f696022120567d2c7a3030c7da996bc6ad1. Results will be available a

[GitHub] [arrow] ursabot edited a comment on pull request #11352: ARROW-14243: [C++] Split vector_sort.cc

2021-10-07 Thread GitBox
ursabot edited a comment on pull request #11352: URL: https://github.com/apache/arrow/pull/11352#issuecomment-938073431 Benchmark runs are scheduled for baseline = 25a6f591d1f162106b74e29870ebd4012e9874cc and contender = 55d40f696022120567d2c7a3030c7da996bc6ad1. Results will be available a

[GitHub] [arrow] emkornfield commented on a change in pull request #11146: ARROW-13984: [Go][Parquet] file handling for go parquet, just the readers

2021-10-07 Thread GitBox
emkornfield commented on a change in pull request #11146: URL: https://github.com/apache/arrow/pull/11146#discussion_r724687258 ## File path: go/parquet/file/column_reader.go ## @@ -0,0 +1,542 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contrib

[GitHub] [arrow] ursabot edited a comment on pull request #11352: ARROW-14243: [C++] Split vector_sort.cc

2021-10-07 Thread GitBox
ursabot edited a comment on pull request #11352: URL: https://github.com/apache/arrow/pull/11352#issuecomment-938073431 Benchmark runs are scheduled for baseline = 25a6f591d1f162106b74e29870ebd4012e9874cc and contender = 55d40f696022120567d2c7a3030c7da996bc6ad1. Results will be available a

[GitHub] [arrow] emkornfield commented on a change in pull request #11146: ARROW-13984: [Go][Parquet] file handling for go parquet, just the readers

2021-10-07 Thread GitBox
emkornfield commented on a change in pull request #11146: URL: https://github.com/apache/arrow/pull/11146#discussion_r724686944 ## File path: go/parquet/file/column_reader.go ## @@ -0,0 +1,542 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contrib

[GitHub] [arrow] emkornfield commented on a change in pull request #11146: ARROW-13984: [Go][Parquet] file handling for go parquet, just the readers

2021-10-07 Thread GitBox
emkornfield commented on a change in pull request #11146: URL: https://github.com/apache/arrow/pull/11146#discussion_r724686466 ## File path: go/parquet/file/column_reader.go ## @@ -0,0 +1,542 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contrib

[GitHub] [arrow] emkornfield commented on a change in pull request #11146: ARROW-13984: [Go][Parquet] file handling for go parquet, just the readers

2021-10-07 Thread GitBox
emkornfield commented on a change in pull request #11146: URL: https://github.com/apache/arrow/pull/11146#discussion_r724686343 ## File path: go/parquet/file/column_reader.go ## @@ -0,0 +1,542 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contrib

[GitHub] [arrow] emkornfield commented on a change in pull request #11146: ARROW-13984: [Go][Parquet] file handling for go parquet, just the readers

2021-10-07 Thread GitBox
emkornfield commented on a change in pull request #11146: URL: https://github.com/apache/arrow/pull/11146#discussion_r724686137 ## File path: go/parquet/file/column_reader.go ## @@ -0,0 +1,542 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contrib

[GitHub] [arrow] emkornfield commented on a change in pull request #11146: ARROW-13984: [Go][Parquet] file handling for go parquet, just the readers

2021-10-07 Thread GitBox
emkornfield commented on a change in pull request #11146: URL: https://github.com/apache/arrow/pull/11146#discussion_r724685540 ## File path: go/parquet/file/column_reader.go ## @@ -0,0 +1,542 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contrib

[GitHub] [arrow-datafusion] Jimexist commented on issue #1083: Implement `approx_distinct` using HyperLogLog

2021-10-07 Thread GitBox
Jimexist commented on issue #1083: URL: https://github.com/apache/arrow-datafusion/issues/1083#issuecomment-938324722 @houqp @alamb is this of any use to your usecases? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [arrow-datafusion] Jimexist opened a new issue #1083: Implement `approx_distinct` using HyperLogLog

2021-10-07 Thread GitBox
Jimexist opened a new issue #1083: URL: https://github.com/apache/arrow-datafusion/issues/1083 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** A clear and concise description of what the problem is. Ex. I'm always frustrated when

[GitHub] [arrow] emkornfield commented on a change in pull request #11146: ARROW-13984: [Go][Parquet] file handling for go parquet, just the readers

2021-10-07 Thread GitBox
emkornfield commented on a change in pull request #11146: URL: https://github.com/apache/arrow/pull/11146#discussion_r724684285 ## File path: go/parquet/file/column_reader.go ## @@ -0,0 +1,542 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contrib

[GitHub] [arrow] emkornfield commented on a change in pull request #11146: ARROW-13984: [Go][Parquet] file handling for go parquet, just the readers

2021-10-07 Thread GitBox
emkornfield commented on a change in pull request #11146: URL: https://github.com/apache/arrow/pull/11146#discussion_r724683731 ## File path: go/parquet/file/column_reader.go ## @@ -0,0 +1,542 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contrib

[GitHub] [arrow] emkornfield commented on a change in pull request #11146: ARROW-13984: [Go][Parquet] file handling for go parquet, just the readers

2021-10-07 Thread GitBox
emkornfield commented on a change in pull request #11146: URL: https://github.com/apache/arrow/pull/11146#discussion_r724682708 ## File path: go/parquet/file/column_reader.go ## @@ -0,0 +1,542 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contrib

[GitHub] [arrow] emkornfield commented on a change in pull request #11146: ARROW-13984: [Go][Parquet] file handling for go parquet, just the readers

2021-10-07 Thread GitBox
emkornfield commented on a change in pull request #11146: URL: https://github.com/apache/arrow/pull/11146#discussion_r724682708 ## File path: go/parquet/file/column_reader.go ## @@ -0,0 +1,542 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contrib

[GitHub] [arrow-cookbook] westonpace closed issue #25: Add linting to C++ cookbook

2021-10-07 Thread GitBox
westonpace closed issue #25: URL: https://github.com/apache/arrow-cookbook/issues/25 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubs

[GitHub] [arrow] bkmgit commented on a change in pull request #11337: ARROW-9843: [C++] WIP Implement Between trinary kernel

2021-10-07 Thread GitBox
bkmgit commented on a change in pull request #11337: URL: https://github.com/apache/arrow/pull/11337#discussion_r724676987 ## File path: cpp/src/arrow/compute/kernels/scalar_between.cc ## @@ -0,0 +1,152 @@ +// Licensed to the Apache Software Foundation (ASF) under one Review c

[GitHub] [arrow] emkornfield commented on pull request #11310: ARROW-13804: [Go] Add Interval type Month, Day, Nano

2021-10-07 Thread GitBox
emkornfield commented on pull request #11310: URL: https://github.com/apache/arrow/pull/11310#issuecomment-938305285 Thanks @zeroshade it looks like there might have been some large changes outside of flatbuf code unrelated to MonthDayNanoIntervals (it appears that the are formatting?) Wo

[GitHub] [arrow] emkornfield commented on a change in pull request #11310: ARROW-13804: [Go] Add Interval type Month, Day, Nano

2021-10-07 Thread GitBox
emkornfield commented on a change in pull request #11310: URL: https://github.com/apache/arrow/pull/11310#discussion_r724670088 ## File path: go/arrow/type_traits_interval.go ## @@ -124,3 +125,53 @@ func (daytimeTraits) CastToBytes(b []DayTimeInterval) []byte { // Copy copi

[GitHub] [arrow] emkornfield commented on a change in pull request #11310: ARROW-13804: [Go] Add Interval type Month, Day, Nano

2021-10-07 Thread GitBox
emkornfield commented on a change in pull request #11310: URL: https://github.com/apache/arrow/pull/11310#discussion_r724669537 ## File path: go/arrow/internal/arrjson/arrjson_test.go ## @@ -2786,6 +2833,44 @@ func makeIntervalsWantJSONs() string { "milliseconds"

[GitHub] [arrow] emkornfield commented on a change in pull request #11310: ARROW-13804: [Go] Add Interval type Month, Day, Nano

2021-10-07 Thread GitBox
emkornfield commented on a change in pull request #11310: URL: https://github.com/apache/arrow/pull/11310#discussion_r724669439 ## File path: go/arrow/internal/arrjson/arrjson.go ## @@ -1670,6 +1689,35 @@ func daytimeintervalToJSON(arr *array.DayTimeInterval) []interface{} {

[GitHub] [arrow] emkornfield commented on a change in pull request #11310: ARROW-13804: [Go] Add Interval type Month, Day, Nano

2021-10-07 Thread GitBox
emkornfield commented on a change in pull request #11310: URL: https://github.com/apache/arrow/pull/11310#discussion_r724668988 ## File path: go/arrow/datatype.go ## @@ -95,21 +95,30 @@ const ( // nanoseconds since midnight TIME64 - // INTERVAL is YEAR_M

[GitHub] [arrow] emkornfield commented on a change in pull request #11310: ARROW-13804: [Go] Add Interval type Month, Day, Nano

2021-10-07 Thread GitBox
emkornfield commented on a change in pull request #11310: URL: https://github.com/apache/arrow/pull/11310#discussion_r724668690 ## File path: go/arrow/array/interval_test.go ## @@ -274,3 +274,128 @@ func TestDayTimeIntervalBuilder_Empty(t *testing.T) { assert.Equal(t, w

[GitHub] [arrow] cyb70289 commented on pull request #11323: ARROW-13975: [C++] Implement decimal round

2021-10-07 Thread GitBox
cyb70289 commented on pull request #11323: URL: https://github.com/apache/arrow/pull/11323#issuecomment-938281976 Sorry I didn't make it clear. I've updated the doc. Will merge when CI done. -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [arrow-datafusion] xudong963 edited a comment on issue #1082: Implement the rest Set Operators: INTERSECT & MINUS

2021-10-07 Thread GitBox
xudong963 edited a comment on issue #1082: URL: https://github.com/apache/arrow-datafusion/issues/1082#issuecomment-938268140 Please assign it to me, thanks! @houqp -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

[GitHub] [arrow] ursabot edited a comment on pull request #11352: ARROW-14243: [C++] Split vector_sort.cc

2021-10-07 Thread GitBox
ursabot edited a comment on pull request #11352: URL: https://github.com/apache/arrow/pull/11352#issuecomment-938073431 Benchmark runs are scheduled for baseline = 25a6f591d1f162106b74e29870ebd4012e9874cc and contender = 55d40f696022120567d2c7a3030c7da996bc6ad1. Results will be available a

[GitHub] [arrow-datafusion] xudong963 commented on issue #1082: Implement the rest Set Operators: INTERSECT & MINUS

2021-10-07 Thread GitBox
xudong963 commented on issue #1082: URL: https://github.com/apache/arrow-datafusion/issues/1082#issuecomment-938268140 Please assign me, thanks! @houqp -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

[GitHub] [arrow-datafusion] xudong963 opened a new issue #1082: Implement the rest Set Operators: INTERSECT & MINUS

2021-10-07 Thread GitBox
xudong963 opened a new issue #1082: URL: https://github.com/apache/arrow-datafusion/issues/1082 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** - [ ] INTERSECT - [ ] MINUS **Describe the solution you'd like** A cl

[GitHub] [arrow] kou commented on a change in pull request #11337: ARROW-9843: [C++] WIP Implement Between trinary kernel

2021-10-07 Thread GitBox
kou commented on a change in pull request #11337: URL: https://github.com/apache/arrow/pull/11337#discussion_r724632008 ## File path: cpp/src/arrow/compute/kernels/scalar_between.cc ## @@ -0,0 +1,152 @@ +// Licensed to the Apache Software Foundation (ASF) under one Review comm

[GitHub] [arrow] westonpace closed pull request #11017: ARROW-13542: [C++][Compute][Dataset] Add dataset::WriteNode for writing rows from an ExecPlan to disk

2021-10-07 Thread GitBox
westonpace closed pull request #11017: URL: https://github.com/apache/arrow/pull/11017 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsu

[GitHub] [arrow] westonpace commented on a change in pull request #11285: ARROW-13611: [C++] Scanning datasets does not enforce back pressure

2021-10-07 Thread GitBox
westonpace commented on a change in pull request #11285: URL: https://github.com/apache/arrow/pull/11285#discussion_r724626154 ## File path: cpp/src/arrow/dataset/scanner_test.cc ## @@ -1026,6 +1026,14 @@ class TestBackpressure : public ::testing::Test { return sum; }

[GitHub] [arrow] kou commented on a change in pull request #11331: ARROW-14222: [C++] implement GCSFileSystem skeleton

2021-10-07 Thread GitBox
kou commented on a change in pull request #11331: URL: https://github.com/apache/arrow/pull/11331#discussion_r724619541 ## File path: cpp/cmake_modules/ThirdpartyToolchain.cmake ## @@ -3704,6 +3706,13 @@ endmacro() if(ARROW_WITH_GOOGLE_CLOUD_CPP) resolve_dependency(google

[GitHub] [arrow] ianmcook commented on pull request #11266: ARROW-14166: [C++] update vcpkg builtin baseline

2021-10-07 Thread GitBox
ianmcook commented on pull request #11266: URL: https://github.com/apache/arrow/pull/11266#issuecomment-938245091 Jira for the linker error is ARROW-14260. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [arrow] pitrou commented on a change in pull request #11350: ARROW-14197: [C++][Compute] Fixing thread sanitizer problems in hash join node

2021-10-07 Thread GitBox
pitrou commented on a change in pull request #11350: URL: https://github.com/apache/arrow/pull/11350#discussion_r723927168 ## File path: cpp/src/arrow/compute/exec/tpch_test.cc ## @@ -0,0 +1,155 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contr

[GitHub] [arrow] emkornfield commented on pull request #11351: ARROW-13151: [C++][Parquet] Propagate schema changes from selection all the way up the stack

2021-10-07 Thread GitBox
emkornfield commented on pull request #11351: URL: https://github.com/apache/arrow/pull/11351#issuecomment-938225940 Also CC @zeroshade I know we haven't gotten there yet, but this might have crept into your port. -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow] mpeterson-p4 commented on pull request #11324: ARROW-14228: [R] Allow for creation of nullable fields

2021-10-07 Thread GitBox
mpeterson-p4 commented on pull request #11324: URL: https://github.com/apache/arrow/pull/11324#issuecomment-937100414 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

[GitHub] [arrow] github-actions[bot] commented on pull request #11361: ARROW-14258: [R] Warn if an SF column is made into a table

2021-10-07 Thread GitBox
github-actions[bot] commented on pull request #11361: URL: https://github.com/apache/arrow/pull/11361#issuecomment-938224243 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow-datafusion] houqp edited a comment on pull request #1072: Expose a static object store registry

2021-10-07 Thread GitBox
houqp edited a comment on pull request #1072: URL: https://github.com/apache/arrow-datafusion/pull/1072#issuecomment-937478271 Interesting, from the discussion we had in https://github.com/rdettai/arrow-datafusion/pull/1, I got the opposite impression on where we are heading ;P I thought t

[GitHub] [arrow] pitrou commented on a change in pull request #11353: ARROW-14244: [C++] Reduce scalar_temporal.cc compilation time

2021-10-07 Thread GitBox
pitrou commented on a change in pull request #11353: URL: https://github.com/apache/arrow/pull/11353#discussion_r724192552 ## File path: cpp/src/arrow/compute/kernels/scalar_temporal_binary.cc ## @@ -0,0 +1,546 @@ +// Licensed to the Apache Software Foundation (ASF) under one +

[GitHub] [arrow] dragosmg commented on pull request #11355: ARROW-13800 [R] Use divide instead of divide_checked

2021-10-07 Thread GitBox
dragosmg commented on pull request #11355: URL: https://github.com/apache/arrow/pull/11355#issuecomment-937908839 The expected behaviour for integer division by 0 should probably: * error when `%/%` is mapped to `divide_checked` * result in `Inf` when `%/%` is mapped to `divide`

[GitHub] [arrow-rs] alamb commented on a change in pull request #820: Fewer ByteArray allocations when writing binary columns

2021-10-07 Thread GitBox
alamb commented on a change in pull request #820: URL: https://github.com/apache/arrow-rs/pull/820#discussion_r724365907 ## File path: parquet/src/arrow/arrow_writer.rs ## @@ -461,15 +461,25 @@ fn write_leaf( macro_rules! def_get_binary_array_fn { ($name:ident, $ty:ty) =>

[GitHub] [arrow] kou closed pull request #11349: ARROW-14240: [C++] Fix wrong nlohmann-json header path

2021-10-07 Thread GitBox
kou closed pull request #11349: URL: https://github.com/apache/arrow/pull/11349 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...

[GitHub] [arrow-cookbook] thisisnic commented on issue #83: [R] Recipe for random sampling

2021-10-07 Thread GitBox
thisisnic commented on issue #83: URL: https://github.com/apache/arrow-cookbook/issues/83#issuecomment-937789620 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

[GitHub] [arrow-rs] nevi-me commented on a change in pull request #818: Separate parquet writer benchmarks

2021-10-07 Thread GitBox
nevi-me commented on a change in pull request #818: URL: https://github.com/apache/arrow-rs/pull/818#discussion_r724066018 ## File path: parquet/benches/arrow_writer.rs ## @@ -36,25 +36,164 @@ fn create_primitive_bench_batch( true_density: f32, ) -> Result { let fiel

[GitHub] [arrow] lidavidm commented on pull request #11328: ARROW-14231: [C++] Support casting timestamp with timezone to string

2021-10-07 Thread GitBox
lidavidm commented on pull request #11328: URL: https://github.com/apache/arrow/pull/11328#issuecomment-937966754 Rebased on top of the recent temporal kernel refactoring. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

[GitHub] [arrow] kou commented on issue #11342: Can arrow be used to load a parquet file from an Azure blob store (dbfs://) using Arrow ruby

2021-10-07 Thread GitBox
kou commented on issue #11342: URL: https://github.com/apache/arrow/issues/11342#issuecomment-937162999 We didn't implement Azure blob storage support yet. You can track it on ARROW-2034 . -- This is an automated message from the Apache Git Service. To respond to the message, please lo

[GitHub] [arrow] kou commented on pull request #11300: ARROW-14207: [C++] Add missing dependencies for bundled Boost targets

2021-10-07 Thread GitBox
kou commented on pull request #11300: URL: https://github.com/apache/arrow/pull/11300#issuecomment-937146813 +1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

[GitHub] [arrow] pitrou closed pull request #11336: ARROW-14214: [Python][CI] Fix tests using OrcFileFormat for Python 3.6 + orc not built

2021-10-07 Thread GitBox
pitrou closed pull request #11336: URL: https://github.com/apache/arrow/pull/11336 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[GitHub] [arrow-datafusion] alamb merged pull request #1071: Add function volatility to Signature

2021-10-07 Thread GitBox
alamb merged pull request #1071: URL: https://github.com/apache/arrow-datafusion/pull/1071 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-

[GitHub] [arrow] github-actions[bot] commented on pull request #11358: ARROW-12820: [C++] Support zone offset in ISO8601, strptime parser

2021-10-07 Thread GitBox
github-actions[bot] commented on pull request #11358: URL: https://github.com/apache/arrow/pull/11358#issuecomment-938108687 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow] pitrou commented on a change in pull request #11357: ARROW-13901: [R] Implement IndexOptions

2021-10-07 Thread GitBox
pitrou commented on a change in pull request #11357: URL: https://github.com/apache/arrow/pull/11357#discussion_r724374825 ## File path: r/src/compute.cpp ## @@ -254,6 +254,11 @@ std::shared_ptr make_compute_options( cpp11::as_cpp(options

  1   2   3   4   >