[GitHub] [arrow] bkmgit commented on a change in pull request #11882: ARROW-9843: [C++] Implement Between ternary kernel

2022-01-12 Thread GitBox
bkmgit commented on a change in pull request #11882: URL: https://github.com/apache/arrow/pull/11882#discussion_r782791264 ## File path: cpp/src/arrow/util/bit_block_counter.h ## @@ -87,6 +87,147 @@ struct BitBlockOrNot { static bool Call(bool left, bool right) { return left

[GitHub] [arrow] ursabot edited a comment on pull request #12108: ARROW-14531: [Ruby] Add Arrow::Table#join

2022-01-12 Thread GitBox
ursabot edited a comment on pull request #12108: URL: https://github.com/apache/arrow/pull/12108#issuecomment-1009380181 Benchmark runs are scheduled for baseline = 16d5554ad2010bc7d224c7e3cad9b87188c92054 and contender = 2d1bd96c951a9c6989c4c475781f59db7987d359. 2d1bd96c951a9c6989c4c4757

[GitHub] [arrow] bkmgit removed a comment on pull request #11882: ARROW-9843: [C++] Implement Between ternary kernel

2022-01-12 Thread GitBox
bkmgit removed a comment on pull request #11882: URL: https://github.com/apache/arrow/pull/11882#issuecomment-1010646334 @lidavidm Thanks for the feedback. Inclusive is in global space so that it can also be used in NotBetween - follow up issue https://issues.apache.org/jira/browse/ARROW-1

[GitHub] [arrow] bkmgit commented on a change in pull request #11882: ARROW-9843: [C++] Implement Between ternary kernel

2022-01-12 Thread GitBox
bkmgit commented on a change in pull request #11882: URL: https://github.com/apache/arrow/pull/11882#discussion_r782792995 ## File path: cpp/src/arrow/compute/api_scalar.h ## @@ -316,6 +316,21 @@ struct ARROW_EXPORT CompareOptions { enum CompareOperator op; }; +enum class

[GitHub] [arrow-datafusion] yjshen commented on a change in pull request #1526: A simplified memory manager for query execution

2022-01-12 Thread GitBox
yjshen commented on a change in pull request #1526: URL: https://github.com/apache/arrow-datafusion/pull/1526#discussion_r782802400 ## File path: datafusion/src/execution/memory_manager.rs ## @@ -0,0 +1,490 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// o

[GitHub] [arrow] vibhatha commented on pull request #12112: ARROW-15183: [Python][Docs] Add Missing Dataset Write Options

2022-01-12 Thread GitBox
vibhatha commented on pull request #12112: URL: https://github.com/apache/arrow/pull/12112#issuecomment-1010765139 @wjones127 thanks for the review, I will update the PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

[GitHub] [arrow] ursabot edited a comment on pull request #12010: ARROW-6001 [Python]: Add from_pylist() and to_pylist() to pyarrow.Table to convert list of records

2022-01-12 Thread GitBox
ursabot edited a comment on pull request #12010: URL: https://github.com/apache/arrow/pull/12010#issuecomment-1010075009 Benchmark runs are scheduled for baseline = 7a0141a8cc867e5b406ed97e5decc227923eb3f5 and contender = ccffcea3fd383c448aa9da292baf2d0805ecab4d. ccffcea3fd383c448aa9da292

[GitHub] [arrow-rs] codecov-commenter commented on pull request #1161: Simplify and reduce code duplication in arithmetic kernels

2022-01-12 Thread GitBox
codecov-commenter commented on pull request #1161: URL: https://github.com/apache/arrow-rs/pull/1161#issuecomment-1010836595 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1161?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=T

[GitHub] [arrow] ursabot edited a comment on pull request #12084: ARROW-15029: [C++] Split compute/kernels/scalar_string.cc

2022-01-12 Thread GitBox
ursabot edited a comment on pull request #12084: URL: https://github.com/apache/arrow/pull/12084#issuecomment-1009634881 Benchmark runs are scheduled for baseline = 2d1bd96c951a9c6989c4c475781f59db7987d359 and contender = 540dbf6d58c4c17d772583d2516f5847ef7d34fd. 540dbf6d58c4c17d772583d25

[GitHub] [arrow] pitrou closed pull request #12129: ARROW-15306: [C++] S3FileSystem Should set the content-type header to application/octet-stream if not specified

2022-01-12 Thread GitBox
pitrou closed pull request #12129: URL: https://github.com/apache/arrow/pull/12129 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[GitHub] [arrow] ursabot commented on pull request #12129: ARROW-15306: [C++] S3FileSystem Should set the content-type header to application/octet-stream if not specified

2022-01-12 Thread GitBox
ursabot commented on pull request #12129: URL: https://github.com/apache/arrow/pull/12129#issuecomment-1010902509 Benchmark runs are scheduled for baseline = ce639b03307c220ffb374bae888d7fc5788fe4ae and contender = 9359026bad4a626de3699e023f96f0c1383d7032. 9359026bad4a626de3699e023f96f0c1

[GitHub] [arrow] ursabot edited a comment on pull request #12122: ARROW-15302: [R] Followup to dropping R 3.3 support

2022-01-12 Thread GitBox
ursabot edited a comment on pull request #12122: URL: https://github.com/apache/arrow/pull/12122#issuecomment-1010179189 Benchmark runs are scheduled for baseline = ccffcea3fd383c448aa9da292baf2d0805ecab4d and contender = aa59d17c4c3a1577bcf985f05b86024c7a33a57c. aa59d17c4c3a1577bcf985f05

[GitHub] [arrow] ursabot edited a comment on pull request #12129: ARROW-15306: [C++] S3FileSystem Should set the content-type header to application/octet-stream if not specified

2022-01-12 Thread GitBox
ursabot edited a comment on pull request #12129: URL: https://github.com/apache/arrow/pull/12129#issuecomment-1010902509 Benchmark runs are scheduled for baseline = ce639b03307c220ffb374bae888d7fc5788fe4ae and contender = 9359026bad4a626de3699e023f96f0c1383d7032. 9359026bad4a626de3699e023

[GitHub] [arrow] zhixingheyi-tian commented on a change in pull request #11763: ARROW-14153: [C++][Dataset] Add support for batch_size in the ORC Scanner

2022-01-12 Thread GitBox
zhixingheyi-tian commented on a change in pull request #11763: URL: https://github.com/apache/arrow/pull/11763#discussion_r782965161 ## File path: cpp/src/arrow/dataset/file_orc.cc ## @@ -85,24 +85,20 @@ class OrcScanTask : public ScanTask { included_fields.push_back

[GitHub] [arrow-datafusion] Igosuki commented on pull request #68: Experimenting with arrow2

2022-01-12 Thread GitBox
Igosuki commented on pull request #68: URL: https://github.com/apache/arrow-datafusion/pull/68#issuecomment-1010929254 https://github.com/houqp/arrow-datafusion/pull/17 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [arrow] bkmgit commented on a change in pull request #11882: ARROW-9843: [C++] Implement Between ternary kernel

2022-01-12 Thread GitBox
bkmgit commented on a change in pull request #11882: URL: https://github.com/apache/arrow/pull/11882#discussion_r782966789 ## File path: cpp/src/arrow/compute/kernels/scalar_compare_test.cc ## @@ -1850,5 +1851,154 @@ TEST(TestMaxElementWiseMinElementWise, CommonTemporal) {

[GitHub] [arrow-cookbook] amol- commented on issue #116: Difference between cookbook and user guide docs

2022-01-12 Thread GitBox
amol- commented on issue #116: URL: https://github.com/apache/arrow-cookbook/issues/116#issuecomment-1010931643 Would be a great addition, my only suggestion would be to add it to README.rst as that's where the introduction to the cookbook and its purpose is currently written: https://git

[GitHub] [arrow] anthonylouisbsb commented on pull request #12130: ARROW-15200: [C++][Gandiva] Enable RTTI when building LLVM dependency using vcpkg [WIP]

2022-01-12 Thread GitBox
anthonylouisbsb commented on pull request #12130: URL: https://github.com/apache/arrow/pull/12130#issuecomment-1010937414 The nightly build is being broken when it is executing the set up of the last job, I need to check if it is related to the changes I made. -- This is an automated mes

[GitHub] [arrow] ursabot edited a comment on pull request #12117: ARROW-15295: [R] Add 6.0.0 to our old versions to check

2022-01-12 Thread GitBox
ursabot edited a comment on pull request #12117: URL: https://github.com/apache/arrow/pull/12117#issuecomment-1009733881 Benchmark runs are scheduled for baseline = 540dbf6d58c4c17d772583d2516f5847ef7d34fd and contender = 123a798288b59c080a2b624384313d390ceef9d7. 123a798288b59c080a2b62438

[GitHub] [arrow] thisisnic commented on a change in pull request #11942: ARROW-14762: [Doc] Additional info and resources

2022-01-12 Thread GitBox
thisisnic commented on a change in pull request #11942: URL: https://github.com/apache/arrow/pull/11942#discussion_r783012298 ## File path: docs/source/developers/guide/resources.rst ## @@ -27,3 +27,51 @@ Additional information and resourc

[GitHub] [arrow-datafusion] alamb commented on pull request #1523: Update to arrow-7.0.0

2022-01-12 Thread GitBox
alamb commented on pull request #1523: URL: https://github.com/apache/arrow-datafusion/pull/1523#issuecomment-1010985980 This PR is now ready for review / merge -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow-rs] alamb opened a new pull request #1162: Remove left over dev/README.md file from arrow/arrow-rs split

2022-01-12 Thread GitBox
alamb opened a new pull request #1162: URL: https://github.com/apache/arrow-rs/pull/1162 Minor cleanup: I noticed a readme file that is outdated -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

[GitHub] [arrow-rs] alamb merged pull request #1159: Add multiply_scalar kernel

2022-01-12 Thread GitBox
alamb merged pull request #1159: URL: https://github.com/apache/arrow-rs/pull/1159 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[GitHub] [arrow-rs] alamb merged pull request #1152: Add subtract_scalar kernel

2022-01-12 Thread GitBox
alamb merged pull request #1152: URL: https://github.com/apache/arrow-rs/pull/1152 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[GitHub] [arrow-rs] codecov-commenter commented on pull request #1162: Remove left over dev/README.md file from arrow/arrow-rs split

2022-01-12 Thread GitBox
codecov-commenter commented on pull request #1162: URL: https://github.com/apache/arrow-rs/pull/1162#issuecomment-1010998208 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1162?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=T

[GitHub] [arrow-rs] tustvold edited a comment on pull request #1082: Optimized ByteArrayReader (#1040)

2022-01-12 Thread GitBox
tustvold edited a comment on pull request #1082: URL: https://github.com/apache/arrow-rs/pull/1082#issuecomment-1011005560 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

[GitHub] [arrow-rs] tustvold commented on pull request #1082: Optimized ByteArrayReader (#1040)

2022-01-12 Thread GitBox
tustvold commented on pull request #1082: URL: https://github.com/apache/arrow-rs/pull/1082#issuecomment-1011005560 I've added UTF-8 validation, including @jorgecarleitao 's very helpful test case, so this should fix that also :tada: -- This is an automated message from the Apache Git Se

[GitHub] [arrow-rs] codecov-commenter edited a comment on pull request #1048: Implement option to sort by dictionary keys in sort and partition kernels

2022-01-12 Thread GitBox
codecov-commenter edited a comment on pull request #1048: URL: https://github.com/apache/arrow-rs/pull/1048#issuecomment-996105420 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1048?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_

[GitHub] [arrow-rs] jhorstmann commented on pull request #1048: Implement option to sort by dictionary keys in sort and partition kernels

2022-01-12 Thread GitBox
jhorstmann commented on pull request #1048: URL: https://github.com/apache/arrow-rs/pull/1048#issuecomment-1011009045 I changed to using the `is_ordered` flag as initially proposed and a `fn as_ordered(&self, is_ordered: bool) -> Self` that would return a DictionaryArray with the flag set

[GitHub] [arrow] thisisnic commented on issue #12114: Build just doesn't work on windows with MSVC 2017. Any chance to put together decent .vcproj files?

2022-01-12 Thread GitBox
thisisnic commented on issue #12114: URL: https://github.com/apache/arrow/issues/12114#issuecomment-1011013064 Thanks for those logs - looks like you may have some configuration issues there. It looks like it’s failing on linking to ucrtd.lib - people seem to get similar error messages wh

[GitHub] [arrow] ursabot edited a comment on pull request #11996: ARROW-15114: [C++] GcsFileSystem uses metadata for directory markers

2022-01-12 Thread GitBox
ursabot edited a comment on pull request #11996: URL: https://github.com/apache/arrow/pull/11996#issuecomment-1010179206 Benchmark runs are scheduled for baseline = aa59d17c4c3a1577bcf985f05b86024c7a33a57c and contender = 2164b0bc6084ba8de1cd6011dc219de74ad57f04. 2164b0bc6084ba8de1cd6011d

[GitHub] [arrow] lidavidm commented on a change in pull request #12106: ARROW-13269: Improve metadata docs for partitioned datasets

2022-01-12 Thread GitBox
lidavidm commented on a change in pull request #12106: URL: https://github.com/apache/arrow/pull/12106#discussion_r783062792 ## File path: python/pyarrow/parquet.py ## @@ -2305,6 +2305,9 @@ def write_metadata(schema, where, metadata_collector=None, **kwargs): ... tabl

[GitHub] [arrow] anthonylouisbsb commented on pull request #12130: ARROW-15200: [C++][Gandiva] Enable RTTI when building LLVM dependency using vcpkg [WIP]

2022-01-12 Thread GitBox
anthonylouisbsb commented on pull request #12130: URL: https://github.com/apache/arrow/pull/12130#issuecomment-1011034204 @kou @kszucs the java-jars nightly build is broken when is executing the checkout in the last step I will check and fix it. -- This is an automated message from the A

[GitHub] [arrow] lidavidm commented on pull request #12099: ARROW-15265: [C++] Fix hang in dataset writer with kDeleteMatchingPartitions and #partitions >= 8

2022-01-12 Thread GitBox
lidavidm commented on pull request #12099: URL: https://github.com/apache/arrow/pull/12099#issuecomment-1011035209 @westonpace I think I've addressed all feedback, any final comments here? -- This is an automated message from the Apache Git Service. To respond to the message, please log o

[GitHub] [arrow-rs] tustvold commented on pull request #1054: Preserve Parquet Bitmask (#1037)

2022-01-12 Thread GitBox
tustvold commented on pull request #1054: URL: https://github.com/apache/arrow-rs/pull/1054#issuecomment-1011035849 Looking into test failures -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] github-actions[bot] commented on pull request #12131: ARROW-15127: [R] More visible documentation of AWS_EC2_METADATA_DISABLED=TRUE

2022-01-12 Thread GitBox
github-actions[bot] commented on pull request #12131: URL: https://github.com/apache/arrow/pull/12131#issuecomment-1011035876 https://issues.apache.org/jira/browse/ARROW-15127 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

[GitHub] [arrow-cookbook] BenjaminWolfe opened a new pull request #117: Update reading_and_writing_data.Rmd (formatting)

2022-01-12 Thread GitBox
BenjaminWolfe opened a new pull request #117: URL: https://github.com/apache/arrow-cookbook/pull/117 Add blank line to format bulleted list correctly -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] thisisnic commented on a change in pull request #12131: ARROW-15127: [R] More visible documentation of AWS_EC2_METADATA_DISABLED=TRUE

2022-01-12 Thread GitBox
thisisnic commented on a change in pull request #12131: URL: https://github.com/apache/arrow/pull/12131#discussion_r783066056 ## File path: r/vignettes/fs.Rmd ## @@ -13,7 +13,10 @@ parts of the project to be able to read and write data with different storage backends. In the

[GitHub] [arrow] thisisnic commented on a change in pull request #12131: ARROW-15127: [R] More visible documentation of AWS_EC2_METADATA_DISABLED=TRUE

2022-01-12 Thread GitBox
thisisnic commented on a change in pull request #12131: URL: https://github.com/apache/arrow/pull/12131#discussion_r783067540 ## File path: r/vignettes/fs.Rmd ## @@ -144,3 +147,28 @@ s3://minioadmin:minioadmin@?scheme=http&endpoint_override=localhost%3A9000 Among other appl

[GitHub] [arrow-cookbook] BenjaminWolfe commented on pull request #117: Update reading_and_writing_data.Rmd (formatting)

2022-01-12 Thread GitBox
BenjaminWolfe commented on pull request #117: URL: https://github.com/apache/arrow-cookbook/pull/117#issuecomment-1011041856 I just tapped the edit button as I was reading from my phone; bookdown and GitHub make it so easy. I didn't see the contributor guidelines till I was about to submit

[GitHub] [arrow-datafusion] selvavm commented on issue #1536: Not able to get the table from register_listing_table

2022-01-12 Thread GitBox
selvavm commented on issue #1536: URL: https://github.com/apache/arrow-datafusion/issues/1536#issuecomment-1011046992 Hi @houqp. Sure I will try to reproduce the example and share. My understanding is - `Datafusion (or Arrow)` is able to understand the folder structure as Hive format and

[GitHub] [arrow] ursabot edited a comment on pull request #12007: ARROW-15087: [Python][Docs] Document MapArray and update parent class to ListArray

2022-01-12 Thread GitBox
ursabot edited a comment on pull request #12007: URL: https://github.com/apache/arrow/pull/12007#issuecomment-1009832519 Benchmark runs are scheduled for baseline = 123a798288b59c080a2b624384313d390ceef9d7 and contender = 0363df1b44274707228af7274102bbe50cdb68be. 0363df1b44274707228af7274

[GitHub] [arrow] github-actions[bot] commented on pull request #12132: custom-arrow-branch

2022-01-12 Thread GitBox
github-actions[bot] commented on pull request #12132: URL: https://github.com/apache/arrow/pull/12132#issuecomment-1011051974 Thanks for opening a pull request! If this is not a [minor PR](https://github.com/apache/arrow/blob/master/CONTRIBUTING.md#Minor-Fixes). Could you op

[GitHub] [arrow-rs] codecov-commenter edited a comment on pull request #1054: Preserve Parquet Bitmask (#1037)

2022-01-12 Thread GitBox
codecov-commenter edited a comment on pull request #1054: URL: https://github.com/apache/arrow-rs/pull/1054#issuecomment-1010153082 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1054?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm

[GitHub] [arrow] lidavidm commented on a change in pull request #12100: ARROW-15061: [C++] Add logging for kernel functions and exec plan nodes

2022-01-12 Thread GitBox
lidavidm commented on a change in pull request #12100: URL: https://github.com/apache/arrow/pull/12100#discussion_r783078213 ## File path: cpp/src/arrow/util/tracing.h ## @@ -0,0 +1,75 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor lic

[GitHub] [arrow] lidavidm commented on pull request #12124: ARROW-14093: [C++] subtract(date, date) -> interval kernel

2022-01-12 Thread GitBox
lidavidm commented on pull request #12124: URL: https://github.com/apache/arrow/pull/12124#issuecomment-1011056720 The title should be `-> duration`, not `-> interval`, right? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #1547: Add batch operations to stddev

2022-01-12 Thread GitBox
alamb commented on a change in pull request #1547: URL: https://github.com/apache/arrow-datafusion/pull/1547#discussion_r783085256 ## File path: datafusion/src/physical_plan/expressions/variance.rs ## @@ -230,93 +235,186 @@ impl VarianceAccumulator { self.count }

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #1547: Add batch operations to stddev

2022-01-12 Thread GitBox
alamb commented on a change in pull request #1547: URL: https://github.com/apache/arrow-datafusion/pull/1547#discussion_r783085457 ## File path: datafusion/src/physical_plan/expressions/variance.rs ## @@ -230,93 +235,186 @@ impl VarianceAccumulator { self.count }

[GitHub] [arrow] multimeric commented on issue #12102: How to pass an in-memory arrow object from Rust into R

2022-01-12 Thread GitBox
multimeric commented on issue #12102: URL: https://github.com/apache/arrow/issues/12102#issuecomment-1011061466 And the FFI doesn't support `RecordBatch`es? So there's no way to pass entire data frames from one process to another? I suppose in that case it is necessary to reconstruct data

[GitHub] [arrow] lidavidm commented on a change in pull request #12124: ARROW-14093: [C++] subtract(date, date) -> interval kernel

2022-01-12 Thread GitBox
lidavidm commented on a change in pull request #12124: URL: https://github.com/apache/arrow/pull/12124#discussion_r783086678 ## File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc ## @@ -1491,6 +1500,8 @@ ArrayKernelExec ArithmeticExecFromOp(detail::GetTypeId get_id)

[GitHub] [arrow] thisisnic commented on a change in pull request #12097: ARROW-14590: [R] Implement lubridate::week

2022-01-12 Thread GitBox
thisisnic commented on a change in pull request #12097: URL: https://github.com/apache/arrow/pull/12097#discussion_r783088569 ## File path: r/R/dplyr-funcs-datetime.R ## @@ -101,6 +101,10 @@ register_bindings_datetime <- function() { Expression$create("day_of_week", x, opt

[GitHub] [arrow] jorgecarleitao commented on issue #12102: How to pass an in-memory arrow object from Rust into R

2022-01-12 Thread GitBox
jorgecarleitao commented on issue #12102: URL: https://github.com/apache/arrow/issues/12102#issuecomment-1011065600 A recordbatch is not part of the arrow spec (and in particular the c data interface), it is something done ad-hoc by implementations. -- This is an automated message from t

[GitHub] [arrow] lidavidm commented on a change in pull request #11882: ARROW-9843: [C++] Implement Between ternary kernel

2022-01-12 Thread GitBox
lidavidm commented on a change in pull request #11882: URL: https://github.com/apache/arrow/pull/11882#discussion_r783092378 ## File path: cpp/src/arrow/compute/api_scalar.h ## @@ -316,6 +316,21 @@ struct ARROW_EXPORT CompareOptions { enum CompareOperator op; }; +enum cla

[GitHub] [arrow] anthonylouisbsb commented on pull request #12130: ARROW-15200: [C++][Gandiva] Enable RTTI when building LLVM dependency using vcpkg [WIP]

2022-01-12 Thread GitBox
anthonylouisbsb commented on pull request #12130: URL: https://github.com/apache/arrow/pull/12130#issuecomment-1011070449 > @kou @kszucs the java-jars nightly build is broken when is executing the checkout in the last step I will check and fix it. Apparently the problem is in the che

[GitHub] [arrow] dragosmg commented on pull request #12097: ARROW-14590: [R] Implement lubridate::week

2022-01-12 Thread GitBox
dragosmg commented on pull request #12097: URL: https://github.com/apache/arrow/pull/12097#issuecomment-1011070835 LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [arrow] anthonylouisbsb commented on pull request #12130: ARROW-15200: [C++][Gandiva] Enable RTTI when building LLVM dependency using vcpkg [WIP]

2022-01-12 Thread GitBox
anthonylouisbsb commented on pull request #12130: URL: https://github.com/apache/arrow/pull/12130#issuecomment-1011071155 @github-actions crossbow submit java-jars -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

[GitHub] [arrow] github-actions[bot] commented on pull request #12130: ARROW-15200: [C++][Gandiva] Enable RTTI when building LLVM dependency using vcpkg [WIP]

2022-01-12 Thread GitBox
github-actions[bot] commented on pull request #12130: URL: https://github.com/apache/arrow/pull/12130#issuecomment-1011072004 Revision: d461b01ef0121d389bd66e8206550c8350eed744 Submitted crossbow builds: [ursacomputing/crossbow @ actions-1391](https://github.com/ursacomputing/crossbo

[GitHub] [arrow] mbrobbel commented on a change in pull request #12100: ARROW-15061: [C++] Add logging for kernel functions and exec plan nodes

2022-01-12 Thread GitBox
mbrobbel commented on a change in pull request #12100: URL: https://github.com/apache/arrow/pull/12100#discussion_r783097897 ## File path: cpp/src/arrow/compute/exec/aggregate_node.cc ## @@ -188,36 +194,40 @@ class ScalarAggregateNode : public ExecNode { } void ErrorRec

[GitHub] [arrow] anthonylouisbsb edited a comment on pull request #12130: ARROW-15200: [C++][Gandiva] Enable RTTI when building LLVM dependency using vcpkg [WIP]

2022-01-12 Thread GitBox
anthonylouisbsb edited a comment on pull request #12130: URL: https://github.com/apache/arrow/pull/12130#issuecomment-1011070449 > @kou @kszucs the java-jars nightly build is broken when is executing the checkout in the last step I will check and fix it. Apparently, the problem was i

[GitHub] [arrow-datafusion] alamb commented on pull request #1526: A simplified memory manager for query execution

2022-01-12 Thread GitBox
alamb commented on pull request #1526: URL: https://github.com/apache/arrow-datafusion/pull/1526#issuecomment-1011084878 FWIW I also plan to run the TPCH benchmarks on this PR and will post the results (I don't expect any changes) -- This is an automated message from the Apache Git Serv

[GitHub] [arrow] edponce commented on a change in pull request #11882: ARROW-9843: [C++] Implement Between ternary kernel

2022-01-12 Thread GitBox
edponce commented on a change in pull request #11882: URL: https://github.com/apache/arrow/pull/11882#discussion_r783115242 ## File path: cpp/src/arrow/compute/api_scalar.h ## @@ -316,6 +316,21 @@ struct ARROW_EXPORT CompareOptions { enum CompareOperator op; }; +enum clas

[GitHub] [arrow-cookbook] amol- commented on a change in pull request #113: [Java]: Java cookbook recipes

2022-01-12 Thread GitBox
amol- commented on a change in pull request #113: URL: https://github.com/apache/arrow-cookbook/pull/113#discussion_r783120749 ## File path: Makefile ## @@ -13,6 +13,7 @@ help: @echo "make testTest cookbook for all platforms." @echo "make py Bui

[GitHub] [arrow] zeroshade commented on a change in pull request #11538: ARROW-13986: [Go][Parquet] Add File Writers and tests

2022-01-12 Thread GitBox
zeroshade commented on a change in pull request #11538: URL: https://github.com/apache/arrow/pull/11538#discussion_r783131136 ## File path: go/parquet/file/column_writer.go ## @@ -179,7 +179,7 @@ func (w *columnWriter) TotalBytesWritten() int64 { } func (w *columnWriter) Ro

[GitHub] [arrow] github-actions[bot] commented on pull request #12133: ARROW-10485: [R] Accept partitioning in open_dataset when file paths are hive-style

2022-01-12 Thread GitBox
github-actions[bot] commented on pull request #12133: URL: https://github.com/apache/arrow/pull/12133#issuecomment-104298 https://issues.apache.org/jira/browse/ARROW-10485 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

[GitHub] [arrow] nealrichardson commented on a change in pull request #12133: ARROW-10485: [R] Accept partitioning in open_dataset when file paths are hive-style

2022-01-12 Thread GitBox
nealrichardson commented on a change in pull request #12133: URL: https://github.com/apache/arrow/pull/12133#discussion_r783138108 ## File path: r/R/dataset-factory.R ## @@ -115,6 +165,10 @@ DatasetFactory$create <- function(x, #'by [hive_partition()] which parses explicit

[GitHub] [arrow-rs] alamb merged pull request #1156: Fuzz test different parquet encodings

2022-01-12 Thread GitBox
alamb merged pull request #1156: URL: https://github.com/apache/arrow-rs/pull/1156 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[GitHub] [arrow] zeroshade commented on a change in pull request #11538: ARROW-13986: [Go][Parquet] Add File Writers and tests

2022-01-12 Thread GitBox
zeroshade commented on a change in pull request #11538: URL: https://github.com/apache/arrow/pull/11538#discussion_r783156109 ## File path: go/parquet/file/column_writer_test.go ## @@ -45,6 +49,73 @@ const ( DictionaryPageSize = 1024 * 1024 ) +type mockpagewriter str

[GitHub] [arrow] zeroshade commented on a change in pull request #11538: ARROW-13986: [Go][Parquet] Add File Writers and tests

2022-01-12 Thread GitBox
zeroshade commented on a change in pull request #11538: URL: https://github.com/apache/arrow/pull/11538#discussion_r783172423 ## File path: go/parquet/file/column_writer_test.go ## @@ -45,6 +49,73 @@ const ( DictionaryPageSize = 1024 * 1024 ) +type mockpagewriter str

[GitHub] [arrow] ursabot edited a comment on pull request #12078: ARROW-14448: [Python] Update pyarrow.array() docstring note on timestamp (timezone) conversion

2022-01-12 Thread GitBox
ursabot edited a comment on pull request #12078: URL: https://github.com/apache/arrow/pull/12078#issuecomment-1009832533 Benchmark runs are scheduled for baseline = 0363df1b44274707228af7274102bbe50cdb68be and contender = 488f084280fa5e2acea76dcb02dd0c3ee655f55b. 488f084280fa5e2acea76dcb0

[GitHub] [arrow] ursabot edited a comment on pull request #12120: ARROW-15279: [R] Update "writing bindings" dev docs based on user feedback

2022-01-12 Thread GitBox
ursabot edited a comment on pull request #12120: URL: https://github.com/apache/arrow/pull/12120#issuecomment-1010641597 Benchmark runs are scheduled for baseline = 2164b0bc6084ba8de1cd6011dc219de74ad57f04 and contender = ce639b03307c220ffb374bae888d7fc5788fe4ae. ce639b03307c220ffb374bae8

[GitHub] [arrow] rok commented on pull request #12124: ARROW-14093: [C++] subtract(date, date) -> interval kernel

2022-01-12 Thread GitBox
rok commented on pull request #12124: URL: https://github.com/apache/arrow/pull/12124#issuecomment-1011155210 > The title should be `-> duration`, not `-> interval`, right? I think so. I suppose same goes for most most [ARROW-11090](https://issues.apache.org/jira/browse/ARROW-11090)

[GitHub] [arrow] rok commented on a change in pull request #12124: ARROW-14093: [C++] subtract(date, date) -> interval kernel

2022-01-12 Thread GitBox
rok commented on a change in pull request #12124: URL: https://github.com/apache/arrow/pull/12124#discussion_r783177577 ## File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc ## @@ -1491,6 +1500,8 @@ ArrayKernelExec ArithmeticExecFromOp(detail::GetTypeId get_id) {

[GitHub] [arrow-rs] alamb commented on a change in pull request #1054: Improve parquet reading performance for columns with nulls by preserving bitmask when possible (#1037)

2022-01-12 Thread GitBox
alamb commented on a change in pull request #1054: URL: https://github.com/apache/arrow-rs/pull/1054#discussion_r783151026 ## File path: parquet/src/arrow/record_reader.rs ## @@ -73,9 +73,19 @@ where V: ValuesBuffer + Default, CV: ColumnValueDecoder, { +/// Creat

[GitHub] [arrow-rs] alamb commented on pull request #1054: Improve parquet reading performance for columns with nulls by preserving bitmask when possible (#1037)

2022-01-12 Thread GitBox
alamb commented on pull request #1054: URL: https://github.com/apache/arrow-rs/pull/1054#issuecomment-1011156589 Likewise cc @nevi-me @sunchao in case you are interested -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

[GitHub] [arrow-rs] alamb commented on pull request #1054: Improve parquet reading performance for columns with nulls by preserving bitmask when possible (#1037)

2022-01-12 Thread GitBox
alamb commented on pull request #1054: URL: https://github.com/apache/arrow-rs/pull/1054#issuecomment-1011157167 Unless anyone wants additional time to review, I'll plan to merge this tomorrow -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [arrow-rs] tustvold opened a new issue #1163: Parquet Read/Write Traits

2022-01-12 Thread GitBox
tustvold opened a new issue #1163: URL: https://github.com/apache/arrow-rs/issues/1163 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** Currently and `ParquetReader` `ParquetWriter` traits pass around immutable references. The

[GitHub] [arrow] amol- commented on pull request #11726: ARROW-14738: [Python][Doc] Make return types clickable

2022-01-12 Thread GitBox
amol- commented on pull request #11726: URL: https://github.com/apache/arrow/pull/11726#issuecomment-1011168850 > Well that's sort of the problem here; the numpydoc's having a hard time reading the human-readable text πŸ˜‰ That being said, I think it's totally fine in the case of `list of`; t

[GitHub] [arrow] amol- edited a comment on pull request #11726: ARROW-14738: [Python][Doc] Make return types clickable

2022-01-12 Thread GitBox
amol- edited a comment on pull request #11726: URL: https://github.com/apache/arrow/pull/11726#issuecomment-1011168850 > Well that's sort of the problem here; the numpydoc's having a hard time reading the human-readable text πŸ˜‰ That being said, I think it's totally fine in the case of `list

[GitHub] [arrow-datafusion] xudong963 commented on a change in pull request #1526: Initial MemoryManager and DiskManager APIs for query execution + External Sort implementation

2022-01-12 Thread GitBox
xudong963 commented on a change in pull request #1526: URL: https://github.com/apache/arrow-datafusion/pull/1526#discussion_r783186162 ## File path: datafusion/src/execution/runtime_env.rs ## @@ -0,0 +1,149 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// o

[GitHub] [arrow-rs] viirya commented on pull request #1152: Add subtract_scalar kernel

2022-01-12 Thread GitBox
viirya commented on pull request #1152: URL: https://github.com/apache/arrow-rs/pull/1152#issuecomment-1011173643 Thanks @alamb ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

[GitHub] [arrow-rs] viirya commented on pull request #1159: Add multiply_scalar kernel

2022-01-12 Thread GitBox
viirya commented on pull request #1159: URL: https://github.com/apache/arrow-rs/pull/1159#issuecomment-1011174011 Thanks @alamb ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

[GitHub] [arrow-rs] tustvold opened a new issue #1164: Parquet Tests Cleanup Temporary Files

2022-01-12 Thread GitBox
tustvold opened a new issue #1164: URL: https://github.com/apache/arrow-rs/issues/1164 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** The parquet tests create a large amount of temporary files in `/target` when running tests

[GitHub] [arrow-cookbook] amol- commented on a change in pull request #113: [Java]: Java cookbook recipes

2022-01-12 Thread GitBox
amol- commented on a change in pull request #113: URL: https://github.com/apache/arrow-cookbook/pull/113#discussion_r783120749 ## File path: Makefile ## @@ -13,6 +13,7 @@ help: @echo "make testTest cookbook for all platforms." @echo "make py Bui

[GitHub] [arrow-cookbook] amol- commented on a change in pull request #113: [Java]: Java cookbook recipes

2022-01-12 Thread GitBox
amol- commented on a change in pull request #113: URL: https://github.com/apache/arrow-cookbook/pull/113#discussion_r783202899 ## File path: Makefile ## @@ -13,6 +13,7 @@ help: @echo "make testTest cookbook for all platforms." @echo "make py Bui

[GitHub] [arrow] github-actions[bot] commented on pull request #12134: ARROW-13617: [C++] Make Decimal representations

2022-01-12 Thread GitBox
github-actions[bot] commented on pull request #12134: URL: https://github.com/apache/arrow/pull/12134#issuecomment-1011193054 https://issues.apache.org/jira/browse/ARROW-13617 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

[GitHub] [arrow-rs] tustvold opened a new pull request #1165: Use tempfile for parquet tests

2022-01-12 Thread GitBox
tustvold opened a new pull request #1165: URL: https://github.com/apache/arrow-rs/pull/1165 # Which issue does this PR close? Closes #1163 # Rationale for this change This avoids running the parquet tests from "leaking" random parquet files in your local filesystem. It

[GitHub] [arrow] AlenkaF commented on a change in pull request #11942: ARROW-14762: [Doc] Additional info and resources

2022-01-12 Thread GitBox
AlenkaF commented on a change in pull request #11942: URL: https://github.com/apache/arrow/pull/11942#discussion_r783220279 ## File path: docs/source/developers/guide/resources.rst ## @@ -27,3 +27,51 @@ Additional information and resources

[GitHub] [arrow] thisisnic closed pull request #12125: ARROW-15303: [R] linting errors

2022-01-12 Thread GitBox
thisisnic closed pull request #12125: URL: https://github.com/apache/arrow/pull/12125 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsub

[GitHub] [arrow] thisisnic commented on a change in pull request #11942: ARROW-14762: [Doc] Additional info and resources

2022-01-12 Thread GitBox
thisisnic commented on a change in pull request #11942: URL: https://github.com/apache/arrow/pull/11942#discussion_r783229757 ## File path: docs/source/developers/guide/resources.rst ## @@ -27,3 +27,51 @@ Additional information and resourc

[GitHub] [arrow] thisisnic closed pull request #11942: ARROW-14762: [Doc] Additional info and resources

2022-01-12 Thread GitBox
thisisnic closed pull request #11942: URL: https://github.com/apache/arrow/pull/11942 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsub

[GitHub] [arrow] ursabot commented on pull request #11942: ARROW-14762: [Doc] Additional info and resources

2022-01-12 Thread GitBox
ursabot commented on pull request #11942: URL: https://github.com/apache/arrow/pull/11942#issuecomment-1011221219 Benchmark runs are scheduled for baseline = af7668e0e674c0ebbdade4afa3c8d2e2503e04d4 and contender = 7303b51ad2f9ac3f0c59bee7221771552ed4eb46. 7303b51ad2f9ac3f0c59bee722177155

[GitHub] [arrow] ursabot commented on pull request #12125: ARROW-15303: [R] linting errors

2022-01-12 Thread GitBox
ursabot commented on pull request #12125: URL: https://github.com/apache/arrow/pull/12125#issuecomment-1011221205 Benchmark runs are scheduled for baseline = 9359026bad4a626de3699e023f96f0c1383d7032 and contender = af7668e0e674c0ebbdade4afa3c8d2e2503e04d4. af7668e0e674c0ebbdade4afa3c8d2e2

[GitHub] [arrow] thisisnic commented on pull request #12083: ARROW-14744: [R] open_dataset() error when `schema` argument supplied, but `column_names` not supplied to `CSVReadOptions`

2022-01-12 Thread GitBox
thisisnic commented on pull request #12083: URL: https://github.com/apache/arrow/pull/12083#issuecomment-1011224682 Thanks for the updates here @toppyy . I've taken the time to have a proper think about this, and on reflection, I don't think we need to make `open_dataset( td, format = 'cs

[GitHub] [arrow] thisisnic edited a comment on pull request #12083: ARROW-14744: [R] open_dataset() error when `schema` argument supplied, but `column_names` not supplied to `CSVReadOptions`

2022-01-12 Thread GitBox
thisisnic edited a comment on pull request #12083: URL: https://github.com/apache/arrow/pull/12083#issuecomment-1011224682 Thanks for the updates here @toppyy . I've taken the time to have a proper think about this, and on reflection, I don't think we need to make `open_dataset( td, forma

[GitHub] [arrow-rs] alamb commented on issue #1163: Use Standard Library IO Abstractions in Parquet

2022-01-12 Thread GitBox
alamb commented on issue #1163: URL: https://github.com/apache/arrow-rs/issues/1163#issuecomment-1011226718 I think the difference I have observed in the past was that the parquet abstractions also allow `Clone` -- as I recall it was to allow clients to read from different columns concurre

[GitHub] [arrow-rs] alamb commented on issue #1163: Use Standard Library IO Abstractions in Parquet

2022-01-12 Thread GitBox
alamb commented on issue #1163: URL: https://github.com/apache/arrow-rs/issues/1163#issuecomment-1011227138 FWIW I think it would be great to avoid all the custom abstractions πŸ‘ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [arrow] zeroshade commented on a change in pull request #11538: ARROW-13986: [Go][Parquet] Add File Writers and tests

2022-01-12 Thread GitBox
zeroshade commented on a change in pull request #11538: URL: https://github.com/apache/arrow/pull/11538#discussion_r783244925 ## File path: go/parquet/file/column_writer_test.go ## @@ -45,6 +49,73 @@ const ( DictionaryPageSize = 1024 * 1024 ) +type mockpagewriter str

[GitHub] [arrow] ursabot edited a comment on pull request #12125: ARROW-15303: [R] linting errors

2022-01-12 Thread GitBox
ursabot edited a comment on pull request #12125: URL: https://github.com/apache/arrow/pull/12125#issuecomment-1011221205 Benchmark runs are scheduled for baseline = 9359026bad4a626de3699e023f96f0c1383d7032 and contender = af7668e0e674c0ebbdade4afa3c8d2e2503e04d4. af7668e0e674c0ebbdade4afa

[GitHub] [arrow] github-actions[bot] commented on pull request #11982: ARROW-15313: [FLIGHT-SQL] Implement type info method to flight-sql

2022-01-12 Thread GitBox
github-actions[bot] commented on pull request #11982: URL: https://github.com/apache/arrow/pull/11982#issuecomment-1011245587 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow-rs] tustvold commented on issue #1163: Use Standard Library IO Abstractions in Parquet

2022-01-12 Thread GitBox
tustvold commented on issue #1163: URL: https://github.com/apache/arrow-rs/issues/1163#issuecomment-1011246439 Aah yes, the `FileReader` API expects to be able to give out `RowGroupReader` which in turn give out `PageReader`. These are all owned constructs and so expect to be able to pass

  1   2   3   >