[GitHub] [arrow-datafusion] Ted-Jiang commented on a change in pull request #1537: Make call SchedulerServer::new once in ballista-scheduler process

2022-01-13 Thread GitBox
Ted-Jiang commented on a change in pull request #1537: URL: https://github.com/apache/arrow-datafusion/pull/1537#discussion_r783714595 ## File path: ballista/rust/scheduler/src/main.rs ## @@ -62,14 +63,18 @@ async fn start_server( "Ballista v{} Scheduler listening on {

[GitHub] [arrow-datafusion] yjshen commented on a change in pull request #1526: Initial MemoryManager and DiskManager APIs for query execution + External Sort implementation

2022-01-13 Thread GitBox
yjshen commented on a change in pull request #1526: URL: https://github.com/apache/arrow-datafusion/pull/1526#discussion_r783720514 ## File path: datafusion/src/execution/mod.rs ## @@ -19,4 +19,7 @@ pub mod context; pub mod dataframe_impl; +pub mod disk_manager; Review com

[GitHub] [arrow] ursabot edited a comment on pull request #11790: ARROW-14843 [R] Implement `decimal128()` (to replace `decimal()`)

2022-01-13 Thread GitBox
ursabot edited a comment on pull request #11790: URL: https://github.com/apache/arrow/pull/11790#issuecomment-985537478 Benchmark runs are scheduled for baseline = ee32aeaaadf1f112a0e850a9050ab822ccfe5d07 and contender = 310fca93d7b44081418fafd5fd30ce9342839ee4. 310fca93d7b44081418fafd5fd

[GitHub] [arrow-rs] Ted-Jiang commented on issue #1157: Return `error` from JSON writer rather than panic!

2022-01-13 Thread GitBox
Ted-Jiang commented on issue #1157: URL: https://github.com/apache/arrow-rs/issues/1157#issuecomment-1011904106 Thanks for your description! I am glad to do this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

[GitHub] [arrow] ursabot edited a comment on pull request #11790: ARROW-14843 [R] Implement `decimal128()` (to replace `decimal()`)

2022-01-13 Thread GitBox
ursabot edited a comment on pull request #11790: URL: https://github.com/apache/arrow/pull/11790#issuecomment-985537478 Benchmark runs are scheduled for baseline = ee32aeaaadf1f112a0e850a9050ab822ccfe5d07 and contender = 310fca93d7b44081418fafd5fd30ce9342839ee4. 310fca93d7b44081418fafd5fd

[GitHub] [arrow] ursabot edited a comment on pull request #12129: ARROW-15306: [C++] S3FileSystem Should set the content-type header to application/octet-stream if not specified

2022-01-13 Thread GitBox
ursabot edited a comment on pull request #12129: URL: https://github.com/apache/arrow/pull/12129#issuecomment-1010902509 Benchmark runs are scheduled for baseline = ce639b03307c220ffb374bae888d7fc5788fe4ae and contender = 9359026bad4a626de3699e023f96f0c1383d7032. 9359026bad4a626de3699e023

[GitHub] [arrow] AlenkaF commented on a change in pull request #12133: ARROW-10485: [R] Accept partitioning in open_dataset when file paths are hive-style

2022-01-13 Thread GitBox
AlenkaF commented on a change in pull request #12133: URL: https://github.com/apache/arrow/pull/12133#discussion_r783756457 ## File path: r/R/dataset-factory.R ## @@ -60,16 +61,63 @@ DatasetFactory$create <- function(x, return(FileSystemDatasetFactory$create(path_and_fs$fs

[GitHub] [arrow] amol- commented on a change in pull request #12076: ARROW-10317: [Python] Document compute function options

2022-01-13 Thread GitBox
amol- commented on a change in pull request #12076: URL: https://github.com/apache/arrow/pull/12076#discussion_r783757541 ## File path: python/pyarrow/_compute.pyx ## @@ -1027,6 +1266,15 @@ cdef class _ScalarAggregateOptions(FunctionOptions): class ScalarAggregateOptions(_

[GitHub] [arrow] pitrou commented on a change in pull request #12076: ARROW-10317: [Python] Document compute function options

2022-01-13 Thread GitBox
pitrou commented on a change in pull request #12076: URL: https://github.com/apache/arrow/pull/12076#discussion_r783775442 ## File path: python/pyarrow/_compute.pyx ## @@ -1027,6 +1266,15 @@ cdef class _ScalarAggregateOptions(FunctionOptions): class ScalarAggregateOptions(

[GitHub] [arrow] mbrobbel commented on a change in pull request #12100: ARROW-15061: [C++] Add logging for kernel functions and exec plan nodes

2022-01-13 Thread GitBox
mbrobbel commented on a change in pull request #12100: URL: https://github.com/apache/arrow/pull/12100#discussion_r783776026 ## File path: cpp/src/arrow/compute/function.cc ## @@ -215,9 +215,15 @@ Result Function::Execute(const std::vector& args, } util::tracing::Span

[GitHub] [arrow-rs] tustvold commented on issue #1163: Use Standard Library IO Abstractions in Parquet

2022-01-13 Thread GitBox
tustvold commented on issue #1163: URL: https://github.com/apache/arrow-rs/issues/1163#issuecomment-1011957344 I had a brief play around with this and found the following. The write side is serial, and so it should be possible to use standard library abstractions. The current trait t

[GitHub] [arrow-rs] tustvold commented on a change in pull request #1054: Improve parquet reading performance for columns with nulls by preserving bitmask when possible (#1037)

2022-01-13 Thread GitBox
tustvold commented on a change in pull request #1054: URL: https://github.com/apache/arrow-rs/pull/1054#discussion_r783784038 ## File path: parquet/src/arrow/record_reader.rs ## @@ -73,9 +73,19 @@ where V: ValuesBuffer + Default, CV: ColumnValueDecoder, { +/// Cr

[GitHub] [arrow-datafusion] Igosuki commented on issue #1532: Discussion: Switch DataFusion to using arrow2?

2022-01-13 Thread GitBox
Igosuki commented on issue #1532: URL: https://github.com/apache/arrow-datafusion/issues/1532#issuecomment-1011959605 My branch got merged into the fork so now we only need to address a few remaining issues that break tests. -- This is an automated message from the Apache Git Service.

[GitHub] [arrow-rs] tustvold commented on a change in pull request #1054: Improve parquet reading performance for columns with nulls by preserving bitmask when possible (#1037)

2022-01-13 Thread GitBox
tustvold commented on a change in pull request #1054: URL: https://github.com/apache/arrow-rs/pull/1054#discussion_r783785407 ## File path: parquet/src/arrow/record_reader/definition_levels.rs ## @@ -91,10 +127,287 @@ impl DefinitionLevelBuffer { &self, range:

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #12076: ARROW-10317: [Python] Document compute function options

2022-01-13 Thread GitBox
jorisvandenbossche commented on a change in pull request #12076: URL: https://github.com/apache/arrow/pull/12076#discussion_r783769781 ## File path: python/pyarrow/compute.py ## @@ -105,66 +117,70 @@ def _decorate_compute_function(wrapper, exposed_name, func, options_class):

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #12076: ARROW-10317: [Python] Document compute function options

2022-01-13 Thread GitBox
jorisvandenbossche commented on a change in pull request #12076: URL: https://github.com/apache/arrow/pull/12076#discussion_r783785071 ## File path: python/pyarrow/_compute.pyx ## @@ -1027,6 +1266,15 @@ cdef class _ScalarAggregateOptions(FunctionOptions): class ScalarAggre

[GitHub] [arrow] amol- commented on a change in pull request #12076: ARROW-10317: [Python] Document compute function options

2022-01-13 Thread GitBox
amol- commented on a change in pull request #12076: URL: https://github.com/apache/arrow/pull/12076#discussion_r783796105 ## File path: python/pyarrow/_compute.pyx ## @@ -1027,6 +1266,15 @@ cdef class _ScalarAggregateOptions(FunctionOptions): class ScalarAggregateOptions(_

[GitHub] [arrow] amol- commented on a change in pull request #12076: ARROW-10317: [Python] Document compute function options

2022-01-13 Thread GitBox
amol- commented on a change in pull request #12076: URL: https://github.com/apache/arrow/pull/12076#discussion_r783796105 ## File path: python/pyarrow/_compute.pyx ## @@ -1027,6 +1266,15 @@ cdef class _ScalarAggregateOptions(FunctionOptions): class ScalarAggregateOptions(_

[GitHub] [arrow] guyuqi opened a new pull request #12138: ARROW-15320: [Go] Implement memset_neon with Arm64 GoLang Assembly

2022-01-13 Thread GitBox
guyuqi opened a new pull request #12138: URL: https://github.com/apache/arrow/pull/12138 ### Benchmark: Before: ``` BenchmarkSet_32-46 5037507720.79 ns/op 1539.46 MB/s BenchmarkSet_64-46 3388586835.44 ns/op 1806.

[GitHub] [arrow] github-actions[bot] commented on pull request #12138: ARROW-15320: [Go] Implement memset_neon with Arm64 GoLang Assembly

2022-01-13 Thread GitBox
github-actions[bot] commented on pull request #12138: URL: https://github.com/apache/arrow/pull/12138#issuecomment-1011971740 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow] pitrou commented on a change in pull request #12076: ARROW-10317: [Python] Document compute function options

2022-01-13 Thread GitBox
pitrou commented on a change in pull request #12076: URL: https://github.com/apache/arrow/pull/12076#discussion_r783802323 ## File path: python/pyarrow/_compute.pyx ## @@ -1027,6 +1266,15 @@ cdef class _ScalarAggregateOptions(FunctionOptions): class ScalarAggregateOptions(

[GitHub] [arrow] ursabot edited a comment on pull request #12099: ARROW-15265: [C++] Fix hang in dataset writer with kDeleteMatchingPartitions and #partitions >= 8

2022-01-13 Thread GitBox
ursabot edited a comment on pull request #12099: URL: https://github.com/apache/arrow/pull/12099#issuecomment-1011433964 Benchmark runs are scheduled for baseline = 0bce4404638f903a308f7f4afd7adc14ab164e08 and contender = bafaa76bad81c047b471071ce46245d261b62b97. bafaa76bad81c047b471071ce

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #12076: ARROW-10317: [Python] Document compute function options

2022-01-13 Thread GitBox
jorisvandenbossche commented on a change in pull request #12076: URL: https://github.com/apache/arrow/pull/12076#discussion_r783804411 ## File path: python/pyarrow/_compute.pyx ## @@ -1027,6 +1266,15 @@ cdef class _ScalarAggregateOptions(FunctionOptions): class ScalarAggre

[GitHub] [arrow] pitrou commented on a change in pull request #12076: ARROW-10317: [Python] Document compute function options

2022-01-13 Thread GitBox
pitrou commented on a change in pull request #12076: URL: https://github.com/apache/arrow/pull/12076#discussion_r783802323 ## File path: python/pyarrow/_compute.pyx ## @@ -1027,6 +1266,15 @@ cdef class _ScalarAggregateOptions(FunctionOptions): class ScalarAggregateOptions(

[GitHub] [arrow] pitrou commented on a change in pull request #12076: ARROW-10317: [Python] Document compute function options

2022-01-13 Thread GitBox
pitrou commented on a change in pull request #12076: URL: https://github.com/apache/arrow/pull/12076#discussion_r783802323 ## File path: python/pyarrow/_compute.pyx ## @@ -1027,6 +1266,15 @@ cdef class _ScalarAggregateOptions(FunctionOptions): class ScalarAggregateOptions(

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #12076: ARROW-10317: [Python] Document compute function options

2022-01-13 Thread GitBox
jorisvandenbossche commented on a change in pull request #12076: URL: https://github.com/apache/arrow/pull/12076#discussion_r783808329 ## File path: python/pyarrow/_compute.pyx ## @@ -1027,6 +1266,15 @@ cdef class _ScalarAggregateOptions(FunctionOptions): class ScalarAggre

[GitHub] [arrow] amol- commented on a change in pull request #12076: ARROW-10317: [Python] Document compute function options

2022-01-13 Thread GitBox
amol- commented on a change in pull request #12076: URL: https://github.com/apache/arrow/pull/12076#discussion_r783810077 ## File path: python/pyarrow/_compute.pyx ## @@ -1027,6 +1266,15 @@ cdef class _ScalarAggregateOptions(FunctionOptions): class ScalarAggregateOptions(_

[GitHub] [arrow-cookbook] amol- commented on a change in pull request #113: [Java]: Java cookbook recipes

2022-01-13 Thread GitBox
amol- commented on a change in pull request #113: URL: https://github.com/apache/arrow-cookbook/pull/113#discussion_r783202899 ## File path: Makefile ## @@ -13,6 +13,7 @@ help: @echo "make testTest cookbook for all platforms." @echo "make py Bui

[GitHub] [arrow] jorisvandenbossche commented on pull request #12137: ARROW-14095: [C++] subtract(timestamp, duration) -> timestamp kernel

2022-01-13 Thread GitBox
jorisvandenbossche commented on pull request #12137: URL: https://github.com/apache/arrow/pull/12137#issuecomment-1011995754 I would expect that for non-matching resolution in the input, we cast to the most detailed resolution. But, this doesn't necessarily need to have kernels for all com

[GitHub] [arrow-cookbook] amol- commented on a change in pull request #113: [Java]: Java cookbook recipes

2022-01-13 Thread GitBox
amol- commented on a change in pull request #113: URL: https://github.com/apache/arrow-cookbook/pull/113#discussion_r783202899 ## File path: Makefile ## @@ -13,6 +13,7 @@ help: @echo "make testTest cookbook for all platforms." @echo "make py Bui

[GitHub] [arrow] ursabot edited a comment on pull request #11977: ARROW-14930: [C++] Make S3 directory detection more robust

2022-01-13 Thread GitBox
ursabot edited a comment on pull request #11977: URL: https://github.com/apache/arrow/pull/11977#issuecomment-997855104 Benchmark runs are scheduled for baseline = 836abb35bd0c4b69e87d45cdbc2ddd3ec408003a and contender = 97a81595d6d25e371b28cf9574623f4c67685c6e. 97a81595d6d25e371b28cf9574

[GitHub] [arrow] jorisvandenbossche edited a comment on pull request #12137: ARROW-14095: [C++] subtract(timestamp, duration) -> timestamp kernel

2022-01-13 Thread GitBox
jorisvandenbossche edited a comment on pull request #12137: URL: https://github.com/apache/arrow/pull/12137#issuecomment-1011995754 I would expect that for non-matching resolution in the input, we cast to the most detailed resolution. But, this doesn't necessarily need to have kernels for

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #12076: ARROW-10317: [Python] Document compute function options

2022-01-13 Thread GitBox
jorisvandenbossche commented on a change in pull request #12076: URL: https://github.com/apache/arrow/pull/12076#discussion_r783826887 ## File path: python/pyarrow/_compute.pyx ## @@ -1027,6 +1266,15 @@ cdef class _ScalarAggregateOptions(FunctionOptions): class ScalarAggre

[GitHub] [arrow] rok opened a new pull request #12139: ARROW-14097: [C++] subtract(time, duration) -> time kernel

2022-01-13 Thread GitBox
rok opened a new pull request #12139: URL: https://github.com/apache/arrow/pull/12139 [ARROW-14097](https://issues.apache.org/jira/browse/ARROW-14097) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [arrow] github-actions[bot] commented on pull request #12139: ARROW-14097: [C++] subtract(time, duration) -> time kernel

2022-01-13 Thread GitBox
github-actions[bot] commented on pull request #12139: URL: https://github.com/apache/arrow/pull/12139#issuecomment-1012002939 https://issues.apache.org/jira/browse/ARROW-14097 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

[GitHub] [arrow] ursabot edited a comment on pull request #11977: ARROW-14930: [C++] Make S3 directory detection more robust

2022-01-13 Thread GitBox
ursabot edited a comment on pull request #11977: URL: https://github.com/apache/arrow/pull/11977#issuecomment-997855104 Benchmark runs are scheduled for baseline = 836abb35bd0c4b69e87d45cdbc2ddd3ec408003a and contender = 97a81595d6d25e371b28cf9574623f4c67685c6e. 97a81595d6d25e371b28cf9574

[GitHub] [arrow] ursabot edited a comment on pull request #12125: ARROW-15303: [R] linting errors

2022-01-13 Thread GitBox
ursabot edited a comment on pull request #12125: URL: https://github.com/apache/arrow/pull/12125#issuecomment-1011221205 Benchmark runs are scheduled for baseline = 9359026bad4a626de3699e023f96f0c1383d7032 and contender = af7668e0e674c0ebbdade4afa3c8d2e2503e04d4. af7668e0e674c0ebbdade4afa

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #12076: ARROW-10317: [Python] Document compute function options

2022-01-13 Thread GitBox
jorisvandenbossche commented on a change in pull request #12076: URL: https://github.com/apache/arrow/pull/12076#discussion_r783840936 ## File path: python/pyarrow/_compute.pyx ## @@ -1027,6 +1266,15 @@ cdef class _ScalarAggregateOptions(FunctionOptions): class ScalarAggre

[GitHub] [arrow] pitrou commented on a change in pull request #12134: ARROW-13617: [C++] Make Decimal representations consistent

2022-01-13 Thread GitBox
pitrou commented on a change in pull request #12134: URL: https://github.com/apache/arrow/pull/12134#discussion_r783849378 ## File path: cpp/src/arrow/util/basic_decimal.h ## @@ -37,50 +38,116 @@ enum class DecimalStatus { kRescaleDataLoss, }; +template Review comment:

[GitHub] [arrow] pitrou commented on a change in pull request #12134: ARROW-13617: [C++] Make Decimal representations consistent

2022-01-13 Thread GitBox
pitrou commented on a change in pull request #12134: URL: https://github.com/apache/arrow/pull/12134#discussion_r783849620 ## File path: cpp/src/arrow/util/basic_decimal.cc ## @@ -1391,4 +1347,8 @@ BasicDecimal256 operator/(const BasicDecimal256& left, const BasicDecimal256& r

[GitHub] [arrow] pitrou closed pull request #12134: ARROW-13617: [C++] Make Decimal representations consistent

2022-01-13 Thread GitBox
pitrou closed pull request #12134: URL: https://github.com/apache/arrow/pull/12134 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[GitHub] [arrow] ursabot edited a comment on pull request #11972: ARROW-15022: [R] install vignette and installation dev vignette need alt text for images

2022-01-13 Thread GitBox
ursabot edited a comment on pull request #11972: URL: https://github.com/apache/arrow/pull/11972#issuecomment-996869748 Benchmark runs are scheduled for baseline = 6e20c6b9d7131af41f2e979529d06e507c731373 and contender = 670af338bc740888bffea65b28ee2bcc065b555a. 670af338bc740888bffea65b28

[GitHub] [arrow] ursabot commented on pull request #12134: ARROW-13617: [C++] Make Decimal representations consistent

2022-01-13 Thread GitBox
ursabot commented on pull request #12134: URL: https://github.com/apache/arrow/pull/12134#issuecomment-1012033781 Benchmark runs are scheduled for baseline = ab86daf3f7c8a67bee6a175a749575fd40417d27 and contender = d67a210b8c50b1d109e3d7780591d010e94cc9cc. d67a210b8c50b1d109e3d7780591d010

[GitHub] [arrow] ursabot edited a comment on pull request #11972: ARROW-15022: [R] install vignette and installation dev vignette need alt text for images

2022-01-13 Thread GitBox
ursabot edited a comment on pull request #11972: URL: https://github.com/apache/arrow/pull/11972#issuecomment-996869748 Benchmark runs are scheduled for baseline = 6e20c6b9d7131af41f2e979529d06e507c731373 and contender = 670af338bc740888bffea65b28ee2bcc065b555a. 670af338bc740888bffea65b28

[GitHub] [arrow] pitrou commented on a change in pull request #12076: ARROW-10317: [Python] Document compute function options

2022-01-13 Thread GitBox
pitrou commented on a change in pull request #12076: URL: https://github.com/apache/arrow/pull/12076#discussion_r783866373 ## File path: python/pyarrow/compute.py ## @@ -105,66 +117,70 @@ def _decorate_compute_function(wrapper, exposed_name, func, options_class): doc_piec

[GitHub] [arrow] pitrou commented on a change in pull request #12076: ARROW-10317: [Python] Document compute function options

2022-01-13 Thread GitBox
pitrou commented on a change in pull request #12076: URL: https://github.com/apache/arrow/pull/12076#discussion_r783867694 ## File path: python/pyarrow/_compute.pyx ## @@ -690,6 +690,26 @@ cdef class _CastOptions(FunctionOptions): class CastOptions(_CastOptions): +"""

[GitHub] [arrow] ursabot edited a comment on pull request #12134: ARROW-13617: [C++] Make Decimal representations consistent

2022-01-13 Thread GitBox
ursabot edited a comment on pull request #12134: URL: https://github.com/apache/arrow/pull/12134#issuecomment-1012033781 Benchmark runs are scheduled for baseline = ab86daf3f7c8a67bee6a175a749575fd40417d27 and contender = d67a210b8c50b1d109e3d7780591d010e94cc9cc. d67a210b8c50b1d109e3d7780

[GitHub] [arrow] pitrou commented on a change in pull request #12076: ARROW-10317: [Python] Document compute function options

2022-01-13 Thread GitBox
pitrou commented on a change in pull request #12076: URL: https://github.com/apache/arrow/pull/12076#discussion_r783868958 ## File path: python/pyarrow/_compute.pyx ## @@ -1326,6 +1781,25 @@ cdef class _QuantileOptions(FunctionOptions): class QuantileOptions(_QuantileOptio

[GitHub] [arrow] dragosmg commented on a change in pull request #12140: ARROW-14759 [Doc] Steps in making your first PR - test in R

2022-01-13 Thread GitBox
dragosmg commented on a change in pull request #12140: URL: https://github.com/apache/arrow/pull/12140#discussion_r783875294 ## File path: docs/source/developers/guide/step_by_step/testing.rst ## @@ -62,29 +62,81 @@ In this section we outline steps needed for unit testing in A

[GitHub] [arrow] dragosmg commented on a change in pull request #12140: ARROW-14759 [Doc] Steps in making your first PR - test in R

2022-01-13 Thread GitBox
dragosmg commented on a change in pull request #12140: URL: https://github.com/apache/arrow/pull/12140#discussion_r783880285 ## File path: docs/source/developers/guide/step_by_step/testing.rst ## @@ -62,29 +62,81 @@ In this section we outline steps needed for unit testing in A

[GitHub] [arrow] github-actions[bot] commented on pull request #12140: ARROW-14759 [Doc] Steps in making your first PR - test in R

2022-01-13 Thread GitBox
github-actions[bot] commented on pull request #12140: URL: https://github.com/apache/arrow/pull/12140#issuecomment-1012060003 https://issues.apache.org/jira/browse/ARROW-14759 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

[GitHub] [arrow] AlenkaF commented on a change in pull request #12140: ARROW-14759 [Doc] Steps in making your first PR - test in R

2022-01-13 Thread GitBox
AlenkaF commented on a change in pull request #12140: URL: https://github.com/apache/arrow/pull/12140#discussion_r783895444 ## File path: docs/source/developers/guide/step_by_step/testing.rst ## @@ -62,29 +62,81 @@ In this section we outline steps needed for unit testing in Ar

[GitHub] [arrow] AlenkaF commented on a change in pull request #12140: ARROW-14759 [Doc] Steps in making your first PR - test in R

2022-01-13 Thread GitBox
AlenkaF commented on a change in pull request #12140: URL: https://github.com/apache/arrow/pull/12140#discussion_r783895769 ## File path: docs/source/developers/guide/step_by_step/testing.rst ## @@ -62,29 +62,81 @@ In this section we outline steps needed for unit testing in Ar

[GitHub] [arrow] dragosmg commented on a change in pull request #12140: ARROW-14759 [Doc] Steps in making your first PR - test in R

2022-01-13 Thread GitBox
dragosmg commented on a change in pull request #12140: URL: https://github.com/apache/arrow/pull/12140#discussion_r783898731 ## File path: docs/source/developers/guide/step_by_step/testing.rst ## @@ -62,29 +62,81 @@ In this section we outline steps needed for unit testing in A

[GitHub] [arrow] dragosmg commented on a change in pull request #12140: ARROW-14759 [Doc] Steps in making your first PR - test in R

2022-01-13 Thread GitBox
dragosmg commented on a change in pull request #12140: URL: https://github.com/apache/arrow/pull/12140#discussion_r783898967 ## File path: docs/source/developers/guide/step_by_step/testing.rst ## @@ -62,29 +62,81 @@ In this section we outline steps needed for unit testing in A

[GitHub] [arrow] AlenkaF commented on a change in pull request #12140: ARROW-14759 [Doc] Steps in making your first PR - test in R

2022-01-13 Thread GitBox
AlenkaF commented on a change in pull request #12140: URL: https://github.com/apache/arrow/pull/12140#discussion_r783901311 ## File path: docs/source/developers/guide/step_by_step/testing.rst ## @@ -62,29 +62,81 @@ In this section we outline steps needed for unit testing in Ar

[GitHub] [arrow] AlenkaF commented on a change in pull request #12140: ARROW-14759 [Doc] Steps in making your first PR - test in R

2022-01-13 Thread GitBox
AlenkaF commented on a change in pull request #12140: URL: https://github.com/apache/arrow/pull/12140#discussion_r783902392 ## File path: docs/source/developers/guide/step_by_step/testing.rst ## @@ -62,29 +62,83 @@ In this section we outline steps needed for unit testing in Ar

[GitHub] [arrow] AlenkaF commented on a change in pull request #12140: ARROW-14759 [Doc] Steps in making your first PR - test in R

2022-01-13 Thread GitBox
AlenkaF commented on a change in pull request #12140: URL: https://github.com/apache/arrow/pull/12140#discussion_r783905316 ## File path: docs/source/developers/guide/step_by_step/testing.rst ## @@ -62,29 +62,83 @@ In this section we outline steps needed for unit testing in Ar

[GitHub] [arrow] ursabot edited a comment on pull request #12079: ARROW-15249: [R] Autobrew + AWS sdk dependency

2022-01-13 Thread GitBox
ursabot edited a comment on pull request #12079: URL: https://github.com/apache/arrow/pull/12079#issuecomment-1011440814 Benchmark runs are scheduled for baseline = bafaa76bad81c047b471071ce46245d261b62b97 and contender = b582af1771a84ae3a13438cf721f5c13c3bcff2a. b582af1771a84ae3a13438cf7

[GitHub] [arrow] pitrou closed pull request #11984: PARQUET-2109: [C++] Check if Parquet page has too few values

2022-01-13 Thread GitBox
pitrou closed pull request #11984: URL: https://github.com/apache/arrow/pull/11984 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[GitHub] [arrow] pitrou commented on pull request #11984: PARQUET-2109: [C++] Check if Parquet page has too few values

2022-01-13 Thread GitBox
pitrou commented on pull request #11984: URL: https://github.com/apache/arrow/pull/11984#issuecomment-1012111524 Merged, thanks for the review @emkornfield ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

[GitHub] [arrow] ursabot edited a comment on pull request #11942: ARROW-14762: [Doc] Additional info and resources

2022-01-13 Thread GitBox
ursabot edited a comment on pull request #11942: URL: https://github.com/apache/arrow/pull/11942#issuecomment-1011221219 Benchmark runs are scheduled for baseline = af7668e0e674c0ebbdade4afa3c8d2e2503e04d4 and contender = 7303b51ad2f9ac3f0c59bee7221771552ed4eb46. 7303b51ad2f9ac3f0c59bee72

[GitHub] [arrow] ursabot commented on pull request #11984: PARQUET-2109: [C++] Check if Parquet page has too few values

2022-01-13 Thread GitBox
ursabot commented on pull request #11984: URL: https://github.com/apache/arrow/pull/11984#issuecomment-1012117682 Benchmark runs are scheduled for baseline = d67a210b8c50b1d109e3d7780591d010e94cc9cc and contender = 77fc23fcae0331da3adf94619a381a371a6e414f. 77fc23fcae0331da3adf94619a381a37

[GitHub] [arrow] coryan commented on pull request #12127: ARROW-14924: [C++] generic fs tests for GcsFileSystem

2022-01-13 Thread GitBox
coryan commented on pull request #12127: URL: https://github.com/apache/arrow/pull/12127#issuecomment-1012129075 @pitrou or @emkornfield, could you take a look? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow] ursabot edited a comment on pull request #11984: PARQUET-2109: [C++] Check if Parquet page has too few values

2022-01-13 Thread GitBox
ursabot edited a comment on pull request #11984: URL: https://github.com/apache/arrow/pull/11984#issuecomment-1012117682 Benchmark runs are scheduled for baseline = d67a210b8c50b1d109e3d7780591d010e94cc9cc and contender = 77fc23fcae0331da3adf94619a381a371a6e414f. 77fc23fcae0331da3adf94619

[GitHub] [arrow] rok opened a new pull request #12141: ARROW-14100: [C++] subtract(duration, duration) -> duration kernel

2022-01-13 Thread GitBox
rok opened a new pull request #12141: URL: https://github.com/apache/arrow/pull/12141 [ARROW-14100](https://issues.apache.org/jira/browse/ARROW-14100) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [arrow] thisisnic closed pull request #11751: ARROW-14694: [R] Let me dput a schema

2022-01-13 Thread GitBox
thisisnic closed pull request #11751: URL: https://github.com/apache/arrow/pull/11751 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsub

[GitHub] [arrow] github-actions[bot] commented on pull request #12141: ARROW-14100: [C++] subtract(duration, duration) -> duration kernel

2022-01-13 Thread GitBox
github-actions[bot] commented on pull request #12141: URL: https://github.com/apache/arrow/pull/12141#issuecomment-1012148276 https://issues.apache.org/jira/browse/ARROW-14100 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

[GitHub] [arrow] ursabot commented on pull request #11751: ARROW-14694: [R] Let me dput a schema

2022-01-13 Thread GitBox
ursabot commented on pull request #11751: URL: https://github.com/apache/arrow/pull/11751#issuecomment-1012152838 Benchmark runs are scheduled for baseline = 77fc23fcae0331da3adf94619a381a371a6e414f and contender = c48353fd21b26bcb894d791d49c29371607eb9b9. c48353fd21b26bcb894d791d49c29371

[GitHub] [arrow] dragosmg commented on a change in pull request #12140: ARROW-14759 [Doc] Steps in making your first PR - test in R

2022-01-13 Thread GitBox
dragosmg commented on a change in pull request #12140: URL: https://github.com/apache/arrow/pull/12140#discussion_r783974449 ## File path: docs/source/developers/guide/step_by_step/testing.rst ## @@ -62,29 +62,83 @@ In this section we outline steps needed for unit testing in A

[GitHub] [arrow] dragosmg commented on a change in pull request #12140: ARROW-14759 [Doc] Steps in making your first PR - test in R

2022-01-13 Thread GitBox
dragosmg commented on a change in pull request #12140: URL: https://github.com/apache/arrow/pull/12140#discussion_r783976168 ## File path: docs/source/developers/guide/step_by_step/testing.rst ## @@ -62,29 +62,81 @@ In this section we outline steps needed for unit testing in A

[GitHub] [arrow] dragosmg commented on a change in pull request #12140: ARROW-14759 [Doc] Steps in making your first PR - test in R

2022-01-13 Thread GitBox
dragosmg commented on a change in pull request #12140: URL: https://github.com/apache/arrow/pull/12140#discussion_r783977113 ## File path: docs/source/developers/guide/step_by_step/testing.rst ## @@ -62,29 +62,83 @@ In this section we outline steps needed for unit testing in A

[GitHub] [arrow] nealrichardson commented on a change in pull request #12133: ARROW-10485: [R] Accept partitioning in open_dataset when file paths are hive-style

2022-01-13 Thread GitBox
nealrichardson commented on a change in pull request #12133: URL: https://github.com/apache/arrow/pull/12133#discussion_r783981221 ## File path: r/R/dataset-factory.R ## @@ -60,16 +61,65 @@ DatasetFactory$create <- function(x, return(FileSystemDatasetFactory$create(path_an

[GitHub] [arrow] dragosmg commented on a change in pull request #11921: ARROW-12743 [R] Add DESCRIPTION fields for dev dependencies

2022-01-13 Thread GitBox
dragosmg commented on a change in pull request #11921: URL: https://github.com/apache/arrow/pull/11921#discussion_r783982988 ## File path: r/vignettes/developers/workflow.Rmd ## @@ -4,6 +4,22 @@ knitr::opts_chunk$set(error = TRUE, eval = FALSE) ``` +The Arrow R package uses

[GitHub] [arrow] dragosmg commented on a change in pull request #11921: ARROW-12743 [R] Add DESCRIPTION fields for dev dependencies

2022-01-13 Thread GitBox
dragosmg commented on a change in pull request #11921: URL: https://github.com/apache/arrow/pull/11921#discussion_r783983716 ## File path: r/vignettes/developers/workflow.Rmd ## @@ -34,7 +44,7 @@ pkgdown::build_site(preview=TRUE) The R code in the package follows [the tidyve

[GitHub] [arrow-datafusion] alamb commented on pull request #1523: Update to arrow-7.0.0

2022-01-13 Thread GitBox
alamb commented on pull request #1523: URL: https://github.com/apache/arrow-datafusion/pull/1523#issuecomment-1012166129 > thanks, @alamb I can go on works depending on arrow-rs 7.0.0. Sorry it took so long @liukun4515 -- I hope this delay will be reduced by the proposal in https:/

[GitHub] [arrow] ursabot edited a comment on pull request #11751: ARROW-14694: [R] Let me dput a schema

2022-01-13 Thread GitBox
ursabot edited a comment on pull request #11751: URL: https://github.com/apache/arrow/pull/11751#issuecomment-1012152838 Benchmark runs are scheduled for baseline = 77fc23fcae0331da3adf94619a381a371a6e414f and contender = c48353fd21b26bcb894d791d49c29371607eb9b9. c48353fd21b26bcb894d791d4

[GitHub] [arrow] thisisnic commented on a change in pull request #12133: ARROW-10485: [R] Accept partitioning in open_dataset when file paths are hive-style

2022-01-13 Thread GitBox
thisisnic commented on a change in pull request #12133: URL: https://github.com/apache/arrow/pull/12133#discussion_r783984955 ## File path: r/R/dataset-factory.R ## @@ -60,16 +61,71 @@ DatasetFactory$create <- function(x, return(FileSystemDatasetFactory$create(path_and_fs$

[GitHub] [arrow] jonkeane commented on a change in pull request #11738: ARROW-14169: [R] altrep for factors

2022-01-13 Thread GitBox
jonkeane commented on a change in pull request #11738: URL: https://github.com/apache/arrow/pull/11738#discussion_r783999870 ## File path: r/src/altrep.cpp ## @@ -23,6 +23,7 @@ #include #include #include +#include Review comment: Aaah, this is failing because `vi

[GitHub] [arrow-datafusion] alamb commented on issue #1540: Replace DataFusionError/Result with impl Error for ObjectStore and Reader

2022-01-13 Thread GitBox
alamb commented on issue #1540: URL: https://github.com/apache/arrow-datafusion/issues/1540#issuecomment-1012184525 It may need to be ``` impl From> for DataFusionError { fn from(err: Box) -> Self { DataFusionError::External(err) } ``` Though I

[GitHub] [arrow] lidavidm commented on a change in pull request #12100: ARROW-15061: [C++] Add logging for kernel functions and exec plan nodes

2022-01-13 Thread GitBox
lidavidm commented on a change in pull request #12100: URL: https://github.com/apache/arrow/pull/12100#discussion_r784008528 ## File path: cpp/src/arrow/util/tracing_internal.h ## @@ -97,6 +100,63 @@ AsyncGenerator WrapAsyncGenerator(AsyncGenerator wrapped, return fut;

[GitHub] [arrow-cookbook] davisusanibar commented on a change in pull request #113: [Java]: Java cookbook recipes

2022-01-13 Thread GitBox
davisusanibar commented on a change in pull request #113: URL: https://github.com/apache/arrow-cookbook/pull/113#discussion_r784012184 ## File path: Makefile ## @@ -13,6 +13,7 @@ help: @echo "make testTest cookbook for all platforms." @echo "make py

[GitHub] [arrow] nealrichardson commented on a change in pull request #12133: ARROW-10485: [R] Accept partitioning in open_dataset when file paths are hive-style

2022-01-13 Thread GitBox
nealrichardson commented on a change in pull request #12133: URL: https://github.com/apache/arrow/pull/12133#discussion_r784014538 ## File path: r/R/dataset-factory.R ## @@ -60,16 +61,71 @@ DatasetFactory$create <- function(x, return(FileSystemDatasetFactory$create(path_an

[GitHub] [arrow] ursabot edited a comment on pull request #11035: ARROW-13811: [Java] Provide a general out-of-place sorter

2022-01-13 Thread GitBox
ursabot edited a comment on pull request #11035: URL: https://github.com/apache/arrow/pull/11035#issuecomment-986163875 Benchmark runs are scheduled for baseline = 310fca93d7b44081418fafd5fd30ce9342839ee4 and contender = 6cd288e686abcdf2957582d6652ddae3d031d0fa. 6cd288e686abcdf2957582d665

[GitHub] [arrow] nealrichardson commented on a change in pull request #12133: ARROW-10485: [R] Accept partitioning in open_dataset when file paths are hive-style

2022-01-13 Thread GitBox
nealrichardson commented on a change in pull request #12133: URL: https://github.com/apache/arrow/pull/12133#discussion_r784016948 ## File path: r/R/dataset-factory.R ## @@ -60,16 +61,71 @@ DatasetFactory$create <- function(x, return(FileSystemDatasetFactory$create(path_an

[GitHub] [arrow] nealrichardson commented on a change in pull request #12133: ARROW-10485: [R] Accept partitioning in open_dataset when file paths are hive-style

2022-01-13 Thread GitBox
nealrichardson commented on a change in pull request #12133: URL: https://github.com/apache/arrow/pull/12133#discussion_r784018084 ## File path: r/R/dataset.R ## @@ -23,8 +23,51 @@ #' `open_dataset()` to point to a directory of data files and return a #' `Dataset`, then use `

[GitHub] [arrow] ursabot edited a comment on pull request #12097: ARROW-14590: [R] Implement lubridate::week

2022-01-13 Thread GitBox
ursabot edited a comment on pull request #12097: URL: https://github.com/apache/arrow/pull/12097#issuecomment-1011490798 Benchmark runs are scheduled for baseline = b582af1771a84ae3a13438cf721f5c13c3bcff2a and contender = d1fb8e70d4a7f49bcb21ca1ce5c8127a45d1c4f7. d1fb8e70d4a7f49bcb21ca1ce

[GitHub] [arrow-datafusion] alamb commented on issue #1549: Proposal: Remove `Accumulator::update` and `Accumulator::merge`

2022-01-13 Thread GitBox
alamb commented on issue #1549: URL: https://github.com/apache/arrow-datafusion/issues/1549#issuecomment-1012197926 @liukun4515 > Can we change the default implementation of the accumulator trait to this? We could change the default implementation, however, I am not sure what

[GitHub] [arrow] ursabot edited a comment on pull request #11035: ARROW-13811: [Java] Provide a general out-of-place sorter

2022-01-13 Thread GitBox
ursabot edited a comment on pull request #11035: URL: https://github.com/apache/arrow/pull/11035#issuecomment-986163875 Benchmark runs are scheduled for baseline = 310fca93d7b44081418fafd5fd30ce9342839ee4 and contender = 6cd288e686abcdf2957582d6652ddae3d031d0fa. 6cd288e686abcdf2957582d665

[GitHub] [arrow-datafusion] liukun4515 commented on a change in pull request #1526: Initial MemoryManager and DiskManager APIs for query execution + External Sort implementation

2022-01-13 Thread GitBox
liukun4515 commented on a change in pull request #1526: URL: https://github.com/apache/arrow-datafusion/pull/1526#discussion_r784025939 ## File path: datafusion/src/execution/mod.rs ## @@ -19,4 +19,7 @@ pub mod context; pub mod dataframe_impl; +pub mod disk_manager; Review

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #1526: Initial MemoryManager and DiskManager APIs for query execution + External Sort implementation

2022-01-13 Thread GitBox
alamb commented on a change in pull request #1526: URL: https://github.com/apache/arrow-datafusion/pull/1526#discussion_r784032226 ## File path: datafusion/src/execution/mod.rs ## @@ -19,4 +19,7 @@ pub mod context; pub mod dataframe_impl; +pub mod disk_manager; Review comm

[GitHub] [arrow] thisisnic commented on a change in pull request #12133: ARROW-10485: [R] Accept partitioning in open_dataset when file paths are hive-style

2022-01-13 Thread GitBox
thisisnic commented on a change in pull request #12133: URL: https://github.com/apache/arrow/pull/12133#discussion_r784044260 ## File path: r/R/dataset.R ## @@ -23,8 +23,51 @@ #' `open_dataset()` to point to a directory of data files and return a #' `Dataset`, then use `dplyr

[GitHub] [arrow-datafusion] liukun4515 commented on issue #1549: Proposal: Remove `Accumulator::update` and `Accumulator::merge`

2022-01-13 Thread GitBox
liukun4515 commented on issue #1549: URL: https://github.com/apache/arrow-datafusion/issues/1549#issuecomment-1012225389 > It makes sense to me. Removing the `update` and `merge` is more clear for developers. -- This is an automated message from the Apache Git Service. To respond

[GitHub] [arrow-datafusion] alamb edited a comment on issue #1549: Proposal: Remove `Accumulator::update` and `Accumulator::merge`

2022-01-13 Thread GitBox
alamb edited a comment on issue #1549: URL: https://github.com/apache/arrow-datafusion/issues/1549#issuecomment-1012197926 @liukun4515 > Can we change the default implementation of the accumulator trait to this? We could change the default implementation, however, I am not su

[GitHub] [arrow-rs] alamb commented on a change in pull request #1054: Improve parquet reading performance for columns with nulls by preserving bitmask when possible (#1037)

2022-01-13 Thread GitBox
alamb commented on a change in pull request #1054: URL: https://github.com/apache/arrow-rs/pull/1054#discussion_r784056677 ## File path: parquet/src/arrow/record_reader/definition_levels.rs ## @@ -228,6 +232,20 @@ impl ColumnLevelDecoder for DefinitionLevelDecoder { } }

[GitHub] [arrow-rs] alamb closed issue #1037: Parquet Preserve BitMask

2022-01-13 Thread GitBox
alamb closed issue #1037: URL: https://github.com/apache/arrow-rs/issues/1037 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@a

[GitHub] [arrow-datafusion] yjshen commented on a change in pull request #1526: Initial MemoryManager and DiskManager APIs for query execution + External Sort implementation

2022-01-13 Thread GitBox
yjshen commented on a change in pull request #1526: URL: https://github.com/apache/arrow-datafusion/pull/1526#discussion_r784057787 ## File path: datafusion/src/execution/mod.rs ## @@ -19,4 +19,7 @@ pub mod context; pub mod dataframe_impl; +pub mod disk_manager; Review com

[GitHub] [arrow-rs] alamb merged pull request #1054: Improve parquet reading performance for columns with nulls by preserving bitmask when possible (#1037)

2022-01-13 Thread GitBox
alamb merged pull request #1054: URL: https://github.com/apache/arrow-rs/pull/1054 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[GitHub] [arrow-rs] alamb commented on pull request #1054: Improve parquet reading performance for columns with nulls by preserving bitmask when possible (#1037)

2022-01-13 Thread GitBox
alamb commented on pull request #1054: URL: https://github.com/apache/arrow-rs/pull/1054#issuecomment-1012233931 Thanks @tustvold -- this is pretty epic -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

[GitHub] [arrow-datafusion] yjshen commented on a change in pull request #1526: Initial MemoryManager and DiskManager APIs for query execution + External Sort implementation

2022-01-13 Thread GitBox
yjshen commented on a change in pull request #1526: URL: https://github.com/apache/arrow-datafusion/pull/1526#discussion_r784058382 ## File path: datafusion/src/execution/runtime_env.rs ## @@ -0,0 +1,149 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or m

  1   2   3   4   >