[GitHub] [arrow-rs] codecov-commenter commented on pull request #531: fix take kernel null handling on structs

2021-07-08 Thread GitBox
codecov-commenter commented on pull request #531: URL: https://github.com/apache/arrow-rs/pull/531#issuecomment-876905023 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/531?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+A

[GitHub] [arrow-rs] bjchambers opened a new pull request #531: fix take kernel null handling on structs

2021-07-08 Thread GitBox
bjchambers opened a new pull request #531: URL: https://github.com/apache/arrow-rs/pull/531 This closes #530. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscrib

[GitHub] [arrow-rs] bjchambers opened a new issue #530: Take kernel doesn't handle nulls and structs correctly

2021-07-08 Thread GitBox
bjchambers opened a new issue #530: URL: https://github.com/apache/arrow-rs/issues/530 **Describe the bug** Applying the Take kernel to an array of indices and an array of structs I expect nulls in the output in two cases: 1. If the index is null for the current row 2. If t

[GitHub] [arrow-datafusion] lvheyang opened a new issue #699: Datafusion: math function does not support type f32

2021-07-08 Thread GitBox
lvheyang opened a new issue #699: URL: https://github.com/apache/arrow-datafusion/issues/699 **Describe the bug** Math function such as sqrt() / sin() ... does not support f32 type. **To Reproduce** Example Code ``` use std::sync::Arc; use datafusion::arrow:

[GitHub] [arrow] cyb70289 closed pull request #10685: ARROW-13290: [C++] Add missing include

2021-07-08 Thread GitBox
cyb70289 closed pull request #10685: URL: https://github.com/apache/arrow/pull/10685 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubs

[GitHub] [arrow] cyb70289 commented on pull request #10685: ARROW-13290: [C++] Add missing include

2021-07-08 Thread GitBox
cyb70289 commented on pull request #10685: URL: https://github.com/apache/arrow/pull/10685#issuecomment-876864385 CI failure is not related. I've created jira issue to track. https://issues.apache.org/jira/browse/ARROW-13292 -- This is an automated message from the Apache Git Service.

[GitHub] [arrow] jonkeane closed pull request #10676: ARROW-13265: [R] cli valgrind errors in nightlies

2021-07-08 Thread GitBox
jonkeane closed pull request #10676: URL: https://github.com/apache/arrow/pull/10676 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubs

[GitHub] [arrow] ZMZ91 commented on pull request #10606: ARROW-13005: [C++] Add support for take implementation on dense union type

2021-07-08 Thread GitBox
ZMZ91 commented on pull request #10606: URL: https://github.com/apache/arrow/pull/10606#issuecomment-876854392 Hi @bkietz and @pitrou, just a question not quite related to this pr and about dense union array. I've seen the limitation for array data is under 2GB. Does it apply to nested arr

[GitHub] [arrow-datafusion] Jimexist commented on pull request #687: #554: Lead/lag window function with offset and default value arguments

2021-07-08 Thread GitBox
Jimexist commented on pull request #687: URL: https://github.com/apache/arrow-datafusion/pull/687#issuecomment-876838160 if the first example works then it's fine. the second example might fail because of different reasons not related to this change. -- This is an automated message from

[GitHub] [arrow] westonpace commented on a change in pull request #10628: ARROW-12364: [Python] [Dataset] Add metadata_collector option to ds.write_dataset()

2021-07-08 Thread GitBox
westonpace commented on a change in pull request #10628: URL: https://github.com/apache/arrow/pull/10628#discussion_r02953 ## File path: python/pyarrow/tests/test_dataset.py ## @@ -2672,47 +2672,57 @@ def test_feather_format(tempdir, dataset_reader): dataset_reader

[GitHub] [arrow] westonpace commented on a change in pull request #10628: ARROW-12364: [Python] [Dataset] Add metadata_collector option to ds.write_dataset()

2021-07-08 Thread GitBox
westonpace commented on a change in pull request #10628: URL: https://github.com/apache/arrow/pull/10628#discussion_r01945 ## File path: python/pyarrow/dataset.py ## @@ -731,6 +731,12 @@ def write_dataset(data, base_dir, basename_template=None, format=None, (e.g.

[GitHub] [arrow] rok commented on pull request #10457: ARROW-12980: [C++] Kernels to extract datetime components should be timezone aware

2021-07-08 Thread GitBox
rok commented on pull request #10457: URL: https://github.com/apache/arrow/pull/10457#issuecomment-876821558 @jorisvandenbossche @pitrou I'm going to look into the R issues but other than that this is ready for review. -- This is an automated message from the Apache Git Service. To respo

[GitHub] [arrow] kou closed pull request #10687: ARROW-13291: [GLib][CI] Require gobject-introspection 3.4.5 or later

2021-07-08 Thread GitBox
kou closed pull request #10687: URL: https://github.com/apache/arrow/pull/10687 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...

[GitHub] [arrow] kou commented on pull request #10687: ARROW-13291: [GLib][CI] Require gobject-introspection 3.4.5 or later

2021-07-08 Thread GitBox
kou commented on pull request #10687: URL: https://github.com/apache/arrow/pull/10687#issuecomment-876818607 +1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

[GitHub] [arrow] kou commented on pull request #10687: ARROW-13291: [GLib][CI] Require gobject-introspection 3.4.5 or later

2021-07-08 Thread GitBox
kou commented on pull request #10687: URL: https://github.com/apache/arrow/pull/10687#issuecomment-876818270 Passed: https://github.com/kou/crossbow/runs/3024237635 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] kou edited a comment on pull request #10687: ARROW-13291: [GLib][CI] Require gobject-introspection 3.4.5 or later

2021-07-08 Thread GitBox
kou edited a comment on pull request #10687: URL: https://github.com/apache/arrow/pull/10687#issuecomment-876809827 It seems that crossbow doesn't work... https://github.com/ursacomputing/crossbow/actions/runs/1012864521 > actions/setup-python@v2 is not allowed to be used in u

[GitHub] [arrow] kou commented on pull request #10687: ARROW-13291: [GLib][CI] Require gobject-introspection 3.4.5 or later

2021-07-08 Thread GitBox
kou commented on pull request #10687: URL: https://github.com/apache/arrow/pull/10687#issuecomment-876809827 It seems that crossbow doesn't work... https://github.com/ursacomputing/crossbow/actions/runs/1012864521 > actions/setup-python@v2 is not allowed to be used in ursac

[GitHub] [arrow] kszucs commented on pull request #10684: ARROW-13239: [Python] [Doc] Expose signatures in pyx modules

2021-07-08 Thread GitBox
kszucs commented on pull request #10684: URL: https://github.com/apache/arrow/pull/10684#issuecomment-876807659 Shell we include other directives too, similarly to [lib.pyx](https://github.com/apache/arrow/blob/master/python/pyarrow/lib.pyx#L18-L21)? Perhaps there is a way to set those glo

[GitHub] [arrow] alexbaden commented on pull request #10685: ARROW-13290: [C++] Add missing include

2021-07-08 Thread GitBox
alexbaden commented on pull request #10685: URL: https://github.com/apache/arrow/pull/10685#issuecomment-876793810 I am not sure why the test failure occurred, but I think is it unrelated to the small PR change. -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow] kou commented on a change in pull request #10614: ARROW-13100: [MATLAB] Integrate GoogleTest with MATLAB Interface C++ Code

2021-07-08 Thread GitBox
kou commented on a change in pull request #10614: URL: https://github.com/apache/arrow/pull/10614#discussion_r666514068 ## File path: matlab/CMakeLists.txt ## @@ -32,8 +110,29 @@ endif() set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} ${CMAKE_SOURCE_DIR}/cmake_modules) -# Arrow

[GitHub] [arrow] westonpace commented on a change in pull request #10664: ARROW-13238: [C++][Compute][Dataset] Use an ExecPlan for dataset scans

2021-07-08 Thread GitBox
westonpace commented on a change in pull request #10664: URL: https://github.com/apache/arrow/pull/10664#discussion_r666506838 ## File path: cpp/src/arrow/compute/exec/exec_plan.cc ## @@ -282,20 +305,21 @@ struct SourceNode : ExecNode { void StopProducing(ExecNode* output)

[GitHub] [arrow] westonpace commented on a change in pull request #10664: ARROW-13238: [C++][Compute][Dataset] Use an ExecPlan for dataset scans

2021-07-08 Thread GitBox
westonpace commented on a change in pull request #10664: URL: https://github.com/apache/arrow/pull/10664#discussion_r666505934 ## File path: cpp/src/arrow/compute/exec/exec_plan.cc ## @@ -220,58 +240,61 @@ struct SourceNode : ExecNode { const char* kind_name() override { r

[GitHub] [arrow] westonpace commented on a change in pull request #10664: ARROW-13238: [C++][Compute][Dataset] Use an ExecPlan for dataset scans

2021-07-08 Thread GitBox
westonpace commented on a change in pull request #10664: URL: https://github.com/apache/arrow/pull/10664#discussion_r666505229 ## File path: cpp/src/arrow/dataset/scanner.cc ## @@ -604,11 +585,90 @@ Result AsyncScanner::ScanBatchesUnorderedAsync() return ScanBatchesUnordere

[GitHub] [arrow] eerhardt commented on a change in pull request #10527: ARROW-6870: [C#] Add Support for Dictionary Arrays and Dictionary Encoding

2021-07-08 Thread GitBox
eerhardt commented on a change in pull request #10527: URL: https://github.com/apache/arrow/pull/10527#discussion_r666503779 ## File path: csharp/src/Apache.Arrow/Ipc/ArrowStreamWriter.cs ## @@ -248,6 +263,13 @@ private protected void WriteRecordBatchInternal(RecordBatch recor

[GitHub] [arrow] eerhardt commented on a change in pull request #10527: ARROW-6870: [C#] Add Support for Dictionary Arrays and Dictionary Encoding

2021-07-08 Thread GitBox
eerhardt commented on a change in pull request #10527: URL: https://github.com/apache/arrow/pull/10527#discussion_r666495112 ## File path: csharp/src/Apache.Arrow/Ipc/ArrowStreamWriter.cs ## @@ -723,4 +868,62 @@ public virtual void Dispose() } } } + +

[GitHub] [arrow] eerhardt commented on a change in pull request #10527: ARROW-6870: [C#] Add Support for Dictionary Arrays and Dictionary Encoding

2021-07-08 Thread GitBox
eerhardt commented on a change in pull request #10527: URL: https://github.com/apache/arrow/pull/10527#discussion_r666494514 ## File path: csharp/src/Apache.Arrow/Ipc/ArrowStreamWriter.cs ## @@ -723,4 +868,62 @@ public virtual void Dispose() } } } + +

[GitHub] [arrow] kevingurney commented on pull request #10614: ARROW-13100: [MATLAB] Integrate GoogleTest with MATLAB Interface C++ Code

2021-07-08 Thread GitBox
kevingurney commented on pull request #10614: URL: https://github.com/apache/arrow/pull/10614#issuecomment-876721697 Thanks, @kou for all of your really helpful feedback! We've addressed all of your comments. We will now focus on adding the additional logic to make this work on Wind

[GitHub] [arrow] eerhardt commented on a change in pull request #10527: ARROW-6870: [C#] Add Support for Dictionary Arrays and Dictionary Encoding

2021-07-08 Thread GitBox
eerhardt commented on a change in pull request #10527: URL: https://github.com/apache/arrow/pull/10527#discussion_r666493486 ## File path: csharp/src/Apache.Arrow/Ipc/ArrowStreamWriter.cs ## @@ -723,4 +868,62 @@ public virtual void Dispose() } } } + +

[GitHub] [arrow] kevingurney commented on a change in pull request #10614: ARROW-13100: [MATLAB] Integrate GoogleTest with MATLAB Interface C++ Code

2021-07-08 Thread GitBox
kevingurney commented on a change in pull request #10614: URL: https://github.com/apache/arrow/pull/10614#discussion_r666492483 ## File path: matlab/CMakeLists.txt ## @@ -17,6 +17,84 @@ cmake_minimum_required(VERSION 3.20) +# Build the Arrow C++ libraries WITHOUT bundled G

[GitHub] [arrow] eerhardt commented on a change in pull request #10527: ARROW-6870: [C#] Add Support for Dictionary Arrays and Dictionary Encoding

2021-07-08 Thread GitBox
eerhardt commented on a change in pull request #10527: URL: https://github.com/apache/arrow/pull/10527#discussion_r666491771 ## File path: csharp/src/Apache.Arrow/Ipc/ArrowStreamWriter.cs ## @@ -723,4 +868,62 @@ public virtual void Dispose() } } } + +

[GitHub] [arrow] eerhardt commented on a change in pull request #10527: ARROW-6870: [C#] Add Support for Dictionary Arrays and Dictionary Encoding

2021-07-08 Thread GitBox
eerhardt commented on a change in pull request #10527: URL: https://github.com/apache/arrow/pull/10527#discussion_r666462335 ## File path: csharp/src/Apache.Arrow/Ipc/ArrowReaderImplementation.cs ## @@ -30,6 +30,13 @@ internal abstract class ArrowReaderImplementation : IDispos

[GitHub] [arrow] github-actions[bot] commented on pull request #10687: ARROW-13291: [GLib][CI] Require gobject-introspection 3.4.5 or later

2021-07-08 Thread GitBox
github-actions[bot] commented on pull request #10687: URL: https://github.com/apache/arrow/pull/10687#issuecomment-876698848 Revision: 0d3fa87f4a8acce6513c2eab64d2fa67e5182df1 Submitted crossbow builds: [ursacomputing/crossbow @ actions-581](https://github.com/ursacomputing/crossbow/

[GitHub] [arrow] github-actions[bot] commented on pull request #10687: ARROW-13291: [GLib][CI] Require gobject-introspection 3.4.5 or later

2021-07-08 Thread GitBox
github-actions[bot] commented on pull request #10687: URL: https://github.com/apache/arrow/pull/10687#issuecomment-876698412 https://issues.apache.org/jira/browse/ARROW-13291 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

[GitHub] [arrow] kou commented on pull request #10687: ARROW-13291: [GLib][CI] Require gobject-introspection 3.4.5 or later

2021-07-08 Thread GitBox
kou commented on pull request #10687: URL: https://github.com/apache/arrow/pull/10687#issuecomment-876698385 @github-actions crossbow submit test-debian-c-glib -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[GitHub] [arrow] kou opened a new pull request #10687: ARROW-13291: [GLib][CI] Require gobject-introspection 3.4.5 or later

2021-07-08 Thread GitBox
kou opened a new pull request #10687: URL: https://github.com/apache/arrow/pull/10687 It's needed for Flight tests. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsu

[GitHub] [arrow] rok commented on pull request #10610: ARROW-13033: [C++] Kernel to localize naive timestamps to a timezone (preserving clock-time)

2021-07-08 Thread GitBox
rok commented on pull request #10610: URL: https://github.com/apache/arrow/pull/10610#issuecomment-876688529 @jorisvandenbossche I've added nonexistent and ambiguous time handling. I'll try to template calls instead of having a single general one. What I'm not sure about now is if tr

[GitHub] [arrow] eerhardt commented on a change in pull request #10527: ARROW-6870: [C#] Add Support for Dictionary Arrays and Dictionary Encoding

2021-07-08 Thread GitBox
eerhardt commented on a change in pull request #10527: URL: https://github.com/apache/arrow/pull/10527#discussion_r666460092 ## File path: csharp/src/Apache.Arrow/Arrays/DictionaryArray.cs ## @@ -0,0 +1,67 @@ +// Licensed to the Apache Software Foundation (ASF) under one or mo

[GitHub] [arrow-datafusion] adamhooper commented on issue #686: Specific timezone support for `to_timetamp*()`

2021-07-08 Thread GitBox
adamhooper commented on issue #686: URL: https://github.com/apache/arrow-datafusion/issues/686#issuecomment-876681175 @velvia Great points My last two suggestions were exactly about interop -- and a transition period. Today, DataFusion users interpret timezone=null to mean `TIMESTAM

[GitHub] [arrow] github-actions[bot] commented on pull request #10686: ARROW-13289: [C++] Accept integer args in trig/log functions via promotion to double

2021-07-08 Thread GitBox
github-actions[bot] commented on pull request #10686: URL: https://github.com/apache/arrow/pull/10686#issuecomment-87148 https://issues.apache.org/jira/browse/ARROW-13289 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

[GitHub] [arrow] lidavidm opened a new pull request #10686: ARROW-13289: [C++] Accept integer args in trig/log functions via promotion to double

2021-07-08 Thread GitBox
lidavidm opened a new pull request #10686: URL: https://github.com/apache/arrow/pull/10686 Instead of adding/generating separate kernels for integers, just promote the arguments instead. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [arrow] nirandaperera commented on pull request #10679: ARROW-13170 [C++] Reducing branching in compute/kernels/vector_selection.cc

2021-07-08 Thread GitBox
nirandaperera commented on pull request #10679: URL: https://github.com/apache/arrow/pull/10679#issuecomment-876664929 @wesm with super scalar variant, I get the following, ``` BEFORE: -

[GitHub] [arrow] shollyman commented on pull request #10603: ARROW-13191: [Go] allow external schema in ipc readers

2021-07-08 Thread GitBox
shollyman commented on pull request #10603: URL: https://github.com/apache/arrow/pull/10603#issuecomment-876651126 The issue is the difference in abstractions. In other languages, there exists a concept of a single master "reader" which is schema aware, and which to which we supply a seri

[GitHub] [arrow] github-actions[bot] commented on pull request #10685: ARROW-13290: [C++] Add missing include

2021-07-08 Thread GitBox
github-actions[bot] commented on pull request #10685: URL: https://github.com/apache/arrow/pull/10685#issuecomment-876650321 https://issues.apache.org/jira/browse/ARROW-13290 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

[GitHub] [arrow] github-actions[bot] commented on pull request #10685: [ARROW-13290] Add missing include

2021-07-08 Thread GitBox
github-actions[bot] commented on pull request #10685: URL: https://github.com/apache/arrow/pull/10685#issuecomment-876650096 Thanks for opening a pull request! If this is not a [minor PR](https://github.com/apache/arrow/blob/master/CONTRIBUTING.md#Minor-Fixes). Could you ope

[GitHub] [arrow] alexbaden opened a new pull request #10685: [ARROW-13290] Add missing include

2021-07-08 Thread GitBox
alexbaden opened a new pull request #10685: URL: https://github.com/apache/arrow/pull/10685 Noticed this issue compiling w/ clang-12 and gcc-11 on Arch linux (both using the PKGBUILD and building from source). -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow-datafusion] Dandandan opened a new pull request #698: Avoid sleeping between tasks

2021-07-08 Thread GitBox
Dandandan opened a new pull request #698: URL: https://github.com/apache/arrow-datafusion/pull/698 # Which issue does this PR close? Closes #697 # Rationale for this change Currently, when polling for work, the executor always waits 250 ms in between polling for tasks.

[GitHub] [arrow-datafusion] Dandandan opened a new issue #697: Avoid sleeping in between task in Ballista

2021-07-08 Thread GitBox
Dandandan opened a new issue #697: URL: https://github.com/apache/arrow-datafusion/issues/697 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** Currently, when polling for work, the executor always waits 250 second in between recei

[GitHub] [arrow-datafusion] velvia commented on issue #686: Specific timezone support for `to_timetamp*()`

2021-07-08 Thread GitBox
velvia commented on issue #686: URL: https://github.com/apache/arrow-datafusion/issues/686#issuecomment-876625232 @adamhopper personally I'd be OK with your recommendations, but some thoughts: - There is some interop considerations between `TIMESTAMP WITH TIMEZONE` ie TimestampType(_,

[GitHub] [arrow] thisisnic commented on pull request #10404: ARROW-12876: [R] Fix build flags on Raspberry Pi

2021-07-08 Thread GitBox
thisisnic commented on pull request #10404: URL: https://github.com/apache/arrow/pull/10404#issuecomment-876585699 OK, thanks @nealrichardson , will give you a shout then then -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

[GitHub] [arrow] bkietz commented on a change in pull request #10606: ARROW-13005: [C++] Add support for take implementation on dense union type

2021-07-08 Thread GitBox
bkietz commented on a change in pull request #10606: URL: https://github.com/apache/arrow/pull/10606#discussion_r666344417 ## File path: cpp/src/arrow/compute/kernels/vector_selection_test.cc ## @@ -607,31 +607,29 @@ TEST_F(TestFilterKernelWithStruct, FilterStruct) { class T

[GitHub] [arrow] bkietz commented on a change in pull request #10661: ARROW-8655: [C++][Python] Preserve partitioning information for a discovered Dataset

2021-07-08 Thread GitBox
bkietz commented on a change in pull request #10661: URL: https://github.com/apache/arrow/pull/10661#discussion_r666340416 ## File path: cpp/src/arrow/dataset/file_base.h ## @@ -236,7 +236,8 @@ class ARROW_DS_EXPORT FileSystemDataset : public Dataset { static Result> Make(

[GitHub] [arrow] bkietz commented on a change in pull request #10664: ARROW-13238: [C++][Compute][Dataset] Use an ExecPlan for dataset scans

2021-07-08 Thread GitBox
bkietz commented on a change in pull request #10664: URL: https://github.com/apache/arrow/pull/10664#discussion_r666334162 ## File path: cpp/src/arrow/compute/exec/exec_plan.cc ## @@ -282,20 +305,21 @@ struct SourceNode : ExecNode { void StopProducing(ExecNode* output) ove

[GitHub] [arrow] bkietz commented on a change in pull request #10664: ARROW-13238: [C++][Compute][Dataset] Use an ExecPlan for dataset scans

2021-07-08 Thread GitBox
bkietz commented on a change in pull request #10664: URL: https://github.com/apache/arrow/pull/10664#discussion_r666332778 ## File path: cpp/src/arrow/dataset/scanner.cc ## @@ -604,11 +585,90 @@ Result AsyncScanner::ScanBatchesUnorderedAsync() return ScanBatchesUnorderedAsy

[GitHub] [arrow] bkietz commented on a change in pull request #10664: ARROW-13238: [C++][Compute][Dataset] Use an ExecPlan for dataset scans

2021-07-08 Thread GitBox
bkietz commented on a change in pull request #10664: URL: https://github.com/apache/arrow/pull/10664#discussion_r666331058 ## File path: cpp/src/arrow/util/future.cc ## @@ -272,6 +272,8 @@ class ConcreteFutureImpl : public FutureImpl { return true; case ShouldSc

[GitHub] [arrow] bkietz commented on a change in pull request #10664: ARROW-13238: [C++][Compute][Dataset] Use an ExecPlan for dataset scans

2021-07-08 Thread GitBox
bkietz commented on a change in pull request #10664: URL: https://github.com/apache/arrow/pull/10664#discussion_r666329817 ## File path: cpp/src/arrow/compute/exec/plan_test.cc ## @@ -144,7 +144,14 @@ TEST(ExecPlan, DummyStartProducing) { // Note that any correct reverse top

[GitHub] [arrow] nealrichardson commented on pull request #10404: ARROW-12876: [R] Fix build flags on Raspberry Pi

2021-07-08 Thread GitBox
nealrichardson commented on pull request #10404: URL: https://github.com/apache/arrow/pull/10404#issuecomment-876557468 We will need to do more work to wire up the changes that kou made, I think. I can help next week. -- This is an automated message from the Apache Git Service. To respo

[GitHub] [arrow-experimental-rs-parquet2] jorgecarleitao merged pull request #2: Change project name to make it clear it is experimental not official

2021-07-08 Thread GitBox
jorgecarleitao merged pull request #2: URL: https://github.com/apache/arrow-experimental-rs-parquet2/pull/2 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe

[GitHub] [arrow-experimental-rs-arrow2] jorgecarleitao merged pull request #2: Change name to make it clear this is an experimental implementation of arrow

2021-07-08 Thread GitBox
jorgecarleitao merged pull request #2: URL: https://github.com/apache/arrow-experimental-rs-arrow2/pull/2 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

[GitHub] [arrow] thisisnic commented on pull request #10404: ARROW-12876: [R] Fix build flags on Raspberry Pi

2021-07-08 Thread GitBox
thisisnic commented on pull request #10404: URL: https://github.com/apache/arrow/pull/10404#issuecomment-876554188 I wiped my Pi and started again just to make sure I hadn't installed things which aren't mentioned in the dev guides which may need adding, etc. I got an install error

[GitHub] [arrow] danielricecodes commented on issue #10616: Having trouble getting Arrow and Red Parquet to work on OS X 10.15.7 (Catalina)

2021-07-08 Thread GitBox
danielricecodes commented on issue #10616: URL: https://github.com/apache/arrow/issues/10616#issuecomment-876551547 > Can you remove gems that may use `fork`? > I think that removing `spring` may work. Disabling `spring` definitely did the trick here. I can even process parquet f

[GitHub] [arrow] github-actions[bot] commented on pull request #10684: ARROW-13239: [Python] [Doc] Expose signatures in pyx modules

2021-07-08 Thread GitBox
github-actions[bot] commented on pull request #10684: URL: https://github.com/apache/arrow/pull/10684#issuecomment-876541079 https://issues.apache.org/jira/browse/ARROW-13239 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

[GitHub] [arrow] amol- opened a new pull request #10684: ARROW-13239: [Python] [Doc] Expose signatures in pyx modules

2021-07-08 Thread GitBox
amol- opened a new pull request #10684: URL: https://github.com/apache/arrow/pull/10684 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-uns

[GitHub] [arrow] ursabot edited a comment on pull request #10412: ARROW-9430: [C++] Implement replace_with_mask kernel

2021-07-08 Thread GitBox
ursabot edited a comment on pull request #10412: URL: https://github.com/apache/arrow/pull/10412#issuecomment-876502374 Benchmark runs are scheduled for baseline = 7eea2f53a1002552bbb87db5611e75c15b88b504 and contender = f79438d96c42b7728c3f9860aadad545cc5ac483. Results will be available a

[GitHub] [arrow] ianmcook closed pull request #10638: ARROW-13171: [R] Add binding for str_pad()

2021-07-08 Thread GitBox
ianmcook closed pull request #10638: URL: https://github.com/apache/arrow/pull/10638 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubs

[GitHub] [arrow] ursabot commented on pull request #10412: ARROW-9430: [C++] Implement replace_with_mask kernel

2021-07-08 Thread GitBox
ursabot commented on pull request #10412: URL: https://github.com/apache/arrow/pull/10412#issuecomment-876502374 Benchmark runs are scheduled for baseline = 7eea2f53a1002552bbb87db5611e75c15b88b504 and contender = f79438d96c42b7728c3f9860aadad545cc5ac483. Results will be available as each

[GitHub] [arrow] lidavidm commented on pull request #10412: ARROW-9430: [C++] Implement replace_with_mask kernel

2021-07-08 Thread GitBox
lidavidm commented on pull request #10412: URL: https://github.com/apache/arrow/pull/10412#issuecomment-876501576 @ursabot please benchmark lang=C++ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] lidavidm removed a comment on pull request #10412: ARROW-9430: [C++] Implement replace_with_mask kernel

2021-07-08 Thread GitBox
lidavidm removed a comment on pull request #10412: URL: https://github.com/apache/arrow/pull/10412#issuecomment-876500592 @ursabot please benchmark language=C++ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow] ursabot commented on pull request #10412: ARROW-9430: [C++] Implement replace_with_mask kernel

2021-07-08 Thread GitBox
ursabot commented on pull request #10412: URL: https://github.com/apache/arrow/pull/10412#issuecomment-876500599 Supported benchmark command examples: `@ursabot benchmark help` To run all benchmarks: `@ursabot please benchmark` To filter benchmarks by language:

[GitHub] [arrow] lidavidm commented on pull request #10412: ARROW-9430: [C++] Implement replace_with_mask kernel

2021-07-08 Thread GitBox
lidavidm commented on pull request #10412: URL: https://github.com/apache/arrow/pull/10412#issuecomment-876500592 @ursabot please benchmark language=C++ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow] kevingurney commented on a change in pull request #10614: ARROW-13100: [MATLAB] Integrate GoogleTest with MATLAB Interface C++ Code

2021-07-08 Thread GitBox
kevingurney commented on a change in pull request #10614: URL: https://github.com/apache/arrow/pull/10614#discussion_r666260445 ## File path: matlab/CMakeLists.txt ## @@ -32,8 +110,29 @@ endif() set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} ${CMAKE_SOURCE_DIR}/cmake_modules)

[GitHub] [arrow] kevingurney commented on a change in pull request #10614: ARROW-13100: [MATLAB] Integrate GoogleTest with MATLAB Interface C++ Code

2021-07-08 Thread GitBox
kevingurney commented on a change in pull request #10614: URL: https://github.com/apache/arrow/pull/10614#discussion_r666260099 ## File path: matlab/CMakeLists.txt ## @@ -32,8 +110,29 @@ endif() set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} ${CMAKE_SOURCE_DIR}/cmake_modules)

[GitHub] [arrow] lidavidm commented on pull request #10412: ARROW-9430: [C++] Implement replace_with_mask kernel

2021-07-08 Thread GitBox
lidavidm commented on pull request #10412: URL: https://github.com/apache/arrow/pull/10412#issuecomment-876490819 And now we're about 2-3x faster! ``` --- Benchmark

[GitHub] [arrow] kevingurney commented on a change in pull request #10614: ARROW-13100: [MATLAB] Integrate GoogleTest with MATLAB Interface C++ Code

2021-07-08 Thread GitBox
kevingurney commented on a change in pull request #10614: URL: https://github.com/apache/arrow/pull/10614#discussion_r666228358 ## File path: matlab/CMakeLists.txt ## @@ -17,6 +17,84 @@ cmake_minimum_required(VERSION 3.20) +# Build the Arrow C++ libraries WITHOUT bundled G

[GitHub] [arrow] kevingurney commented on a change in pull request #10614: ARROW-13100: [MATLAB] Integrate GoogleTest with MATLAB Interface C++ Code

2021-07-08 Thread GitBox
kevingurney commented on a change in pull request #10614: URL: https://github.com/apache/arrow/pull/10614#discussion_r666223346 ## File path: matlab/CMakeLists.txt ## @@ -17,6 +17,84 @@ cmake_minimum_required(VERSION 3.20) +# Build the Arrow C++ libraries WITHOUT bundled G

[GitHub] [arrow] lidavidm commented on pull request #10412: ARROW-9430: [C++] Implement replace_with_mask kernel

2021-07-08 Thread GitBox
lidavidm commented on pull request #10412: URL: https://github.com/apache/arrow/pull/10412#issuecomment-876453122 Performance here looks highly variable - the benchmark seems to be one of the bimodal ones; we either run at ~1.3 G/s or ~1.7 G/s when repeatedly running the benchmark.

[GitHub] [arrow] kevingurney commented on a change in pull request #10614: ARROW-13100: [MATLAB] Integrate GoogleTest with MATLAB Interface C++ Code

2021-07-08 Thread GitBox
kevingurney commented on a change in pull request #10614: URL: https://github.com/apache/arrow/pull/10614#discussion_r666208411 ## File path: matlab/CMakeLists.txt ## @@ -32,8 +110,29 @@ endif() set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} ${CMAKE_SOURCE_DIR}/cmake_modules)

[GitHub] [arrow] kevingurney commented on a change in pull request #10614: ARROW-13100: [MATLAB] Integrate GoogleTest with MATLAB Interface C++ Code

2021-07-08 Thread GitBox
kevingurney commented on a change in pull request #10614: URL: https://github.com/apache/arrow/pull/10614#discussion_r666204614 ## File path: matlab/CMakeLists.txt ## @@ -32,8 +110,29 @@ endif() set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} ${CMAKE_SOURCE_DIR}/cmake_modules)

[GitHub] [arrow] kevingurney commented on a change in pull request #10614: ARROW-13100: [MATLAB] Integrate GoogleTest with MATLAB Interface C++ Code

2021-07-08 Thread GitBox
kevingurney commented on a change in pull request #10614: URL: https://github.com/apache/arrow/pull/10614#discussion_r666203962 ## File path: matlab/CMakeLists.txt ## @@ -32,8 +110,29 @@ endif() set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} ${CMAKE_SOURCE_DIR}/cmake_modules)

[GitHub] [arrow] bkietz commented on a change in pull request #10664: ARROW-13238: [C++][Compute][Dataset] Use an ExecPlan for dataset scans

2021-07-08 Thread GitBox
bkietz commented on a change in pull request #10664: URL: https://github.com/apache/arrow/pull/10664#discussion_r666203110 ## File path: cpp/src/arrow/compute/exec/exec_plan.h ## @@ -65,16 +61,24 @@ class ARROW_EXPORT ExecPlan : public std::enable_shared_from_this { Statu

[GitHub] [arrow] bkietz commented on a change in pull request #10664: ARROW-13238: [C++][Compute][Dataset] Use an ExecPlan for dataset scans

2021-07-08 Thread GitBox
bkietz commented on a change in pull request #10664: URL: https://github.com/apache/arrow/pull/10664#discussion_r666202502 ## File path: cpp/src/arrow/dataset/scanner.cc ## @@ -604,11 +585,90 @@ Result AsyncScanner::ScanBatchesUnorderedAsync() return ScanBatchesUnorderedAsy

[GitHub] [arrow] bkietz commented on a change in pull request #10664: ARROW-13238: [C++][Compute][Dataset] Use an ExecPlan for dataset scans

2021-07-08 Thread GitBox
bkietz commented on a change in pull request #10664: URL: https://github.com/apache/arrow/pull/10664#discussion_r666201112 ## File path: cpp/src/arrow/compute/exec/exec_plan.cc ## @@ -220,58 +240,61 @@ struct SourceNode : ExecNode { const char* kind_name() override { retur

[GitHub] [arrow] bkietz commented on a change in pull request #10664: ARROW-13238: [C++][Compute][Dataset] Use an ExecPlan for dataset scans

2021-07-08 Thread GitBox
bkietz commented on a change in pull request #10664: URL: https://github.com/apache/arrow/pull/10664#discussion_r666200686 ## File path: cpp/src/arrow/compute/exec/exec_plan.cc ## @@ -220,58 +240,61 @@ struct SourceNode : ExecNode { const char* kind_name() override { retur

[GitHub] [arrow] bkietz commented on a change in pull request #10664: ARROW-13238: [C++][Compute][Dataset] Use an ExecPlan for dataset scans

2021-07-08 Thread GitBox
bkietz commented on a change in pull request #10664: URL: https://github.com/apache/arrow/pull/10664#discussion_r666199751 ## File path: cpp/src/arrow/compute/exec/exec_plan.cc ## @@ -220,58 +240,61 @@ struct SourceNode : ExecNode { const char* kind_name() override { retur

[GitHub] [arrow] thisisnic commented on a change in pull request #10638: ARROW-13171: [R] Add binding for str_pad()

2021-07-08 Thread GitBox
thisisnic commented on a change in pull request #10638: URL: https://github.com/apache/arrow/pull/10638#discussion_r666191374 ## File path: r/tests/testthat/test-dplyr-string-functions.R ## @@ -866,3 +866,44 @@ test_that("str_like", { df ) }) + +test_that("str_pad", {

[GitHub] [arrow] jorisvandenbossche commented on pull request #10661: ARROW-8655: [C++][Python] Preserve partitioning information for a discovered Dataset

2021-07-08 Thread GitBox
jorisvandenbossche commented on pull request #10661: URL: https://github.com/apache/arrow/pull/10661#issuecomment-876422802 Updated the PR according to the comments + added tests and docstrings. > Another approach could be to expose the dictionaries as part of the partitioning facto

[GitHub] [arrow-datafusion] adamhooper commented on issue #686: Specific timezone support for `to_timetamp*()`

2021-07-08 Thread GitBox
adamhooper commented on issue #686: URL: https://github.com/apache/arrow-datafusion/issues/686#issuecomment-876420215 @alamb would anybody be horrified/appalled at the following: * `to_timestamp()` always returns `TIMESTAMP WITH TIMEZONE` -- i.e., UTC. * `my_parquet_file_with_isAd

[GitHub] [arrow] jonkeane commented on pull request #10676: ARROW-13265: [R] cli valgrind errors in nightlies

2021-07-08 Thread GitBox
jonkeane commented on pull request #10676: URL: https://github.com/apache/arrow/pull/10676#issuecomment-876415939 Thanks, I've added comments + a link. It also looks like a {cli} fix is already in a PR (though it'll need to be released to cran to help us on this front). I do think

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #10661: ARROW-8655: [C++][Python] Preserve partitioning information for a discovered Dataset

2021-07-08 Thread GitBox
jorisvandenbossche commented on a change in pull request #10661: URL: https://github.com/apache/arrow/pull/10661#discussion_r666166080 ## File path: python/pyarrow/tests/test_dataset.py ## @@ -1999,39 +2003,46 @@ def test_scan_iterator(use_threads, use_async): scan

[GitHub] [arrow] nealrichardson commented on pull request #10676: ARROW-13265: [R] cli valgrind errors in nightlies

2021-07-08 Thread GitBox
nealrichardson commented on pull request #10676: URL: https://github.com/apache/arrow/pull/10676#issuecomment-876411737 Can you add a comment explains why this env var is being set? Link to the cli issue maybe too. Should we remove this change once cli is fixed or does it not really

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #10661: ARROW-8655: [C++][Python] Preserve partitioning information for a discovered Dataset

2021-07-08 Thread GitBox
jorisvandenbossche commented on a change in pull request #10661: URL: https://github.com/apache/arrow/pull/10661#discussion_r666150986 ## File path: python/pyarrow/_dataset.pyx ## @@ -2083,6 +2090,15 @@ cdef class DirectoryPartitioning(Partitioning): return Partitionin

[GitHub] [arrow-datafusion] jgoday commented on pull request #687: #554: Lead/lag window function with offset and default value arguments

2021-07-08 Thread GitBox
jgoday commented on pull request #687: URL: https://github.com/apache/arrow-datafusion/pull/687#issuecomment-876383651 @Jimexist I was trying to execute the integration tests manually, executing test_psql_parity.py with the following sql ``` SELECT c8, LEAD(c8) OVER () nex

[GitHub] [arrow-datafusion] jgoday commented on a change in pull request #687: #554: Lead/lag window function with offset and default value arguments

2021-07-08 Thread GitBox
jgoday commented on a change in pull request #687: URL: https://github.com/apache/arrow-datafusion/pull/687#discussion_r666130346 ## File path: datafusion/src/physical_plan/windows.rs ## @@ -110,13 +110,49 @@ fn create_built_in_window_expr( let coerced_args = coerc

[GitHub] [arrow] anthonylouisbsb closed pull request #10493: ARROW-13020: [C++] [Gandiva] Enable Gandiva to run the LLVM in Interpreted mode - WIP

2021-07-08 Thread GitBox
anthonylouisbsb closed pull request #10493: URL: https://github.com/apache/arrow/pull/10493 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github

[GitHub] [arrow-datafusion] alamb closed issue #695: DataFusion will not build with Rust 1.52.1

2021-07-08 Thread GitBox
alamb closed issue #695: URL: https://github.com/apache/arrow-datafusion/issues/695 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubsc

[GitHub] [arrow-datafusion] alamb merged pull request #696: Fix build with 1.52.1

2021-07-08 Thread GitBox
alamb merged pull request #696: URL: https://github.com/apache/arrow-datafusion/pull/696 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-un

[GitHub] [arrow-datafusion] alamb merged pull request #691: perf: Improve materialisation performance of SortPreservingMergeExec

2021-07-08 Thread GitBox
alamb merged pull request #691: URL: https://github.com/apache/arrow-datafusion/pull/691 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-un

[GitHub] [arrow-datafusion] alamb commented on pull request #690: Fix Date32 and Date64 parquet row group pruning

2021-07-08 Thread GitBox
alamb commented on pull request #690: URL: https://github.com/apache/arrow-datafusion/pull/690#issuecomment-876366765 @yordan-pavlov this is ready for review. cc @Dandandan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

[GitHub] [arrow-datafusion] alamb merged pull request #668: add more integration tests

2021-07-08 Thread GitBox
alamb merged pull request #668: URL: https://github.com/apache/arrow-datafusion/pull/668 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-un

[GitHub] [arrow-datafusion] alamb commented on issue #686: Specific timezone support for `to_timetamp*()`

2021-07-08 Thread GitBox
alamb commented on issue #686: URL: https://github.com/apache/arrow-datafusion/issues/686#issuecomment-876362941 @adamhooper thank you for the perspectives. I see the challenges with adding a second argument to `to_timestamp` and would love to consider other alternatives. Can you

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #10661: ARROW-8655: [C++][Python] Preserve partitioning information for a discovered Dataset

2021-07-08 Thread GitBox
jorisvandenbossche commented on a change in pull request #10661: URL: https://github.com/apache/arrow/pull/10661#discussion_r666102851 ## File path: cpp/src/arrow/dataset/file_base.h ## @@ -238,6 +238,12 @@ class ARROW_DS_EXPORT FileSystemDataset : public Dataset { std::

  1   2   >