[GitHub] [arrow] pitrou commented on pull request #8357: ARROW-10189: [C] Fixed typo in C-Data interface example

2020-10-06 Thread GitBox
pitrou commented on pull request #8357: URL: https://github.com/apache/arrow/pull/8357#issuecomment-704086450 Unrelated, but what's up with the Rust "dev_labeler"? This is an automated message from the Apache Git Service. To

[GitHub] [arrow] pitrou edited a comment on pull request #8357: ARROW-10189: [C] Fixed typo in C-Data interface example

2020-10-06 Thread GitBox
pitrou edited a comment on pull request #8357: URL: https://github.com/apache/arrow/pull/8357#issuecomment-704086450 Unrelated, but what's up with the Rust "dev_labeler"? It seems to be failing on every PR. This is an automa

[GitHub] [arrow] pitrou closed pull request #8357: ARROW-10189: [Doc] Fixed typo in C-Data interface example

2020-10-06 Thread GitBox
pitrou closed pull request #8357: URL: https://github.com/apache/arrow/pull/8357 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] jorgecarleitao commented on pull request #8357: ARROW-10189: [Doc] Fixed typo in C-Data interface example

2020-10-06 Thread GitBox
jorgecarleitao commented on pull request #8357: URL: https://github.com/apache/arrow/pull/8357#issuecomment-704087532 > Unrelated, but what's up with the Rust "dev_labeler"? It seems to be failing on every PR. I made a mistake. It should be fixed in https://github.com/apache/arrow/p

[GitHub] [arrow] jorgecarleitao opened a new pull request #8358: Testing labeler.

2020-10-06 Thread GitBox
jorgecarleitao opened a new pull request #8358: URL: https://github.com/apache/arrow/pull/8358 This is an acceptance test to the labeler. Please ignore. This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow] jorgecarleitao commented on pull request #8358: Testing labeler.

2020-10-06 Thread GitBox
jorgecarleitao commented on pull request #8358: URL: https://github.com/apache/arrow/pull/8358#issuecomment-704090285 The test pass: the boot works as expected. I am thus closing this. This is an automated message from the Ap

[GitHub] [arrow] jorgecarleitao closed pull request #8358: Testing labeler.

2020-10-06 Thread GitBox
jorgecarleitao closed pull request #8358: URL: https://github.com/apache/arrow/pull/8358 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [arrow] maartenbreddels commented on a change in pull request #8271: ARROW-9991: [C++] split kernels for strings/binary

2020-10-06 Thread GitBox
maartenbreddels commented on a change in pull request #8271: URL: https://github.com/apache/arrow/pull/8271#discussion_r500071766 ## File path: python/pyarrow/compute.py ## @@ -253,6 +255,68 @@ def match_substring(array, pattern): MatchSubstringOptions

[GitHub] [arrow] romainfrancois commented on a change in pull request #8341: ARROW-10093: [R] Add ability to opt-out of int64 -> int demotion

2020-10-06 Thread GitBox
romainfrancois commented on a change in pull request #8341: URL: https://github.com/apache/arrow/pull/8341#discussion_r500076684 ## File path: r/tests/testthat/test-Array.R ## @@ -749,3 +749,17 @@ test_that("Array$ApproxEquals", { expect_true(a$ApproxEquals(b)) expect_fal

[GitHub] [arrow] romainfrancois commented on a change in pull request #8341: ARROW-10093: [R] Add ability to opt-out of int64 -> int demotion

2020-10-06 Thread GitBox
romainfrancois commented on a change in pull request #8341: URL: https://github.com/apache/arrow/pull/8341#discussion_r500074475 ## File path: r/src/array_to_vector.cpp ## @@ -960,6 +960,14 @@ bool ArraysCanFitInteger(ArrayVector arrays) { return all_can_fit; } +bool opti

[GitHub] [arrow] maartenbreddels commented on a change in pull request #8271: ARROW-9991: [C++] split kernels for strings/binary

2020-10-06 Thread GitBox
maartenbreddels commented on a change in pull request #8271: URL: https://github.com/apache/arrow/pull/8271#discussion_r500084727 ## File path: python/pyarrow/compute.py ## @@ -253,6 +255,68 @@ def match_substring(array, pattern): MatchSubstringOptions

[GitHub] [arrow] pitrou commented on a change in pull request #8337: ARROW-10151: [Python] Add support for MapArray conversion to Pandas

2020-10-06 Thread GitBox
pitrou commented on a change in pull request #8337: URL: https://github.com/apache/arrow/pull/8337#discussion_r500077813 ## File path: cpp/src/arrow/python/arrow_to_pandas.cc ## @@ -791,6 +791,117 @@ Status ConvertListsLike(const PandasOptions& options, const ChunkedArray& dat

[GitHub] [arrow] alamb commented on pull request #8333: ARROW-10167: [Rust] [DataFusion] Support DictionaryArray in sql.rs tests, by using standard pretty printer

2020-10-06 Thread GitBox
alamb commented on pull request #8333: URL: https://github.com/apache/arrow/pull/8333#issuecomment-704152397 @jorgecarleitao FYI rebased This is an automated message from the Apache Git Service. To respond to the message, pl

[GitHub] [arrow] kszucs commented on pull request #8315: ARROW-9266: [Python][Packaging] Enable S3 support in macOS wheels

2020-10-06 Thread GitBox
kszucs commented on pull request #8315: URL: https://github.com/apache/arrow/pull/8315#issuecomment-704176534 @github-actions crossbow submit wheel-osx-* This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow] github-actions[bot] commented on pull request #8315: ARROW-9266: [Python][Packaging] Enable S3 support in macOS wheels

2020-10-06 Thread GitBox
github-actions[bot] commented on pull request #8315: URL: https://github.com/apache/arrow/pull/8315#issuecomment-704177368 Revision: 97b5b3248a2becd7bf86d89cb8bd71de2bf43e30 Submitted crossbow builds: [ursa-labs/crossbow @ actions-605](https://github.com/ursa-labs/crossbow/branches/a

[GitHub] [arrow] nevi-me commented on pull request #8330: ARROW-10191: [Rust] [Parquet] Add roundtrip Arrow -> Parquet tests for all supported Arrow DataTypes

2020-10-06 Thread GitBox
nevi-me commented on pull request #8330: URL: https://github.com/apache/arrow/pull/8330#issuecomment-704204575 @carols10cents may you please create a JIRA account at https://issues.apache.org, so that we can assign the tasks that you've worked on to you. Thanks :) ---

[GitHub] [arrow] github-actions[bot] commented on pull request #8330: ARROW-10191: [Rust] [Parquet] Add roundtrip Arrow -> Parquet tests for all supported Arrow DataTypes

2020-10-06 Thread GitBox
github-actions[bot] commented on pull request #8330: URL: https://github.com/apache/arrow/pull/8330#issuecomment-704207669 https://issues.apache.org/jira/browse/ARROW-10191 This is an automated message from the Apache Git Ser

[GitHub] [arrow] nevi-me commented on pull request #8330: ARROW-10191: [Rust] [Parquet] Add roundtrip Arrow -> Parquet tests for all supported Arrow DataTypes

2020-10-06 Thread GitBox
nevi-me commented on pull request #8330: URL: https://github.com/apache/arrow/pull/8330#issuecomment-704208332 Hi @andygrove @jorgecarleitao may you please kindly merge this for me, the merge tool's failing to push to GH on my end :( ---

[GitHub] [arrow] carols10cents commented on pull request #8330: ARROW-10191: [Rust] [Parquet] Add roundtrip Arrow -> Parquet tests for all supported Arrow DataTypes

2020-10-06 Thread GitBox
carols10cents commented on pull request #8330: URL: https://github.com/apache/arrow/pull/8330#issuecomment-704223459 @nevi-me Done! https://issues.apache.org/jira/secure/ViewProfile.jspa?name=carols10cents This is an automat

[GitHub] [arrow] github-actions[bot] commented on pull request #8354: ARROW-10168: [Rust] [Parquet] Schema roundtrip - use Arrow schema from Parquet metadata when available

2020-10-06 Thread GitBox
github-actions[bot] commented on pull request #8354: URL: https://github.com/apache/arrow/pull/8354#issuecomment-704224496 https://issues.apache.org/jira/browse/ARROW-10168 This is an automated message from the Apache Git Ser

[GitHub] [arrow] alamb opened a new pull request #8359: ARROW-10163: [Rust] [DataFusion] Add DictionaryArray coercion support to physical plans

2020-10-06 Thread GitBox
alamb opened a new pull request #8359: URL: https://github.com/apache/arrow/pull/8359 NOTE: this builds on #8333 and #8340 and #8346 so leaving as a draft until those are merged This PR adds basic physical expression / casting support to DataFusion. Right now, it will cause all Dict

[GitHub] [arrow] bkietz commented on a change in pull request #8305: ARROW-9782: [C++][Dataset] More configurable Dataset writing

2020-10-06 Thread GitBox
bkietz commented on a change in pull request #8305: URL: https://github.com/apache/arrow/pull/8305#discussion_r500235525 ## File path: cpp/src/arrow/dataset/file_base.cc ## @@ -143,97 +150,235 @@ FragmentIterator FileSystemDataset::GetFragmentsImpl( return MakeVectorIterator

[GitHub] [arrow] github-actions[bot] commented on pull request #8359: ARROW-10163: [Rust] [DataFusion] Add DictionaryArray coercion support to physical plans

2020-10-06 Thread GitBox
github-actions[bot] commented on pull request #8359: URL: https://github.com/apache/arrow/pull/8359#issuecomment-704237332 https://issues.apache.org/jira/browse/ARROW-10163 This is an automated message from the Apache Git Ser

[GitHub] [arrow] alamb commented on a change in pull request #8359: ARROW-10163: [Rust] [DataFusion] Add DictionaryArray coercion support to physical plans

2020-10-06 Thread GitBox
alamb commented on a change in pull request #8359: URL: https://github.com/apache/arrow/pull/8359#discussion_r500236277 ## File path: rust/datafusion/src/physical_plan/expressions.rs ## @@ -1166,16 +1201,13 @@ fn order_coercion(lhs_type: &DataType, rhs_type: &DataType) -> Opti

[GitHub] [arrow] maartenbreddels commented on a change in pull request #8271: ARROW-9991: [C++] split kernels for strings/binary

2020-10-06 Thread GitBox
maartenbreddels commented on a change in pull request #8271: URL: https://github.com/apache/arrow/pull/8271#discussion_r500243584 ## File path: cpp/src/arrow/compute/kernels/scalar_string_test.cc ## @@ -53,8 +53,19 @@ class BaseTestStringKernels : public ::testing::Test { st

[GitHub] [arrow] maartenbreddels commented on a change in pull request #8271: ARROW-9991: [C++] split kernels for strings/binary

2020-10-06 Thread GitBox
maartenbreddels commented on a change in pull request #8271: URL: https://github.com/apache/arrow/pull/8271#discussion_r500275273 ## File path: cpp/src/arrow/compute/kernels/scalar_string.cc ## @@ -809,6 +809,475 @@ struct IsUpperAscii : CharacterPredicateAscii { } }; +/

[GitHub] [arrow] pitrou commented on a change in pull request #8271: ARROW-9991: [C++] split kernels for strings/binary

2020-10-06 Thread GitBox
pitrou commented on a change in pull request #8271: URL: https://github.com/apache/arrow/pull/8271#discussion_r500282031 ## File path: cpp/src/arrow/compute/kernels/scalar_string.cc ## @@ -809,6 +809,475 @@ struct IsUpperAscii : CharacterPredicateAscii { } }; +// splitti

[GitHub] [arrow] pitrou opened a new pull request #8361: ARROW-10192: [Python] Always decode inner dictionaries when converting array to Pandas

2020-10-06 Thread GitBox
pitrou opened a new pull request #8361: URL: https://github.com/apache/arrow/pull/8361 Fix a crash on conversion of e.g. struct of dictionaries. This is an automated message from the Apache Git Service. To respond to the mess

[GitHub] [arrow] pitrou commented on a change in pull request #8305: ARROW-9782: [C++][Dataset] More configurable Dataset writing

2020-10-06 Thread GitBox
pitrou commented on a change in pull request #8305: URL: https://github.com/apache/arrow/pull/8305#discussion_r500202061 ## File path: cpp/src/arrow/util/mutex.h ## @@ -31,16 +31,20 @@ namespace util { class ARROW_EXPORT Mutex { public: Mutex(); + Mutex(Mutex&&) = defaul

[GitHub] [arrow] pitrou commented on a change in pull request #8271: ARROW-9991: [C++] split kernels for strings/binary

2020-10-06 Thread GitBox
pitrou commented on a change in pull request #8271: URL: https://github.com/apache/arrow/pull/8271#discussion_r500265136 ## File path: cpp/src/arrow/compute/kernels/scalar_string_test.cc ## @@ -53,8 +53,19 @@ class BaseTestStringKernels : public ::testing::Test { std::shared

[GitHub] [arrow] bkietz commented on a change in pull request #8305: ARROW-9782: [C++][Dataset] More configurable Dataset writing

2020-10-06 Thread GitBox
bkietz commented on a change in pull request #8305: URL: https://github.com/apache/arrow/pull/8305#discussion_r500287304 ## File path: cpp/src/arrow/dataset/file_base.h ## @@ -128,6 +128,9 @@ class ARROW_DS_EXPORT FileFormat : public std::enable_shared_from_this

[GitHub] [arrow] pitrou commented on a change in pull request #8271: ARROW-9991: [C++] split kernels for strings/binary

2020-10-06 Thread GitBox
pitrou commented on a change in pull request #8271: URL: https://github.com/apache/arrow/pull/8271#discussion_r500266219 ## File path: python/pyarrow/compute.py ## @@ -253,6 +255,68 @@ def match_substring(array, pattern): MatchSubstringOptions(pattern)

[GitHub] [arrow] bkietz commented on a change in pull request #8305: ARROW-9782: [C++][Dataset] More configurable Dataset writing

2020-10-06 Thread GitBox
bkietz commented on a change in pull request #8305: URL: https://github.com/apache/arrow/pull/8305#discussion_r500288334 ## File path: cpp/src/arrow/dataset/file_ipc.cc ## @@ -159,18 +163,44 @@ Result IpcFileFormat::ScanFile(std::shared_ptr op

[GitHub] [arrow] bkietz commented on a change in pull request #8305: ARROW-9782: [C++][Dataset] More configurable Dataset writing

2020-10-06 Thread GitBox
bkietz commented on a change in pull request #8305: URL: https://github.com/apache/arrow/pull/8305#discussion_r500290327 ## File path: cpp/src/arrow/util/string.h ## @@ -56,5 +57,9 @@ bool AsciiEqualsCaseInsensitive(util::string_view left, util::string_view right) ARROW_EXPOR

[GitHub] [arrow] pitrou commented on a change in pull request #8271: ARROW-9991: [C++] split kernels for strings/binary

2020-10-06 Thread GitBox
pitrou commented on a change in pull request #8271: URL: https://github.com/apache/arrow/pull/8271#discussion_r500268258 ## File path: python/pyarrow/compute.py ## @@ -253,6 +255,68 @@ def match_substring(array, pattern): MatchSubstringOptions(pattern)

[GitHub] [arrow] pitrou commented on pull request #8356: ARROW-10139: [C++] Add support for building arrow_testing without building tests

2020-10-06 Thread GitBox
pitrou commented on pull request #8356: URL: https://github.com/apache/arrow/pull/8356#issuecomment-704263971 Do we want to update some CI build configurations with this? (I assume this can make builds shorter) This is an au

[GitHub] [arrow] naman1996 closed pull request #8231: ARROW-10023: [C++][Gandiva] Implement split_part function in gandiva

2020-10-06 Thread GitBox
naman1996 closed pull request #8231: URL: https://github.com/apache/arrow/pull/8231 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] bkietz commented on a change in pull request #8305: ARROW-9782: [C++][Dataset] More configurable Dataset writing

2020-10-06 Thread GitBox
bkietz commented on a change in pull request #8305: URL: https://github.com/apache/arrow/pull/8305#discussion_r500295452 ## File path: cpp/src/arrow/util/mutex.cc ## @@ -19,25 +19,36 @@ #include +#include "arrow/util/logging.h" + namespace arrow { namespace util { -st

[GitHub] [arrow] pitrou commented on a change in pull request #8343: ARROW-9147: [C++][Dataset] Support projection from null->any type

2020-10-06 Thread GitBox
pitrou commented on a change in pull request #8343: URL: https://github.com/apache/arrow/pull/8343#discussion_r500272157 ## File path: python/pyarrow/tests/test_dataset.py ## @@ -2124,6 +2124,23 @@ def test_dataset_project_only_partition_columns(tempdir): assert all_cols.c

[GitHub] [arrow] bkietz commented on a change in pull request #8305: ARROW-9782: [C++][Dataset] More configurable Dataset writing

2020-10-06 Thread GitBox
bkietz commented on a change in pull request #8305: URL: https://github.com/apache/arrow/pull/8305#discussion_r500297392 ## File path: cpp/src/arrow/util/mutex.h ## @@ -31,16 +31,20 @@ namespace util { class ARROW_EXPORT Mutex { public: Mutex(); + Mutex(Mutex&&) = defaul

[GitHub] [arrow] github-actions[bot] commented on pull request #8361: ARROW-10192: [Python] Always decode inner dictionaries when converting array to Pandas

2020-10-06 Thread GitBox
github-actions[bot] commented on pull request #8361: URL: https://github.com/apache/arrow/pull/8361#issuecomment-704266502 https://issues.apache.org/jira/browse/ARROW-10192 This is an automated message from the Apache Git Ser

[GitHub] [arrow] github-actions[bot] commented on pull request #8360: ARROW-10193: [Python] Segfault when converting to fixed size binary array

2020-10-06 Thread GitBox
github-actions[bot] commented on pull request #8360: URL: https://github.com/apache/arrow/pull/8360#issuecomment-704255542 https://issues.apache.org/jira/browse/ARROW-10193 This is an automated message from the Apache Git Ser

[GitHub] [arrow] bkietz commented on a change in pull request #8305: ARROW-9782: [C++][Dataset] More configurable Dataset writing

2020-10-06 Thread GitBox
bkietz commented on a change in pull request #8305: URL: https://github.com/apache/arrow/pull/8305#discussion_r499859844 ## File path: python/pyarrow/dataset.py ## @@ -694,22 +697,23 @@ def _ensure_write_partitioning(scheme): return scheme -def write_dataset(data, base

[GitHub] [arrow] andygrove closed pull request #8355: ARROW-10188: [Rust] [DataFusion] Fixed DataFusion examples.

2020-10-06 Thread GitBox
andygrove closed pull request #8355: URL: https://github.com/apache/arrow/pull/8355 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] andygrove closed pull request #8333: ARROW-10167: [Rust] [DataFusion] Support DictionaryArray in sql.rs tests, by using standard pretty printer

2020-10-06 Thread GitBox
andygrove closed pull request #8333: URL: https://github.com/apache/arrow/pull/8333 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] andygrove edited a comment on pull request #8330: ARROW-10191: [Rust] [Parquet] Add roundtrip Arrow -> Parquet tests for all supported Arrow DataTypes

2020-10-06 Thread GitBox
andygrove edited a comment on pull request #8330: URL: https://github.com/apache/arrow/pull/8330#issuecomment-704320070 @nevi-me @carols10cents I ran the merge tool and the changes appear to have been committed to the branch but this PR did not get closed, so I am closing it manually. --

[GitHub] [arrow] andygrove commented on pull request #8330: ARROW-10191: [Rust] [Parquet] Add roundtrip Arrow -> Parquet tests for all supported Arrow DataTypes

2020-10-06 Thread GitBox
andygrove commented on pull request #8330: URL: https://github.com/apache/arrow/pull/8330#issuecomment-704320070 @nevi-me @carols10cents I ran the merge tool and the changes appear to have been committed to the branch but this issue did not get closed, so I am closing it manually. --

[GitHub] [arrow] andygrove closed pull request #8330: ARROW-10191: [Rust] [Parquet] Add roundtrip Arrow -> Parquet tests for all supported Arrow DataTypes

2020-10-06 Thread GitBox
andygrove closed pull request #8330: URL: https://github.com/apache/arrow/pull/8330 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] mrkn commented on pull request #6302: ARROW-7633: [C++][CI] Create fuzz targets for tensors and sparse tensors

2020-10-06 Thread GitBox
mrkn commented on pull request #6302: URL: https://github.com/apache/arrow/pull/6302#issuecomment-704320830 This pull request is ready to review. @pitrou Could you have a look? This is an automated message from the Apa

[GitHub] [arrow] bkietz opened a new pull request #8362: ARROW-10196: [C++] Add Future::DeferNotOk

2020-10-06 Thread GitBox
bkietz opened a new pull request #8362: URL: https://github.com/apache/arrow/pull/8362 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] nealrichardson commented on a change in pull request #8341: ARROW-10093: [R] Add ability to opt-out of int64 -> int demotion

2020-10-06 Thread GitBox
nealrichardson commented on a change in pull request #8341: URL: https://github.com/apache/arrow/pull/8341#discussion_r500365879 ## File path: r/src/array_to_vector.cpp ## @@ -1068,8 +1080,9 @@ std::shared_ptr Converter::Make(const std::shared_ptr& type return std::mak

[GitHub] [arrow] github-actions[bot] commented on pull request #8362: ARROW-10196: [C++] Add Future::DeferNotOk

2020-10-06 Thread GitBox
github-actions[bot] commented on pull request #8362: URL: https://github.com/apache/arrow/pull/8362#issuecomment-704335087 https://issues.apache.org/jira/browse/ARROW-10196 This is an automated message from the Apache Git Ser

[GitHub] [arrow] HedgehogCode opened a new pull request #8363: ARROW-10174: [Java] Fix reading/writing dict structs

2020-10-06 Thread GitBox
HedgehogCode opened a new pull request #8363: URL: https://github.com/apache/arrow/pull/8363 When translating between the memory FieldType and message FieldType for dictionary encoded vectors the children of the dictionary field were not handled correctly. * When going from memory f

[GitHub] [arrow] github-actions[bot] commented on pull request #8363: ARROW-10174: [Java] Fix reading/writing dict structs

2020-10-06 Thread GitBox
github-actions[bot] commented on pull request #8363: URL: https://github.com/apache/arrow/pull/8363#issuecomment-704359506 https://issues.apache.org/jira/browse/ARROW-10174 This is an automated message from the Apache Git Ser

[GitHub] [arrow] nevi-me opened a new pull request #8364: ARROW-5350: [Rust] Allow filtering on simple lists

2020-10-06 Thread GitBox
nevi-me opened a new pull request #8364: URL: https://github.com/apache/arrow/pull/8364 This extends filters to simple lists. CC @yordan-pavlov This is an automated message from the Apache Git Service. To respond to the mes

[GitHub] [arrow] github-actions[bot] commented on pull request #8364: ARROW-5350: [Rust] Allow filtering on simple lists

2020-10-06 Thread GitBox
github-actions[bot] commented on pull request #8364: URL: https://github.com/apache/arrow/pull/8364#issuecomment-704369422 https://issues.apache.org/jira/browse/ARROW-5350 This is an automated message from the Apache Git Serv

[GitHub] [arrow] nevi-me commented on a change in pull request #8346: ARROW-10164: [Rust] Add support for DictionaryArray to cast kernel

2020-10-06 Thread GitBox
nevi-me commented on a change in pull request #8346: URL: https://github.com/apache/arrow/pull/8346#discussion_r500409499 ## File path: rust/arrow/src/compute/kernels/cast.rs ## @@ -755,10 +784,253 @@ where Ok(b.finish()) } +/// Attempts to cast an `ArrayDictionary` wit

[GitHub] [arrow] nevi-me commented on a change in pull request #8346: ARROW-10164: [Rust] Add support for DictionaryArray to cast kernel

2020-10-06 Thread GitBox
nevi-me commented on a change in pull request #8346: URL: https://github.com/apache/arrow/pull/8346#discussion_r500409499 ## File path: rust/arrow/src/compute/kernels/cast.rs ## @@ -755,10 +784,253 @@ where Ok(b.finish()) } +/// Attempts to cast an `ArrayDictionary` wit

[GitHub] [arrow] nealrichardson opened a new pull request #8365: ARROW-6582: [R] Arrow to R fails with embedded nuls in strings

2020-10-06 Thread GitBox
nealrichardson opened a new pull request #8365: URL: https://github.com/apache/arrow/pull/8365 @romainfrancois see the comment I added with the reprex: this fails differently now, presumably due to the cpp11 change This is a

[GitHub] [arrow] bkietz commented on a change in pull request #8343: ARROW-9147: [C++][Dataset] Support projection from null->any type

2020-10-06 Thread GitBox
bkietz commented on a change in pull request #8343: URL: https://github.com/apache/arrow/pull/8343#discussion_r500413021 ## File path: python/pyarrow/tests/test_dataset.py ## @@ -2124,6 +2124,23 @@ def test_dataset_project_only_partition_columns(tempdir): assert all_cols.c

[GitHub] [arrow] nevi-me commented on pull request #8330: ARROW-10191: [Rust] [Parquet] Add roundtrip Arrow -> Parquet tests for all supported Arrow DataTypes

2020-10-06 Thread GitBox
nevi-me commented on pull request #8330: URL: https://github.com/apache/arrow/pull/8330#issuecomment-704378123 > @nevi-me @carols10cents I ran the merge tool and the changes appear to have been committed to the branch but this PR did not get closed, so I am closing it manually. It's

[GitHub] [arrow] jorgecarleitao edited a comment on pull request #8287: ARROW-10111: [Rust] Added new crate with code that consumes C Data interface

2020-10-06 Thread GitBox
jorgecarleitao edited a comment on pull request #8287: URL: https://github.com/apache/arrow/pull/8287#issuecomment-704388901 I have been heavily working in this problem based on your ideas, @pitrou on a [separate branch](https://github.com/jorgecarleitao/arrow/pull/13/files), and I think I

[GitHub] [arrow] jorgecarleitao commented on pull request #8287: ARROW-10111: [Rust] Added new crate with code that consumes C Data interface

2020-10-06 Thread GitBox
jorgecarleitao commented on pull request #8287: URL: https://github.com/apache/arrow/pull/8287#issuecomment-704388901 I have been heavily working in this problem based on your ideas, @pitrou on a [separate branch](https://github.com/jorgecarleitao/arrow/pull/13/files), and I think I need s

[GitHub] [arrow] pitrou commented on pull request #8366: ARROW-9943: [C++] Recursively apply Arrow metadata when reading from Parquet

2020-10-06 Thread GitBox
pitrou commented on pull request #8366: URL: https://github.com/apache/arrow/pull/8366#issuecomment-704409884 Also cc @jorisvandenbossche This is an automated message from the Apache Git Service. To respond to the message, p

[GitHub] [arrow] pitrou commented on pull request #8342: ARROW-10120: [C++] Add two-level nested Parquet read to Arrow benchmarks

2020-10-06 Thread GitBox
pitrou commented on pull request #8342: URL: https://github.com/apache/arrow/pull/8342#issuecomment-704413585 +1, will merge. This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [arrow] pitrou commented on a change in pull request #8366: ARROW-9943: [C++] Recursively apply Arrow metadata when reading from Parquet

2020-10-06 Thread GitBox
pitrou commented on a change in pull request #8366: URL: https://github.com/apache/arrow/pull/8366#discussion_r500443163 ## File path: cpp/src/parquet/arrow/schema.cc ## @@ -689,10 +686,91 @@ Status GetOriginSchema(const std::shared_ptr& metadata, // but that is not necessaril

[GitHub] [arrow] pitrou opened a new pull request #8366: ARROW-9943: [C++] Recursively apply Arrow metadata when reading from Parquet

2020-10-06 Thread GitBox
pitrou opened a new pull request #8366: URL: https://github.com/apache/arrow/pull/8366 This allows roundtripping complex types such as `list>`, `list`, etc. This is an automated message from the Apache Git Service. To respon

[GitHub] [arrow] github-actions[bot] commented on pull request #8365: ARROW-6582: [R] Arrow to R fails with embedded nuls in strings

2020-10-06 Thread GitBox
github-actions[bot] commented on pull request #8365: URL: https://github.com/apache/arrow/pull/8365#issuecomment-704396850 https://issues.apache.org/jira/browse/ARROW-6582 This is an automated message from the Apache Git Serv

[GitHub] [arrow] alamb commented on a change in pull request #8346: ARROW-10164: [Rust] Add support for DictionaryArray to cast kernel

2020-10-06 Thread GitBox
alamb commented on a change in pull request #8346: URL: https://github.com/apache/arrow/pull/8346#discussion_r500443425 ## File path: rust/arrow/src/compute/kernels/cast.rs ## @@ -755,10 +784,253 @@ where Ok(b.finish()) } +/// Attempts to cast an `ArrayDictionary` with

[GitHub] [arrow] kszucs commented on pull request #8360: ARROW-10193: [Python] Segfault when converting to fixed size binary array

2020-10-06 Thread GitBox
kszucs commented on pull request #8360: URL: https://github.com/apache/arrow/pull/8360#issuecomment-704412948 The CI failures are unrelated, merging. Thanks Antoine! This is an automated message from the Apache Git Service. T

[GitHub] [arrow] github-actions[bot] commented on pull request #8366: ARROW-9943: [C++] Recursively apply Arrow metadata when reading from Parquet

2020-10-06 Thread GitBox
github-actions[bot] commented on pull request #8366: URL: https://github.com/apache/arrow/pull/8366#issuecomment-704421003 https://issues.apache.org/jira/browse/ARROW-9943 This is an automated message from the Apache Git Serv

[GitHub] [arrow] pitrou commented on pull request #8287: ARROW-10111: [Rust] Added new crate with code that consumes C Data interface

2020-10-06 Thread GitBox
pitrou commented on pull request #8287: URL: https://github.com/apache/arrow/pull/8287#issuecomment-704423199 You're doing it wrong. I suggest again that you try to follow how C++ does things, otherwise you'll get lost. For example, your `release` callback assumes that buffers have b

[GitHub] [arrow] pitrou closed pull request #8342: ARROW-10120: [C++] Add two-level nested Parquet read to Arrow benchmarks

2020-10-06 Thread GitBox
pitrou closed pull request #8342: URL: https://github.com/apache/arrow/pull/8342 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] pitrou commented on pull request #6302: ARROW-7633: [C++][CI] Create fuzz targets for tensors and sparse tensors

2020-10-06 Thread GitBox
pitrou commented on pull request #6302: URL: https://github.com/apache/arrow/pull/6302#issuecomment-704429373 I'll take a look later this week. This is an automated message from the Apache Git Service. To respond to the messa

[GitHub] [arrow] pitrou edited a comment on pull request #8287: ARROW-10111: [Rust] Added new crate with code that consumes C Data interface

2020-10-06 Thread GitBox
pitrou edited a comment on pull request #8287: URL: https://github.com/apache/arrow/pull/8287#issuecomment-704423199 You're doing it wrong. I suggest again that you try to follow how C++ does things, otherwise you'll get lost. For example, your `release` callback assumes that buffers

[GitHub] [arrow] pitrou edited a comment on pull request #8287: ARROW-10111: [Rust] Added new crate with code that consumes C Data interface

2020-10-06 Thread GitBox
pitrou edited a comment on pull request #8287: URL: https://github.com/apache/arrow/pull/8287#issuecomment-704423199 You're doing it wrong. I suggest again that you try to follow how C++ does things, otherwise you'll get lost. For example, your `release` callback assumes that buffers

[GitHub] [arrow] kszucs closed pull request #8360: ARROW-10193: [Python] Segfault when converting to fixed size binary array

2020-10-06 Thread GitBox
kszucs closed pull request #8360: URL: https://github.com/apache/arrow/pull/8360 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] nealrichardson commented on pull request #8151: ARROW-9279: [C++] Implement PrettyPrint for Scalars

2020-10-06 Thread GitBox
nealrichardson commented on pull request #8151: URL: https://github.com/apache/arrow/pull/8151#issuecomment-704440114 @tianchen92 could you please rebase this? would you like to try to get this in the 2.0 release? @pitrou could you please take a look? --

[GitHub] [arrow] pitrou commented on pull request #8151: ARROW-9279: [C++] Implement PrettyPrint for Scalars

2020-10-06 Thread GitBox
pitrou commented on pull request #8151: URL: https://github.com/apache/arrow/pull/8151#issuecomment-704440473 This is too involved for the 2.0 release. This is an automated message from the Apache Git Service. To respond to t

[GitHub] [arrow] kszucs commented on a change in pull request #8361: ARROW-10192: [Python] Always decode inner dictionaries when converting array to Pandas

2020-10-06 Thread GitBox
kszucs commented on a change in pull request #8361: URL: https://github.com/apache/arrow/pull/8361#discussion_r500483306 ## File path: cpp/src/arrow/python/arrow_to_pandas.cc ## @@ -641,29 +659,18 @@ inline Status ConvertStruct(const PandasOptions& options, const ChunkedArray&

[GitHub] [arrow] nealrichardson commented on pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

2020-10-06 Thread GitBox
nealrichardson commented on pull request #7887: URL: https://github.com/apache/arrow/pull/7887#issuecomment-704441085 @tianchen92 do you want to pick this up again? the bug that @emkornfield mentioned has been fixed. This is

[GitHub] [arrow] nealrichardson commented on pull request #7942: ARROW-9704: [Java] TestEndianness.testLittleEndian supports little- and big-endian platforms

2020-10-06 Thread GitBox
nealrichardson commented on pull request #7942: URL: https://github.com/apache/arrow/pull/7942#issuecomment-704443194 Is this mergeable now? According to the mailing list discussion: > It does not seem too invasive to support native endianness in implementation libraries. As long

[GitHub] [arrow] nealrichardson commented on pull request #8265: ARROW-9586: [FlightRPC][Java] implement per-call allocator

2020-10-06 Thread GitBox
nealrichardson commented on pull request #8265: URL: https://github.com/apache/arrow/pull/8265#issuecomment-70724 Is this good to merge? Or needs review? 2.0 is fast approaching. This is an automated message from the Apac

[GitHub] [arrow] nealrichardson commented on pull request #8305: ARROW-9782: [C++][Dataset] More configurable Dataset writing

2020-10-06 Thread GitBox
nealrichardson commented on pull request #8305: URL: https://github.com/apache/arrow/pull/8305#issuecomment-704450332 Is this done, or what is left? This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] kiszk commented on pull request #7942: ARROW-9704: [Java] TestEndianness.testLittleEndian supports little- and big-endian platforms

2020-10-06 Thread GitBox
kiszk commented on pull request #7942: URL: https://github.com/apache/arrow/pull/7942#issuecomment-704450710 Thank you for your comment. According to Micah 's [comment](https://github.com/apache/arrow/pull/7938#issuecomment-703381572), we may wait until he will update the document s

[GitHub] [arrow] pitrou commented on a change in pull request #8361: ARROW-10192: [Python] Always decode inner dictionaries when converting array to Pandas

2020-10-06 Thread GitBox
pitrou commented on a change in pull request #8361: URL: https://github.com/apache/arrow/pull/8361#discussion_r500496450 ## File path: cpp/src/arrow/python/arrow_to_pandas.cc ## @@ -641,29 +659,18 @@ inline Status ConvertStruct(const PandasOptions& options, const ChunkedArray&

[GitHub] [arrow] bkietz commented on pull request #8305: ARROW-9782: [C++][Dataset] More configurable Dataset writing

2020-10-06 Thread GitBox
bkietz commented on pull request #8305: URL: https://github.com/apache/arrow/pull/8305#issuecomment-704453473 @pitrou are you planning to review C++ again? @jorisvandenbossche are you planning to review python? This is an

[GitHub] [arrow] kszucs commented on a change in pull request #8361: ARROW-10192: [Python] Always decode inner dictionaries when converting array to Pandas

2020-10-06 Thread GitBox
kszucs commented on a change in pull request #8361: URL: https://github.com/apache/arrow/pull/8361#discussion_r500499238 ## File path: cpp/src/arrow/python/arrow_to_pandas.cc ## @@ -641,29 +659,18 @@ inline Status ConvertStruct(const PandasOptions& options, const ChunkedArray&

[GitHub] [arrow] kszucs commented on a change in pull request #8361: ARROW-10192: [Python] Always decode inner dictionaries when converting array to Pandas

2020-10-06 Thread GitBox
kszucs commented on a change in pull request #8361: URL: https://github.com/apache/arrow/pull/8361#discussion_r500499506 ## File path: python/pyarrow/tests/test_pandas.py ## @@ -2297,6 +2310,24 @@ def test_from_tuples(self): df, expected=expected_df, schema=expecte

[GitHub] [arrow] kszucs commented on a change in pull request #8361: ARROW-10192: [Python] Always decode inner dictionaries when converting array to Pandas

2020-10-06 Thread GitBox
kszucs commented on a change in pull request #8361: URL: https://github.com/apache/arrow/pull/8361#discussion_r500499506 ## File path: python/pyarrow/tests/test_pandas.py ## @@ -2297,6 +2310,24 @@ def test_from_tuples(self): df, expected=expected_df, schema=expecte

[GitHub] [arrow] Luminarys commented on pull request #8344: Add BasicDecimal256 Multiplication Support (PR for decimal256 branch, not master)

2020-10-06 Thread GitBox
Luminarys commented on pull request #8344: URL: https://github.com/apache/arrow/pull/8344#issuecomment-704457623 I've looked through the CI failures, it seems there are a few kinds: 1. aws connector failure (I think this isn't our issue) 2. a python lint error (this should be fixed, bu

[GitHub] [arrow] pitrou commented on pull request #8320: ARROW-10058: [C++] Improve repeated levels conversion without BMI2

2020-10-06 Thread GitBox
pitrou commented on pull request #8320: URL: https://github.com/apache/arrow/pull/8320#issuecomment-704460267 Updated benchmarks on AMD Ryzen: ``` benchmark baseline contender change %

[GitHub] [arrow] lidavidm commented on pull request #8265: ARROW-9586: [FlightRPC][Java] implement per-call allocator

2020-10-06 Thread GitBox
lidavidm commented on pull request #8265: URL: https://github.com/apache/arrow/pull/8265#issuecomment-704466369 Still needs review. We can take it off the 2.0 milestone. This is an automated message from the Apache Git Servic

[GitHub] [arrow] lidavidm edited a comment on pull request #8265: ARROW-9586: [FlightRPC][Java] implement per-call allocator

2020-10-06 Thread GitBox
lidavidm edited a comment on pull request #8265: URL: https://github.com/apache/arrow/pull/8265#issuecomment-704466369 Still needs review. I've taken it off the 2.0 milestone. This is an automated message from the Apache Git

[GitHub] [arrow] emkornfield commented on pull request #8320: ARROW-10058: [C++] Improve repeated levels conversion without BMI2

2020-10-06 Thread GitBox
emkornfield commented on pull request #8320: URL: https://github.com/apache/arrow/pull/8320#issuecomment-704482050 sorry some personal issues came up. hope to have time tonight to review this and other parquet related CLs T

[GitHub] [arrow] bkietz opened a new pull request #8367: ARROW-10099: [C++][Dataset] Simplify type inference for partition columns

2020-10-06 Thread GitBox
bkietz opened a new pull request #8367: URL: https://github.com/apache/arrow/pull/8367 Removed cardinality considerations for inferring partition field types. There is now a boolean flag (`inspect_dictionary`, default false) which if set causes fields to be inferred as a dictionary encoded

[GitHub] [arrow] github-actions[bot] commented on pull request #8367: ARROW-10099: [C++][Dataset] Simplify type inference for partition columns

2020-10-06 Thread GitBox
github-actions[bot] commented on pull request #8367: URL: https://github.com/apache/arrow/pull/8367#issuecomment-704490021 https://issues.apache.org/jira/browse/ARROW-10099 This is an automated message from the Apache Git Ser

[GitHub] [arrow] pitrou commented on pull request #8320: ARROW-10058: [C++] Improve repeated levels conversion without BMI2

2020-10-06 Thread GitBox
pitrou commented on pull request #8320: URL: https://github.com/apache/arrow/pull/8320#issuecomment-704491476 For the record, if I profile `BM_ReadStructOfListColumn/50`, I get the following hot spots (in cycles spent): * ~19% in `DefRepLevelsToListInfo` * ~15% in `DelimitRecords`

[GitHub] [arrow] pitrou commented on pull request #8305: ARROW-9782: [C++][Dataset] More configurable Dataset writing

2020-10-06 Thread GitBox
pitrou commented on pull request #8305: URL: https://github.com/apache/arrow/pull/8305#issuecomment-704497970 The C++ changes addressed my comments. It would be nice though if @fsaintjacques could take a look. This is an aut

[GitHub] [arrow] pitrou commented on a change in pull request #8362: ARROW-10196: [C++] Add Future::DeferNotOk

2020-10-06 Thread GitBox
pitrou commented on a change in pull request #8362: URL: https://github.com/apache/arrow/pull/8362#discussion_r500539248 ## File path: cpp/src/arrow/util/future.h ## @@ -358,6 +357,13 @@ class Future { return fut; } + static Future DeferNotOk(Result maybe_future) {

  1   2   >