[GitHub] [arrow] kszucs closed pull request #8072: ARROW-9879: [Python] Add support for numpy scalars to ChunkedArray.__getitem__

2020-10-09 Thread GitBox
kszucs closed pull request #8072: URL: https://github.com/apache/arrow/pull/8072 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] jduo commented on a change in pull request #8325: ARROW-10206: [C++][Python][FlightRPC] Allow disabling server validation

2020-10-09 Thread GitBox
jduo commented on a change in pull request #8325: URL: https://github.com/apache/arrow/pull/8325#discussion_r502241411 ## File path: cpp/src/arrow/flight/client.h ## @@ -90,6 +90,8 @@ class ARROW_FLIGHT_EXPORT FlightWriteSizeStatusDetail : public arrow::StatusDeta class ARR

[GitHub] [arrow] jorisvandenbossche commented on pull request #8162: ARROW-9962: [Python] Fix conversion to_pandas with tz-aware index column and fixed offset timezones

2020-10-09 Thread GitBox
jorisvandenbossche commented on pull request #8162: URL: https://github.com/apache/arrow/pull/8162#issuecomment-706026472 Yes, it's an actual failure that I still needed to take a look at. But the problem is that I can't reproduce it locally (well, it might be specific to Python 3.5, but I

[GitHub] [arrow] jduo commented on a change in pull request #8325: ARROW-10206: [C++][Python][FlightRPC] Allow disabling server validation

2020-10-09 Thread GitBox
jduo commented on a change in pull request #8325: URL: https://github.com/apache/arrow/pull/8325#discussion_r502248222 ## File path: cpp/src/arrow/flight/client.cc ## @@ -835,6 +843,31 @@ class GrpcMetadataReader : public FlightMetadataReader { std::shared_ptr read_mutex_;

[GitHub] [arrow] jhorstmann opened a new pull request #8409: ARROW-10240: [Rust] Optionally load data into memory before running benchmark query

2020-10-09 Thread GitBox
jhorstmann opened a new pull request #8409: URL: https://github.com/apache/arrow/pull/8409 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow] jhorstmann commented on a change in pull request #8409: ARROW-10240: [Rust] Optionally load data into memory before running benchmark query

2020-10-09 Thread GitBox
jhorstmann commented on a change in pull request #8409: URL: https://github.com/apache/arrow/pull/8409#discussion_r502249059 ## File path: rust/benchmarks/src/bin/tpch.rs ## @@ -59,6 +61,10 @@ struct TpchOpt { /// File format: `csv` or `parquet` #[structopt(short = "f

[GitHub] [arrow] github-actions[bot] commented on pull request #8409: ARROW-10240: [Rust] Optionally load data into memory before running benchmark query

2020-10-09 Thread GitBox
github-actions[bot] commented on pull request #8409: URL: https://github.com/apache/arrow/pull/8409#issuecomment-706036980 https://issues.apache.org/jira/browse/ARROW-10240 This is an automated message from the Apache Git Ser

[GitHub] [arrow] jorisvandenbossche opened a new pull request #8410: ARROW-10244: [Python] Document pyarrow.dataset.parquet_dataset

2020-10-09 Thread GitBox
jorisvandenbossche opened a new pull request #8410: URL: https://github.com/apache/arrow/pull/8410 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow] jorisvandenbossche closed pull request #7515: ARROW-2801: [Python] Add split_row_group keyword to ParquetDataset / document split_by_row_group

2020-10-09 Thread GitBox
jorisvandenbossche closed pull request #7515: URL: https://github.com/apache/arrow/pull/7515 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [arrow] jorisvandenbossche commented on pull request #7515: ARROW-2801: [Python] Add split_row_group keyword to ParquetDataset / document split_by_row_group

2020-10-09 Thread GitBox
jorisvandenbossche commented on pull request #7515: URL: https://github.com/apache/arrow/pull/7515#issuecomment-706045938 I am actually not fully convinced this is a worthwhile change. But split of the docs I added here (which are useful anyway) into its own PR: https://github.com/apache/a

[GitHub] [arrow] jorisvandenbossche commented on pull request #8410: ARROW-10244: [Python] Document pyarrow.dataset.parquet_dataset

2020-10-09 Thread GitBox
jorisvandenbossche commented on pull request #8410: URL: https://github.com/apache/arrow/pull/8410#issuecomment-706046234 The useful parts split off from #7515 This is an automated message from the Apache Git Service. To resp

[GitHub] [arrow] jorisvandenbossche commented on pull request #8255: ARROW-9518: [Python] Deprecate pyarrow serialization

2020-10-09 Thread GitBox
jorisvandenbossche commented on pull request #8255: URL: https://github.com/apache/arrow/pull/8255#issuecomment-706046582 Rebased this This is an automated message from the Apache Git Service. To respond to the message, pleas

[GitHub] [arrow] github-actions[bot] commented on pull request #8410: ARROW-10244: [Python] Document pyarrow.dataset.parquet_dataset

2020-10-09 Thread GitBox
github-actions[bot] commented on pull request #8410: URL: https://github.com/apache/arrow/pull/8410#issuecomment-706047018 https://issues.apache.org/jira/browse/ARROW-10244 This is an automated message from the Apache Git Ser

[GitHub] [arrow] praveenbingo closed pull request #8201: ARROW-9956: [C++] [Gandiva] Implementation of binary_string function in gandiva

2020-10-09 Thread GitBox
praveenbingo closed pull request #8201: URL: https://github.com/apache/arrow/pull/8201 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] kszucs commented on a change in pull request #8255: ARROW-9518: [Python] Deprecate pyarrow serialization

2020-10-09 Thread GitBox
kszucs commented on a change in pull request #8255: URL: https://github.com/apache/arrow/pull/8255#discussion_r502312300 ## File path: python/pyarrow/tests/test_serialization_deprecated.py ## @@ -0,0 +1,57 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or m

[GitHub] [arrow] kszucs commented on a change in pull request #8255: ARROW-9518: [Python] Deprecate pyarrow serialization

2020-10-09 Thread GitBox
kszucs commented on a change in pull request #8255: URL: https://github.com/apache/arrow/pull/8255#discussion_r502312518 ## File path: python/pyarrow/__init__.py ## @@ -203,30 +202,49 @@ def show_versions(): import pyarrow.types as types -# deprecated filesystems +# deprec

[GitHub] [arrow] jorisvandenbossche commented on pull request #8301: ARROW-10100: [C++][Python][Dataset] Add ParquetFileFragment::Subset method

2020-10-09 Thread GitBox
jorisvandenbossche commented on pull request #8301: URL: https://github.com/apache/arrow/pull/8301#issuecomment-706084228 @bkietz so I was now thinking it is actually fine to return a fragment with 0 row groups (it can then be the responsibility of the user to check for this, if they want

[GitHub] [arrow] kszucs commented on pull request #8255: ARROW-9518: [Python] Deprecate pyarrow serialization

2020-10-09 Thread GitBox
kszucs commented on pull request #8255: URL: https://github.com/apache/arrow/pull/8255#issuecomment-706087221 Why are the tests in a separate file? This is an automated message from the Apache Git Service. To respond to the m

[GitHub] [arrow] sagnikc-dremio commented on pull request #8398: ARROW-10234: [C++][Gandiva] Fix logic of round() for floats/decimals in Gandiva

2020-10-09 Thread GitBox
sagnikc-dremio commented on pull request #8398: URL: https://github.com/apache/arrow/pull/8398#issuecomment-706087612 > Hi @sagnikc-dremio does this by chance resolve the build failure observed on [ARROW-10177](https://issues.apache.org/jira/browse/ARROW-10177)? Actually--let's run the tes

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #8255: ARROW-9518: [Python] Deprecate pyarrow serialization

2020-10-09 Thread GitBox
jorisvandenbossche commented on a change in pull request #8255: URL: https://github.com/apache/arrow/pull/8255#discussion_r502324218 ## File path: python/pyarrow/__init__.py ## @@ -203,30 +202,49 @@ def show_versions(): import pyarrow.types as types -# deprecated filesyste

[GitHub] [arrow] jorisvandenbossche commented on pull request #8255: ARROW-9518: [Python] Deprecate pyarrow serialization

2020-10-09 Thread GitBox
jorisvandenbossche commented on pull request #8255: URL: https://github.com/apache/arrow/pull/8255#issuecomment-706093394 > Why are the tests in a separate file? Because that allows me to simply ignore all deprecation warnings in the actual test file (see the comment at the pytestmar

[GitHub] [arrow] xhochy commented on pull request #8371: WIP: ARROW-4960: [R] Build r-arrow conda package in crossbow

2020-10-09 Thread GitBox
xhochy commented on pull request #8371: URL: https://github.com/apache/arrow/pull/8371#issuecomment-706095916 I can happily report that the iterator is the only issue and the build passes locally for me when I add a `const` at the end of the declaration and to all implementations. --

[GitHub] [arrow] kszucs commented on pull request #8255: ARROW-9518: [Python] Deprecate pyarrow serialization

2020-10-09 Thread GitBox
kszucs commented on pull request #8255: URL: https://github.com/apache/arrow/pull/8255#issuecomment-706102256 Thanks Joris, merging! This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [arrow] kszucs closed pull request #8255: ARROW-9518: [Python] Deprecate pyarrow serialization

2020-10-09 Thread GitBox
kszucs closed pull request #8255: URL: https://github.com/apache/arrow/pull/8255 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] pitrou commented on a change in pull request #8325: ARROW-10206: [C++][Python][FlightRPC] Allow disabling server validation

2020-10-09 Thread GitBox
pitrou commented on a change in pull request #8325: URL: https://github.com/apache/arrow/pull/8325#discussion_r502338793 ## File path: cpp/src/arrow/flight/client.cc ## @@ -845,18 +878,48 @@ class FlightClient::FlightClientImpl { if (scheme == kSchemeGrpc || scheme == kSch

[GitHub] [arrow] pitrou commented on a change in pull request #8325: ARROW-10206: [C++][Python][FlightRPC] Allow disabling server validation

2020-10-09 Thread GitBox
pitrou commented on a change in pull request #8325: URL: https://github.com/apache/arrow/pull/8325#discussion_r502339168 ## File path: cpp/src/arrow/flight/client.cc ## @@ -845,18 +878,48 @@ class FlightClient::FlightClientImpl { if (scheme == kSchemeGrpc || scheme == kSch

[GitHub] [arrow] alamb commented on a change in pull request #8300: ARROW-10135: [Rust] [Parquet] Refactor file module to help adding sources

2020-10-09 Thread GitBox
alamb commented on a change in pull request #8300: URL: https://github.com/apache/arrow/pull/8300#discussion_r502329941 ## File path: rust/parquet/src/file/reader.rs ## @@ -18,35 +18,37 @@ //! Contains file reader API and provides methods to access file metadata, row group /

[GitHub] [arrow] alamb commented on a change in pull request #8300: ARROW-10135: [Rust] [Parquet] Refactor file module to help adding sources

2020-10-09 Thread GitBox
alamb commented on a change in pull request #8300: URL: https://github.com/apache/arrow/pull/8300#discussion_r502332349 ## File path: rust/parquet/src/file/reader.rs ## @@ -18,35 +18,37 @@ //! Contains file reader API and provides methods to access file metadata, row group /

[GitHub] [arrow] kszucs commented on pull request #8395: ARROW-10230: [JS][Doc] JavaScript documentation fails to build

2020-10-09 Thread GitBox
kszucs commented on pull request #8395: URL: https://github.com/apache/arrow/pull/8395#issuecomment-706121867 @github-actions crossbow submit test-ubuntu-18.04-docs This is an automated message from the Apache Git Service. To

[GitHub] [arrow] github-actions[bot] commented on pull request #8395: ARROW-10230: [JS][Doc] JavaScript documentation fails to build

2020-10-09 Thread GitBox
github-actions[bot] commented on pull request #8395: URL: https://github.com/apache/arrow/pull/8395#issuecomment-706123279 Revision: 7ace0ae2a17007cb77228f000f713d0b13695688 Submitted crossbow builds: [ursa-labs/crossbow @ actions-629](https://github.com/ursa-labs/crossbow/branches/a

[GitHub] [arrow] romainfrancois opened a new pull request #8411: ARROW-10114: [R] Segfault in to_dataframe_parallel with deeply nested structs

2020-10-09 Thread GitBox
romainfrancois opened a new pull request #8411: URL: https://github.com/apache/arrow/pull/8411 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

[GitHub] [arrow] github-actions[bot] commented on pull request #8411: ARROW-10114: [R] Segfault in to_dataframe_parallel with deeply nested structs

2020-10-09 Thread GitBox
github-actions[bot] commented on pull request #8411: URL: https://github.com/apache/arrow/pull/8411#issuecomment-706125380 https://issues.apache.org/jira/browse/ARROW-10114 This is an automated message from the Apache Git Ser

[GitHub] [arrow] liyafan82 commented on a change in pull request #8363: ARROW-10174: [Java] Fix reading/writing dict structs

2020-10-09 Thread GitBox
liyafan82 commented on a change in pull request #8363: URL: https://github.com/apache/arrow/pull/8363#discussion_r502369246 ## File path: java/vector/src/main/java/org/apache/arrow/vector/util/DictionaryUtility.java ## @@ -115,25 +118,28 @@ public static Field toMemoryFormat(F

[GitHub] [arrow] bkietz commented on pull request #8389: ARROW-8296: [C++][Dataset] Add IpcFileWriteOptions

2020-10-09 Thread GitBox
bkietz commented on pull request #8389: URL: https://github.com/apache/arrow/pull/8389#issuecomment-706132376 Thanks @kou! This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [arrow] liyafan82 commented on a change in pull request #8363: ARROW-10174: [Java] Fix reading/writing dict structs

2020-10-09 Thread GitBox
liyafan82 commented on a change in pull request #8363: URL: https://github.com/apache/arrow/pull/8363#discussion_r502370575 ## File path: java/vector/src/test/java/org/apache/arrow/vector/ipc/TestArrowReaderWriter.java ## @@ -305,6 +321,64 @@ public void testWriteReadWithDicti

[GitHub] [arrow] liyafan82 commented on a change in pull request #8363: ARROW-10174: [Java] Fix reading/writing dict structs

2020-10-09 Thread GitBox
liyafan82 commented on a change in pull request #8363: URL: https://github.com/apache/arrow/pull/8363#discussion_r502371916 ## File path: java/vector/src/test/java/org/apache/arrow/vector/ipc/TestArrowReaderWriter.java ## @@ -305,6 +321,64 @@ public void testWriteReadWithDicti

[GitHub] [arrow] liyafan82 commented on a change in pull request #8363: ARROW-10174: [Java] Fix reading/writing dict structs

2020-10-09 Thread GitBox
liyafan82 commented on a change in pull request #8363: URL: https://github.com/apache/arrow/pull/8363#discussion_r502372509 ## File path: java/vector/src/test/java/org/apache/arrow/vector/ipc/TestArrowReaderWriter.java ## @@ -305,6 +321,64 @@ public void testWriteReadWithDicti

[GitHub] [arrow] liyafan82 commented on a change in pull request #8363: ARROW-10174: [Java] Fix reading/writing dict structs

2020-10-09 Thread GitBox
liyafan82 commented on a change in pull request #8363: URL: https://github.com/apache/arrow/pull/8363#discussion_r502373097 ## File path: java/vector/src/test/java/org/apache/arrow/vector/ipc/TestArrowReaderWriter.java ## @@ -305,6 +321,64 @@ public void testWriteReadWithDicti

[GitHub] [arrow] liyafan82 commented on a change in pull request #8363: ARROW-10174: [Java] Fix reading/writing dict structs

2020-10-09 Thread GitBox
liyafan82 commented on a change in pull request #8363: URL: https://github.com/apache/arrow/pull/8363#discussion_r502373348 ## File path: java/vector/src/test/java/org/apache/arrow/vector/testing/ValueVectorDataPopulator.java ## @@ -673,4 +677,28 @@ public static void setVecto

[GitHub] [arrow] liyafan82 commented on a change in pull request #8363: ARROW-10174: [Java] Fix reading/writing dict structs

2020-10-09 Thread GitBox
liyafan82 commented on a change in pull request #8363: URL: https://github.com/apache/arrow/pull/8363#discussion_r502373894 ## File path: java/vector/src/test/java/org/apache/arrow/vector/testing/ValueVectorDataPopulator.java ## @@ -673,4 +677,28 @@ public static void setVecto

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #8305: ARROW-9782: [C++][Dataset] More configurable Dataset writing

2020-10-09 Thread GitBox
jorisvandenbossche commented on a change in pull request #8305: URL: https://github.com/apache/arrow/pull/8305#discussion_r502406437 ## File path: python/pyarrow/tests/test_dataset.py ## @@ -2200,7 +2196,7 @@ def test_write_dataset(tempdir): dataset = ds.dataset(directory)

[GitHub] [arrow] jorisvandenbossche commented on pull request #8305: ARROW-9782: [C++][Dataset] More configurable Dataset writing

2020-10-09 Thread GitBox
jorisvandenbossche commented on pull request #8305: URL: https://github.com/apache/arrow/pull/8305#issuecomment-706166335 > FileSystemDataset::Write now parallelizes across scan tasks rather than fragments so there will be no difference in performance/written files even if we create a sing

[GitHub] [arrow] jorisvandenbossche opened a new pull request #8412: ARROW-9952: [Python] Optionally use pyarrow.dataset in parquet.write_to_dataset

2020-10-09 Thread GitBox
jorisvandenbossche opened a new pull request #8412: URL: https://github.com/apache/arrow/pull/8412 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow] drusso commented on pull request #8222: ARROW-10043: [Rust][DataFusion] Implement COUNT(DISTINCT col)

2020-10-09 Thread GitBox
drusso commented on pull request #8222: URL: https://github.com/apache/arrow/pull/8222#issuecomment-706177533 Thanks @jorgecarleitao! And also, thank you for taking the time for the thorough review. This was a fun feature to work on. --

[GitHub] [arrow] jorgecarleitao commented on pull request #8316: ARROW-10149: [Rust] Improved support for externally owned memory regions

2020-10-09 Thread GitBox
jorgecarleitao commented on pull request #8316: URL: https://github.com/apache/arrow/pull/8316#issuecomment-705944089 Closing in favor of #8401 This is an automated message from the Apache Git Service. To respond to the mess

[GitHub] [arrow] pitrou closed pull request #8145: ARROW-9967: [Python] Add compute module docs + expose more option classes

2020-10-09 Thread GitBox
pitrou closed pull request #8145: URL: https://github.com/apache/arrow/pull/8145 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] terencehonles commented on a change in pull request #8386: ARROW-10224: [Python] Build, test, and support Python 3.9

2020-10-09 Thread GitBox
terencehonles commented on a change in pull request #8386: URL: https://github.com/apache/arrow/pull/8386#discussion_r501463131 ## File path: dev/tasks/python-wheels/travis.osx.yml ## @@ -31,7 +31,7 @@ addons: - git - [email protected] - protobuf - - python@

[GitHub] [arrow] nealrichardson commented on pull request #8406: ARROW-10239: [C++] Add missing zlib dependency to aws-sdk-cpp

2020-10-09 Thread GitBox
nealrichardson commented on pull request #8406: URL: https://github.com/apache/arrow/pull/8406#issuecomment-705931636 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow] nevi-me commented on a change in pull request #8262: ARROW-10040: [Rust] Iterate over and combine boolean buffers with arbitrary offsets

2020-10-09 Thread GitBox
nevi-me commented on a change in pull request #8262: URL: https://github.com/apache/arrow/pull/8262#discussion_r501681537 ## File path: rust/arrow/src/buffer.rs ## @@ -254,6 +256,21 @@ impl Buffer { ) } +/// Returns a slice of this buffer starting at a certa

[GitHub] [arrow] jorgecarleitao commented on a change in pull request #8262: ARROW-10040: [Rust] Iterate over and combine boolean buffers with arbitrary offsets

2020-10-09 Thread GitBox
jorgecarleitao commented on a change in pull request #8262: URL: https://github.com/apache/arrow/pull/8262#discussion_r501772813 ## File path: rust/arrow/src/buffer.rs ## @@ -369,120 +394,171 @@ where result.freeze() } +/// Apply a bitwise operation `op` to two inputs a

[GitHub] [arrow] liyafan82 commented on pull request #8363: ARROW-10174: [Java] Fix reading/writing dict structs

2020-10-09 Thread GitBox
liyafan82 commented on pull request #8363: URL: https://github.com/apache/arrow/pull/8363#issuecomment-70592 > @liyafan82 do you have time to review? @emkornfield Sure. I will take a look in one or two days. This

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #8255: ARROW-9518: [Python] Deprecate pyarrow serialization

2020-10-09 Thread GitBox
jorisvandenbossche commented on a change in pull request #8255: URL: https://github.com/apache/arrow/pull/8255#discussion_r502324218 ## File path: python/pyarrow/__init__.py ## @@ -203,30 +202,49 @@ def show_versions(): import pyarrow.types as types -# deprecated filesyste

[GitHub] [arrow] drusso commented on a change in pull request #8222: ARROW-10043: [Rust][DataFusion] Implement COUNT(DISTINCT col)

2020-10-09 Thread GitBox
drusso commented on a change in pull request #8222: URL: https://github.com/apache/arrow/pull/8222#discussion_r501684962 ## File path: rust/datafusion/src/physical_plan/distinct_expressions.rs ## @@ -0,0 +1,203 @@ +// Licensed to the Apache Software Foundation (ASF) under one +

[GitHub] [arrow] kou commented on pull request #8386: ARROW-10224: [Python] Build, test, and support Python 3.9

2020-10-09 Thread GitBox
kou commented on pull request #8386: URL: https://github.com/apache/arrow/pull/8386#issuecomment-705332674 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

[GitHub] [arrow] kszucs commented on pull request #8162: ARROW-9962: [Python] Fix conversion to_pandas with tz-aware index column and fixed offset timezones

2020-10-09 Thread GitBox
kszucs commented on pull request #8162: URL: https://github.com/apache/arrow/pull/8162#issuecomment-705456869 @jorisvandenbossche seems like python3.5 build is failing This is an automated message from the Apache Git Service.

[GitHub] [arrow] wesm closed issue #8384: how to test whether arrow works correctly in R?

2020-10-09 Thread GitBox
wesm closed issue #8384: URL: https://github.com/apache/arrow/issues/8384 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

[GitHub] [arrow] projjal commented on pull request #8158: ARROW-7215: [C++][Gandiva] Implement castVARCHAR(numeric_type) functions

2020-10-09 Thread GitBox
projjal commented on pull request #8158: URL: https://github.com/apache/arrow/pull/8158#issuecomment-705958047 Thanks @wesm This is an automated message from the Apache Git Service. To respond to the message, please log on t

[GitHub] [arrow] drusso commented on pull request #8222: ARROW-10043: [Rust][DataFusion] Implement COUNT(DISTINCT col)

2020-10-09 Thread GitBox
drusso commented on pull request #8222: URL: https://github.com/apache/arrow/pull/8222#issuecomment-706177533 Thanks @jorgecarleitao! And also, thank you for taking the time for the thorough review. This was a fun feature to work on. --

[GitHub] [arrow] github-actions[bot] commented on pull request #8402: ARROW-8426: [Rust] [Parquet] - Add more support for converting Dicts

2020-10-09 Thread GitBox
github-actions[bot] commented on pull request #8402: URL: https://github.com/apache/arrow/pull/8402#issuecomment-705795175 https://issues.apache.org/jira/browse/ARROW-8426 This is an automated message from the Apache Git Serv

[GitHub] [arrow] github-actions[bot] commented on pull request #8397: ARROW-10233: [Rust] Make array_value_to_string available in all Arrow builds

2020-10-09 Thread GitBox
github-actions[bot] commented on pull request #8397: URL: https://github.com/apache/arrow/pull/8397#issuecomment-705497672 https://issues.apache.org/jira/browse/ARROW-10233 This is an automated message from the Apache Git Ser

[GitHub] [arrow] github-actions[bot] commented on pull request #8406: ARROW-10239: [C++] Add missing zlib dependency to aws-sdk-cpp

2020-10-09 Thread GitBox
github-actions[bot] commented on pull request #8406: URL: https://github.com/apache/arrow/pull/8406#issuecomment-705896055 This is an automated message from the Apache Git Service. To respond to the message, please log on to G

[GitHub] [arrow] emkornfield commented on pull request #6979: ARROW-7800 [Python] implement iter_batches() method for ParquetFile and ParquetReader

2020-10-09 Thread GitBox
emkornfield commented on pull request #6979: URL: https://github.com/apache/arrow/pull/6979#issuecomment-705321792 @wesm does your +1 still hold? Can this be merged now? This is an automated message from the Apache Git Servi

[GitHub] [arrow] pitrou commented on a change in pull request #8349: ARROW-3080: [Python] Unify Arrow to Python object conversion paths

2020-10-09 Thread GitBox
pitrou commented on a change in pull request #8349: URL: https://github.com/apache/arrow/pull/8349#discussion_r501762533 ## File path: cpp/src/arrow/python/helpers.cc ## @@ -258,30 +266,46 @@ bool PyFloat_IsNaN(PyObject* obj) { namespace { static std::once_flag pandas_stati

[GitHub] [arrow] liyafan82 commented on a change in pull request #8363: ARROW-10174: [Java] Fix reading/writing dict structs

2020-10-09 Thread GitBox
liyafan82 commented on a change in pull request #8363: URL: https://github.com/apache/arrow/pull/8363#discussion_r502369246 ## File path: java/vector/src/main/java/org/apache/arrow/vector/util/DictionaryUtility.java ## @@ -115,25 +118,28 @@ public static Field toMemoryFormat(F

[GitHub] [arrow] emkornfield commented on a change in pull request #8366: ARROW-9943: [C++] Recursively apply Arrow metadata when reading from Parquet

2020-10-09 Thread GitBox
emkornfield commented on a change in pull request #8366: URL: https://github.com/apache/arrow/pull/8366#discussion_r501446625 ## File path: cpp/src/parquet/arrow/schema.h ## @@ -91,7 +91,6 @@ struct PARQUET_EXPORT SchemaField { std::shared_ptr<::arrow::Field> field; // If

[GitHub] [arrow] nealrichardson closed pull request #7807: ARROW-6537 [R]: Pass column_types to CSV reader

2020-10-09 Thread GitBox
nealrichardson closed pull request #7807: URL: https://github.com/apache/arrow/pull/7807 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [arrow] github-actions[bot] commented on pull request #8405: One definition/repetition level test

2020-10-09 Thread GitBox
github-actions[bot] commented on pull request #8405: URL: https://github.com/apache/arrow/pull/8405#issuecomment-705823515 Thanks for opening a pull request! Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW Then could

[GitHub] [arrow] nealrichardson commented on pull request #8398: ARROW-10234: [C++][Gandiva] Fix logic of round() for floats/decimals in Gandiva

2020-10-09 Thread GitBox
nealrichardson commented on pull request #8398: URL: https://github.com/apache/arrow/pull/8398#issuecomment-705640747 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow] github-actions[bot] commented on pull request #8407: ARROW-10241: [C++][Compute] Add variance kernel benchmark

2020-10-09 Thread GitBox
github-actions[bot] commented on pull request #8407: URL: https://github.com/apache/arrow/pull/8407#issuecomment-705923975 https://issues.apache.org/jira/browse/ARROW-10241 This is an automated message from the Apache Git Ser

[GitHub] [arrow] jduo commented on pull request #8325: ARROW-10206: [C++][Python][FlightRPC] Allow disabling server validation

2020-10-09 Thread GitBox
jduo commented on pull request #8325: URL: https://github.com/apache/arrow/pull/8325#issuecomment-705342986 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

[GitHub] [arrow] jorgecarleitao commented on a change in pull request #8222: ARROW-10043: [Rust][DataFusion] Implement COUNT(DISTINCT col)

2020-10-09 Thread GitBox
jorgecarleitao commented on a change in pull request #8222: URL: https://github.com/apache/arrow/pull/8222#discussion_r501440700 ## File path: rust/datafusion/src/test/mod.rs ## @@ -135,6 +135,13 @@ pub fn format_batch(batch: &RecordBatch) -> Vec { } l

[GitHub] [arrow] pitrou commented on a change in pull request #8366: ARROW-9943: [C++] Recursively apply Arrow metadata when reading from Parquet

2020-10-09 Thread GitBox
pitrou commented on a change in pull request #8366: URL: https://github.com/apache/arrow/pull/8366#discussion_r501548523 ## File path: cpp/src/parquet/arrow/schema.h ## @@ -91,7 +91,6 @@ struct PARQUET_EXPORT SchemaField { std::shared_ptr<::arrow::Field> field; // If fiel

[GitHub] [arrow] nevi-me commented on pull request #8346: ARROW-10164: [Rust] Add support for DictionaryArray to cast kernel

2020-10-09 Thread GitBox
nevi-me commented on pull request #8346: URL: https://github.com/apache/arrow/pull/8346#issuecomment-705532225 > 🤔 so the 'pretty-print' feature is not enabled by default for arrow and thus I can't use it in the tests. I put in a hack (copy/paste of the pretty printing) in [9f8b9ba](https

[GitHub] [arrow] kszucs commented on pull request #8395: ARROW-10230: [JS][Doc] JavaScript documentation fails to build

2020-10-09 Thread GitBox
kszucs commented on pull request #8395: URL: https://github.com/apache/arrow/pull/8395#issuecomment-705456261 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [arrow] jorisvandenbossche commented on pull request #8162: ARROW-9962: [Python] Fix conversion to_pandas with tz-aware index column and fixed offset timezones

2020-10-09 Thread GitBox
jorisvandenbossche commented on pull request #8162: URL: https://github.com/apache/arrow/pull/8162#issuecomment-706026472 Yes, it's an actual failure that I still needed to take a look at. But the problem is that I can't reproduce it locally (well, it might be specific to Python 3.5, but I

[GitHub] [arrow] nealrichardson commented on pull request #8371: WIP: ARROW-4960: [R] Build r-arrow conda package in crossbow

2020-10-09 Thread GitBox
nealrichardson commented on pull request #8371: URL: https://github.com/apache/arrow/pull/8371#issuecomment-705626091 @bkietz can you take a look at that? Windows conda uses MSVC so it wouldn't surprise me if there were some corners in `cpp11` that weren't robust to that.

[GitHub] [arrow] sagnikc-dremio commented on a change in pull request #8398: ARROW-10234: [C++][Gandiva] Fix logic of round() for floats/decimals in Gandiva

2020-10-09 Thread GitBox
sagnikc-dremio commented on a change in pull request #8398: URL: https://github.com/apache/arrow/pull/8398#discussion_r501676101 ## File path: cpp/src/gandiva/precompiled/extended_math_ops_test.cc ## @@ -93,6 +93,9 @@ TEST(TestExtendedMathOps, TestRoundDecimal) { EXPECT_FLOA

[GitHub] [arrow] github-actions[bot] commented on pull request #8403: ARROW-10237: [C++] Duplicate dict values cause corrupt parquet

2020-10-09 Thread GitBox
github-actions[bot] commented on pull request #8403: URL: https://github.com/apache/arrow/pull/8403#issuecomment-705808504 https://issues.apache.org/jira/browse/ARROW-10237 This is an automated message from the Apache Git Ser

[GitHub] [arrow] jhorstmann commented on a change in pull request #8262: ARROW-10040: [Rust] Iterate over and combine boolean buffers with arbitrary offsets

2020-10-09 Thread GitBox
jhorstmann commented on a change in pull request #8262: URL: https://github.com/apache/arrow/pull/8262#discussion_r501769149 ## File path: rust/arrow/src/buffer.rs ## @@ -369,120 +394,171 @@ where result.freeze() } +/// Apply a bitwise operation `op` to two inputs and r

[GitHub] [arrow] pitrou commented on pull request #8392: ARROW-10229: [C++] Remove errant log line

2020-10-09 Thread GitBox
pitrou commented on pull request #8392: URL: https://github.com/apache/arrow/pull/8392#issuecomment-705428653 Ha, sorry. Thank you for noticing :-) This is an automated message from the Apache Git Service. To respond to the m

[GitHub] [arrow] kszucs commented on pull request #8255: ARROW-9518: [Python] Deprecate pyarrow serialization

2020-10-09 Thread GitBox
kszucs commented on pull request #8255: URL: https://github.com/apache/arrow/pull/8255#issuecomment-706087221 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [arrow] pitrou commented on a change in pull request #8271: ARROW-9991: [C++] split kernels for strings/binary

2020-10-09 Thread GitBox
pitrou commented on a change in pull request #8271: URL: https://github.com/apache/arrow/pull/8271#discussion_r501601274 ## File path: cpp/src/arrow/compute/kernels/scalar_string.cc ## @@ -809,6 +812,418 @@ struct IsUpperAscii : CharacterPredicateAscii { } }; +// splitti

[GitHub] [arrow] jorgecarleitao closed pull request #8316: ARROW-10149: [Rust] Improved support for externally owned memory regions

2020-10-09 Thread GitBox
jorgecarleitao closed pull request #8316: URL: https://github.com/apache/arrow/pull/8316 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [arrow] pprudhvi commented on a change in pull request #8398: ARROW-10234: [C++][Gandiva] Fix logic of round() for floats/decimals in Gandiva

2020-10-09 Thread GitBox
pprudhvi commented on a change in pull request #8398: URL: https://github.com/apache/arrow/pull/8398#discussion_r501669631 ## File path: cpp/src/gandiva/precompiled/extended_math_ops_test.cc ## @@ -93,6 +93,9 @@ TEST(TestExtendedMathOps, TestRoundDecimal) { EXPECT_FLOAT_EQ(r

[GitHub] [arrow] andygrove commented on pull request #8370: ARROW-10015: [Rust] Simd aggregate kernels

2020-10-09 Thread GitBox
andygrove commented on pull request #8370: URL: https://github.com/apache/arrow/pull/8370#issuecomment-705747085 I'm planning on running TPC-H benchmarks later today with and without this patch. This is an automated message

[GitHub] [arrow] bkietz commented on pull request #8371: WIP: ARROW-4960: [R] Build r-arrow conda package in crossbow

2020-10-09 Thread GitBox
bkietz commented on pull request #8371: URL: https://github.com/apache/arrow/pull/8371#issuecomment-705748359 unfortunately `cpp11::r_vector::const_iterator::operator*` is not marked const, but MSVC's impl of `std::copy` marks its [arguments const](https://github.com/microsoft/STL/blob/3d8

[GitHub] [arrow] wesm commented on pull request #8393: [NEEDS IP CLEARANCE] ARROW-10228: Contribute Julia implementation

2020-10-09 Thread GitBox
wesm commented on pull request #8393: URL: https://github.com/apache/arrow/pull/8393#issuecomment-705663389 I think it would be best to conduct the IP clearance process for this codebase. The first step is to have a vote on the mailing list about accepting the code donation. We'll need to

[GitHub] [arrow] github-actions[bot] commented on pull request #8408: ARROW-10215: [Rust] [DataFusion] Renamed Source to SendableRecordBatchReader.

2020-10-09 Thread GitBox
github-actions[bot] commented on pull request #8408: URL: https://github.com/apache/arrow/pull/8408#issuecomment-705988214 https://issues.apache.org/jira/browse/ARROW-10215 This is an automated message from the Apache Git Ser

[GitHub] [arrow] jorisvandenbossche commented on pull request #8255: ARROW-9518: [Python] Deprecate pyarrow serialization

2020-10-09 Thread GitBox
jorisvandenbossche commented on pull request #8255: URL: https://github.com/apache/arrow/pull/8255#issuecomment-706046582 This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

[GitHub] [arrow] nevi-me commented on a change in pull request #8400: ARROW-10236: [Rust][DataFusion] Unify type casting logic in DataFusion

2020-10-09 Thread GitBox
nevi-me commented on a change in pull request #8400: URL: https://github.com/apache/arrow/pull/8400#discussion_r501850814 ## File path: rust/datafusion/src/physical_plan/type_casting.rs ## @@ -0,0 +1,218 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or m

[GitHub] [arrow] pitrou commented on pull request #8386: ARROW-10224: [Python] Build, test, and support Python 3.9

2020-10-09 Thread GitBox
pitrou commented on pull request #8386: URL: https://github.com/apache/arrow/pull/8386#issuecomment-705440089 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [arrow] nealrichardson commented on pull request #8393: [NEEDS IP CLEARANCE] ARROW-10228: Contribute Julia implementation

2020-10-09 Thread GitBox
nealrichardson commented on pull request #8393: URL: https://github.com/apache/arrow/pull/8393#issuecomment-705695770 @quinnj I can help you with the IP clearance process. To get started, here's a link to the Apache CLA: https://www.apache.org/licenses/contributor-agreements.html -

[GitHub] [arrow] wesm commented on issue #8384: how to test whether arrow works correctly in R?

2020-10-09 Thread GitBox
wesm commented on issue #8384: URL: https://github.com/apache/arrow/issues/8384#issuecomment-705811182 You can run the vignette examples, or the unit test suite, I think. If you wanted something more integrated with the install you could open a JIRA issue to describe exactly what you would

[GitHub] [arrow] github-actions[bot] commented on pull request #8371: WIP: ARROW-4960: [R] Build r-arrow conda package in crossbow

2020-10-09 Thread GitBox
github-actions[bot] commented on pull request #8371: URL: https://github.com/apache/arrow/pull/8371#issuecomment-705459887 This is an automated message from the Apache Git Service. To respond to the message, please log on to G

[GitHub] [arrow] kiszk commented on a change in pull request #8374: ARROW-10203: Give guidance on big-endian support in the contributors docs

2020-10-09 Thread GitBox
kiszk commented on a change in pull request #8374: URL: https://github.com/apache/arrow/pull/8374#discussion_r501448247 ## File path: docs/source/developers/contributing.rst ## @@ -304,3 +304,40 @@ to your branch, which they sometimes do to help move a pull request along. In

[GitHub] [arrow] terencehonles edited a comment on pull request #8386: ARROW-10224: [Python] Build, test, and support Python 3.9

2020-10-09 Thread GitBox
terencehonles edited a comment on pull request #8386: URL: https://github.com/apache/arrow/pull/8386#issuecomment-705697163 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [arrow] nevi-me closed pull request #8262: ARROW-10040: [Rust] Iterate over and combine boolean buffers with arbitrary offsets

2020-10-09 Thread GitBox
nevi-me closed pull request #8262: URL: https://github.com/apache/arrow/pull/8262 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

[GitHub] [arrow] vvellanki commented on a change in pull request #8398: ARROW-10234: [C++][Gandiva] Fix logic of round() for floats/decimals in Gandiva

2020-10-09 Thread GitBox
vvellanki commented on a change in pull request #8398: URL: https://github.com/apache/arrow/pull/8398#discussion_r501670246 ## File path: cpp/src/gandiva/precompiled/extended_math_ops_test.cc ## @@ -93,6 +93,9 @@ TEST(TestExtendedMathOps, TestRoundDecimal) { EXPECT_FLOAT_EQ(

[GitHub] [arrow] github-actions[bot] commented on pull request #8395: ARROW-10230: [JS][Doc] JavaScript documentation fails to build

2020-10-09 Thread GitBox
github-actions[bot] commented on pull request #8395: URL: https://github.com/apache/arrow/pull/8395#issuecomment-705458155 This is an automated message from the Apache Git Service. To respond to the message, please log on to G

[GitHub] [arrow] terencehonles removed a comment on pull request #8386: ARROW-10224: [Python] Build, test, and support Python 3.9

2020-10-09 Thread GitBox
terencehonles removed a comment on pull request #8386: URL: https://github.com/apache/arrow/pull/8386#issuecomment-705332230 This is an automated message from the Apache Git Service. To respond to the message, please log on to

  1   2   3   >