[GitHub] [arrow] zanmato1984 opened a new pull request #8423: ARROW-10262: [C++] Fix TypeClass for BinaryScalar and LargeBinaryScalar

2020-10-09 Thread GitBox
zanmato1984 opened a new pull request #8423: URL: https://github.com/apache/arrow/pull/8423 Alias `TypeClass` in `BinaryScalar` and `LargeBinaryScalar` are seemingly typo-ed to be `BinaryScalar` and `LargeBinaryScalar`. This causes issues when using `ScalarType::TypeClass`, esp. with `Type

[GitHub] [arrow] BryanCutler commented on pull request #8422: ARROW-10260 [Python] Missing MapType to Pandas dtype

2020-10-09 Thread GitBox
BryanCutler commented on pull request #8422: URL: https://github.com/apache/arrow/pull/8422#issuecomment-706492831 Thanks @dmarsh19 ! I think you will want to add the type mapping to the test here https://github.com/apache/arrow/blob/eec7277561b9e5c8aac03e44ac11f492ff1385fd/python/pyarrow

[GitHub] [arrow] github-actions[bot] commented on pull request #8422: ARROW-10260 [Python] Missing MapType to Pandas dtype

2020-10-09 Thread GitBox
github-actions[bot] commented on pull request #8422: URL: https://github.com/apache/arrow/pull/8422#issuecomment-706483446 https://issues.apache.org/jira/browse/ARROW-10260 This is an automated message from the Apache Git Ser

[GitHub] [arrow] dmarsh19 opened a new pull request #8422: ARROW-10260 [Python] Missing MapType to Pandas dtype

2020-10-09 Thread GitBox
dmarsh19 opened a new pull request #8422: URL: https://github.com/apache/arrow/pull/8422 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [arrow] nealrichardson closed pull request #8421: ARROW-10257: [R] Prepare news/docs for 2.0 release

2020-10-09 Thread GitBox
nealrichardson closed pull request #8421: URL: https://github.com/apache/arrow/pull/8421 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [arrow] github-actions[bot] commented on pull request #8421: ARROW-10257: [R] Prepare news/docs for 2.0 release

2020-10-09 Thread GitBox
github-actions[bot] commented on pull request #8421: URL: https://github.com/apache/arrow/pull/8421#issuecomment-706443378 https://issues.apache.org/jira/browse/ARROW-10257 This is an automated message from the Apache Git Ser

[GitHub] [arrow] nealrichardson opened a new pull request #8421: ARROW-10257: [R] Prepare news/docs for 2.0 release

2020-10-09 Thread GitBox
nealrichardson opened a new pull request #8421: URL: https://github.com/apache/arrow/pull/8421 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

[GitHub] [arrow] nealrichardson closed pull request #8389: ARROW-8296: [C++][Dataset] Add IpcFileWriteOptions

2020-10-09 Thread GitBox
nealrichardson closed pull request #8389: URL: https://github.com/apache/arrow/pull/8389 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [arrow] nealrichardson closed pull request #8351: ARROW-9870: [R] Friendly interface for filesystems (S3)

2020-10-09 Thread GitBox
nealrichardson closed pull request #8351: URL: https://github.com/apache/arrow/pull/8351 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [arrow] emkornfield commented on pull request #8417: WIP: [C++] Get rid of code duplication in Decimal##bit_width

2020-10-09 Thread GitBox
emkornfield commented on pull request #8417: URL: https://github.com/apache/arrow/pull/8417#issuecomment-706437971 I can look in a bit more detail but two high level comments: 1. Lets not use macros in the header files for declaration. The [style guide strongly discourages it](http

[GitHub] [arrow] andygrove commented on pull request #8409: ARROW-10240: [Rust] Optionally load data into memory before running benchmark query

2020-10-09 Thread GitBox
andygrove commented on pull request #8409: URL: https://github.com/apache/arrow/pull/8409#issuecomment-706429236 The results are pretty interesting for me. Without `--mem-table`: ``` Running benchmarks with the following options: TpchOpt { query: 1, debug: false, iterations

[GitHub] [arrow] nealrichardson commented on pull request #8389: ARROW-8296: [C++][Dataset] Add IpcFileWriteOptions

2020-10-09 Thread GitBox
nealrichardson commented on pull request #8389: URL: https://github.com/apache/arrow/pull/8389#issuecomment-706431994 I'm fixing the R and then will merge This is an automated message from the Apache Git Service. To respond t

[GitHub] [arrow] wesm commented on pull request #8419: ARROW-10256: [C++][Flight] Disable -Werror carefully

2020-10-09 Thread GitBox
wesm commented on pull request #8419: URL: https://github.com/apache/arrow/pull/8419#issuecomment-706426958 Oh, thank you for catching this! This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [arrow] nealrichardson closed pull request #8411: ARROW-10114: [R] Segfault in to_dataframe_parallel with deeply nested structs

2020-10-09 Thread GitBox
nealrichardson closed pull request #8411: URL: https://github.com/apache/arrow/pull/8411 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [arrow] quazzuk closed issue #8420: Is this expected behaviour?

2020-10-09 Thread GitBox
quazzuk closed issue #8420: URL: https://github.com/apache/arrow/issues/8420 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

[GitHub] [arrow] quazzuk opened a new issue #8420: Is this expected behaviour?

2020-10-09 Thread GitBox
quazzuk opened a new issue #8420: URL: https://github.com/apache/arrow/issues/8420 ``` import os import pyarrow as pa import pyarrow.parquet as pq df = pd.DataFrame(dict(symbol=["A", "B", "C", "D"], year=[2017, 2018, 2019, 2020], close=np.arange(4))) root_path = "test

[GitHub] [arrow] kou opened a new pull request #8419: ARROW-10256: [C++][Flight] Disable -Werror carefully

2020-10-09 Thread GitBox
kou opened a new pull request #8419: URL: https://github.com/apache/arrow/pull/8419 If we replace "-Werror" with "", "-Werror=SOMETHING" is "=SOMETHING". "=SOMETHING" always causes build error because "=SOMETHING" is treated as an input file. For example, CentOS 8 RPM build uses

[GitHub] [arrow] github-actions[bot] commented on pull request #8419: ARROW-10256: [C++][Flight] Disable -Werror carefully

2020-10-09 Thread GitBox
github-actions[bot] commented on pull request #8419: URL: https://github.com/apache/arrow/pull/8419#issuecomment-706421641 This is an automated message from the Apache Git Service. To respond to the message, please log on to G

[GitHub] [arrow] github-actions[bot] commented on pull request #8418: ARROW-10255: [JS] [WIP] Reorganize exports for ESM tree-shaking

2020-10-09 Thread GitBox
github-actions[bot] commented on pull request #8418: URL: https://github.com/apache/arrow/pull/8418#issuecomment-706415050 https://issues.apache.org/jira/browse/ARROW-10255 This is an automated message from the Apache Git Ser

[GitHub] [arrow] kou commented on a change in pull request #8325: ARROW-10206: [C++][Python][FlightRPC] Allow disabling server validation

2020-10-09 Thread GitBox
kou commented on a change in pull request #8325: URL: https://github.com/apache/arrow/pull/8325#discussion_r502686050 ## File path: cpp/src/arrow/flight/CMakeLists.txt ## @@ -61,6 +61,48 @@ set_source_files_properties(${FLIGHT_GENERATED_PROTO_FILES} PROPERTIES GENERATED add

[GitHub] [arrow] github-actions[bot] commented on pull request #8417: WIP: [C++] Get rid of code duplication in Decimal##bit_width

2020-10-09 Thread GitBox
github-actions[bot] commented on pull request #8417: URL: https://github.com/apache/arrow/pull/8417#issuecomment-706413313 Thanks for opening a pull request! Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW Then could

[GitHub] [arrow] trxcllnt opened a new pull request #8418: ARROW-10255: [JS] [WIP] Reorganize exports for ESM tree-shaking

2020-10-09 Thread GitBox
trxcllnt opened a new pull request #8418: URL: https://github.com/apache/arrow/pull/8418 Related JIRA: https://issues.apache.org/jira/browse/ARROW-10255 Opening this as a draft PR for ongoing work to make our ESM exports more friendly to [tree-shaking](https://webpack.js.org/guides/t

[GitHub] [arrow] kou commented on pull request #8419: ARROW-10256: [C++][Flight] Disable -Werror carefully

2020-10-09 Thread GitBox
kou commented on pull request #8419: URL: https://github.com/apache/arrow/pull/8419#issuecomment-706416164 @github-actions crossbow submit centos-8-* This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] jhorstmann commented on a change in pull request #8409: ARROW-10240: [Rust] Optionally load data into memory before running benchmark query

2020-10-09 Thread GitBox
jhorstmann commented on a change in pull request #8409: URL: https://github.com/apache/arrow/pull/8409#discussion_r502679298 ## File path: rust/benchmarks/src/bin/tpch.rs ## @@ -73,31 +79,42 @@ async fn main() -> Result<()> { let path = opt.path.to_str().unwrap(); -

[GitHub] [arrow] dchigarev opened a new pull request #8417: WIP: [C++] Get rid of code duplication in Decimal##bit_width

2020-10-09 Thread GitBox
dchigarev opened a new pull request #8417: URL: https://github.com/apache/arrow/pull/8417 PR to get rid of code duplication while declaring new `Decimal##BIT_WIDTH##Type`. Since there is a plans to add support for lower bit Decimals also, I think Decimal256 branch is the best place to deci

[GitHub] [arrow] github-actions[bot] commented on pull request #8416: ARROW-10252: [Python] Add option to skip inclusion of Arrow headers in Python installation

2020-10-09 Thread GitBox
github-actions[bot] commented on pull request #8416: URL: https://github.com/apache/arrow/pull/8416#issuecomment-706402495 https://issues.apache.org/jira/browse/ARROW-10252 This is an automated message from the Apache Git Ser

[GitHub] [arrow] github-actions[bot] commented on pull request #8416: ARROW-10252: [Python] Add option to skip inclusion of Arrow headers in Python installation

2020-10-09 Thread GitBox
github-actions[bot] commented on pull request #8416: URL: https://github.com/apache/arrow/pull/8416#issuecomment-706396399 Revision: 24dca89942249c42cbd765fc73545b597e9f3548 Submitted crossbow builds: [ursa-labs/crossbow @ actions-634](https://github.com/ursa-labs/crossbow/branches/a

[GitHub] [arrow] xhochy commented on pull request #8416: WIP: ARROW-10252: [Python] Add option to skip inclusion of Arrow headers in Python installation

2020-10-09 Thread GitBox
xhochy commented on pull request #8416: URL: https://github.com/apache/arrow/pull/8416#issuecomment-706395409 @github-actions crossbow submit -g conda This is an automated message from the Apache Git Service. To respond to th

[GitHub] [arrow] xhochy commented on pull request #8371: WIP: ARROW-4960: [R] Build r-arrow conda package in crossbow

2020-10-09 Thread GitBox
xhochy commented on pull request #8371: URL: https://github.com/apache/arrow/pull/8371#issuecomment-706395120 > Great! Can we apply that patch to https://github.com/conda-forge/r-cpp11-feedstock, and then this will work? Yes!

[GitHub] [arrow] bkietz commented on a change in pull request #8389: ARROW-8296: [C++][Dataset] Add IpcFileWriteOptions

2020-10-09 Thread GitBox
bkietz commented on a change in pull request #8389: URL: https://github.com/apache/arrow/pull/8389#discussion_r502661252 ## File path: r/tests/testthat/test-dataset.R ## @@ -943,10 +943,42 @@ test_that("Dataset writing: from RecordBatch", { ) }) +test_that("Writing a data

[GitHub] [arrow] nealrichardson commented on a change in pull request #8389: ARROW-8296: [C++][Dataset] Add IpcFileWriteOptions

2020-10-09 Thread GitBox
nealrichardson commented on a change in pull request #8389: URL: https://github.com/apache/arrow/pull/8389#discussion_r502656503 ## File path: r/R/dataset-format.R ## @@ -139,6 +139,13 @@ FileWriteOptions <- R6Class("FileWriteOptions", inherit = ArrowObject, dataset__

[GitHub] [arrow] nealrichardson commented on a change in pull request #8389: ARROW-8296: [C++][Dataset] Add IpcFileWriteOptions

2020-10-09 Thread GitBox
nealrichardson commented on a change in pull request #8389: URL: https://github.com/apache/arrow/pull/8389#discussion_r502656073 ## File path: r/tests/testthat/test-dataset.R ## @@ -943,10 +943,42 @@ test_that("Dataset writing: from RecordBatch", { ) }) +test_that("Writin

[GitHub] [arrow] bkietz commented on a change in pull request #8389: ARROW-8296: [C++][Dataset] Add IpcFileWriteOptions

2020-10-09 Thread GitBox
bkietz commented on a change in pull request #8389: URL: https://github.com/apache/arrow/pull/8389#discussion_r502651537 ## File path: r/tests/testthat/test-dataset.R ## @@ -943,10 +943,42 @@ test_that("Dataset writing: from RecordBatch", { ) }) +test_that("Writing a data

[GitHub] [arrow] nealrichardson commented on pull request #8371: WIP: ARROW-4960: [R] Build r-arrow conda package in crossbow

2020-10-09 Thread GitBox
nealrichardson commented on pull request #8371: URL: https://github.com/apache/arrow/pull/8371#issuecomment-706382865 Great! Can we apply that patch to https://github.com/conda-forge/r-cpp11-feedstock, and then this will work? --

[GitHub] [arrow] github-actions[bot] commented on pull request #8416: WIP: ARROW-10252: [Python] Add option to skip inclusion of Arrow headers in Python installation

2020-10-09 Thread GitBox
github-actions[bot] commented on pull request #8416: URL: https://github.com/apache/arrow/pull/8416#issuecomment-706372068 Revision: 24dca89942249c42cbd765fc73545b597e9f3548 Submitted crossbow builds: [ursa-labs/crossbow @ actions-633](https://github.com/ursa-labs/crossbow/branches/a

[GitHub] [arrow] github-actions[bot] commented on pull request #8416: WIP: ARROW-10252: [Python] Add option to skip inclusion of Arrow headers in Python installation

2020-10-09 Thread GitBox
github-actions[bot] commented on pull request #8416: URL: https://github.com/apache/arrow/pull/8416#issuecomment-706371217 Thanks for opening a pull request! Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW Then could

[GitHub] [arrow] xhochy commented on pull request #8416: WIP: ARROW-10252: [Python] Add option to skip inclusion of Arrow headers in Python installation

2020-10-09 Thread GitBox
xhochy commented on pull request #8416: URL: https://github.com/apache/arrow/pull/8416#issuecomment-706371417 @github-actions crossbow submit conda-linux-gcc-py36-cpu This is an automated message from the Apache Git S

[GitHub] [arrow] xhochy commented on pull request #8416: WIP: ARROW-10252: [Python] Add option to skip inclusion of Arrow headers in Python installation

2020-10-09 Thread GitBox
xhochy commented on pull request #8416: URL: https://github.com/apache/arrow/pull/8416#issuecomment-706371282 @github-actions cross submit conda-linux-gcc-py36-cpu This is an automated message from the Apache Git Service. To

[GitHub] [arrow] xhochy opened a new pull request #8416: WIP: ARROW-10252: [Python] Add option to skip inclusion of Arrow headers in Python installation

2020-10-09 Thread GitBox
xhochy opened a new pull request #8416: URL: https://github.com/apache/arrow/pull/8416 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] andygrove commented on a change in pull request #8409: ARROW-10240: [Rust] Optionally load data into memory before running benchmark query

2020-10-09 Thread GitBox
andygrove commented on a change in pull request #8409: URL: https://github.com/apache/arrow/pull/8409#discussion_r502598284 ## File path: rust/benchmarks/src/bin/tpch.rs ## @@ -73,31 +79,42 @@ async fn main() -> Result<()> { let path = opt.path.to_str().unwrap(); -

[GitHub] [arrow] bkietz commented on a change in pull request #8389: ARROW-8296: [C++][Dataset] Add IpcFileWriteOptions

2020-10-09 Thread GitBox
bkietz commented on a change in pull request #8389: URL: https://github.com/apache/arrow/pull/8389#discussion_r502628716 ## File path: r/R/dataset-format.R ## @@ -139,6 +139,13 @@ FileWriteOptions <- R6Class("FileWriteOptions", inherit = ArrowObject, dataset___Parquet

[GitHub] [arrow] bkietz commented on a change in pull request #8389: ARROW-8296: [C++][Dataset] Add IpcFileWriteOptions

2020-10-09 Thread GitBox
bkietz commented on a change in pull request #8389: URL: https://github.com/apache/arrow/pull/8389#discussion_r502627767 ## File path: r/R/dataset-format.R ## @@ -139,6 +139,13 @@ FileWriteOptions <- R6Class("FileWriteOptions", inherit = ArrowObject, dataset___Parquet

[GitHub] [arrow] andygrove commented on a change in pull request #8409: ARROW-10240: [Rust] Optionally load data into memory before running benchmark query

2020-10-09 Thread GitBox
andygrove commented on a change in pull request #8409: URL: https://github.com/apache/arrow/pull/8409#discussion_r502598284 ## File path: rust/benchmarks/src/bin/tpch.rs ## @@ -73,31 +79,42 @@ async fn main() -> Result<()> { let path = opt.path.to_str().unwrap(); -

[GitHub] [arrow] nealrichardson commented on a change in pull request #8389: ARROW-8296: [C++][Dataset] Add IpcFileWriteOptions

2020-10-09 Thread GitBox
nealrichardson commented on a change in pull request #8389: URL: https://github.com/apache/arrow/pull/8389#discussion_r502591443 ## File path: r/R/dataset-format.R ## @@ -139,6 +139,13 @@ FileWriteOptions <- R6Class("FileWriteOptions", inherit = ArrowObject, dataset__

[GitHub] [arrow] nealrichardson commented on a change in pull request #8389: ARROW-8296: [C++][Dataset] Add IpcFileWriteOptions

2020-10-09 Thread GitBox
nealrichardson commented on a change in pull request #8389: URL: https://github.com/apache/arrow/pull/8389#discussion_r502591443 ## File path: r/R/dataset-format.R ## @@ -139,6 +139,13 @@ FileWriteOptions <- R6Class("FileWriteOptions", inherit = ArrowObject, dataset__

[GitHub] [arrow] nealrichardson commented on a change in pull request #8389: ARROW-8296: [C++][Dataset] Add IpcFileWriteOptions

2020-10-09 Thread GitBox
nealrichardson commented on a change in pull request #8389: URL: https://github.com/apache/arrow/pull/8389#discussion_r502590671 ## File path: r/src/arrow_metadata.h ## @@ -0,0 +1,38 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor licen

[GitHub] [arrow] jorgecarleitao closed pull request #8408: ARROW-10215: [Rust] [DataFusion] Renamed Source to SendableRecordBatchReader.

2020-10-09 Thread GitBox
jorgecarleitao closed pull request #8408: URL: https://github.com/apache/arrow/pull/8408 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [arrow] wesm commented on pull request #8325: ARROW-10206: [C++][Python][FlightRPC] Allow disabling server validation

2020-10-09 Thread GitBox
wesm commented on pull request #8325: URL: https://github.com/apache/arrow/pull/8325#issuecomment-706311621 thanks everyone This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [arrow] wesm closed pull request #8325: ARROW-10206: [C++][Python][FlightRPC] Allow disabling server validation

2020-10-09 Thread GitBox
wesm closed pull request #8325: URL: https://github.com/apache/arrow/pull/8325 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

[GitHub] [arrow] andygrove commented on a change in pull request #8409: ARROW-10240: [Rust] Optionally load data into memory before running benchmark query

2020-10-09 Thread GitBox
andygrove commented on a change in pull request #8409: URL: https://github.com/apache/arrow/pull/8409#discussion_r502573069 ## File path: rust/benchmarks/src/bin/tpch.rs ## @@ -59,6 +61,10 @@ struct TpchOpt { /// File format: `csv` or `parquet` #[structopt(short = "f"

[GitHub] [arrow] lidavidm commented on a change in pull request #8325: ARROW-10206: [C++][Python][FlightRPC] Allow disabling server validation

2020-10-09 Thread GitBox
lidavidm commented on a change in pull request #8325: URL: https://github.com/apache/arrow/pull/8325#discussion_r502572751 ## File path: cpp/src/arrow/flight/client.h ## @@ -90,6 +90,8 @@ class ARROW_FLIGHT_EXPORT FlightWriteSizeStatusDetail : public arrow::StatusDeta class

[GitHub] [arrow] github-actions[bot] commented on pull request #8415: ARROW-10248: [Python][Dataset] Always apply Python's default write properties

2020-10-09 Thread GitBox
github-actions[bot] commented on pull request #8415: URL: https://github.com/apache/arrow/pull/8415#issuecomment-706306636 https://issues.apache.org/jira/browse/ARROW-10248 This is an automated message from the Apache Git Ser

[GitHub] [arrow] bkietz opened a new pull request #8415: ARROW-10248: [Python][Dataset] Always apply Python's default write properties

2020-10-09 Thread GitBox
bkietz opened a new pull request #8415: URL: https://github.com/apache/arrow/pull/8415 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] lidavidm commented on pull request #8325: ARROW-10206: [C++][Python][FlightRPC] Allow disabling server validation

2020-10-09 Thread GitBox
lidavidm commented on pull request #8325: URL: https://github.com/apache/arrow/pull/8325#issuecomment-706296052 Waiting for CI here... This is an automated message from the Apache Git Service. To respond to the message, pleas

[GitHub] [arrow] bkietz commented on a change in pull request #8389: ARROW-8296: [C++][Dataset] Add IpcFileWriteOptions

2020-10-09 Thread GitBox
bkietz commented on a change in pull request #8389: URL: https://github.com/apache/arrow/pull/8389#discussion_r502561853 ## File path: r/R/dataset-format.R ## @@ -139,6 +139,13 @@ FileWriteOptions <- R6Class("FileWriteOptions", inherit = ArrowObject, dataset___Parquet

[GitHub] [arrow] bkietz commented on a change in pull request #8389: ARROW-8296: [C++][Dataset] Add IpcFileWriteOptions

2020-10-09 Thread GitBox
bkietz commented on a change in pull request #8389: URL: https://github.com/apache/arrow/pull/8389#discussion_r502559021 ## File path: r/R/dataset-write.R ## @@ -44,7 +44,8 @@ #' @param filesystem A [FileSystem] where the dataset should be written if it is a #' string file p

[GitHub] [arrow] bkietz commented on a change in pull request #8389: ARROW-8296: [C++][Dataset] Add IpcFileWriteOptions

2020-10-09 Thread GitBox
bkietz commented on a change in pull request #8389: URL: https://github.com/apache/arrow/pull/8389#discussion_r50255 ## File path: r/tests/testthat/test-dataset.R ## @@ -943,10 +943,42 @@ test_that("Dataset writing: from RecordBatch", { ) }) +test_that("Writing a data

[GitHub] [arrow] bkietz commented on a change in pull request #8389: ARROW-8296: [C++][Dataset] Add IpcFileWriteOptions

2020-10-09 Thread GitBox
bkietz commented on a change in pull request #8389: URL: https://github.com/apache/arrow/pull/8389#discussion_r502558340 ## File path: r/src/recordbatchreader.cpp ## @@ -74,6 +76,12 @@ int ipc___RecordBatchFileReader__num_record_batches( return reader->num_record_batches();

[GitHub] [arrow] bkietz commented on a change in pull request #8389: ARROW-8296: [C++][Dataset] Add IpcFileWriteOptions

2020-10-09 Thread GitBox
bkietz commented on a change in pull request #8389: URL: https://github.com/apache/arrow/pull/8389#discussion_r502558047 ## File path: r/src/arrow_metadata.h ## @@ -0,0 +1,38 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agree

[GitHub] [arrow] bkietz commented on a change in pull request #8389: ARROW-8296: [C++][Dataset] Add IpcFileWriteOptions

2020-10-09 Thread GitBox
bkietz commented on a change in pull request #8389: URL: https://github.com/apache/arrow/pull/8389#discussion_r502557047 ## File path: r/src/arrow_cpp11.h ## @@ -157,19 +157,42 @@ struct ns { static SEXP arrow; }; +// Specialize this struct to define a default value to be

[GitHub] [arrow] bkietz commented on a change in pull request #8389: ARROW-8296: [C++][Dataset] Add IpcFileWriteOptions

2020-10-09 Thread GitBox
bkietz commented on a change in pull request #8389: URL: https://github.com/apache/arrow/pull/8389#discussion_r502556438 ## File path: r/configure ## @@ -201,7 +201,10 @@ else fi # Write to Makevars -sed -e "s|@cflags@|$PKG_CFLAGS|" -e "s|@libs@|$PKG_LIBS|" src/Makevars.in

[GitHub] [arrow] nealrichardson commented on a change in pull request #8389: ARROW-8296: [C++][Dataset] Add IpcFileWriteOptions

2020-10-09 Thread GitBox
nealrichardson commented on a change in pull request #8389: URL: https://github.com/apache/arrow/pull/8389#discussion_r502545157 ## File path: r/src/arrow_metadata.h ## @@ -0,0 +1,38 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor licen

[GitHub] [arrow] github-actions[bot] commented on pull request #8162: ARROW-9962: [Python] Fix conversion to_pandas with tz-aware index column and fixed offset timezones

2020-10-09 Thread GitBox
github-actions[bot] commented on pull request #8162: URL: https://github.com/apache/arrow/pull/8162#issuecomment-706278606 Revision: 0e2a734c1b1ac7aca11e0c17f8251a645579e4fd Submitted crossbow builds: [ursa-labs/crossbow @ actions-632](https://github.com/ursa-labs/crossbow/branches/a

[GitHub] [arrow] jorisvandenbossche commented on pull request #8162: ARROW-9962: [Python] Fix conversion to_pandas with tz-aware index column and fixed offset timezones

2020-10-09 Thread GitBox
jorisvandenbossche commented on pull request #8162: URL: https://github.com/apache/arrow/pull/8162#issuecomment-706270759 OK, I am just going to skip it for older pandas versions .. On the older pandas version, it's the assert_frame_equal that is failing, but the actual result and expected

[GitHub] [arrow] jorisvandenbossche commented on pull request #8162: ARROW-9962: [Python] Fix conversion to_pandas with tz-aware index column and fixed offset timezones

2020-10-09 Thread GitBox
jorisvandenbossche commented on pull request #8162: URL: https://github.com/apache/arrow/pull/8162#issuecomment-706269990 @github-actions crossbow submit -g integration This is an automated message from the Apache Git Service

[GitHub] [arrow] jduo commented on a change in pull request #8325: ARROW-10206: [C++][Python][FlightRPC] Allow disabling server validation

2020-10-09 Thread GitBox
jduo commented on a change in pull request #8325: URL: https://github.com/apache/arrow/pull/8325#discussion_r502531900 ## File path: cpp/src/arrow/flight/client.cc ## @@ -845,18 +878,48 @@ class FlightClient::FlightClientImpl { if (scheme == kSchemeGrpc || scheme == kSchem

[GitHub] [arrow] github-actions[bot] commented on pull request #8395: ARROW-10230: [JS][Doc] JavaScript documentation fails to build

2020-10-09 Thread GitBox
github-actions[bot] commented on pull request #8395: URL: https://github.com/apache/arrow/pull/8395#issuecomment-706261073 Revision: 501fb03f46fc4a34bb3d86f52df4bb81d088f23d Submitted crossbow builds: [ursa-labs/crossbow @ actions-631](https://github.com/ursa-labs/crossbow/branches/a

[GitHub] [arrow] kszucs commented on pull request #8395: ARROW-10230: [JS][Doc] JavaScript documentation fails to build

2020-10-09 Thread GitBox
kszucs commented on pull request #8395: URL: https://github.com/apache/arrow/pull/8395#issuecomment-706259921 @github-actions crossbow submit test-ubuntu-18.04-docs This is an automated message from the Apache Git Service. To

[GitHub] [arrow] bkietz commented on pull request #8305: ARROW-9782: [C++][Dataset] More configurable Dataset writing

2020-10-09 Thread GitBox
bkietz commented on pull request #8305: URL: https://github.com/apache/arrow/pull/8305#issuecomment-706251324 If you're writing with no partitioning then yes, everything will be written to a single file. In a follow up we'll probably add a special case for unpartitioned writing which alloc

[GitHub] [arrow] bkietz commented on a change in pull request #8305: ARROW-9782: [C++][Dataset] More configurable Dataset writing

2020-10-09 Thread GitBox
bkietz commented on a change in pull request #8305: URL: https://github.com/apache/arrow/pull/8305#discussion_r502511455 ## File path: python/pyarrow/tests/test_dataset.py ## @@ -2258,11 +2254,6 @@ def test_write_dataset_use_threads(tempdir): use_threads=False )

[GitHub] [arrow] bkietz commented on a change in pull request #8305: ARROW-9782: [C++][Dataset] More configurable Dataset writing

2020-10-09 Thread GitBox
bkietz commented on a change in pull request #8305: URL: https://github.com/apache/arrow/pull/8305#discussion_r502509714 ## File path: python/pyarrow/tests/test_dataset.py ## @@ -2200,7 +2196,7 @@ def test_write_dataset(tempdir): dataset = ds.dataset(directory) targ

[GitHub] [arrow] kszucs closed pull request #8413: ARROW-10175: [CI] Fix nightly HDFS integration tests (ensure to use legacy dataset)

2020-10-09 Thread GitBox
kszucs closed pull request #8413: URL: https://github.com/apache/arrow/pull/8413 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] kszucs commented on a change in pull request #8413: ARROW-10175: [CI] Fix nightly HDFS integration tests (ensure to use legacy dataset)

2020-10-09 Thread GitBox
kszucs commented on a change in pull request #8413: URL: https://github.com/apache/arrow/pull/8413#discussion_r502506815 ## File path: python/pyarrow/tests/test_hdfs.py ## @@ -321,7 +321,8 @@ def test_read_multiple_parquet_files_with_uri(self): expected = self._write

[GitHub] [arrow] bkietz commented on a change in pull request #8389: ARROW-8296: [C++][Dataset] Add IpcFileWriteOptions

2020-10-09 Thread GitBox
bkietz commented on a change in pull request #8389: URL: https://github.com/apache/arrow/pull/8389#discussion_r502506462 ## File path: cpp/src/arrow/dataset/file_ipc.cc ## @@ -185,7 +185,13 @@ Result> IpcFileFormat::MakeWriter( auto ipc_options = checked_pointer_cast(opti

[GitHub] [arrow] kszucs commented on a change in pull request #8349: ARROW-3080: [Python] Unify Arrow to Python object conversion paths

2020-10-09 Thread GitBox
kszucs commented on a change in pull request #8349: URL: https://github.com/apache/arrow/pull/8349#discussion_r502497329 ## File path: python/pyarrow/tests/test_convert_builtin.py ## @@ -1903,18 +1946,40 @@ def test_dictionary_from_strings(): assert a.dictionary.equals(exp

[GitHub] [arrow] kszucs closed pull request #8396: ARROW-10231: [CI] Unable to download minio in arm32v7 docker image

2020-10-09 Thread GitBox
kszucs closed pull request #8396: URL: https://github.com/apache/arrow/pull/8396 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] kszucs commented on pull request #8396: ARROW-10231: [CI] Unable to download minio in arm32v7 docker image

2020-10-09 Thread GitBox
kszucs commented on pull request #8396: URL: https://github.com/apache/arrow/pull/8396#issuecomment-706234971 The build failure is unrelated. This is an automated message from the Apache Git Service. To respond to the message

[GitHub] [arrow] kszucs commented on pull request #8396: ARROW-10231: [CI] Unable to download minio in arm32v7 docker image

2020-10-09 Thread GitBox
kszucs commented on pull request #8396: URL: https://github.com/apache/arrow/pull/8396#issuecomment-706234037 Ran the build on the arm32 machine and it works although both machines are down at the moment, so I'm disabling these builds to prevent failing commits on the main branch. --

[GitHub] [arrow] kszucs commented on a change in pull request #8349: ARROW-3080: [Python] Unify Arrow to Python object conversion paths

2020-10-09 Thread GitBox
kszucs commented on a change in pull request #8349: URL: https://github.com/apache/arrow/pull/8349#discussion_r502490505 ## File path: python/pyarrow/tests/test_convert_builtin.py ## @@ -1903,18 +1946,40 @@ def test_dictionary_from_strings(): assert a.dictionary.equals(exp

[GitHub] [arrow] arw2019 commented on pull request #8244: ARROW-8355: [Python] Remove hard pandas dependency from FeatherDataset and minimize pandas dependency in test_feather.py

2020-10-09 Thread GitBox
arw2019 commented on pull request #8244: URL: https://github.com/apache/arrow/pull/8244#issuecomment-706229088 Thanks @jorisvandenbossche for reviewing! This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow] nealrichardson commented on pull request #8411: ARROW-10114: [R] Segfault in to_dataframe_parallel with deeply nested structs

2020-10-09 Thread GitBox
nealrichardson commented on pull request #8411: URL: https://github.com/apache/arrow/pull/8411#issuecomment-706227714 Do we want to add head.jsonl, or perhaps some other minimal fixture that triggered the bug, to the test suite?

[GitHub] [arrow] jorisvandenbossche closed pull request #8244: ARROW-8355: [Python] Remove hard pandas dependency from FeatherDataset and minimize pandas dependency in test_feather.py

2020-10-09 Thread GitBox
jorisvandenbossche closed pull request #8244: URL: https://github.com/apache/arrow/pull/8244 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [arrow] jorisvandenbossche commented on pull request #8244: ARROW-8355: [Python] Remove hard pandas dependency from FeatherDataset and minimize pandas dependency in test_feather.py

2020-10-09 Thread GitBox
jorisvandenbossche commented on pull request #8244: URL: https://github.com/apache/arrow/pull/8244#issuecomment-706225605 Thanks @arw2019 ! This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [arrow] jorisvandenbossche commented on pull request #8413: ARROW-10175: [CI] Fix nightly HDFS integration tests (ensure to use legacy dataset)

2020-10-09 Thread GitBox
jorisvandenbossche commented on pull request #8413: URL: https://github.com/apache/arrow/pull/8413#issuecomment-706219230 HDFS integration build is green now ;) This is an automated message from the Apache Git Service. To res

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #8389: ARROW-8296: [C++][Dataset] Add IpcFileWriteOptions

2020-10-09 Thread GitBox
jorisvandenbossche commented on a change in pull request #8389: URL: https://github.com/apache/arrow/pull/8389#discussion_r502468990 ## File path: cpp/src/arrow/dataset/file_ipc.cc ## @@ -185,7 +185,13 @@ Result> IpcFileFormat::MakeWriter( auto ipc_options = checked_point

[GitHub] [arrow] github-actions[bot] commented on pull request #8414: ARROW-7957: [Python] Handle new FileSystem in ParquetDataset by automatically using new implementation

2020-10-09 Thread GitBox
github-actions[bot] commented on pull request #8414: URL: https://github.com/apache/arrow/pull/8414#issuecomment-706213956 https://issues.apache.org/jira/browse/ARROW-7957 This is an automated message from the Apache Git Serv

[GitHub] [arrow] bkietz commented on a change in pull request #8301: ARROW-10100: [C++][Python][Dataset] Add ParquetFileFragment::Subset method

2020-10-09 Thread GitBox
bkietz commented on a change in pull request #8301: URL: https://github.com/apache/arrow/pull/8301#discussion_r502461050 ## File path: cpp/src/arrow/dataset/file_parquet.cc ## @@ -631,6 +631,39 @@ Result ParquetFileFragment::SplitByRowGroup( return fragments; } +Result>

[GitHub] [arrow] jorisvandenbossche opened a new pull request #8414: ARROW-7957: [Python] Handle new FileSystem in ParquetDataset by automatically using new implementation

2020-10-09 Thread GitBox
jorisvandenbossche opened a new pull request #8414: URL: https://github.com/apache/arrow/pull/8414 In principle, the user can also simply pass `use_legacy_dataset=False`, but I think it is 1) a bit more user friendly to do this automatically (it's clear that you need this with a new filesy

[GitHub] [arrow] github-actions[bot] commented on pull request #8413: ARROW-10175: [CI] Fix nightly HDFS integration tests (ensure to use legacy dataset)

2020-10-09 Thread GitBox
github-actions[bot] commented on pull request #8413: URL: https://github.com/apache/arrow/pull/8413#issuecomment-706194766 This is an automated message from the Apache Git Service. To respond to the message, please log on to G

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #8413: ARROW-10175: [CI] Fix nightly HDFS integration tests (ensure to use legacy dataset)

2020-10-09 Thread GitBox
jorisvandenbossche commented on a change in pull request #8413: URL: https://github.com/apache/arrow/pull/8413#discussion_r502441529 ## File path: python/pyarrow/tests/test_hdfs.py ## @@ -321,7 +321,8 @@ def test_read_multiple_parquet_files_with_uri(self): expected =

[GitHub] [arrow] github-actions[bot] commented on pull request #8412: ARROW-9952: [Python] Optionally use pyarrow.dataset in parquet.write_to_dataset

2020-10-09 Thread GitBox
github-actions[bot] commented on pull request #8412: URL: https://github.com/apache/arrow/pull/8412#issuecomment-706178576 This is an automated message from the Apache Git Service. To respond to the message, please log on to G

[GitHub] [arrow] jorisvandenbossche commented on pull request #8413: ARROW-10175: [CI] Fix nightly HDFS integration tests (ensure to use legacy dataset)

2020-10-09 Thread GitBox
jorisvandenbossche commented on pull request #8413: URL: https://github.com/apache/arrow/pull/8413#issuecomment-706193657 @github-actions crossbow submit test-conda-python-3.7-hdfs-2.9.2 This is an automated message from

[GitHub] [arrow] HedgehogCode commented on a change in pull request #8363: ARROW-10174: [Java] Fix reading/writing dict structs

2020-10-09 Thread GitBox
HedgehogCode commented on a change in pull request #8363: URL: https://github.com/apache/arrow/pull/8363#discussion_r502432327 ## File path: java/vector/src/main/java/org/apache/arrow/vector/util/DictionaryUtility.java ## @@ -115,25 +118,28 @@ public static Field toMemoryForma

[GitHub] [arrow] jorisvandenbossche opened a new pull request #8413: ARROW-10175: [CI] Fix nightly HDFS integration tests (ensure to use legacy dataset)

2020-10-09 Thread GitBox
jorisvandenbossche opened a new pull request #8413: URL: https://github.com/apache/arrow/pull/8413 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow] github-actions[bot] removed a comment on pull request #8412: ARROW-9952: [Python] Optionally use pyarrow.dataset in parquet.write_to_dataset

2020-10-09 Thread GitBox
github-actions[bot] removed a comment on pull request #8412: URL: https://github.com/apache/arrow/pull/8412#issuecomment-706178576 Thanks for opening a pull request! Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW Th

[GitHub] [arrow] HedgehogCode commented on pull request #8363: ARROW-10174: [Java] Fix reading/writing dict structs

2020-10-09 Thread GitBox
HedgehogCode commented on pull request #8363: URL: https://github.com/apache/arrow/pull/8363#issuecomment-706193122 I implemented the changes and force pushed them. Thank you for the advice with the `RangeEqualsVisitor`. Using it I discovered another bug in `DictionaryUtility`. The f

[GitHub] [arrow] bkietz edited a comment on pull request #8371: WIP: ARROW-4960: [R] Build r-arrow conda package in crossbow

2020-10-09 Thread GitBox
bkietz edited a comment on pull request #8371: URL: https://github.com/apache/arrow/pull/8371#issuecomment-705748359 unfortunately `cpp11::r_vector::const_iterator::operator*` is not marked const, but MSVC's impl of `std::copy` marks its [arguments const](https://github.com/microsoft/STL/b

[GitHub] [arrow] nevi-me commented on pull request #8388: ARROW-10225: [Rust] [Parquet] Fix null comparison in roundtrip

2020-10-09 Thread GitBox
nevi-me commented on pull request #8388: URL: https://github.com/apache/arrow/pull/8388#issuecomment-705636359 Merged This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [arrow] alamb commented on a change in pull request #8340: ARROW-10165: [Rust] [DataFusion]: Remove special case DataFusion casting checks in favor of Arrow cast kernel

2020-10-09 Thread GitBox
alamb commented on a change in pull request #8340: URL: https://github.com/apache/arrow/pull/8340#discussion_r501797053 ## File path: rust/datafusion/src/logical_plan/mod.rs ## @@ -323,21 +322,19 @@ impl Expr { /// /// # Errors /// -/// This function errors w

[GitHub] [arrow] wesm edited a comment on pull request #8325: ARROW-10206: [C++][Python][FlightRPC] Allow disabling server validation

2020-10-09 Thread GitBox
wesm edited a comment on pull request #8325: URL: https://github.com/apache/arrow/pull/8325#issuecomment-705851235 There's some broken stuff for me locally with clang-8, I'm trying to fix ``` In file included from /home/wesm/code/arrow/cpp/src/arrow/flight/try_compile/check_tls_op

  1   2   3   >