[GitHub] [arrow] pitrou commented on a change in pull request #12231: ARROW-14783: [C++][Python] Fix the BytesIO support issue

2022-01-23 Thread GitBox
pitrou commented on a change in pull request #12231: URL: https://github.com/apache/arrow/pull/12231#discussion_r790245578 ## File path: python/pyarrow/orc.py ## @@ -22,6 +22,7 @@ from pyarrow.lib import Table import pyarrow._orc as _orc from pyarrow.fs import _resolve_files

[GitHub] [arrow] toppyy commented on pull request #12083: ARROW-14744: [R] open_dataset() error when `schema` argument supplied, but `column_names` not supplied to `CSVReadOptions`

2022-01-23 Thread GitBox
toppyy commented on pull request #12083: URL: https://github.com/apache/arrow/pull/12083#issuecomment-1019446733 I refactored/simplified the column_names vs. schema-names comparison a bit. While doing this, I realized that the solution did not solve the original issue. I was comparin

[GitHub] [arrow-rs] jhorstmann commented on pull request #1219: Do not concatenate identical dictionaries

2022-01-23 Thread GitBox
jhorstmann commented on pull request #1219: URL: https://github.com/apache/arrow-rs/pull/1219#issuecomment-1019447411 Looks good to me. I initially thought `ptr_eq` for dictionary values could get away with only comparing the data buffer, but in theory the same buffer could be combined wit

[GitHub] [arrow-rs] jhorstmann commented on issue #1218: Cast Dictionary Options

2022-01-23 Thread GitBox
jhorstmann commented on issue #1218: URL: https://github.com/apache/arrow-rs/issues/1218#issuecomment-1019448610 Slightly related to the `make_ordered` function from this draft PR: https://github.com/apache/arrow-rs/pull/1048/files Doing this in the `cast` kernels seems a bit more ge

[GitHub] [arrow-datafusion] alamb commented on pull request #1639: fix a cte block with same name for many times

2022-01-23 Thread GitBox
alamb commented on pull request #1639: URL: https://github.com/apache/arrow-datafusion/pull/1639#issuecomment-1019451797 Thanks @xudong963 , @Dandandan and @houqp ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

[GitHub] [arrow-datafusion] alamb merged pull request #1639: fix a cte block with same name for many times

2022-01-23 Thread GitBox
alamb merged pull request #1639: URL: https://github.com/apache/arrow-datafusion/pull/1639 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-

[GitHub] [arrow-rs] alamb merged pull request #1220: Add non utf8 values into the test cases of BinaryArray comparison

2022-01-23 Thread GitBox
alamb merged pull request #1220: URL: https://github.com/apache/arrow-rs/pull/1220 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[GitHub] [arrow-rs] jhorstmann opened a new pull request #1222: Remove memory-check feature

2022-01-23 Thread GitBox
jhorstmann opened a new pull request #1222: URL: https://github.com/apache/arrow-rs/pull/1222 # Which issue does this PR close? Closes #1171. # Rationale for this change The `memory-check` feature was supposed to help in finding memory leaks, but the correspond

[GitHub] [arrow-rs] codecov-commenter commented on pull request #1222: Remove memory-check feature

2022-01-23 Thread GitBox
codecov-commenter commented on pull request #1222: URL: https://github.com/apache/arrow-rs/pull/1222#issuecomment-1019454677 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1222?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=T

[GitHub] [arrow] kszucs commented on pull request #12227: [Python][Packaging] Use vcpkg manifest and update vcpkg version

2022-01-23 Thread GitBox
kszucs commented on pull request #12227: URL: https://github.com/apache/arrow/pull/12227#issuecomment-1019456330 @github-actions crossbow submit wheel-manylinux2014-cp38-amd64 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

[GitHub] [arrow] github-actions[bot] commented on pull request #12227: [Python][Packaging] Use vcpkg manifest and update vcpkg version

2022-01-23 Thread GitBox
github-actions[bot] commented on pull request #12227: URL: https://github.com/apache/arrow/pull/12227#issuecomment-1019456474 Revision: 11d069496e41bc883c857d535bbc5c3417a54f71 Submitted crossbow builds: [ursacomputing/crossbow @ actions-1431](https://github.com/ursacomputing/crossbo

[GitHub] [arrow-datafusion] Igosuki commented on pull request #1556: Officially maintained Arrow2 branch

2022-01-23 Thread GitBox
Igosuki commented on pull request #1556: URL: https://github.com/apache/arrow-datafusion/pull/1556#issuecomment-1019463521 > Query 1 iteration 0 took 27109.6 ms > Query 1 avg time: 27109.65 ms > ``` @andygrove What tool did you use to get such a smooth CPU chart ? -- Thi

[GitHub] [arrow-datafusion] Igosuki edited a comment on pull request #1556: Officially maintained Arrow2 branch

2022-01-23 Thread GitBox
Igosuki edited a comment on pull request #1556: URL: https://github.com/apache/arrow-datafusion/pull/1556#issuecomment-1019463521 > ![arrow2](https://user-images.githubusercontent.com/934084/149633991-a1615ac0-a1ca-46f0-9b76-f0abb6917d2c.png) @andygrove What tool did you use to get

[GitHub] [arrow] wzx140 commented on pull request #12229: ARROW-15414: [java] RangeEqualsVisitor does not work for BitVector

2022-01-23 Thread GitBox
wzx140 commented on pull request #12229: URL: https://github.com/apache/arrow/pull/12229#issuecomment-1019464165 @liyafan82 Could you please review it? Really thanks! This problem is very similar to ARROW-13981 -- This is an automated message from the Apache Git Service. To respond to th

[GitHub] [arrow] wzx140 edited a comment on pull request #12229: ARROW-15414: [java] RangeEqualsVisitor does not work for BitVector

2022-01-23 Thread GitBox
wzx140 edited a comment on pull request #12229: URL: https://github.com/apache/arrow/pull/12229#issuecomment-1019464165 @lidavidm Could you please review it? Really thanks! This problem is very similar to ARROW-13981 -- This is an automated message from the Apache Git Service. To respon

[GitHub] [arrow] ursabot edited a comment on pull request #11993: ARROW-15153: [Python] Expose ReferencedBufferSize to python

2022-01-23 Thread GitBox
ursabot edited a comment on pull request #11993: URL: https://github.com/apache/arrow/pull/11993#issuecomment-1018108378 Benchmark runs are scheduled for baseline = cba3ee6017d1cf621a8cd805aa1963fae7af9ad9 and contender = 658bec37aa5cbdd53b5e4cdc81b8ba3962e67f11. 658bec37aa5cbdd53b5e4cdc8

[GitHub] [arrow-rs] alamb opened a new pull request #1223: `DecimalArray` API ergonomics: add iter(), create from iter(), change precision / scale

2022-01-23 Thread GitBox
alamb opened a new pull request #1223: URL: https://github.com/apache/arrow-rs/pull/1223 # Which issue does this PR close? Closes https://github.com/apache/arrow-rs/issues/1009 Closes https://github.com/apache/arrow-rs/issues/1083 # Rationale for this change

[GitHub] [arrow] kou commented on a change in pull request #11889: ARROW-14708: [C++] Adding missing abseil dependencies to enable static flight build

2022-01-23 Thread GitBox
kou commented on a change in pull request #11889: URL: https://github.com/apache/arrow/pull/11889#discussion_r790269489 ## File path: cpp/cmake_modules/gRPCVariables.cmake ## @@ -0,0 +1,25 @@ +# Licensed to the Apache Software Foundation (ASF) under one Review comment:

[GitHub] [arrow-rs] alamb commented on a change in pull request #1223: `DecimalArray` API ergonomics: add iter(), create from iter(), change precision / scale

2022-01-23 Thread GitBox
alamb commented on a change in pull request #1223: URL: https://github.com/apache/arrow-rs/pull/1223#discussion_r790269438 ## File path: arrow/src/array/array_binary.rs ## @@ -816,13 +827,80 @@ impl DecimalArray { let array_data = unsafe { builder.build_unchecked() };

[GitHub] [arrow-rs] alamb commented on issue #1009: Add creator from Iterator of `i128` to get the decimalarray

2022-01-23 Thread GitBox
alamb commented on issue #1009: URL: https://github.com/apache/arrow-rs/issues/1009#issuecomment-1019475331 i took a crack at this in https://github.com/apache/arrow-rs/pull/1223 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [arrow-rs] alamb commented on issue #1083: Add into iter for decimal array

2022-01-23 Thread GitBox
alamb commented on issue #1083: URL: https://github.com/apache/arrow-rs/issues/1083#issuecomment-1019475412 I took a crack at this one in https://github.com/apache/arrow-rs/pull/1223 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

[GitHub] [arrow-rs] codecov-commenter commented on pull request #1223: `DecimalArray` API ergonomics: add iter(), create from iter(), change precision / scale

2022-01-23 Thread GitBox
codecov-commenter commented on pull request #1223: URL: https://github.com/apache/arrow-rs/pull/1223#issuecomment-1019476786 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1223?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=T

[GitHub] [arrow-datafusion] xudong963 opened a new pull request #1649: feat: implement exists subquery

2022-01-23 Thread GitBox
xudong963 opened a new pull request #1649: URL: https://github.com/apache/arrow-datafusion/pull/1649 # Which issue does this PR close? Closes #. # Rationale for this change # What changes are included in this PR? # Are there any user-facing change

[GitHub] [arrow-datafusion] xudong963 removed a comment on pull request #1373: Support subqueries

2022-01-23 Thread GitBox
xudong963 removed a comment on pull request #1373: URL: https://github.com/apache/arrow-datafusion/pull/1373#issuecomment-1019430328 Plz reopen the PR, I start to write the ticket 😄 @alamb -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [arrow] kszucs commented on pull request #12232: [Python] Add option to skip gdb tests

2022-01-23 Thread GitBox
kszucs commented on pull request #12232: URL: https://github.com/apache/arrow/pull/12232#issuecomment-1019480308 @github-actions crossbow submit wheel-macos-big-sur-cp39-arm64 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

[GitHub] [arrow] github-actions[bot] commented on pull request #12232: [Python] Add option to skip gdb tests

2022-01-23 Thread GitBox
github-actions[bot] commented on pull request #12232: URL: https://github.com/apache/arrow/pull/12232#issuecomment-1019480503 Revision: 959b6111acbba2ce9c70347252e64a26c88bc312 Submitted crossbow builds: [ursacomputing/crossbow @ actions-1432](https://github.com/ursacomputing/crossbo

[GitHub] [arrow] github-actions[bot] commented on pull request #12232: ARROW-15416: [Python] Add option to skip gdb tests

2022-01-23 Thread GitBox
github-actions[bot] commented on pull request #12232: URL: https://github.com/apache/arrow/pull/12232#issuecomment-1019480614 https://issues.apache.org/jira/browse/ARROW-15416 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

[GitHub] [arrow] kszucs commented on pull request #12232: ARROW-15416: [Python] Add option to skip gdb tests

2022-01-23 Thread GitBox
kszucs commented on pull request #12232: URL: https://github.com/apache/arrow/pull/12232#issuecomment-1019482976 @github-actions crossbow submit wheel-macos-big-sur-cp39-arm64 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

[GitHub] [arrow] github-actions[bot] commented on pull request #12232: ARROW-15416: [Python] Add option to skip gdb tests

2022-01-23 Thread GitBox
github-actions[bot] commented on pull request #12232: URL: https://github.com/apache/arrow/pull/12232#issuecomment-1019483232 Revision: a0674cf5a01a2c9410859ca907e54e71c1107f65 Submitted crossbow builds: [ursacomputing/crossbow @ actions-1433](https://github.com/ursacomputing/crossbo

[GitHub] [arrow-datafusion] longwusha commented on a change in pull request #1632: Update clap requirement from 2.33 to 3.0

2022-01-23 Thread GitBox
longwusha commented on a change in pull request #1632: URL: https://github.com/apache/arrow-datafusion/pull/1632#discussion_r790276052 ## File path: datafusion-cli/Cargo.toml ## @@ -27,7 +27,7 @@ repository = "https://github.com/apache/arrow-datafusion"; rust-version = "1.58"

[GitHub] [arrow-rs] tustvold opened a new issue #1224: MutableArrayData Builds Null Mask Unnecessarily

2022-01-23 Thread GitBox
tustvold opened a new issue #1224: URL: https://github.com/apache/arrow-rs/issues/1224 **Describe the bug** `MutableArrayData` contains logic to skip computing a null mask if the source data contains no nulls. In such a scenario it skips pre-allocating space for the null buffer. How

[GitHub] [arrow-rs] tustvold opened a new pull request #1225: Skip building null mask in MutableArrayData (#1224)

2022-01-23 Thread GitBox
tustvold opened a new pull request #1225: URL: https://github.com/apache/arrow-rs/pull/1225 # Which issue does this PR close? Closes #1224 # Rationale for this change See ticket # What changes are included in this PR? This changes MutableArrayData to skip

[GitHub] [arrow-datafusion] xudong963 opened a new pull request #1650: refine match pattern related code

2022-01-23 Thread GitBox
xudong963 opened a new pull request #1650: URL: https://github.com/apache/arrow-datafusion/pull/1650 # Which issue does this PR close? No # Rationale for this change # What changes are included in this PR? # Are there any user-facing changes?

[GitHub] [arrow-rs] tustvold commented on a change in pull request #1225: Skip building null mask in MutableArrayData (#1224)

2022-01-23 Thread GitBox
tustvold commented on a change in pull request #1225: URL: https://github.com/apache/arrow-rs/pull/1225#discussion_r790276434 ## File path: arrow/src/array/transform/mod.rs ## @@ -552,17 +552,19 @@ impl<'a> MutableArrayData<'a> { let extend_nulls = build_extend_nulls

[GitHub] [arrow-datafusion] xudong963 commented on pull request #1650: refine match pattern related code

2022-01-23 Thread GitBox
xudong963 commented on pull request #1650: URL: https://github.com/apache/arrow-datafusion/pull/1650#issuecomment-1019485657 During I wrote #1649, I found we can refine the match-pattern related code to make it clearer. -- This is an automated message from the Apache Git Service. To res

[GitHub] [arrow] xhochy commented on pull request #11916: ARROW-14506: [C++] Conda support for google-cloud-cpp

2022-01-23 Thread GitBox
xhochy commented on pull request #11916: URL: https://github.com/apache/arrow/pull/11916#issuecomment-1019486224 @github-actions crossbow submit -g conda -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

[GitHub] [arrow-datafusion] yjshen commented on issue #1573: SQL tests for when sorting exceeded available memory and had to spill to disk

2022-01-23 Thread GitBox
yjshen commented on issue #1573: URL: https://github.com/apache/arrow-datafusion/issues/1573#issuecomment-1019486353 @alamb, I think this one can be closed. Solved in #1641. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

[GitHub] [arrow] github-actions[bot] commented on pull request #11916: ARROW-14506: [C++] Conda support for google-cloud-cpp

2022-01-23 Thread GitBox
github-actions[bot] commented on pull request #11916: URL: https://github.com/apache/arrow/pull/11916#issuecomment-1019486444 Revision: 584f3f43878d514d59c6d47ece447c015da23d9c Submitted crossbow builds: [ursacomputing/crossbow @ actions-1434](https://github.com/ursacomputing/crossbo

[GitHub] [arrow-datafusion] xudong963 commented on a change in pull request #1632: Update clap requirement from 2.33 to 3.0

2022-01-23 Thread GitBox
xudong963 commented on a change in pull request #1632: URL: https://github.com/apache/arrow-datafusion/pull/1632#discussion_r790277440 ## File path: datafusion-cli/Cargo.toml ## @@ -27,7 +27,7 @@ repository = "https://github.com/apache/arrow-datafusion"; rust-version = "1.58"

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #1622: Handle merging of evolved schemas in ParquetExec

2022-01-23 Thread GitBox
alamb commented on a change in pull request #1622: URL: https://github.com/apache/arrow-datafusion/pull/1622#discussion_r790277607 ## File path: datafusion/src/physical_plan/file_format/parquet.rs ## @@ -457,22 +518,313 @@ fn read_partition( #[cfg(test)] mod tests { -us

[GitHub] [arrow-datafusion] alamb merged pull request #1622: Handle merging of evolved schemas in ParquetExec

2022-01-23 Thread GitBox
alamb merged pull request #1622: URL: https://github.com/apache/arrow-datafusion/pull/1622 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-

[GitHub] [arrow-datafusion] alamb closed issue #132: Add support for Parquet schema merging

2022-01-23 Thread GitBox
alamb closed issue #132: URL: https://github.com/apache/arrow-datafusion/issues/132 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubsc

[GitHub] [arrow-datafusion] alamb commented on pull request #1622: Handle merging of evolved schemas in ParquetExec

2022-01-23 Thread GitBox
alamb commented on pull request #1622: URL: https://github.com/apache/arrow-datafusion/pull/1622#issuecomment-1019486887 Thanks @thinkharderdev ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow-rs] tustvold commented on pull request #1225: Skip building null mask in MutableArrayData (#1224)

2022-01-23 Thread GitBox
tustvold commented on pull request #1225: URL: https://github.com/apache/arrow-rs/pull/1225#issuecomment-1019486970 Benchmarks ``` filter u8 time: [289.42 us 290.41 us 291.38 us] change: [-40.610% -40.339% -40.079%]

[GitHub] [arrow] kszucs closed pull request #12232: ARROW-15416: [Python] Add option to skip gdb tests

2022-01-23 Thread GitBox
kszucs closed pull request #12232: URL: https://github.com/apache/arrow/pull/12232 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[GitHub] [arrow-rs] codecov-commenter commented on pull request #1225: Skip building null mask in MutableArrayData (#1224)

2022-01-23 Thread GitBox
codecov-commenter commented on pull request #1225: URL: https://github.com/apache/arrow-rs/pull/1225#issuecomment-1019487289 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1225?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=T

[GitHub] [arrow-datafusion] alamb opened a new issue #1651: Panic/dropped data when reading parquet files with incompatible shemas

2022-01-23 Thread GitBox
alamb opened a new issue #1651: URL: https://github.com/apache/arrow-datafusion/issues/1651 @thinkharderdev added the ability to read from multiple parquet files that have different schemas in https://github.com/apache/arrow-datafusion/pull/1622 However, if some of the files have "in

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #1622: Handle merging of evolved schemas in ParquetExec

2022-01-23 Thread GitBox
alamb commented on a change in pull request #1622: URL: https://github.com/apache/arrow-datafusion/pull/1622#discussion_r790277965 ## File path: datafusion/src/physical_plan/file_format/parquet.rs ## @@ -457,22 +518,313 @@ fn read_partition( #[cfg(test)] mod tests { -us

[GitHub] [arrow] github-actions[bot] removed a comment on pull request #12227: ARROW-15417: [Python][Packaging] Use vcpkg manifest to install dependencies; update vcpkg version; use bundled AWS SDK

2022-01-23 Thread GitBox
github-actions[bot] removed a comment on pull request #12227: URL: https://github.com/apache/arrow/pull/12227#issuecomment-1018933030 Thanks for opening a pull request! If this is not a [minor PR](https://github.com/apache/arrow/blob/master/CONTRIBUTING.md#Minor-Fixes). Coul

[GitHub] [arrow] github-actions[bot] commented on pull request #12227: ARROW-15417: [Python][Packaging] Use vcpkg manifest to install dependencies; update vcpkg version; use bundled AWS SDK

2022-01-23 Thread GitBox
github-actions[bot] commented on pull request #12227: URL: https://github.com/apache/arrow/pull/12227#issuecomment-1019487690 https://issues.apache.org/jira/browse/ARROW-15417 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

[GitHub] [arrow] ursabot commented on pull request #12232: ARROW-15416: [Python] Add option to skip gdb tests

2022-01-23 Thread GitBox
ursabot commented on pull request #12232: URL: https://github.com/apache/arrow/pull/12232#issuecomment-1019488065 Benchmark runs are scheduled for baseline = 664a15896c8eb07916beb054b45db3e35b9810da and contender = 7f0867e616c23fe3e0ebaed8e8ef11be1b9f1dd9. 7f0867e616c23fe3e0ebaed8e8ef11be

[GitHub] [arrow] kszucs commented on pull request #12227: ARROW-15417: [Python][Packaging] Use vcpkg manifest to install dependencies; update vcpkg version; use bundled AWS SDK

2022-01-23 Thread GitBox
kszucs commented on pull request #12227: URL: https://github.com/apache/arrow/pull/12227#issuecomment-1019488283 @github-actions crossbow submit wheel-manylinux* wheel-macos* -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

[GitHub] [arrow-rs] alamb commented on a change in pull request #1214: Batch multiple records in ArrowWriter

2022-01-23 Thread GitBox
alamb commented on a change in pull request #1214: URL: https://github.com/apache/arrow-rs/pull/1214#discussion_r790278532 ## File path: parquet/src/arrow/arrow_writer.rs ## @@ -40,14 +43,23 @@ use crate::{ /// Arrow writer /// -/// Writes Arrow `RecordBatch`es to a Parquet

[GitHub] [arrow] github-actions[bot] commented on pull request #12227: ARROW-15417: [Python][Packaging] Use vcpkg manifest to install dependencies; update vcpkg version; use bundled AWS SDK

2022-01-23 Thread GitBox
github-actions[bot] commented on pull request #12227: URL: https://github.com/apache/arrow/pull/12227#issuecomment-1019488588 Revision: ae130414bbc882646abd28d4e711692aae314af6 Submitted crossbow builds: [ursacomputing/crossbow @ actions-1435](https://github.com/ursacomputing/crossbo

[GitHub] [arrow-rs] alamb opened a new issue #1226: RecordBatch::slice does not take offset into account

2022-01-23 Thread GitBox
alamb opened a new issue #1226: URL: https://github.com/apache/arrow-rs/issues/1226 **Describe the bug** RecordBatch::slice() does not properly take `offset` into account Write **To Reproduce** Slice a `RecordBatch that has an offset; The slice will be relative to 0 **

[GitHub] [arrow-datafusion] Ted-Jiang commented on issue #1648: Cannot query parquet files generated by Apache Spark from datafusion-cli

2022-01-23 Thread GitBox
Ted-Jiang commented on issue #1648: URL: https://github.com/apache/arrow-datafusion/issues/1648#issuecomment-1019490399 @houqp plz assign this to me 😊. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow-rs] alamb commented on a change in pull request #1214: Batch multiple records in ArrowWriter

2022-01-23 Thread GitBox
alamb commented on a change in pull request #1214: URL: https://github.com/apache/arrow-rs/pull/1214#discussion_r790278638 ## File path: parquet/src/arrow/arrow_writer.rs ## @@ -75,54 +87,109 @@ impl ArrowWriter { Ok(Self { writer: file_writer, +

[GitHub] [arrow] ursabot edited a comment on pull request #12232: ARROW-15416: [Python] Add option to skip gdb tests

2022-01-23 Thread GitBox
ursabot edited a comment on pull request #12232: URL: https://github.com/apache/arrow/pull/12232#issuecomment-1019488065 Benchmark runs are scheduled for baseline = 664a15896c8eb07916beb054b45db3e35b9810da and contender = 7f0867e616c23fe3e0ebaed8e8ef11be1b9f1dd9. 7f0867e616c23fe3e0ebaed8e

[GitHub] [arrow-rs] alamb commented on a change in pull request #1222: Remove memory-check feature

2022-01-23 Thread GitBox
alamb commented on a change in pull request #1222: URL: https://github.com/apache/arrow-rs/pull/1222#discussion_r790281175 ## File path: arrow/CONTRIBUTING.md ## @@ -26,19 +26,6 @@ Rust [README.md](../README.md). Please refer to [lib.rs](src/lib.rs) for an introduction to this

[GitHub] [arrow-datafusion] thinkharderdev commented on pull request #1622: Handle merging of evolved schemas in ParquetExec

2022-01-23 Thread GitBox
thinkharderdev commented on pull request #1622: URL: https://github.com/apache/arrow-datafusion/pull/1622#issuecomment-1019492564 > Thanks @thinkharderdev ! Thanks you for your help! -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [arrow-rs] jhorstmann commented on pull request #1225: Skip building null mask in MutableArrayData (#1224)

2022-01-23 Thread GitBox
jhorstmann commented on pull request #1225: URL: https://github.com/apache/arrow-rs/pull/1225#issuecomment-1019493554 Nice, I think this should also speed up the `concatenate_kernel` benchmarks. -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [arrow-julia] kou commented on pull request #278: Create CONTRIBUTING.md

2022-01-23 Thread GitBox
kou commented on pull request #278: URL: https://github.com/apache/arrow-julia/pull/278#issuecomment-1019494122 We can use `JuliaRegistries/TagBot` action because it's added to allow list: See #273 and the linked Jira issue for details. We need a vote for each new release before we p

[GitHub] [arrow] kszucs commented on pull request #12227: ARROW-15417: [Python][Packaging] Use vcpkg manifest to install dependencies; update vcpkg version; use bundled AWS SDK

2022-01-23 Thread GitBox
kszucs commented on pull request #12227: URL: https://github.com/apache/arrow/pull/12227#issuecomment-1019494414 @github-actions crossbow submit wheel-macos-big-sur-cp39-universal2 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

[GitHub] [arrow-rs] alamb commented on a change in pull request #1219: Do not concatenate identical dictionaries

2022-01-23 Thread GitBox
alamb commented on a change in pull request #1219: URL: https://github.com/apache/arrow-rs/pull/1219#discussion_r790282646 ## File path: arrow/src/array/data.rs ## @@ -1155,6 +1155,41 @@ impl ArrayData { Ok(()) }) } + +/// Returns true if this `Ar

[GitHub] [arrow] github-actions[bot] commented on pull request #12227: ARROW-15417: [Python][Packaging] Use vcpkg manifest to install dependencies; update vcpkg version; use bundled AWS SDK

2022-01-23 Thread GitBox
github-actions[bot] commented on pull request #12227: URL: https://github.com/apache/arrow/pull/12227#issuecomment-1019494627 Revision: 979825bc3c874bce1a2e82a9600e10dc99b0ecd6 Submitted crossbow builds: [ursacomputing/crossbow @ actions-1436](https://github.com/ursacomputing/crossbo

[GitHub] [arrow-rs] tustvold opened a new issue #1227: UnalignedBitChunkIterator

2022-01-23 Thread GitBox
tustvold opened a new issue #1227: URL: https://github.com/apache/arrow-rs/issues/1227 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** `BitChunks` and the associated `BitChunkIterator` allow iterating over a bitmask in u64 si

[GitHub] [arrow-rs] alamb commented on a change in pull request #1219: Do not concatenate identical dictionaries

2022-01-23 Thread GitBox
alamb commented on a change in pull request #1219: URL: https://github.com/apache/arrow-rs/pull/1219#discussion_r790283770 ## File path: arrow/src/compute/kernels/concat.rs ## @@ -525,4 +525,44 @@ mod tests { Ok(()) } + +#[test] +fn test_dictionary_conca

[GitHub] [arrow-rs] alamb commented on issue #504: Do not copy dictionary values when they are the same in `concat`

2022-01-23 Thread GitBox
alamb commented on issue #504: URL: https://github.com/apache/arrow-rs/issues/504#issuecomment-1019496497 FYI @tustvold has a proposed implementation in https://github.com/apache/arrow-rs/pull/1219 -- This is an automated message from the Apache Git Service. To respond to the message, p

[GitHub] [arrow] bkmgit commented on a change in pull request #11882: ARROW-9843: [C++][Python] Implement Between ternary kernel and Python bindings

2022-01-23 Thread GitBox
bkmgit commented on a change in pull request #11882: URL: https://github.com/apache/arrow/pull/11882#discussion_r790283940 ## File path: cpp/src/arrow/compute/kernels/scalar_compare.cc ## @@ -746,6 +808,279 @@ std::shared_ptr MakeScalarMinMax(std::string name, return func;

[GitHub] [arrow] kszucs commented on pull request #12227: ARROW-15417: [Python][Packaging] Use vcpkg manifest to install dependencies; update vcpkg version; use bundled AWS SDK

2022-01-23 Thread GitBox
kszucs commented on pull request #12227: URL: https://github.com/apache/arrow/pull/12227#issuecomment-1019497281 @github-actions crossbow submit wheel-macos-big-sur-cp39-universal2 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

[GitHub] [arrow] github-actions[bot] commented on pull request #12227: ARROW-15417: [Python][Packaging] Use vcpkg manifest to install dependencies; update vcpkg version; use bundled AWS SDK

2022-01-23 Thread GitBox
github-actions[bot] commented on pull request #12227: URL: https://github.com/apache/arrow/pull/12227#issuecomment-1019497500 Revision: 29936c6682b65ffd8e754a74cddced57739f9ec6 Submitted crossbow builds: [ursacomputing/crossbow @ actions-1437](https://github.com/ursacomputing/crossbo

[GitHub] [arrow-rs] tustvold commented on a change in pull request #1219: Do not concatenate identical dictionaries

2022-01-23 Thread GitBox
tustvold commented on a change in pull request #1219: URL: https://github.com/apache/arrow-rs/pull/1219#discussion_r790284597 ## File path: arrow/src/array/data.rs ## @@ -1155,6 +1155,41 @@ impl ArrayData { Ok(()) }) } + +/// Returns true if this

[GitHub] [arrow-rs] tustvold opened a new pull request #1228: Unaligned bit chunk

2022-01-23 Thread GitBox
tustvold opened a new pull request #1228: URL: https://github.com/apache/arrow-rs/pull/1228 **Draft as builds on #1225** # Which issue does this PR close? Closes #1227. # Rationale for this change This improves the filter benchmarks by a factor of 2x, and likely

[GitHub] [arrow-rs] tustvold commented on a change in pull request #1228: Unaligned bit chunk

2022-01-23 Thread GitBox
tustvold commented on a change in pull request #1228: URL: https://github.com/apache/arrow-rs/pull/1228#discussion_r790285853 ## File path: arrow/src/compute/kernels/filter.rs ## @@ -17,184 +17,117 @@ //! Defines miscellaneous array kernels. +use crate::array::*; use crat

[GitHub] [arrow-rs] tustvold commented on a change in pull request #1228: Unaligned bit chunk

2022-01-23 Thread GitBox
tustvold commented on a change in pull request #1228: URL: https://github.com/apache/arrow-rs/pull/1228#discussion_r790285853 ## File path: arrow/src/compute/kernels/filter.rs ## @@ -17,184 +17,117 @@ //! Defines miscellaneous array kernels. +use crate::array::*; use crat

[GitHub] [arrow-rs] jhorstmann commented on pull request #1221: Remove explicit simd arithmetic kernels except for division/modulo

2022-01-23 Thread GitBox
jhorstmann commented on pull request #1221: URL: https://github.com/apache/arrow-rs/pull/1221#issuecomment-1019499436 Hi @alamb, the [benchmark results and my analysis are in the issue](https://github.com/apache/arrow-rs/issues/1182#issuecomment-1013669825). For the removed kernels, the au

[GitHub] [arrow] xhochy commented on pull request #11916: ARROW-14506: [C++] Conda support for google-cloud-cpp

2022-01-23 Thread GitBox
xhochy commented on pull request #11916: URL: https://github.com/apache/arrow/pull/11916#issuecomment-1019499508 @github-actions crossbow submit conda-linux-gcc-py38-cpu -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [arrow] github-actions[bot] commented on pull request #11916: ARROW-14506: [C++] Conda support for google-cloud-cpp

2022-01-23 Thread GitBox
github-actions[bot] commented on pull request #11916: URL: https://github.com/apache/arrow/pull/11916#issuecomment-1019499632 Revision: c0ef3c5c0031483b8d3ca9c60f01c539683b6211 Submitted crossbow builds: [ursacomputing/crossbow @ actions-1438](https://github.com/ursacomputing/crossbo

[GitHub] [arrow-rs] tustvold commented on a change in pull request #1228: Unaligned bit chunk

2022-01-23 Thread GitBox
tustvold commented on a change in pull request #1228: URL: https://github.com/apache/arrow-rs/pull/1228#discussion_r790286234 ## File path: parquet/src/arrow/record_reader/definition_levels.rs ## @@ -353,34 +353,27 @@ impl PackedDecoder { /// Counts the number of set bits in

[GitHub] [arrow-rs] codecov-commenter commented on pull request #1228: Unaligned bit chunk

2022-01-23 Thread GitBox
codecov-commenter commented on pull request #1228: URL: https://github.com/apache/arrow-rs/pull/1228#issuecomment-1019500860 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1228?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=T

[GitHub] [arrow] kszucs commented on pull request #12227: ARROW-15417: [Python][Packaging] Use vcpkg manifest to install dependencies; update vcpkg version; use bundled AWS SDK

2022-01-23 Thread GitBox
kszucs commented on pull request #12227: URL: https://github.com/apache/arrow/pull/12227#issuecomment-1019501287 @github-actions crossbow submit wheel-macos-big-sur-cp39-universal2 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

[GitHub] [arrow] github-actions[bot] commented on pull request #12227: ARROW-15417: [Python][Packaging] Use vcpkg manifest to install dependencies; update vcpkg version; use bundled AWS SDK

2022-01-23 Thread GitBox
github-actions[bot] commented on pull request #12227: URL: https://github.com/apache/arrow/pull/12227#issuecomment-1019501471 Revision: 2cac9512aab4fcd0d4be3c8e08785bc667a6c316 Submitted crossbow builds: [ursacomputing/crossbow @ actions-1439](https://github.com/ursacomputing/crossbo

[GitHub] [arrow] kszucs commented on pull request #12227: ARROW-15417: [Python][Packaging] Use vcpkg manifest to install dependencies; update vcpkg version; use bundled AWS SDK

2022-01-23 Thread GitBox
kszucs commented on pull request #12227: URL: https://github.com/apache/arrow/pull/12227#issuecomment-1019503425 @github-actions crossbow submit wheel-macos-big-sur-cp39-universal2 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

[GitHub] [arrow] github-actions[bot] commented on pull request #12227: ARROW-15417: [Python][Packaging] Use vcpkg manifest to install dependencies; update vcpkg version; use bundled AWS SDK

2022-01-23 Thread GitBox
github-actions[bot] commented on pull request #12227: URL: https://github.com/apache/arrow/pull/12227#issuecomment-1019503668 Revision: 9a94a76139279e5c1379fc998b39397badc0fb6c Submitted crossbow builds: [ursacomputing/crossbow @ actions-1440](https://github.com/ursacomputing/crossbo

[GitHub] [arrow] kszucs commented on pull request #12227: ARROW-15417: [Python][Packaging] Use vcpkg manifest to install dependencies; update vcpkg version; use bundled AWS SDK

2022-01-23 Thread GitBox
kszucs commented on pull request #12227: URL: https://github.com/apache/arrow/pull/12227#issuecomment-1019504120 @github-actions crossbow submit wheel-macos-big-sur-cp39-universal2 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

[GitHub] [arrow] github-actions[bot] commented on pull request #12227: ARROW-15417: [Python][Packaging] Use vcpkg manifest to install dependencies; update vcpkg version; use bundled AWS SDK

2022-01-23 Thread GitBox
github-actions[bot] commented on pull request #12227: URL: https://github.com/apache/arrow/pull/12227#issuecomment-1019504307 Revision: 40ff446eecc9acffdeeeaf72499e5339020b02f5 Submitted crossbow builds: [ursacomputing/crossbow @ actions-1441](https://github.com/ursacomputing/crossbo

[GitHub] [arrow] iajoiner commented on a change in pull request #12231: ARROW-14783: [C++][Python] Fix the BytesIO support issue

2022-01-23 Thread GitBox
iajoiner commented on a change in pull request #12231: URL: https://github.com/apache/arrow/pull/12231#discussion_r790289895 ## File path: cpp/src/arrow/adapters/orc/adapter.cc ## @@ -720,11 +720,7 @@ class ArrowOutputStream : public liborc::OutputStream { return filename;

[GitHub] [arrow] iajoiner commented on a change in pull request #12231: ARROW-14783: [C++][Python] Fix the BytesIO support issue

2022-01-23 Thread GitBox
iajoiner commented on a change in pull request #12231: URL: https://github.com/apache/arrow/pull/12231#discussion_r790289895 ## File path: cpp/src/arrow/adapters/orc/adapter.cc ## @@ -720,11 +720,7 @@ class ArrowOutputStream : public liborc::OutputStream { return filename;

[GitHub] [arrow] iajoiner commented on a change in pull request #12231: ARROW-14783: [C++][Python] Fix the BytesIO support issue

2022-01-23 Thread GitBox
iajoiner commented on a change in pull request #12231: URL: https://github.com/apache/arrow/pull/12231#discussion_r790289895 ## File path: cpp/src/arrow/adapters/orc/adapter.cc ## @@ -720,11 +720,7 @@ class ArrowOutputStream : public liborc::OutputStream { return filename;

[GitHub] [arrow] xhochy commented on pull request #11916: ARROW-14506: [C++] Conda support for google-cloud-cpp

2022-01-23 Thread GitBox
xhochy commented on pull request #11916: URL: https://github.com/apache/arrow/pull/11916#issuecomment-1019506498 @github-actions crossbow submit conda-linux-gcc-py38-cpu -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow] github-actions[bot] commented on pull request #11916: ARROW-14506: [C++] Conda support for google-cloud-cpp

2022-01-23 Thread GitBox
github-actions[bot] commented on pull request #11916: URL: https://github.com/apache/arrow/pull/11916#issuecomment-1019506775 Revision: 7f3f56722864074e45754c6217b75b1c9e1ca178 Submitted crossbow builds: [ursacomputing/crossbow @ actions-1442](https://github.com/ursacomputing/crossbo

[GitHub] [arrow] ursabot edited a comment on pull request #12232: ARROW-15416: [Python] Add option to skip gdb tests

2022-01-23 Thread GitBox
ursabot edited a comment on pull request #12232: URL: https://github.com/apache/arrow/pull/12232#issuecomment-1019488065 Benchmark runs are scheduled for baseline = 664a15896c8eb07916beb054b45db3e35b9810da and contender = 7f0867e616c23fe3e0ebaed8e8ef11be1b9f1dd9. 7f0867e616c23fe3e0ebaed8e

[GitHub] [arrow-rs] tustvold commented on pull request #1225: Skip building null mask in MutableArrayData (#1224)

2022-01-23 Thread GitBox
tustvold commented on pull request #1225: URL: https://github.com/apache/arrow-rs/pull/1225#issuecomment-1019509705 I just need to double-check how this interacts with ExtendNulls -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

[GitHub] [arrow-rs] tustvold opened a new issue #1229: Add MutableArrayData::extend_ranges

2022-01-23 Thread GitBox
tustvold opened a new issue #1229: URL: https://github.com/apache/arrow-rs/issues/1229 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** `MutableArrayData` is created with one or more `ArrayData` and can be used to copy across

[GitHub] [arrow] ursabot edited a comment on pull request #12214: ARROW-15376: [Go][Release] cpu_arm64 needs +build comment

2022-01-23 Thread GitBox
ursabot edited a comment on pull request #12214: URL: https://github.com/apache/arrow/pull/12214#issuecomment-1018269017 Benchmark runs are scheduled for baseline = 658bec37aa5cbdd53b5e4cdc81b8ba3962e67f11 and contender = a2fe24fe970bce082782225c16ff7ac45989884b. a2fe24fe970bce082782225c1

[GitHub] [arrow] kszucs commented on pull request #12227: ARROW-15417: [Python][Packaging] Use vcpkg manifest to install dependencies; use bundled AWS SDK

2022-01-23 Thread GitBox
kszucs commented on pull request #12227: URL: https://github.com/apache/arrow/pull/12227#issuecomment-1019510250 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

[GitHub] [arrow] github-actions[bot] commented on pull request #12227: ARROW-15417: [Python][Packaging] Use vcpkg manifest to install dependencies; use bundled AWS SDK

2022-01-23 Thread GitBox
github-actions[bot] commented on pull request #12227: URL: https://github.com/apache/arrow/pull/12227#issuecomment-1019510471 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow] kszucs commented on pull request #12227: ARROW-15417: [Python][Packaging] Use vcpkg manifest to install dependencies; use bundled AWS SDK

2022-01-23 Thread GitBox
kszucs commented on pull request #12227: URL: https://github.com/apache/arrow/pull/12227#issuecomment-1019511761 @github-actions crossbow submit wheel-manylinux* wheel-macos* -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

[GitHub] [arrow] github-actions[bot] commented on pull request #12227: ARROW-15417: [Python][Packaging] Use vcpkg manifest to install dependencies; use bundled AWS SDK

2022-01-23 Thread GitBox
github-actions[bot] commented on pull request #12227: URL: https://github.com/apache/arrow/pull/12227#issuecomment-1019511972 Revision: 9a94a76139279e5c1379fc998b39397badc0fb6c Submitted crossbow builds: [ursacomputing/crossbow @ actions-1444](https://github.com/ursacomputing/crossbo

[GitHub] [arrow-rs] tustvold commented on issue #1229: Add MutableArrayData::extend_ranges

2022-01-23 Thread GitBox
tustvold commented on issue #1229: URL: https://github.com/apache/arrow-rs/issues/1229#issuecomment-1019516203 Linking https://github.com/apache/arrow-datafusion/issues/416 and https://github.com/apache/arrow-datafusion/issues/1572 as at least historically `MutableArrayData` was one of the

  1   2   3   >