[GitHub] [arrow-rs] viirya commented on pull request #1736: Fix projection in IPC reader

2022-05-24 Thread GitBox
viirya commented on PR #1736: URL: https://github.com/apache/arrow-rs/pull/1736#issuecomment-1136802521 Thank you @iyupeng -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow-datafusion] waynexia commented on pull request #2613: Support CREATE OR REPLACE TABLE

2022-05-24 Thread GitBox
waynexia commented on PR #2613: URL: https://github.com/apache/arrow-datafusion/pull/2613#issuecomment-1136800236 >if 'if not exists' and 'or replace' are both specified, treat as 'or replace' I personally feel weird about `IF NOT EXIST` coexist with `OR REPLACE`. They are asking fo

[GitHub] [arrow-rs] iyupeng commented on pull request #1736: Fix projection in IPC reader

2022-05-24 Thread GitBox
iyupeng commented on PR #1736: URL: https://github.com/apache/arrow-rs/pull/1736#issuecomment-1136785856 Hi @viirya, I can add more commits to improve the test on projection. Please wait for some time before merging. -- This is an automated message from the Apache Git Service. To respond

[GitHub] [arrow-datafusion] waynexia commented on a diff in pull request #2612: [MINOR] remove datafusion-cli's ballista feature

2022-05-24 Thread GitBox
waynexia commented on code in PR #2612: URL: https://github.com/apache/arrow-datafusion/pull/2612#discussion_r881260305 ## docs/source/user-guide/cli.md: ## @@ -19,8 +19,7 @@ # DataFusion Command-line Interface -The DataFusion CLI allows SQL queries to be executed by an in-

[GitHub] [arrow] ursabot commented on pull request #13221: ARROW-16638: [Go][Parquet] Fix boolean column skip

2022-05-24 Thread GitBox
ursabot commented on PR #13221: URL: https://github.com/apache/arrow/pull/13221#issuecomment-1136779497 Benchmark runs are scheduled for baseline = 7adda73dc4ef6f28f8e5701eb4be1dc9526c0e1b and contender = 5994fd88259b8a6ee0efef7233d75352b7360188. 5994fd88259b8a6ee0efef7233d75352b7360188 is

[GitHub] [arrow-cookbook] thisisnic closed issue #190: [Java] Clarify build requirements

2022-05-24 Thread GitBox
thisisnic closed issue #190: [Java] Clarify build requirements URL: https://github.com/apache/arrow-cookbook/issues/190 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

[GitHub] [arrow-cookbook] toddfarmer opened a new issue, #190: [Java] Clarify build requirements

2022-05-24 Thread GitBox
toddfarmer opened a new issue, #190: URL: https://github.com/apache/arrow-cookbook/issues/190 The [Java CONTRIBUTING.rst page](https://github.com/apache/arrow-cookbook/blob/main/java/CONTRIBUTING.rst) makes no mention of a Python dependency, but the `make java` command can fail due to pyth

[GitHub] [arrow-datafusion] tustvold commented on issue #2581: Introduce ProjectionMask To Allow Nested Projection Pushdown

2022-05-24 Thread GitBox
tustvold commented on issue #2581: URL: https://github.com/apache/arrow-datafusion/issues/2581#issuecomment-1136757045 At least in the case of arrow and parquet, list element pushdown is more of a filter than a cheap projection - it requires rewriting the buffers. Perhaps we could do

[GitHub] [arrow-rs] viirya commented on a diff in pull request #1736: Fix projection in IPC reader

2022-05-24 Thread GitBox
viirya commented on code in PR #1736: URL: https://github.com/apache/arrow-rs/pull/1736#discussion_r881239079 ## arrow/src/ipc/reader.rs: ## @@ -275,6 +275,120 @@ fn create_array( Ok((array, node_index, buffer_index)) } +/// Skip fields based on data types to advance `no

[GitHub] [arrow] kou commented on pull request #13184: ARROW-16602: [Dev] Use GitHub API to merge pull request

2022-05-24 Thread GitBox
kou commented on PR #13184: URL: https://github.com/apache/arrow/pull/13184#issuecomment-1136746053 We just need the `public_repo` scope. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

[GitHub] [arrow-rs] HaoYang670 commented on a diff in pull request #1738: Support casting `Utf8` to `Boolean`

2022-05-24 Thread GitBox
HaoYang670 commented on code in PR #1738: URL: https://github.com/apache/arrow-rs/pull/1738#discussion_r881229487 ## arrow/src/compute/kernels/cast.rs: ## @@ -1661,6 +1660,34 @@ fn cast_string_to_timestamp_ns( Ok(Arc::new(array) as ArrayRef) } +/// Casts Utf8 to Boolean

[GitHub] [arrow-datafusion] kesavkolla commented on issue #2581: Introduce ProjectionMask To Allow Nested Projection Pushdown

2022-05-24 Thread GitBox
kesavkolla commented on issue #2581: URL: https://github.com/apache/arrow-datafusion/issues/2581#issuecomment-1136730678 My question is will the projection goes to nested levels? Eg: employee.departments[*].name, employee.departments[0].name -- This is an automated message from the

[GitHub] [arrow-datafusion] AssHero opened a new pull request, #2613: Support CREATE OR REPLACE TABLE

2022-05-24 Thread GitBox
AssHero opened a new pull request, #2613: URL: https://github.com/apache/arrow-datafusion/pull/2613 # Which issue does this PR close? Closes #2605 # Rationale for this change replace the table content if table already exits ❯ create table xx as values(1,2),(3,4); +--

[GitHub] [arrow-datafusion] Ted-Jiang opened a new pull request, #2612: [MINOR] remove datafusion-cli's ballista feature

2022-05-24 Thread GitBox
Ted-Jiang opened a new pull request, #2612: URL: https://github.com/apache/arrow-datafusion/pull/2612 # Which issue does this PR close? Now we not support connect ballista using datafusion-cli ``` datafusion-cli % cargo build --features ballista error: none of the selected package

[GitHub] [arrow-datafusion] Ted-Jiang commented on a diff in pull request #2591: Support for non equality predicates in `ON` clause of `LEFT`, `RIGHT, `and `FULL` joins

2022-05-24 Thread GitBox
Ted-Jiang commented on code in PR #2591: URL: https://github.com/apache/arrow-datafusion/pull/2591#discussion_r881201496 ## datafusion/core/src/physical_plan/hash_join.rs: ## @@ -791,6 +826,109 @@ fn build_join_indexes( } } +fn apply_join_filter( +left: &RecordBatch,

[GitHub] [arrow] zeroshade commented on pull request #13221: ARROW-16638: [Go][Parquet] Fix boolean column skip

2022-05-24 Thread GitBox
zeroshade commented on PR #13221: URL: https://github.com/apache/arrow/pull/13221#issuecomment-1136697951 Thanks @mdepero! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

[GitHub] [arrow] zeroshade closed pull request #13221: ARROW-16638: [Go][Parquet] Fix boolean column skip

2022-05-24 Thread GitBox
zeroshade closed pull request #13221: ARROW-16638: [Go][Parquet] Fix boolean column skip URL: https://github.com/apache/arrow/pull/13221 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [arrow] zeroshade commented on a diff in pull request #13221: ARROW-16638: [Go][Parquet] Fix boolean column skip

2022-05-24 Thread GitBox
zeroshade commented on code in PR #13221: URL: https://github.com/apache/arrow/pull/13221#discussion_r881194207 ## go/parquet/file/column_reader_types.gen.go.tmpl: ## @@ -36,11 +36,13 @@ func (cr *{{.Name}}ColumnChunkReader) Skip(nvalues int64) (int64, error) { vals, _,

[GitHub] [arrow-rs] codecov-commenter commented on pull request #1736: Fix projection in IPC reader

2022-05-24 Thread GitBox
codecov-commenter commented on PR #1736: URL: https://github.com/apache/arrow-rs/pull/1736#issuecomment-1136694183 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1736?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+S

[GitHub] [arrow-rs] iyupeng commented on pull request #1736: Fix projection in IPC reader

2022-05-24 Thread GitBox
iyupeng commented on PR #1736: URL: https://github.com/apache/arrow-rs/pull/1736#issuecomment-1136684581 Hi @viirya , thanks for your comment. A new test is added to IPC reader in my last commit. -- This is an automated message from the Apache Git Service. To respond to the message, pleas

[GitHub] [arrow-rs] MazterQyou commented on a diff in pull request #1738: Support casting `Utf8` to `Boolean`

2022-05-24 Thread GitBox
MazterQyou commented on code in PR #1738: URL: https://github.com/apache/arrow-rs/pull/1738#discussion_r881181334 ## arrow/src/compute/kernels/cast.rs: ## @@ -280,6 +280,8 @@ pub fn can_cast_types(from_type: &DataType, to_type: &DataType) -> bool { /// /// Behavior: /// * Bo

[GitHub] [arrow-rs] HaoYang670 commented on pull request #1739: Rewrite `ArrayDataBuilder::null_bit_buffer`

2022-05-24 Thread GitBox
HaoYang670 commented on PR #1739: URL: https://github.com/apache/arrow-rs/pull/1739#issuecomment-1136611225 cc @alamb @tustvold @viirya Please help to review. Thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [arrow] mdepero commented on a diff in pull request #13221: ARROW-16638: [Go][Parquet] Fix boolean column skip

2022-05-24 Thread GitBox
mdepero commented on code in PR #13221: URL: https://github.com/apache/arrow/pull/13221#discussion_r881114277 ## go/parquet/file/column_reader_types.gen.go: ## @@ -207,10 +207,11 @@ type BooleanColumnChunkReader struct { func (cr *BooleanColumnChunkReader) Skip(nvalues int64) (

[GitHub] [arrow] zeroshade commented on a diff in pull request #13221: ARROW-16638: [Go][Parquet] Fix boolean column skip

2022-05-24 Thread GitBox
zeroshade commented on code in PR #13221: URL: https://github.com/apache/arrow/pull/13221#discussion_r881107743 ## go/parquet/file/column_reader_types.gen.go: ## @@ -207,10 +207,11 @@ type BooleanColumnChunkReader struct { func (cr *BooleanColumnChunkReader) Skip(nvalues int64)

[GitHub] [arrow-rs] MazterQyou commented on a diff in pull request #1738: Support casting `Utf8` to `Boolean`

2022-05-24 Thread GitBox
MazterQyou commented on code in PR #1738: URL: https://github.com/apache/arrow-rs/pull/1738#discussion_r881091531 ## arrow/src/compute/kernels/cast.rs: ## @@ -280,6 +280,8 @@ pub fn can_cast_types(from_type: &DataType, to_type: &DataType) -> bool { /// /// Behavior: /// * Bo

[GitHub] [arrow-rs] MazterQyou commented on a diff in pull request #1738: Support casting `Utf8` to `Boolean`

2022-05-24 Thread GitBox
MazterQyou commented on code in PR #1738: URL: https://github.com/apache/arrow-rs/pull/1738#discussion_r881091531 ## arrow/src/compute/kernels/cast.rs: ## @@ -280,6 +280,8 @@ pub fn can_cast_types(from_type: &DataType, to_type: &DataType) -> bool { /// /// Behavior: /// * Bo

[GitHub] [arrow-rs] MazterQyou commented on a diff in pull request #1738: Support casting `Utf8` to `Boolean`

2022-05-24 Thread GitBox
MazterQyou commented on code in PR #1738: URL: https://github.com/apache/arrow-rs/pull/1738#discussion_r881091531 ## arrow/src/compute/kernels/cast.rs: ## @@ -280,6 +280,8 @@ pub fn can_cast_types(from_type: &DataType, to_type: &DataType) -> bool { /// /// Behavior: /// * Bo

[GitHub] [arrow-rs] MazterQyou commented on a diff in pull request #1738: Support casting `Utf8` to `Boolean`

2022-05-24 Thread GitBox
MazterQyou commented on code in PR #1738: URL: https://github.com/apache/arrow-rs/pull/1738#discussion_r881091531 ## arrow/src/compute/kernels/cast.rs: ## @@ -280,6 +280,8 @@ pub fn can_cast_types(from_type: &DataType, to_type: &DataType) -> bool { /// /// Behavior: /// * Bo

[GitHub] [arrow] vibhatha commented on pull request #13078: ARROW-15590: [C++] Add support for joins to the Substrait consumer

2022-05-24 Thread GitBox
vibhatha commented on PR #13078: URL: https://github.com/apache/arrow/pull/13078#issuecomment-1136560779 @westonpace These two JIRAs [1] [2] will be helpful in finalizing the overall tasks associated with Substrait-Joins [1] https://issues.apache.org/jira/browse/ARROW-16485 [2] htt

[GitHub] [arrow] vibhatha commented on a diff in pull request #13078: ARROW-15590: [C++] Add support for joins to the Substrait consumer

2022-05-24 Thread GitBox
vibhatha commented on code in PR #13078: URL: https://github.com/apache/arrow/pull/13078#discussion_r881078028 ## cpp/src/arrow/engine/substrait/relation_internal.cc: ## @@ -225,6 +225,76 @@ Result FromProto(const substrait::Rel& rel, }); } +case substrait::Re

[GitHub] [arrow] vibhatha commented on pull request #13205: ARROW-16515: [C++] Adding a Close method to RecordBatchReader

2022-05-24 Thread GitBox
vibhatha commented on PR #13205: URL: https://github.com/apache/arrow/pull/13205#issuecomment-1136558312 @westonpace updated the PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [arrow-ballista] andygrove commented on a diff in pull request #41: MINOR: Improve developer docs

2022-05-24 Thread GitBox
andygrove commented on code in PR #41: URL: https://github.com/apache/arrow-ballista/pull/41#discussion_r881063740 ## docs/developer/architecture.md: ## @@ -36,24 +35,15 @@ stage cannot start until its child query stages have completed. Each query stage has one or more partiti

[GitHub] [arrow] github-actions[bot] commented on pull request #13228: ARROW-14790: [GLib] Fix a memory leak on creating GArrowDatum

2022-05-24 Thread GitBox
github-actions[bot] commented on PR #13228: URL: https://github.com/apache/arrow/pull/13228#issuecomment-1136535964 :warning: Ticket **has not been started in JIRA**, please click 'Start Progress'. -- This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [arrow] github-actions[bot] commented on pull request #13228: ARROW-14790: [GLib] Fix a memory leak on creating GArrowDatum

2022-05-24 Thread GitBox
github-actions[bot] commented on PR #13228: URL: https://github.com/apache/arrow/pull/13228#issuecomment-1136535948 https://issues.apache.org/jira/browse/ARROW-14790 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] vibhatha commented on pull request #13214: ARROW-15635: [C++] Support nested extension-id-registry

2022-05-24 Thread GitBox
vibhatha commented on PR #13214: URL: https://github.com/apache/arrow/pull/13214#issuecomment-1136535248 @rtpsw Is the PR associated with the correct JIRA? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [arrow-datafusion] andygrove closed pull request #2601: WIP: Move most logical plan optimizer rules to new `datafusion-optimizer` crate

2022-05-24 Thread GitBox
andygrove closed pull request #2601: WIP: Move most logical plan optimizer rules to new `datafusion-optimizer` crate URL: https://github.com/apache/arrow-datafusion/pull/2601 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

[GitHub] [arrow-datafusion] andygrove closed pull request #2602: WIP: Remove `ExecutionProps` from `OptimizerRule` trait

2022-05-24 Thread GitBox
andygrove closed pull request #2602: WIP: Remove `ExecutionProps` from `OptimizerRule` trait URL: https://github.com/apache/arrow-datafusion/pull/2602 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow-ballista] andygrove commented on pull request #41: MINOR: Improve developer docs

2022-05-24 Thread GitBox
andygrove commented on PR #41: URL: https://github.com/apache/arrow-ballista/pull/41#issuecomment-1136533963 @thinkharderdev @yahoNanJing I am planning on improving the documentation to try and make it easier for new contributors to get up to speed. This is just a small start but I would ap

[GitHub] [arrow-ballista] andygrove opened a new pull request, #41: MINOR: Improve developer docs

2022-05-24 Thread GitBox
andygrove opened a new pull request, #41: URL: https://github.com/apache/arrow-ballista/pull/41 # Which issue does this PR close? N/A # Rationale for this change This is a first pass of the developer docs to make them easier to find and to provide a better i

[GitHub] [arrow] wjones127 commented on a diff in pull request #13206: ARROW-15906: [C++][Python][R] By default, don't create or delete S3 buckets

2022-05-24 Thread GitBox
wjones127 commented on code in PR #13206: URL: https://github.com/apache/arrow/pull/13206#discussion_r881037233 ## cpp/src/arrow/filesystem/s3fs.h: ## @@ -130,6 +130,14 @@ struct ARROW_EXPORT S3Options { /// Whether OutputStream writes will be issued in the background, withou

[GitHub] [arrow-rs] HaoYang670 commented on a diff in pull request #1725: support `min_max_binary`

2022-05-24 Thread GitBox
HaoYang670 commented on code in PR #1725: URL: https://github.com/apache/arrow-rs/pull/1725#discussion_r881036282 ## arrow/src/compute/kernels/aggregate.rs: ## @@ -885,11 +893,45 @@ mod tests { assert!(max(&a).unwrap().is_nan()); } +#[test] +fn test_binar

[GitHub] [arrow-cookbook] davisusanibar commented on a diff in pull request #215: [Java] Adding examples about Dictionary-encoded Layout

2022-05-24 Thread GitBox
davisusanibar commented on code in PR #215: URL: https://github.com/apache/arrow-cookbook/pull/215#discussion_r881035299 ## java/source/io.rst: ## @@ -443,3 +443,109 @@ Reading Parquet File Please check :doc:`Dataset <./dataset>` + +Handling Data with Di

[GitHub] [arrow] wjones127 commented on pull request #13206: ARROW-15906: [C++][Python][R] By default, don't create or delete S3 buckets

2022-05-24 Thread GitBox
wjones127 commented on PR #13206: URL: https://github.com/apache/arrow/pull/13206#issuecomment-1136497947 > Could you send an e-mail to dev@ to confirm whether this default behavior change is acceptable? I can't decide this because I haven't used Apache Arrow to access S3 yet. Sure,

[GitHub] [arrow] glin commented on issue #13211: [R] Can't binary install arrow 8.0.0 from RStudio Public Package Maneger

2022-05-24 Thread GitBox
glin commented on issue #13211: URL: https://github.com/apache/arrow/issues/13211#issuecomment-1136496017 arrow 8.0.0 binaries should now be available for both focal and jammy, and thanks for reporting this. @nealrichardson Ooh, good to know about `LIBARROW_BINARY`, we might add that

[GitHub] [arrow-rs] ahmedriza commented on issue #1744: Parquet write failure (from record batches) when data is nested two levels deep

2022-05-24 Thread GitBox
ahmedriza commented on issue #1744: URL: https://github.com/apache/arrow-rs/issues/1744#issuecomment-1136494429 Cool @tustvold. I do recall the reader side error as well before version 14. Thanks a lot. -- This is an automated message from the Apache Git Service. To respond to the messa

[GitHub] [arrow-rs] tustvold commented on issue #1744: Parquet write failure (from record batches) when data is nested two levels deep

2022-05-24 Thread GitBox
tustvold commented on issue #1744: URL: https://github.com/apache/arrow-rs/issues/1744#issuecomment-1136492215 This looks very similar to https://github.com/apache/arrow-rs/issues/1651 which fixed the read side, there is likely a similar issue on the write side. Thank you for the report, I'

[GitHub] [arrow-rs] ahmedriza commented on issue #1715: Why `Parquet` is a part of `Arrow`?

2022-05-24 Thread GitBox
ahmedriza commented on issue #1715: URL: https://github.com/apache/arrow-rs/issues/1715#issuecomment-1136487513 We've been using `arrow-rs` in place of a legacy Scala code base that uses `parquet-mr` and found that the `arrow-rs` is catching up quite nicely now. -- This is an automated

[GitHub] [arrow-rs] ahmedriza opened a new issue, #1744: Parquet write failure when data is nested two levels deep

2022-05-24 Thread GitBox
ahmedriza opened a new issue, #1744: URL: https://github.com/apache/arrow-rs/issues/1744 **Describe the bug** Let me introduce the Schema of the data in an easily readable format (the Apache Spark pretty print format): ``` root |-- id: string (nullable = true) |-- prices: ar

[GitHub] [arrow-rs] viirya commented on a diff in pull request #1725: support `min_max_binary`

2022-05-24 Thread GitBox
viirya commented on code in PR #1725: URL: https://github.com/apache/arrow-rs/pull/1725#discussion_r880978957 ## arrow/src/compute/kernels/aggregate.rs: ## @@ -885,11 +893,45 @@ mod tests { assert!(max(&a).unwrap().is_nan()); } +#[test] +fn test_binary_mi

[GitHub] [arrow-rs] nevi-me commented on a diff in pull request #1738: Support casting `Utf8` to `Boolean`

2022-05-24 Thread GitBox
nevi-me commented on code in PR #1738: URL: https://github.com/apache/arrow-rs/pull/1738#discussion_r880971793 ## arrow/src/compute/kernels/cast.rs: ## @@ -280,6 +280,8 @@ pub fn can_cast_types(from_type: &DataType, to_type: &DataType) -> bool { /// /// Behavior: /// * Boole

[GitHub] [arrow-rs] viirya commented on pull request #1742: Improve integration test document to follow Arrow C++ repo CI

2022-05-24 Thread GitBox
viirya commented on PR #1742: URL: https://github.com/apache/arrow-rs/pull/1742#issuecomment-1136455196 cc @alamb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubs

[GitHub] [arrow] ursabot commented on pull request #13225: ARROW-16643: [C++] Fix warnings for clang-14

2022-05-24 Thread GitBox
ursabot commented on PR #13225: URL: https://github.com/apache/arrow/pull/13225#issuecomment-1136452065 ['Python', 'R'] benchmarks have high level of regressions. [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/bbade04a7e4f4aa09f1bc11a54e63b95...9260031769da4ea6b2fbd9ee5efdd39d/)

[GitHub] [arrow] ursabot commented on pull request #13225: ARROW-16643: [C++] Fix warnings for clang-14

2022-05-24 Thread GitBox
ursabot commented on PR #13225: URL: https://github.com/apache/arrow/pull/13225#issuecomment-1136451944 Benchmark runs are scheduled for baseline = dd84c0f68c6f898e3a02bb0623500e3f165f80d4 and contender = 7adda73dc4ef6f28f8e5701eb4be1dc9526c0e1b. 7adda73dc4ef6f28f8e5701eb4be1dc9526c0e1b is

[GitHub] [arrow-datafusion] andygrove merged pull request #2611: Unpin Ballista following FileFormat Breaking Change

2022-05-24 Thread GitBox
andygrove merged PR #2611: URL: https://github.com/apache/arrow-datafusion/pull/2611 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

[GitHub] [arrow] kou commented on a diff in pull request #13206: ARROW-15906: [C++][Python][R] By default, don't create or delete S3 buckets

2022-05-24 Thread GitBox
kou commented on code in PR #13206: URL: https://github.com/apache/arrow/pull/13206#discussion_r880951359 ## cpp/src/arrow/filesystem/s3fs.h: ## @@ -130,6 +130,14 @@ struct ARROW_EXPORT S3Options { /// Whether OutputStream writes will be issued in the background, without blo

[GitHub] [arrow-cookbook] lidavidm commented on a diff in pull request #215: [Java] Adding examples about Dictionary-encoded Layout

2022-05-24 Thread GitBox
lidavidm commented on code in PR #215: URL: https://github.com/apache/arrow-cookbook/pull/215#discussion_r880945093 ## java/source/io.rst: ## @@ -443,3 +443,109 @@ Reading Parquet File Please check :doc:`Dataset <./dataset>` + +Handling Data with Diction

[GitHub] [arrow-rs] viirya merged pull request #1743: Pin nightly version to bypass packed_simd build error

2022-05-24 Thread GitBox
viirya merged PR #1743: URL: https://github.com/apache/arrow-rs/pull/1743 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apach

[GitHub] [arrow-rs] viirya closed issue #1734: Latest nightly fails to build with feature simd

2022-05-24 Thread GitBox
viirya closed issue #1734: Latest nightly fails to build with feature simd URL: https://github.com/apache/arrow-rs/issues/1734 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

[GitHub] [arrow-rs] viirya commented on pull request #1743: Pin nightly version to bypass packed_simd build error

2022-05-24 Thread GitBox
viirya commented on PR #1743: URL: https://github.com/apache/arrow-rs/pull/1743#issuecomment-1136423514 Merging for unblocking other PRs. Thanks @sunchao -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

[GitHub] [arrow-cookbook] lidavidm commented on a diff in pull request #215: [Java] Adding examples about Dictionary-encoded Layout

2022-05-24 Thread GitBox
lidavidm commented on code in PR #215: URL: https://github.com/apache/arrow-cookbook/pull/215#discussion_r880942640 ## java/source/io.rst: ## @@ -443,3 +443,109 @@ Reading Parquet File Please check :doc:`Dataset <./dataset>` + +Handling Data with Diction

[GitHub] [arrow-cookbook] lidavidm commented on a diff in pull request #215: [Java] Adding examples about Dictionary-encoded Layout

2022-05-24 Thread GitBox
lidavidm commented on code in PR #215: URL: https://github.com/apache/arrow-cookbook/pull/215#discussion_r880941854 ## java/source/io.rst: ## @@ -443,3 +443,109 @@ Reading Parquet File Please check :doc:`Dataset <./dataset>` + +Handling Data with Diction

[GitHub] [arrow] github-actions[bot] commented on pull request #13227: ARROW-16553: [Java] Initial changes to configure GitHub nightly builds Jar as a repository to use from Maven/Gradle

2022-05-24 Thread GitBox
github-actions[bot] commented on PR #13227: URL: https://github.com/apache/arrow/pull/13227#issuecomment-1136416792 Revision: 1c9392265e60f4519f3e6e6564b598aed97d9d3e Submitted crossbow builds: [ursacomputing/crossbow @ actions-7a102b9b5e](https://github.com/ursacomputing/crossbow/bra

[GitHub] [arrow] davisusanibar commented on pull request #13227: ARROW-16553: [Java] Initial changes to configure GitHub nightly builds Jar as a repository to use from Maven/Gradle

2022-05-24 Thread GitBox
davisusanibar commented on PR #13227: URL: https://github.com/apache/arrow/pull/13227#issuecomment-1136415922 @github-actions crossbow submit java-jars -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [arrow-datafusion] tustvold opened a new pull request, #2611: Unpin Ballista following FileFormat Breaking Change

2022-05-24 Thread GitBox
tustvold opened a new pull request, #2611: URL: https://github.com/apache/arrow-datafusion/pull/2611 https://github.com/apache/arrow-ballista/pull/40 has been merged -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] jonkeane commented on a diff in pull request #13149: ARROW-16403:[R][CI] Create Crossbow task for R nightly builds

2022-05-24 Thread GitBox
jonkeane commented on code in PR #13149: URL: https://github.com/apache/arrow/pull/13149#discussion_r880819817 ## dev/tasks/macros.jinja: ## @@ -221,3 +222,79 @@ on: cp ${formula} $(brew --repository homebrew/core)/Formula/ done {% endmacro %} + +{%- macro githu

[GitHub] [arrow-cookbook] davisusanibar commented on a diff in pull request #215: [Java] Adding examples about Dictionary-encoded Layout

2022-05-24 Thread GitBox
davisusanibar commented on code in PR #215: URL: https://github.com/apache/arrow-cookbook/pull/215#discussion_r880897554 ## java/source/io.rst: ## @@ -443,3 +443,109 @@ Reading Parquet File Please check :doc:`Dataset <./dataset>` + +Handling Data with Di

[GitHub] [arrow] icexelloss commented on pull request #13214: ARROW-15635: [C++] Support nested extension-id-registry

2022-05-24 Thread GitBox
icexelloss commented on PR #13214: URL: https://github.com/apache/arrow/pull/13214#issuecomment-1136370366 @westonpace I wonder what's your thoughts about the changes here? Is this on the right track? -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [arrow-cookbook] davisusanibar commented on a diff in pull request #215: [Java] Adding examples about Dictionary-encoded Layout

2022-05-24 Thread GitBox
davisusanibar commented on code in PR #215: URL: https://github.com/apache/arrow-cookbook/pull/215#discussion_r880894231 ## java/source/create.rst: ## @@ -70,6 +70,66 @@ Array of Varchar [one, two, three] +Dictionary-Encoded Array of Varchar +---

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #2610: binary mathematical operators work with `NULL`

2022-05-24 Thread GitBox
alamb commented on code in PR #2610: URL: https://github.com/apache/arrow-datafusion/pull/2610#discussion_r880888491 ## datafusion/physical-expr/src/expressions/binary.rs: ## @@ -701,16 +701,58 @@ macro_rules! compute_bool_op { /// LEFT is array, RIGHT is scalar value macro_ru

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #2591: Support for non equality predicates in `ON` clause of `LEFT`, `RIGHT, `and `FULL` joins

2022-05-24 Thread GitBox
alamb commented on code in PR #2591: URL: https://github.com/apache/arrow-datafusion/pull/2591#discussion_r880841593 ## datafusion/core/src/optimizer/filter_push_down.rs: ## @@ -1143,6 +1145,7 @@ mod tests { &right, JoinType::Inner,

[GitHub] [arrow] github-actions[bot] commented on pull request #13227: ARROW-16553: [Java] Initial changes to configure GitHub nightly builds Jar as a repository to use from Maven/Gradle

2022-05-24 Thread GitBox
github-actions[bot] commented on PR #13227: URL: https://github.com/apache/arrow/pull/13227#issuecomment-1136348628 https://issues.apache.org/jira/browse/ARROW-16553 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] davisusanibar opened a new pull request, #13227: ARROW-16553: [Java] Initial changes to configure GitHub nightly builds Jar as a repository to use from Maven/Gradle

2022-05-24 Thread GitBox
davisusanibar opened a new pull request, #13227: URL: https://github.com/apache/arrow/pull/13227 For Java side currently we are offering nightly builds Jar artifacts uploaded to GitHub repository as an assets. Then, if a user decided to use that in their local projects they need to d

[GitHub] [arrow-datafusion] WinkerDu opened a new pull request, #2610: binary mathematical operator work with `NULL`

2022-05-24 Thread GitBox
WinkerDu opened a new pull request, #2610: URL: https://github.com/apache/arrow-datafusion/pull/2610 # Which issue does this PR close? Closes #2609 . # Rationale for this change There is bug when binary mathematical operators `+`, `-`, `*`, `/`, `%` work with litera

[GitHub] [arrow-datafusion] WinkerDu opened a new issue, #2609: Add, Minus, Multiply, divide, Modulo operator work with literal `NULL`

2022-05-24 Thread GitBox
WinkerDu opened a new issue, #2609: URL: https://github.com/apache/arrow-datafusion/issues/2609 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** There is bug when binary mathematical operators `+`, `-`, `*`, `/`, `%` work with lite

[GitHub] [arrow-cookbook] lidavidm commented on a diff in pull request #215: [Java] Adding examples about Dictionary-encoded Layout

2022-05-24 Thread GitBox
lidavidm commented on code in PR #215: URL: https://github.com/apache/arrow-cookbook/pull/215#discussion_r880860685 ## java/source/io.rst: ## @@ -443,3 +443,109 @@ Reading Parquet File Please check :doc:`Dataset <./dataset>` + +Handling Data with Diction

[GitHub] [arrow-cookbook] davisusanibar commented on a diff in pull request #215: [Java] Adding examples about Dictionary-encoded Layout

2022-05-24 Thread GitBox
davisusanibar commented on code in PR #215: URL: https://github.com/apache/arrow-cookbook/pull/215#discussion_r880858786 ## java/source/io.rst: ## @@ -443,3 +443,108 @@ Reading Parquet File Please check :doc:`Dataset <./dataset>` + +Dictionary-encoded La

[GitHub] [arrow] lidavidm commented on pull request #13226: ARROW-16267: [Java] Adding support to compile Java code with JDK 18

2022-05-24 Thread GitBox
lidavidm commented on PR #13226: URL: https://github.com/apache/arrow/pull/13226#issuecomment-1136330058 Ah, reasonable enough. Just wondering if there was a way to let the infrastructure do it for us. -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [arrow-cookbook] davisusanibar commented on a diff in pull request #215: [Java] Adding examples about Dictionary-encoded Layout

2022-05-24 Thread GitBox
davisusanibar commented on code in PR #215: URL: https://github.com/apache/arrow-cookbook/pull/215#discussion_r880858576 ## java/source/create.rst: ## @@ -70,6 +70,62 @@ Array of Varchar [one, two, three] +In some scenarios could be more appropriate use `Dictionary-enco

[GitHub] [arrow] assignUser commented on pull request #13226: ARROW-16267: [Java] Adding support to compile Java code with JDK 18

2022-05-24 Thread GitBox
assignUser commented on PR #13226: URL: https://github.com/apache/arrow/pull/13226#issuecomment-1136326911 @lidavidm This is a docker issue, as we use some of [these](https://hub.docker.com/_/maven) for the jobs and they don't have a `3.8.5-openjdk-latest` tag. So there is no easy way to ke

[GitHub] [arrow-cookbook] lidavidm commented on pull request #193: [Java] Clarify build dependencies

2022-05-24 Thread GitBox
lidavidm commented on PR #193: URL: https://github.com/apache/arrow-cookbook/pull/193#issuecomment-1136322685 Or actually I guess I can just edit the commit message before merge here so no worries. Thanks for updating this! -- This is an automated message from the Apache Git Service. To r

[GitHub] [arrow-cookbook] lidavidm merged pull request #193: [Java] Clarify build dependencies

2022-05-24 Thread GitBox
lidavidm merged PR #193: URL: https://github.com/apache/arrow-cookbook/pull/193 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow

[GitHub] [arrow-cookbook] lidavidm closed issue #190: [Java] Clarify build requirements

2022-05-24 Thread GitBox
lidavidm closed issue #190: [Java] Clarify build requirements URL: https://github.com/apache/arrow-cookbook/issues/190 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubs

[GitHub] [arrow-cookbook] lidavidm opened a new issue, #216: [Java][Flight] Flight example appears to be flaky

2022-05-24 Thread GitBox
lidavidm opened a new issue, #216: URL: https://github.com/apache/arrow-cookbook/issues/216 It looks like ```java try { flightServer.start(); System.out.println("S1: Server (Location): Listening on port " + flightServer.getPort());

[GitHub] [arrow] lidavidm closed pull request #13225: ARROW-16643: [C++] Fix warnings for clang-14

2022-05-24 Thread GitBox
lidavidm closed pull request #13225: ARROW-16643: [C++] Fix warnings for clang-14 URL: https://github.com/apache/arrow/pull/13225 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [arrow] lidavidm commented on pull request #13225: ARROW-16643: [C++] Fix warnings for clang-14

2022-05-24 Thread GitBox
lidavidm commented on PR #13225: URL: https://github.com/apache/arrow/pull/13225#issuecomment-1136319388 That AppVeyor issue should be fixed by https://github.com/apache/arrow/pull/13191 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [arrow] lidavidm commented on pull request #13226: ARROW-16267: [Java] Adding support to compile Java code with JDK 18

2022-05-24 Thread GitBox
lidavidm commented on PR #13226: URL: https://github.com/apache/arrow/pull/13226#issuecomment-1136315860 Does GitHub offer a way to just specify "latest" so that we can automatically get 19, 20, ..., etc when they release? -- This is an automated message from the Apache Git Service. To r

[GitHub] [arrow-datafusion] alamb merged pull request #2607: MINOR: Rename benchmark crate

2022-05-24 Thread GitBox
alamb merged PR #2607: URL: https://github.com/apache/arrow-datafusion/pull/2607 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arro

[GitHub] [arrow-datafusion] alamb commented on pull request #2607: MINOR: Rename benchmark crate

2022-05-24 Thread GitBox
alamb commented on PR #2607: URL: https://github.com/apache/arrow-datafusion/pull/2607#issuecomment-1136314499 Thanks all! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow] lidavidm commented on a diff in pull request #12692: ARROW-16005: [Java] Fix ArrayConsumer when using ArrowVectorIterator

2022-05-24 Thread GitBox
lidavidm commented on code in PR #12692: URL: https://github.com/apache/arrow/pull/12692#discussion_r880842894 ## java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/consumer/ArrayConsumer.java: ## @@ -90,13 +97,12 @@ public void consume(ResultSet resultSet) throws SQ

[GitHub] [arrow-ballista] andygrove merged pull request #40: Update with file format breaking change

2022-05-24 Thread GitBox
andygrove merged PR #40: URL: https://github.com/apache/arrow-ballista/pull/40 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.

[GitHub] [arrow-cookbook] lidavidm commented on pull request #193: Toddfarmer/contributing java dependencies

2022-05-24 Thread GitBox
lidavidm commented on PR #193: URL: https://github.com/apache/arrow-cookbook/pull/193#issuecomment-1136305495 @toddfarmer mind updating the PR title to be a little more descriptive too? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [arrow-cookbook] lidavidm commented on pull request #193: Toddfarmer/contributing java dependencies

2022-05-24 Thread GitBox
lidavidm commented on PR #193: URL: https://github.com/apache/arrow-cookbook/pull/193#issuecomment-1136305145 @davisusanibar it looks like ```java try { flightServer.start(); System.out.println("S1: Server (Location): Listening on p

[GitHub] [arrow-rs] alamb commented on issue #1701: "out of order projection is not supported" after Fix Parquet Arrow Schema Inference

2022-05-24 Thread GitBox
alamb commented on issue #1701: URL: https://github.com/apache/arrow-rs/issues/1701#issuecomment-1136304413 Update: @tustvold has kindly offered to take a stab at preparing a datafusion PR to upgrade to latest arrow, prior to #1727 -- This is an automated message from the Apache Git Ser

[GitHub] [arrow-rs] codecov-commenter commented on pull request #1743: Pin nightly version to bypass packed_simd build error

2022-05-24 Thread GitBox
codecov-commenter commented on PR #1743: URL: https://github.com/apache/arrow-rs/pull/1743#issuecomment-1136298356 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1743?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+S

[GitHub] [arrow-rs] viirya commented on pull request #1743: Pin nightly version to bypass packed_simd build error

2022-05-24 Thread GitBox
viirya commented on PR #1743: URL: https://github.com/apache/arrow-rs/pull/1743#issuecomment-1136293542 We can change back to latest `nightly` once `packed_simd` publishes new release with a fix. This is for unblocking PRs. -- This is an automated message from the Apache Git Service. To r

[GitHub] [arrow-rs] viirya commented on pull request #1743: Pin nightly version to bypass packed_simd build error

2022-05-24 Thread GitBox
viirya commented on PR #1743: URL: https://github.com/apache/arrow-rs/pull/1743#issuecomment-1136292275 cc @sunchao -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsu

[GitHub] [arrow-rs] viirya commented on a diff in pull request #1738: Support casting `Utf8` to `Boolean`

2022-05-24 Thread GitBox
viirya commented on code in PR #1738: URL: https://github.com/apache/arrow-rs/pull/1738#discussion_r880824560 ## arrow/src/compute/kernels/cast.rs: ## @@ -280,6 +280,8 @@ pub fn can_cast_types(from_type: &DataType, to_type: &DataType) -> bool { /// /// Behavior: /// * Boolea

[GitHub] [arrow-rs] codecov-commenter commented on pull request #1742: Improve integration test document to follow Arrow C++ repo CI

2022-05-24 Thread GitBox
codecov-commenter commented on PR #1742: URL: https://github.com/apache/arrow-rs/pull/1742#issuecomment-1136286755 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1742?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+S

[GitHub] [arrow-cookbook] lidavidm commented on a diff in pull request #193: Toddfarmer/contributing java dependencies

2022-05-24 Thread GitBox
lidavidm commented on code in PR #193: URL: https://github.com/apache/arrow-cookbook/pull/193#discussion_r880818125 ## java/CONTRIBUTING.rst: ## @@ -1,13 +1,27 @@ Building the Java Cookbook = - The Java cookbook uses the Sphinx documentation system.

[GitHub] [arrow-cookbook] lidavidm commented on a diff in pull request #215: [Java] Adding examples about Dictionary-encoded Layout

2022-05-24 Thread GitBox
lidavidm commented on code in PR #215: URL: https://github.com/apache/arrow-cookbook/pull/215#discussion_r880809920 ## java/source/create.rst: ## @@ -70,6 +70,62 @@ Array of Varchar [one, two, three] +In some scenarios could be more appropriate use `Dictionary-encoded L

  1   2   3   >