[GitHub] [arrow-datafusion] mustafasrepo commented on a diff in pull request #3916: Support for non-u64 types for Window Bound

2022-10-24 Thread GitBox
mustafasrepo commented on code in PR #3916: URL: https://github.com/apache/arrow-datafusion/pull/3916#discussion_r1002942852 ## datafusion/expr/src/window_frame.rs: ## @@ -252,103 +247,32 @@ mod tests { }; let result = WindowFrame::try_from(window_frame);

[GitHub] [arrow-rs] viirya opened a new issue, #2914: Add `freeze_with_dictionary` API to `MutableArrayData`

2022-10-24 Thread GitBox
viirya opened a new issue, #2914: URL: https://github.com/apache/arrow-rs/issues/2914 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** We use `MutableArrayData` to copy array data for certain operation. One issue we faced i

[GitHub] [arrow-rs] viirya opened a new pull request, #2915: Add `freeze_with_dictionary` API to `MutableArrayData`

2022-10-24 Thread GitBox
viirya opened a new pull request, #2915: URL: https://github.com/apache/arrow-rs/pull/2915 # Which issue does this PR close? Closes #2914. # Rationale for this change # What changes are included in this PR? # Are there any user-facing chang

[GitHub] [arrow-ballista] yahoNanJing opened a new pull request, #439: Add a feature for hdfs3

2022-10-24 Thread GitBox
yahoNanJing opened a new pull request, #439: URL: https://github.com/apache/arrow-ballista/pull/439 # Which issue does this PR close? Closes #419. # Rationale for this change # What changes are included in this PR? # Are there any user-facing chang

[GitHub] [arrow-ballista] yahoNanJing commented on pull request #439: Add a feature for hdfs3

2022-10-24 Thread GitBox
yahoNanJing commented on PR #439: URL: https://github.com/apache/arrow-ballista/pull/439#issuecomment-1288549186 Hi @yjshen and @andygrove, could you help review this PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [arrow-datafusion] xudong963 commented on a diff in pull request #3787: Join cardinality computation for cost-based nested join optimizations

2022-10-24 Thread GitBox
xudong963 commented on code in PR #3787: URL: https://github.com/apache/arrow-datafusion/pull/3787#discussion_r1003003630 ## datafusion/core/src/physical_plan/join_utils.rs: ## @@ -296,6 +299,154 @@ impl Clone for OnceFut { } } +/// A shared state between statistic aggre

[GitHub] [arrow-rs] jiacai2050 opened a new issue, #2916: Perf about ParquetRecordBatchStream vs ParquetRecordBatchReader

2022-10-24 Thread GitBox
jiacai2050 opened a new issue, #2916: URL: https://github.com/apache/arrow-rs/issues/2916 **Which part is this question about** API Usage & Perf **Describe your question** I create two benchmark based on [example code](https://docs.rs/parquet/latest/parquet/arrow/async_reade

[GitHub] [arrow-rs] waitingkuo commented on a diff in pull request #2909: Add timezone abstraction

2022-10-24 Thread GitBox
waitingkuo commented on code in PR #2909: URL: https://github.com/apache/arrow-rs/pull/2909#discussion_r1003055425 ## arrow-array/src/timezone.rs: ## @@ -0,0 +1,306 @@ +use arrow_schema::ArrowError; +use chrono::format::{parse, Parsed, StrftimeItems}; +use chrono::FixedOffset; +

[GitHub] [arrow-ballista] fsnlla closed pull request #440: sc-5792: fix

2022-10-24 Thread GitBox
fsnlla closed pull request #440: sc-5792: fix URL: https://github.com/apache/arrow-ballista/pull/440 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: git

[GitHub] [arrow] rok commented on a diff in pull request #14472: ARROW-18042: [Java] Distribute Apple M1 compatible JNI libraries via mavencentral

2022-10-24 Thread GitBox
rok commented on code in PR #14472: URL: https://github.com/apache/arrow/pull/14472#discussion_r1003087554 ## dev/tasks/java-jars/github.yml: ## @@ -48,47 +50,65 @@ jobs: {% endif %} build-cpp-macos: -name: Build C++ libraries macOS -runs-on: macos-latest +

[GitHub] [arrow] rok commented on a diff in pull request #14472: ARROW-18042: [Java] Distribute Apple M1 compatible JNI libraries via mavencentral

2022-10-24 Thread GitBox
rok commented on code in PR #14472: URL: https://github.com/apache/arrow/pull/14472#discussion_r1003088388 ## dev/tasks/java-jars/github.yml: ## @@ -117,44 +142,43 @@ jobs: name: Build jar files runs-on: macos-latest needs: - - build-cpp-ubuntu - build

[GitHub] [arrow] rok commented on a diff in pull request #14472: ARROW-18042: [Java] Distribute Apple M1 compatible JNI libraries via mavencentral

2022-10-24 Thread GitBox
rok commented on code in PR #14472: URL: https://github.com/apache/arrow/pull/14472#discussion_r1003090145 ## docker-compose.yml: ## @@ -1008,10 +1008,10 @@ services: - ${DOCKER_VOLUME_PREFIX}python-wheel-manylinux2014-ccache:/ccache:delegated command: ["pip

[GitHub] [arrow] rok commented on a diff in pull request #14472: ARROW-18042: [Java] Distribute Apple M1 compatible JNI libraries via mavencentral

2022-10-24 Thread GitBox
rok commented on code in PR #14472: URL: https://github.com/apache/arrow/pull/14472#discussion_r1003092078 ## dev/tasks/java-jars/github.yml: ## @@ -48,47 +50,65 @@ jobs: {% endif %} build-cpp-macos: -name: Build C++ libraries macOS -runs-on: macos-latest +

[GitHub] [arrow] raulcd commented on pull request #14362: ARROW-17972: [CI] Update CUDA docker jobs

2022-10-24 Thread GitBox
raulcd commented on PR #14362: URL: https://github.com/apache/arrow/pull/14362#issuecomment-1288725296 > (This may be a problem of my environment...) I have tested locally and it worked correctly generating the wheel and all tests were successful for `ubuntu-cuda-python`: ```

[GitHub] [arrow] rok commented on a diff in pull request #14472: ARROW-18042: [Java] Distribute Apple M1 compatible JNI libraries via mavencentral

2022-10-24 Thread GitBox
rok commented on code in PR #14472: URL: https://github.com/apache/arrow/pull/14472#discussion_r1003093325 ## dev/tasks/java-jars/github.yml: ## @@ -164,6 +188,18 @@ jobs: arrow/ci/scripts/java_full_build.sh \ $GITHUB_WORKSPACE/arrow \ $GITHU

[GitHub] [arrow] rok commented on pull request #14472: ARROW-18042: [Java] Distribute Apple M1 compatible JNI libraries via mavencentral

2022-10-24 Thread GitBox
rok commented on PR #14472: URL: https://github.com/apache/arrow/pull/14472#issuecomment-1288726668 @github-actions crossbow submit java-jars -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

[GitHub] [arrow] fotiskoun opened a new issue, #14483: [C++ Lib][Ubuntu 20.04] libarrow-dev package not found

2022-10-24 Thread GitBox
fotiskoun opened a new issue, #14483: URL: https://github.com/apache/arrow/issues/14483 Hello, I am trying to install the C++ headers for arrow and parquet. For this reason I follow the instructions from this page https://arrow.apache.org/install/ I update, install the certifica

[GitHub] [arrow] raulcd commented on a diff in pull request #14362: ARROW-17972: [CI] Update CUDA docker jobs

2022-10-24 Thread GitBox
raulcd commented on code in PR #14362: URL: https://github.com/apache/arrow/pull/14362#discussion_r1003096657 ## docker-compose.yml: ## @@ -488,17 +488,36 @@ services: shm_size: *shm-size ulimits: *ulimits environment: &cuda-environment Review Comment: can we

[GitHub] [arrow] assignUser commented on pull request #14362: ARROW-17972: [CI] Update CUDA docker jobs

2022-10-24 Thread GitBox
assignUser commented on PR #14362: URL: https://github.com/apache/arrow/pull/14362#issuecomment-1288739123 To avoide the `None` version issue you have to set `SETUPTOOLS_SCM_PRETEND_VERSION=10.0.0`, the var is added to the env node and will be passed on to the buiild. -- This is an autom

[GitHub] [arrow] assignUser commented on a diff in pull request #14362: ARROW-17972: [CI] Update CUDA docker jobs

2022-10-24 Thread GitBox
assignUser commented on code in PR #14362: URL: https://github.com/apache/arrow/pull/14362#discussion_r1003103085 ## docker-compose.yml: ## @@ -488,17 +488,36 @@ services: shm_size: *shm-size ulimits: *ulimits environment: &cuda-environment Review Comment: I w

[GitHub] [arrow] assignUser commented on a diff in pull request #14362: ARROW-17972: [CI] Update CUDA docker jobs

2022-10-24 Thread GitBox
assignUser commented on code in PR #14362: URL: https://github.com/apache/arrow/pull/14362#discussion_r1003103085 ## docker-compose.yml: ## @@ -488,17 +488,36 @@ services: shm_size: *shm-size ulimits: *ulimits environment: &cuda-environment Review Comment: I w

[GitHub] [arrow-rs] waitingkuo commented on a diff in pull request #2909: Add timezone abstraction

2022-10-24 Thread GitBox
waitingkuo commented on code in PR #2909: URL: https://github.com/apache/arrow-rs/pull/2909#discussion_r1003055425 ## arrow-array/src/timezone.rs: ## @@ -0,0 +1,306 @@ +use arrow_schema::ArrowError; +use chrono::format::{parse, Parsed, StrftimeItems}; +use chrono::FixedOffset; +

[GitHub] [arrow] github-actions[bot] commented on pull request #14472: ARROW-18042: [Java] Distribute Apple M1 compatible JNI libraries via mavencentral

2022-10-24 Thread GitBox
github-actions[bot] commented on PR #14472: URL: https://github.com/apache/arrow/pull/14472#issuecomment-1288805457 Revision: 1525d0550797df99184e91b65e77cb470257e8b7 Submitted crossbow builds: [ursacomputing/crossbow @ actions-d68fe64cda](https://github.com/ursacomputing/crossbow/bra

[GitHub] [arrow] github-actions[bot] commented on pull request #14484: ARROW-18131: [R] Correctly handle .data pronoun in group_by()

2022-10-24 Thread GitBox
github-actions[bot] commented on PR #14484: URL: https://github.com/apache/arrow/pull/14484#issuecomment-1288815090 https://issues.apache.org/jira/browse/ARROW-18131 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] rtpsw commented on pull request #14385: ARROW-17980: [C++] As-of-Join Substrait extension

2022-10-24 Thread GitBox
rtpsw commented on PR #14385: URL: https://github.com/apache/arrow/pull/14385#issuecomment-1288843658 The commit history got messed up. I'll try to open a fresh PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

[GitHub] [arrow] rtpsw closed pull request #14385: ARROW-17980: [C++] As-of-Join Substrait extension

2022-10-24 Thread GitBox
rtpsw closed pull request #14385: ARROW-17980: [C++] As-of-Join Substrait extension URL: https://github.com/apache/arrow/pull/14385 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

[GitHub] [arrow] fotiskoun commented on issue #14483: [C++ Lib][Ubuntu 20.04] libarrow-dev package not found

2022-10-24 Thread GitBox
fotiskoun commented on issue #14483: URL: https://github.com/apache/arrow/issues/14483#issuecomment-1288845357 There was an issue with my sources list, so my system could not correctly install in ubuntu focal the apache-arrow-apt-source. After updating my sources by adding the following f

[GitHub] [arrow] fotiskoun closed issue #14483: [C++ Lib][Ubuntu 20.04] libarrow-dev package not found

2022-10-24 Thread GitBox
fotiskoun closed issue #14483: [C++ Lib][Ubuntu 20.04] libarrow-dev package not found URL: https://github.com/apache/arrow/issues/14483 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [arrow] rtpsw opened a new pull request, #14485: ARROW-17980: [C++] As-of-Join Substrait extension

2022-10-24 Thread GitBox
rtpsw opened a new pull request, #14485: URL: https://github.com/apache/arrow/pull/14485 Replacing https://github.com/apache/arrow/pull/14385 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

[GitHub] [arrow] github-actions[bot] commented on pull request #14485: ARROW-17980: [C++] As-of-Join Substrait extension

2022-10-24 Thread GitBox
github-actions[bot] commented on PR #14485: URL: https://github.com/apache/arrow/pull/14485#issuecomment-1288855233 https://issues.apache.org/jira/browse/ARROW-17980 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] rok commented on pull request #14472: ARROW-18042: [Java] Distribute Apple M1 compatible JNI libraries via mavencentral

2022-10-24 Thread GitBox
rok commented on PR #14472: URL: https://github.com/apache/arrow/pull/14472#issuecomment-1288857101 @github-actions crossbow submit java-jars -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

[GitHub] [arrow] rtpsw commented on a diff in pull request #14485: ARROW-17980: [C++] As-of-Join Substrait extension

2022-10-24 Thread GitBox
rtpsw commented on code in PR #14485: URL: https://github.com/apache/arrow/pull/14485#discussion_r1003180495 ## cpp/src/arrow/engine/substrait/test_plan_builder.cc: ## @@ -88,16 +91,18 @@ Result> CreateProject( // If it doesn't have a type then it's an enum const

[GitHub] [arrow-datafusion] HaoYang670 opened a new issue, #3934: Refactor `simplify_expressions` and `expr_simplifier`

2022-10-24 Thread GitBox
HaoYang670 opened a new issue, #3934: URL: https://github.com/apache/arrow-datafusion/issues/3934 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** Currently, the items related to expression simplification are in `simplification_exp

[GitHub] [arrow-rs] waitingkuo commented on a diff in pull request #2909: Add timezone abstraction

2022-10-24 Thread GitBox
waitingkuo commented on code in PR #2909: URL: https://github.com/apache/arrow-rs/pull/2909#discussion_r1003230601 ## arrow-array/src/temporal_conversions.rs: ## @@ -187,6 +188,15 @@ pub fn as_datetime(v: i64) -> Option { } } +/// Converts an [`ArrowPrimitiveType`] to [

[GitHub] [arrow-rs] waitingkuo commented on pull request #2909: Add timezone abstraction

2022-10-24 Thread GitBox
waitingkuo commented on PR #2909: URL: https://github.com/apache/arrow-rs/pull/2909#issuecomment-1288931513 looks great to me, thank you @tustvold -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] github-actions[bot] commented on pull request #14472: ARROW-18042: [Java] Distribute Apple M1 compatible JNI libraries via mavencentral

2022-10-24 Thread GitBox
github-actions[bot] commented on PR #14472: URL: https://github.com/apache/arrow/pull/14472#issuecomment-1288942675 Revision: c10fafb77d2205e853984e5e1048ae2d418ce4b6 Submitted crossbow builds: [ursacomputing/crossbow @ actions-c7f0db7052](https://github.com/ursacomputing/crossbow/bra

[GitHub] [arrow-rs] waitingkuo opened a new issue, #2917: Add timezone offset while displaying Timestamp with Timezone

2022-10-24 Thread GitBox
waitingkuo opened a new issue, #2917: URL: https://github.com/apache/arrow-rs/issues/2917 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** ```rust fn main() { let arr = TimestampSecondArray::from_vec(vec![0], Some

[GitHub] [arrow-nanoarrow] paleolimbot commented on a diff in pull request #61: Proof-of-concept IPC reader/writer

2022-10-24 Thread GitBox
paleolimbot commented on code in PR #61: URL: https://github.com/apache/arrow-nanoarrow/pull/61#discussion_r1003251733 ## extensions/nanoarrow_ipc/src/nanoarrow_ipc/nanoarrow_ipc.h: ## @@ -0,0 +1,153 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more c

[GitHub] [arrow] icexelloss commented on pull request #14385: ARROW-17980: [C++] As-of-Join Substrait extension

2022-10-24 Thread GitBox
icexelloss commented on PR #14385: URL: https://github.com/apache/arrow/pull/14385#issuecomment-1288966778 @rtpsw You don't need to open a new PR - you can just force push to your branch ARROW-17980 and fix the commit history -- This is an automated message from the Apache Git Service. To

[GitHub] [arrow] lidavidm commented on a diff in pull request #14151: ARROW-11776: [C++][Java] Support parquet write from ArrowReader to file

2022-10-24 Thread GitBox
lidavidm commented on code in PR #14151: URL: https://github.com/apache/arrow/pull/14151#discussion_r1003261476 ## java/dataset/src/test/java/org/apache/arrow/dataset/file/TestDatasetFileWriter.java: ## @@ -0,0 +1,140 @@ +/* + * Licensed to the Apache Software Foundation (ASF) u

[GitHub] [arrow] lidavidm commented on pull request #14151: ARROW-11776: [C++][Java] Support parquet write from ArrowReader to file

2022-10-24 Thread GitBox
lidavidm commented on PR #14151: URL: https://github.com/apache/arrow/pull/14151#issuecomment-1288970726 There are lint errors https://github.com/apache/arrow/actions/runs/3290191393/jobs/5426029066#step:6:7990 ``` [WARN] /arrow/java/dataset/src/main/java/org/apache/arrow/data

[GitHub] [arrow-nanoarrow] lidavidm commented on a diff in pull request #61: Proof-of-concept IPC reader/writer

2022-10-24 Thread GitBox
lidavidm commented on code in PR #61: URL: https://github.com/apache/arrow-nanoarrow/pull/61#discussion_r1003265839 ## extensions/nanoarrow_ipc/src/nanoarrow_ipc/nanoarrow_ipc.h: ## @@ -0,0 +1,153 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more cont

[GitHub] [arrow-nanoarrow] lidavidm commented on pull request #61: Proof-of-concept IPC reader/writer

2022-10-24 Thread GitBox
lidavidm commented on PR #61: URL: https://github.com/apache/arrow-nanoarrow/pull/61#issuecomment-1288977477 We could perhaps keep the flatcc runtime separate, so that you would just not include it if you're also using flatcc yourself, but otherwise it's right there and you don't need to fe

[GitHub] [arrow] rok commented on a diff in pull request #14472: ARROW-18042: [Java] Distribute Apple M1 compatible JNI libraries via mavencentral

2022-10-24 Thread GitBox
rok commented on code in PR #14472: URL: https://github.com/apache/arrow/pull/14472#discussion_r1003268639 ## java/c/src/main/java/org/apache/arrow/c/jni/JniLoader.java: ## @@ -33,7 +33,7 @@ * The JniLoader for C Data Interface API's native implementation. */ public class J

[GitHub] [arrow] js8544 opened a new pull request, #14486: ARROW-18144: [C++] Improve JSONTypeError error message in testing

2022-10-24 Thread GitBox
js8544 opened a new pull request, #14486: URL: https://github.com/apache/arrow/pull/14486 If there is a type error, ArrayFromJSON returns an error message like "Invalid: Expected unsigned int or null, got JSON type 4", where JSON type 4 is a value of type rapidjson::Type enum. It is better

[GitHub] [arrow-nanoarrow] jorisvandenbossche opened a new pull request, #62: [Python] Basic Array class wrapping C struct (with conversion to numpy)

2022-10-24 Thread GitBox
jorisvandenbossche opened a new pull request, #62: URL: https://github.com/apache/arrow-nanoarrow/pull/62 Currently still on top https://github.com/apache/arrow-nanoarrow/pull/52 This further explores possible Python bindings in cython. I added here an `Array` class that wraps an `Arr

[GitHub] [arrow-datafusion] andygrove merged pull request #3931: Refactor Expr::Cast to use a struct.

2022-10-24 Thread GitBox
andygrove merged PR #3931: URL: https://github.com/apache/arrow-datafusion/pull/3931 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

[GitHub] [arrow-datafusion] HaoYang670 opened a new issue, #3935: Update the Roadmap.md

2022-10-24 Thread GitBox
HaoYang670 opened a new issue, #3935: URL: https://github.com/apache/arrow-datafusion/issues/3935 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** ``` Having Ballista as part of the DataFusion codebase helps ensure that DataFusi

[GitHub] [arrow-datafusion] HaoYang670 commented on issue #3935: Update the Roadmap.md

2022-10-24 Thread GitBox
HaoYang670 commented on issue #3935: URL: https://github.com/apache/arrow-datafusion/issues/3935#issuecomment-1289001408 cc @andygrove. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

[GitHub] [arrow-nanoarrow] jorisvandenbossche commented on pull request #52: [Python] Add basic Python package structure and build setup

2022-10-24 Thread GitBox
jorisvandenbossche commented on PR #52: URL: https://github.com/apache/arrow-nanoarrow/pull/52#issuecomment-1289001562 Would it be OK to merge this PR? (I am not sure we will eventually end up using this setup, but at least that allows some further experimentation with cleaner diffs) --

[GitHub] [arrow] thisisnic merged pull request #14484: ARROW-18131: [R] Correctly handle .data pronoun in group_by()

2022-10-24 Thread GitBox
thisisnic merged PR #14484: URL: https://github.com/apache/arrow/pull/14484 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected]

[GitHub] [arrow-datafusion] alamb commented on pull request #3932: doc: fix doc about `CREATE TABLE IF NOT EXISTS`

2022-10-24 Thread GitBox
alamb commented on PR #3932: URL: https://github.com/apache/arrow-datafusion/pull/3932#issuecomment-1289004326 ❤️ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

[GitHub] [arrow-nanoarrow] jorisvandenbossche commented on issue #53: [Python] Useful scope of the python bindings?

2022-10-24 Thread GitBox
jorisvandenbossche commented on issue #53: URL: https://github.com/apache/arrow-nanoarrow/issues/53#issuecomment-1289007767 > Data structure: provide an R class that holds an Array/Schema/ArrayStream and releases it when it goes out of scope. Yes, that was the first thing I was planni

[GitHub] [arrow-datafusion] alamb commented on issue #3935: Update the Roadmap.md

2022-10-24 Thread GitBox
alamb commented on issue #3935: URL: https://github.com/apache/arrow-datafusion/issues/3935#issuecomment-1289015702 It would be great to update the roadmap -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [arrow-datafusion] alamb opened a new issue, #3936: Error using `IN` list on dictionary encoded data

2022-10-24 Thread GitBox
alamb opened a new issue, #3936: URL: https://github.com/apache/arrow-datafusion/issues/3936 **Describe the bug** I am writing a query to select some values from a dictionary encoded string column `trace_id` and I get an error ``` select * from spans where trace_id IN ('187dcb

[GitHub] [arrow-datafusion] alamb commented on issue #3936: Error using `IN` list on dictionary encoded data

2022-10-24 Thread GitBox
alamb commented on issue #3936: URL: https://github.com/apache/arrow-datafusion/issues/3936#issuecomment-1289018830 I plan to work on this later in the week if no one beats me to it -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

[GitHub] [arrow] rok commented on pull request #14472: ARROW-18042: [Java] Distribute Apple M1 compatible JNI libraries via mavencentral

2022-10-24 Thread GitBox
rok commented on PR #14472: URL: https://github.com/apache/arrow/pull/14472#issuecomment-1289021998 @github-actions crossbow submit java-jars -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

[GitHub] [arrow-nanoarrow] jorisvandenbossche merged pull request #52: [Python] Add basic Python package structure and build setup

2022-10-24 Thread GitBox
jorisvandenbossche merged PR #52: URL: https://github.com/apache/arrow-nanoarrow/pull/52 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[GitHub] [arrow-nanoarrow] paleolimbot commented on issue #53: [Python] Useful scope of the python bindings?

2022-10-24 Thread GitBox
paleolimbot commented on issue #53: URL: https://github.com/apache/arrow-nanoarrow/issues/53#issuecomment-1289032211 Yeah, the fact that `some_r_schema$name <- "something else"` makes a deep copy of `some_r_schema` with my current design is probably not ideal...if it were just a classed R l

[GitHub] [arrow-rs] dependabot[bot] opened a new pull request, #2918: Update quick-xml requirement from 0.25.0 to 0.26.0

2022-10-24 Thread GitBox
dependabot[bot] opened a new pull request, #2918: URL: https://github.com/apache/arrow-rs/pull/2918 Updates the requirements on [quick-xml](https://github.com/tafia/quick-xml) to permit the latest version. Changelog Sourced from https://github.com/tafia/quick-xml/blob/master/Change

[GitHub] [arrow-rs] waitingkuo commented on pull request #2909: Add timezone abstraction

2022-10-24 Thread GitBox
waitingkuo commented on PR #2909: URL: https://github.com/apache/arrow-rs/pull/2909#issuecomment-1289053917 the return data types for `timestamp_[sometimeunit]_to_datetime` are changed to in #2894 https://github.com/apache/arrow-rs/pull/2909/files#diff-fe32c157c2dda897288cc93386e455

[GitHub] [arrow-datafusion] andygrove commented on a diff in pull request #3884: Improve formatting of binary expressions

2022-10-24 Thread GitBox
andygrove commented on code in PR #3884: URL: https://github.com/apache/arrow-datafusion/pull/3884#discussion_r1003329721 ## datafusion/expr/src/expr.rs: ## @@ -267,6 +267,28 @@ impl BinaryExpr { } } +impl Display for BinaryExpr { +fn fmt(&self, f: &mut Formatter<'_>

[GitHub] [arrow-datafusion] jackwener commented on issue #3935: Update the Roadmap.md

2022-10-24 Thread GitBox
jackwener commented on issue #3935: URL: https://github.com/apache/arrow-datafusion/issues/3935#issuecomment-1289062910 We also can consider other part, some content is outdate. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow-datafusion] alamb opened a new pull request, #3937: Add test for querying predicate on dictionary

2022-10-24 Thread GitBox
alamb opened a new pull request, #3937: URL: https://github.com/apache/arrow-datafusion/pull/3937 Add (ignored) test for https://github.com/apache/arrow-datafusion/issues/3936 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

[GitHub] [arrow-datafusion] jackwener commented on issue #3936: Error using `IN` list on dictionary encoded data: `InList does not support datatype Dictionary(Int32, Utf8).`

2022-10-24 Thread GitBox
jackwener commented on issue #3936: URL: https://github.com/apache/arrow-datafusion/issues/3936#issuecomment-1289065648 related issue: #3766 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

[GitHub] [arrow-datafusion] waitingkuo commented on issue #3922: Internal error in CAST from Timestamp[us]

2022-10-24 Thread GitBox
waitingkuo commented on issue #3922: URL: https://github.com/apache/arrow-datafusion/issues/3922#issuecomment-1289094859 @ike560 Cast does support timestamp to INT64, you can try `CAST(tpep_pickup_datetime AS BIGINT)` -- This is an automated message from the Apache Git Service. To respon

[GitHub] [arrow-datafusion] alamb opened a new issue, #3938: Predicate still has cast when comparing Timestamp(Nano, None) to a timestamp literal, so can't be pushed down or used for pruning

2022-10-24 Thread GitBox
alamb opened a new issue, #3938: URL: https://github.com/apache/arrow-datafusion/issues/3938 **Describe the bug** Comparing a Timestamp(Nanosecond, None) column to a timestamp literal is important for IOx and can be used to potentially prune significant amounts of data and pushed down to

[GitHub] [arrow-datafusion] alamb opened a new pull request, #3939: Add optimizer test for simplifying predicates on timestamps

2022-10-24 Thread GitBox
alamb opened a new pull request, #3939: URL: https://github.com/apache/arrow-datafusion/pull/3939 Add test for https://github.com/apache/arrow-datafusion/issues/3938 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #3939: Add optimizer test for simplifying predicates on timestamps

2022-10-24 Thread GitBox
alamb commented on code in PR #3939: URL: https://github.com/apache/arrow-datafusion/pull/3939#discussion_r1003364471 ## datafusion/optimizer/tests/integration-test.rs: ## @@ -87,11 +88,11 @@ fn case_when_aggregate() -> Result<()> { #[test] fn unsigned_target_type() -> Resul

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #3939: Add optimizer test for simplifying predicates on timestamps

2022-10-24 Thread GitBox
alamb commented on code in PR #3939: URL: https://github.com/apache/arrow-datafusion/pull/3939#discussion_r1003365789 ## datafusion/optimizer/tests/integration-test.rs: ## @@ -225,6 +226,38 @@ fn concat_ws_literals() -> Result<()> { Ok(()) } +#[test] +#[ignore] +// https

[GitHub] [arrow] github-actions[bot] commented on pull request #14486: ARROW-18144: [C++] Improve JSONTypeError error message in testing

2022-10-24 Thread GitBox
github-actions[bot] commented on PR #14486: URL: https://github.com/apache/arrow/pull/14486#issuecomment-1289101863 https://issues.apache.org/jira/browse/ARROW-18144 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] vibhatha opened a new pull request, #14487: ARROW-18025: [C++][Python] SubstraitSinkConsumer should handle backpressure

2022-10-24 Thread GitBox
vibhatha opened a new pull request, #14487: URL: https://github.com/apache/arrow/pull/14487 **DO NOT REVIEW** Drafted for discussion -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[GitHub] [arrow-datafusion] alamb closed issue #1750: Break datafusion crate into smaller crates

2022-10-24 Thread GitBox
alamb closed issue #1750: Break datafusion crate into smaller crates URL: https://github.com/apache/arrow-datafusion/issues/1750 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow-datafusion] alamb commented on issue #1750: Break datafusion crate into smaller crates

2022-10-24 Thread GitBox
alamb commented on issue #1750: URL: https://github.com/apache/arrow-datafusion/issues/1750#issuecomment-1289115550 Closing this ticket and we can track further splitting in follow on issues. Again 👏 -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [arrow] lidavidm commented on a diff in pull request #14389: ARROW-18014: [Java] Implement copy functions for vectors and Table

2022-10-24 Thread GitBox
lidavidm commented on code in PR #14389: URL: https://github.com/apache/arrow/pull/14389#discussion_r1003385474 ## java/vector/src/main/java/org/apache/arrow/vector/table/BaseTable.java: ## @@ -266,6 +270,48 @@ FieldVector getVector(int columnIndex) { return fieldVectors.ge

[GitHub] [arrow] pitrou commented on a diff in pull request #14486: ARROW-18144: [C++] Improve JSONTypeError error message in testing

2022-10-24 Thread GitBox
pitrou commented on code in PR #14486: URL: https://github.com/apache/arrow/pull/14486#discussion_r1003415982 ## cpp/src/arrow/ipc/json_simple.cc: ## @@ -63,10 +63,11 @@ using ::arrow::internal::checked_pointer_cast; namespace { constexpr auto kParseFlags = rj::kParseFullPre

[GitHub] [arrow] xhochy commented on pull request #14102: ARROW-17635: [Python][CI] Sync conda recipe with the arrow-cpp feedstock

2022-10-24 Thread GitBox
xhochy commented on PR #14102: URL: https://github.com/apache/arrow/pull/14102#issuecomment-1289180399 @github-actions crossbow submit conda-win-vs2019-py37-r41 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[GitHub] [arrow] github-actions[bot] commented on pull request #14472: ARROW-18042: [Java] Distribute Apple M1 compatible JNI libraries via mavencentral

2022-10-24 Thread GitBox
github-actions[bot] commented on PR #14472: URL: https://github.com/apache/arrow/pull/14472#issuecomment-1289190515 Revision: 43c8e82352cfe7359b799728e0cbf48386ae2cbc Submitted crossbow builds: [ursacomputing/crossbow @ actions-d482219008](https://github.com/ursacomputing/crossbow/bra

[GitHub] [arrow] assignUser commented on pull request #14102: ARROW-17635: [Python][CI] Sync conda recipe with the arrow-cpp feedstock

2022-10-24 Thread GitBox
assignUser commented on PR #14102: URL: https://github.com/apache/arrow/pull/14102#issuecomment-1289203326 > > Do you want me to disable the PPC conda jobs somehow? > > I don't know. @assignUser What do you think? :shurg: I am not sure, I don't really know if this is important f

[GitHub] [arrow-rs] crepererum commented on a diff in pull request #2902: add `Array::as_any_arc`

2022-10-24 Thread GitBox
crepererum commented on code in PR #2902: URL: https://github.com/apache/arrow-rs/pull/2902#discussion_r1003454609 ## arrow-array/src/array/mod.rs: ## @@ -88,6 +88,31 @@ pub trait Array: std::fmt::Debug + Send + Sync { /// ``` fn as_any(&self) -> &dyn Any; +/// R

[GitHub] [arrow] github-actions[bot] commented on pull request #14487: ARROW-18025: [C++][Python] SubstraitSinkConsumer should handle backpressure

2022-10-24 Thread GitBox
github-actions[bot] commented on PR #14487: URL: https://github.com/apache/arrow/pull/14487#issuecomment-1289210146 https://issues.apache.org/jira/browse/ARROW-18025 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] github-actions[bot] commented on pull request #14487: ARROW-18025: [C++][Python] SubstraitSinkConsumer should handle backpressure

2022-10-24 Thread GitBox
github-actions[bot] commented on PR #14487: URL: https://github.com/apache/arrow/pull/14487#issuecomment-1289210210 :warning: Ticket **has not been started in JIRA**, please click 'Start Progress'. -- This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [arrow-rs] crepererum commented on a diff in pull request #2902: add `Array::as_any_arc`

2022-10-24 Thread GitBox
crepererum commented on code in PR #2902: URL: https://github.com/apache/arrow-rs/pull/2902#discussion_r1003457471 ## arrow-array/src/array/dictionary_array.rs: ## @@ -520,6 +521,10 @@ impl Array for DictionaryArray { self } +fn as_any_arc(self: Arc) -> Optio

[GitHub] [arrow-datafusion] alamb opened a new issue, #3941: Generate runtime errors if the memory budget is exceeded [EPIC]

2022-10-24 Thread GitBox
alamb opened a new issue, #3941: URL: https://github.com/apache/arrow-datafusion/issues/3941 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** The basic challenge is that DataFusion can use an unbounded amount of memory for running

[GitHub] [arrow-rs] crepererum commented on a diff in pull request #2902: add `Array::as_any_arc`

2022-10-24 Thread GitBox
crepererum commented on code in PR #2902: URL: https://github.com/apache/arrow-rs/pull/2902#discussion_r1003459174 ## arrow-array/src/array/mod.rs: ## @@ -258,6 +283,10 @@ impl Array for ArrayRef { self.as_ref().as_any() } +fn as_any_arc(self: Arc) -> Option>

[GitHub] [arrow] js8544 commented on a diff in pull request #14486: ARROW-18144: [C++] Improve JSONTypeError error message in testing

2022-10-24 Thread GitBox
js8544 commented on code in PR #14486: URL: https://github.com/apache/arrow/pull/14486#discussion_r1003460820 ## cpp/src/arrow/ipc/json_simple.cc: ## @@ -63,10 +63,11 @@ using ::arrow::internal::checked_pointer_cast; namespace { constexpr auto kParseFlags = rj::kParseFullPre

[GitHub] [arrow-rs] crepererum commented on issue #2901: API to take back ownership of an ArrayRef

2022-10-24 Thread GitBox
crepererum commented on issue #2901: URL: https://github.com/apache/arrow-rs/issues/2901#issuecomment-1289215838 Would it be an option to remove `impl Array for T` for the two cases where `T` involves references? Looking at all the impls and how `Array` is usually used, these two look prett

[GitHub] [arrow-datafusion] alamb commented on issue #587: Optionally Limit memory used by DataFusion plan

2022-10-24 Thread GitBox
alamb commented on issue #587: URL: https://github.com/apache/arrow-datafusion/issues/587#issuecomment-1289216806 Added https://github.com/apache/arrow-datafusion/issues/3941 for the project of "error if memory limits are exceeded" -- This is an automated message from the Apache Git Serv

[GitHub] [arrow] pitrou commented on pull request #14102: ARROW-17635: [Python][CI] Sync conda recipe with the arrow-cpp feedstock

2022-10-24 Thread GitBox
pitrou commented on PR #14102: URL: https://github.com/apache/arrow/pull/14102#issuecomment-1289217436 Ok, let's disable PPC now. The goal should be to improve the state of conda recipes, not necessarily get all of them working. -- This is an automated message from the Apache Git Service.

[GitHub] [arrow-datafusion] xudong963 commented on issue #3941: Generate runtime errors if the memory budget is exceeded [EPIC]

2022-10-24 Thread GitBox
xudong963 commented on issue #3941: URL: https://github.com/apache/arrow-datafusion/issues/3941#issuecomment-1289230359 For join operator, we have sort-merge join. When the memory budget is exceeded if we should try sort-merge join first? -- This is an automated message from the

[GitHub] [arrow-ballista] Dandandan opened a new pull request, #441: Reorder joins after resolving

2022-10-24 Thread GitBox
Dandandan opened a new pull request, #441: URL: https://github.com/apache/arrow-ballista/pull/441 # Which issue does this PR close? Part of #387 # Rationale for this change # What changes are included in this PR? # Are there any user-facing changes

[GitHub] [arrow-datafusion] alamb opened a new pull request, #3942: Consolidate physical join code into `datafusion/core/src/physical_plan/joins`

2022-10-24 Thread GitBox
alamb opened a new pull request, #3942: URL: https://github.com/apache/arrow-datafusion/pull/3942 # Which issue does this PR close? Re https://github.com/apache/arrow-datafusion/issues/3941 # Rationale for this change I want to quickly find where all the join

[GitHub] [arrow] lwhite1 commented on a diff in pull request #14389: ARROW-18014: [Java] Implement copy functions for vectors and Table

2022-10-24 Thread GitBox
lwhite1 commented on code in PR #14389: URL: https://github.com/apache/arrow/pull/14389#discussion_r1003495170 ## java/vector/src/main/java/org/apache/arrow/vector/table/BaseTable.java: ## @@ -266,6 +270,48 @@ FieldVector getVector(int columnIndex) { return fieldVectors.get

[GitHub] [arrow] rok commented on pull request #14472: ARROW-18042: [Java] Distribute Apple M1 compatible JNI libraries via mavencentral

2022-10-24 Thread GitBox
rok commented on PR #14472: URL: https://github.com/apache/arrow/pull/14472#issuecomment-1289261980 @github-actions crossbow submit java-jars -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

[GitHub] [arrow] pitrou opened a new pull request, #14488: ARROW-18141: [C++][Parquet] Avoid UB on unaligned load

2022-10-24 Thread GitBox
pitrou opened a new pull request, #14488: URL: https://github.com/apache/arrow/pull/14488 Some gcc versions (such as 6.3.0) may emit an aligned-only load instruction, but the Parquet writer can be called with unaligned buffers. -- This is an automated message from the Apache Git Service.

[GitHub] [arrow] lwhite1 commented on pull request #14389: ARROW-18014: [Java] Implement copy functions for vectors and Table

2022-10-24 Thread GitBox
lwhite1 commented on PR #14389: URL: https://github.com/apache/arrow/pull/14389#issuecomment-1289273879 I modified the documentation to reflect the correct Exception (It should be IllegalArgumentException.) This was mentioned incorrectly in many places in the Row class so the documentation

[GitHub] [arrow-datafusion] xudong963 commented on a diff in pull request #3939: Add optimizer test for simplifying predicates on timestamps

2022-10-24 Thread GitBox
xudong963 commented on code in PR #3939: URL: https://github.com/apache/arrow-datafusion/pull/3939#discussion_r1003509064 ## datafusion/optimizer/tests/integration-test.rs: ## @@ -87,11 +88,11 @@ fn case_when_aggregate() -> Result<()> { #[test] fn unsigned_target_type() -> R

[GitHub] [arrow] pitrou commented on pull request #14488: ARROW-18141: [C++][Parquet] Avoid UB on unaligned load

2022-10-24 Thread GitBox
pitrou commented on PR #14488: URL: https://github.com/apache/arrow/pull/14488#issuecomment-1289277427 The fix should probably be made in encoders as well (the various `Put` and `PutSpaced` methods). -- This is an automated message from the Apache Git Service. To respond to the message, p

[GitHub] [arrow-datafusion] xudong963 merged pull request #3939: Add optimizer test for simplifying predicates on timestamps

2022-10-24 Thread GitBox
xudong963 merged PR #3939: URL: https://github.com/apache/arrow-datafusion/pull/3939 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

[GitHub] [arrow] github-actions[bot] commented on pull request #14102: ARROW-17635: [Python][CI] Sync conda recipe with the arrow-cpp feedstock

2022-10-24 Thread GitBox
github-actions[bot] commented on PR #14102: URL: https://github.com/apache/arrow/pull/14102#issuecomment-1289284892 Revision: 57e3dc9aa24d25a3931571dbc39a543f1140c9c6 Submitted crossbow builds: [ursacomputing/crossbow @ actions-ff8a79230b](https://github.com/ursacomputing/crossbow/bra

  1   2   3   >