[GitHub] [arrow] sjperkins commented on pull request #33805: GH-33804: [Python] Add support for manylinux_2_28 wheel

2023-03-31 Thread via GitHub
sjperkins commented on PR #33805: URL: https://github.com/apache/arrow/pull/33805#issuecomment-1491411942 > @sjperkins try rebasing and push -f to get a clean CI run that might actually pass. I think the PR needs to re-opened for this to work? Pushes to the fork don't seem to be refl

[GitHub] [arrow-datafusion] viirya merged pull request #5784: Incorrect row comparison for tpch queries in benchmarks

2023-03-31 Thread via GitHub
viirya merged PR #5784: URL: https://github.com/apache/arrow-datafusion/pull/5784 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arr

[GitHub] [arrow-datafusion] viirya closed issue #5783: Incorrect row comparison for tpch queries in benchmarks

2023-03-31 Thread via GitHub
viirya closed issue #5783: Incorrect row comparison for tpch queries in benchmarks URL: https://github.com/apache/arrow-datafusion/issues/5783 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

[GitHub] [arrow] jorisvandenbossche commented on pull request #33805: GH-33804: [Python] Add support for manylinux_2_28 wheel

2023-03-31 Thread via GitHub
jorisvandenbossche commented on PR #33805: URL: https://github.com/apache/arrow/pull/33805#issuecomment-1491423805 > Pushes to the fork don't seem to be reflecting here anymore. Because you (force) pushed while the PR was closed, it can't be reopened anymore .. Sorry! (unless you woul

[GitHub] [arrow] jorisvandenbossche commented on pull request #34483: GH-33801: [Python] Generate on the fly Python Extension Types wrapping C++ Extension Types

2023-03-31 Thread via GitHub
jorisvandenbossche commented on PR #34483: URL: https://github.com/apache/arrow/pull/34483#issuecomment-1491429386 @sjperkins sorry about closing all your PRs, and about the slow review here. I will try to get back to this one and https://github.com/apache/arrow/pull/34469 shortly. O

[GitHub] [arrow] amol- commented on pull request #33805: GH-33804: [Python] Add support for manylinux_2_28 wheel

2023-03-31 Thread via GitHub
amol- commented on PR #33805: URL: https://github.com/apache/arrow/pull/33805#issuecomment-1491430616 It's probably easier to just open a new PR, that wouldn't require ant change on @sjperkins side and we could link the two PRs -- This is an automated message from the Apache Git Service.

[GitHub] [arrow] omrimallis opened a new pull request, #34511: GH-29105: [C++][Parquet] Relax schema checking when writing using StreamWriter

2023-03-31 Thread via GitHub
omrimallis opened a new pull request, #34511: URL: https://github.com/apache/arrow/pull/34511 ### Rationale for this change The converted type expected by StreamWriter when writing int and string values did not match the converted type typically set in the Parquet sche

[GitHub] [arrow] krcrouse opened a new pull request, #13126: ARROW-12526: Pre-generating pyarrow.compute and creating a docstring additions system for pyarrow functions

2023-03-31 Thread via GitHub
krcrouse opened a new pull request, #13126: URL: https://github.com/apache/arrow/pull/13126 This PR addresses both the JIRA issue cited (pre-generate pyarrow.compute) and also a dev thread that suggests creating the ability to add in python docs for functions that inherit from the Arrow C++

[GitHub] [arrow] jorisvandenbossche commented on pull request #13126: ARROW-12526: Pre-generating pyarrow.compute and creating a docstring additions system for pyarrow functions

2023-03-31 Thread via GitHub
jorisvandenbossche commented on PR #13126: URL: https://github.com/apache/arrow/pull/13126#issuecomment-1491436598 @krcrouse sorry for letting this slip my mind. I will try to take a look at the latest version and your last comment shortly. -- This is an automated message from the Apache

[GitHub] [arrow] sjperkins opened a new pull request, #34818: GH-33804: [Python] Add support for manylinux_2_28 wheel

2023-03-31 Thread via GitHub
sjperkins opened a new pull request, #34818: URL: https://github.com/apache/arrow/pull/34818 Closes #33804 ### Rationale for this change At some point, it would be useful to support the new C++ ABI `_GLIBCXX_USE_CXX11_ABI=1` in pyarrow wheels, especially when m

[GitHub] [arrow] github-actions[bot] commented on pull request #34818: GH-33804: [Python] Add support for manylinux_2_28 wheel

2023-03-31 Thread via GitHub
github-actions[bot] commented on PR #34818: URL: https://github.com/apache/arrow/pull/34818#issuecomment-1491457264 * Closes: #33804 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [arrow] sjperkins commented on pull request #33805: GH-33804: [Python] Add support for manylinux_2_28 wheel

2023-03-31 Thread via GitHub
sjperkins commented on PR #33805: URL: https://github.com/apache/arrow/pull/33805#issuecomment-1491457347 > It's probably easier to just open a new PR, that wouldn't require any change on @sjperkins side and we could link the two PRs Done in #34818 -- This is an automated mess

[GitHub] [arrow] sjperkins commented on pull request #34483: GH-33801: [Python] Generate on the fly Python Extension Types wrapping C++ Extension Types

2023-03-31 Thread via GitHub
sjperkins commented on PR #34483: URL: https://github.com/apache/arrow/pull/34483#issuecomment-1491459193 > @sjperkins sorry about closing all your PRs, and about the slow review here. I will try to get back to this one and #34469 shortly. No problem! I assumed that the upcoming Arrow

[GitHub] [arrow] sjperkins closed pull request #34483: GH-33801: [Python] Generate on the fly Python Extension Types wrapping C++ Extension Types

2023-03-31 Thread via GitHub
sjperkins closed pull request #34483: GH-33801: [Python] Generate on the fly Python Extension Types wrapping C++ Extension Types URL: https://github.com/apache/arrow/pull/34483 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [arrow] sjperkins commented on pull request #34483: GH-33801: [Python] Generate on the fly Python Extension Types wrapping C++ Extension Types

2023-03-31 Thread via GitHub
sjperkins commented on PR #34483: URL: https://github.com/apache/arrow/pull/34483#issuecomment-1491459870 > Yes, the UI didn't seem to give me the option to reopen the PR. But I do have the ability to close the PR.. -- This is an automated message from the Apache Git Ser

[GitHub] [arrow-datafusion] HaoYang670 opened a new issue, #5808: `decorrelate_where_in` reports error when optimizing `limit subquery`

2023-03-31 Thread via GitHub
HaoYang670 opened a new issue, #5808: URL: https://github.com/apache/arrow-datafusion/issues/5808 ### Describe the bug `decorrelate_where_in` currently only support `Predicate` as the top level plan in the sub-queries, otherwise it will return an error: https://github.com/apache/a

[GitHub] [arrow-datafusion] HaoYang670 commented on issue #5808: `decorrelate_where_in` reports error when optimizing `limit subquery`

2023-03-31 Thread via GitHub
HaoYang670 commented on issue #5808: URL: https://github.com/apache/arrow-datafusion/issues/5808#issuecomment-1491467615 Hi @avantgardnerio @mingmwang, could you please help to take a look if you have time? -- This is an automated message from the Apache Git Service. To respond to the me

[GitHub] [arrow] sjperkins commented on pull request #34483: GH-33801: [Python] Generate on the fly Python Extension Types wrapping C++ Extension Types

2023-03-31 Thread via GitHub
sjperkins commented on PR #34483: URL: https://github.com/apache/arrow/pull/34483#issuecomment-1491468130 > Yes, the UI didn't seem to give me the option to reopen the PR. OK, this does seem possible if I close the PR. Apologies I responded late yesterday evening, so I may have confla

[GitHub] [arrow] liujiajun commented on pull request #34818: GH-33804: [Python] Add support for manylinux_2_28 wheel

2023-03-31 Thread via GitHub
liujiajun commented on PR #34818: URL: https://github.com/apache/arrow/pull/34818#issuecomment-1491472978 Hi I am new to Arrow and I would like to ask a dumb question here: From my understanding, the current situation (before this PR) is - pyarrow is compiled with the old ABI -

[GitHub] [arrow] kou commented on pull request #34818: GH-33804: [Python] Add support for manylinux_2_28 wheel

2023-03-31 Thread via GitHub
kou commented on PR #34818: URL: https://github.com/apache/arrow/pull/34818#issuecomment-1491474119 @github-actions crossbow submit wheel-manylinux* -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [arrow] sjperkins commented on pull request #34483: GH-33801: [Python] Generate on the fly Python Extension Types wrapping C++ Extension Types

2023-03-31 Thread via GitHub
sjperkins commented on PR #34483: URL: https://github.com/apache/arrow/pull/34483#issuecomment-1491477330 @jorisvandenbossche Regarding this issue and #33997, they're somewhat exploratory because they dynamically generate Python types. I'm interested in whether these strategies would

[GitHub] [arrow] github-actions[bot] commented on pull request #34818: GH-33804: [Python] Add support for manylinux_2_28 wheel

2023-03-31 Thread via GitHub
github-actions[bot] commented on PR #34818: URL: https://github.com/apache/arrow/pull/34818#issuecomment-1491482316 Revision: cf616099529de1390cb8711e9e798464b7be57a0 Submitted crossbow builds: [ursacomputing/crossbow @ actions-948abf799c](https://github.com/ursacomputing/crossbow/bra

[GitHub] [arrow] rok commented on pull request #14341: GH-32863: [C++][Parquet] Add DELTA_BYTE_ARRAY encoder to Parquet writer

2023-03-31 Thread via GitHub
rok commented on PR #14341: URL: https://github.com/apache/arrow/pull/14341#issuecomment-1491492079 Thanks for @mapleFU. I used your suggestion. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

[GitHub] [arrow] thisisnic merged pull request #34710: GH-33998: [R] Update vignettes to reference the new open_*_dataset functions

2023-03-31 Thread via GitHub
thisisnic merged PR #34710: URL: https://github.com/apache/arrow/pull/34710 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected]

[GitHub] [arrow] wgtmac commented on pull request #34054: GH-34053: [C++][Parquet] Write parquet page index

2023-03-31 Thread via GitHub
wgtmac commented on PR #34054: URL: https://github.com/apache/arrow/pull/34054#issuecomment-1491505218 Gentle ping @emkornfield :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

[GitHub] [arrow] thisisnic commented on a diff in pull request #34798: GH-15247: [R] Error when trying to save a data.frame with NULL column names

2023-03-31 Thread via GitHub
thisisnic commented on code in PR #34798: URL: https://github.com/apache/arrow/pull/34798#discussion_r1154168985 ## r/R/csv.R: ## @@ -782,12 +782,16 @@ write_csv_arrow <- function(x, tryCatch( x <- as_record_batch_reader(x), error = function(e) { -abor

[GitHub] [arrow] thisisnic commented on a diff in pull request #34798: GH-15247: [R] Error when trying to save a data.frame with NULL column names

2023-03-31 Thread via GitHub
thisisnic commented on code in PR #34798: URL: https://github.com/apache/arrow/pull/34798#discussion_r1154168985 ## r/R/csv.R: ## @@ -782,12 +782,16 @@ write_csv_arrow <- function(x, tryCatch( x <- as_record_batch_reader(x), error = function(e) { -abor

[GitHub] [arrow] sjperkins commented on pull request #34818: GH-33804: [Python] Add support for manylinux_2_28 wheel

2023-03-31 Thread via GitHub
sjperkins commented on PR #34818: URL: https://github.com/apache/arrow/pull/34818#issuecomment-1491510145 > Then, how come pyarrow is able to use arrow without a runtime error? The pyarrow binary wheel packages the arrow shared object libraries built with the ABI, and I believe auditw

[GitHub] [arrow] kou merged pull request #34765: GH-14917: [C++] Error out when GTest is compiled with a C++ standard lower than 17

2023-03-31 Thread via GitHub
kou merged PR #34765: URL: https://github.com/apache/arrow/pull/34765 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected]

[GitHub] [arrow] thisisnic commented on pull request #34798: GH-15247: [R] Error when trying to save a data.frame with NULL column names

2023-03-31 Thread via GitHub
thisisnic commented on PR #34798: URL: https://github.com/apache/arrow/pull/34798#issuecomment-1491524011 Added checking for `write_dataset()`. Already implemented for `write_ipc_stream()` as that would call `as_arrow_table()` internally, so rather than testing that too, I moved the tests t

[GitHub] [arrow-datafusion] berkaysynnada commented on pull request #5803: Minor: clean up timestamp arithmetic tests

2023-03-31 Thread via GitHub
berkaysynnada commented on PR #5803: URL: https://github.com/apache/arrow-datafusion/pull/5803#issuecomment-1491526050 > @berkaysynnada I would appreciate a review of this proposal if you have some time Thank you for solving the interval issue in the tests. I think when we can handl

[GitHub] [arrow] omrimallis commented on pull request #34511: GH-29105: [C++][Parquet] Relax schema checking when writing using StreamWriter

2023-03-31 Thread via GitHub
omrimallis commented on PR #34511: URL: https://github.com/apache/arrow/pull/34511#issuecomment-1491527724 @mapleFU Sure, feel free to update it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

[GitHub] [arrow-datafusion] crepererum commented on issue #5789: SQL case, This feature is not implemented: Physical plan does not support logical expression EXISTS ()

2023-03-31 Thread via GitHub
crepererum commented on issue #5789: URL: https://github.com/apache/arrow-datafusion/issues/5789#issuecomment-1491531423 To me this looks like a bug. Who's trying to push down / apply a sub-query predicate to a parquet file read? Shouldn't the logical optimizer remove these kind of express

[GitHub] [arrow] sjperkins commented on pull request #34483: GH-33801: [Python] Generate on the fly Python Extension Types wrapping C++ Extension Types

2023-03-31 Thread via GitHub
sjperkins commented on PR #34483: URL: https://github.com/apache/arrow/pull/34483#issuecomment-1491537307 > For instance, {from,to}_numpy() and {from,to}_pandas could then be added to these types. Some further context to motivate for this: It would be useful to efficiently convert ne

[GitHub] [arrow] collimarco commented on issue #34801: gem install red-arrow fails on MacOS / Ruby 3.2.1

2023-03-31 Thread via GitHub
collimarco commented on issue #34801: URL: https://github.com/apache/arrow/issues/34801#issuecomment-1491539118 @kou Thanks. I am closing this issue and moved the other topic here: https://github.com/apache/arrow/issues/34819 -- This is an automated message from the Apache Git Service. To

[GitHub] [arrow-datafusion] berkaysynnada commented on pull request #5764: Support timestamp and interval arithmetic

2023-03-31 Thread via GitHub
berkaysynnada commented on PR #5764: URL: https://github.com/apache/arrow-datafusion/pull/5764#issuecomment-1491548025 > > > @alamb Thanks for the support. I add these issues to my to-do's and will open the PRs as I progress. > > Thanks @berkaysynnada -- can you be sp

[GitHub] [arrow-datafusion] Jefffrey commented on issue #5657: Request for documentation for compressed CSV/JSON support

2023-03-31 Thread via GitHub
Jefffrey commented on issue #5657: URL: https://github.com/apache/arrow-datafusion/issues/5657#issuecomment-1491561571 Specific issue seems to be in this function: https://github.com/apache/arrow-datafusion/blob/667f19ebad216b7592af5a91b70a24fb21c3bb64/datafusion/core/src/datasource/

[GitHub] [arrow] kou commented on issue #34801: gem install red-arrow fails on MacOS / Ruby 3.2.1

2023-03-31 Thread via GitHub
kou commented on issue #34801: URL: https://github.com/apache/arrow/issues/34801#issuecomment-1491565839 Thanks. I reopen this. I want to use this to resolve avoiding `PKG_CONFIG_PATH=$(brew --prefix [email protected])/lib/pkgconfig ...` workaround. -- This is an automated message from t

[GitHub] [arrow-datafusion] dependabot[bot] commented on pull request #5752: Update ctor requirement from 0.1.22 to 0.2.0

2023-03-31 Thread via GitHub
dependabot[bot] commented on PR #5752: URL: https://github.com/apache/arrow-datafusion/pull/5752#issuecomment-1491589354 A newer version of ctor exists, but since this PR has been edited by someone other than Dependabot I haven't updated it. You'll get a PR for the updated version as norma

[GitHub] [arrow] kou commented on issue #34819: [Ruby] Add easy to use API for match_substring and related functions

2023-03-31 Thread via GitHub
kou commented on issue #34819: URL: https://github.com/apache/arrow/issues/34819#issuecomment-1491603815 FYI: We can implement `slicer['message'].match_substring('example')` API by improving https://github.com/apache/arrow/blob/68dc40a10bc752d8b577f57976873d087714d486/ruby/red-arrow/lib/arr

[GitHub] [arrow] jorisvandenbossche closed pull request #34483: GH-33801: [Python] Generate on the fly Python Extension Types wrapping C++ Extension Types

2023-03-31 Thread via GitHub
jorisvandenbossche closed pull request #34483: GH-33801: [Python] Generate on the fly Python Extension Types wrapping C++ Extension Types URL: https://github.com/apache/arrow/pull/34483 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

[GitHub] [arrow] jorisvandenbossche commented on pull request #34483: GH-33801: [Python] Generate on the fly Python Extension Types wrapping C++ Extension Types

2023-03-31 Thread via GitHub
jorisvandenbossche commented on PR #34483: URL: https://github.com/apache/arrow/pull/34483#issuecomment-1491646664 Can you reopen it now? (it might depend on whether you closed it yourself) -- This is an automated message from the Apache Git Service. To respond to the message, please log o

[GitHub] [arrow] jorisvandenbossche commented on a diff in pull request #8510: GH-15483: [C++] Add a Fixed Shape Tensor canonical ExtensionType

2023-03-31 Thread via GitHub
jorisvandenbossche commented on code in PR #8510: URL: https://github.com/apache/arrow/pull/8510#discussion_r1154281153 ## cpp/src/arrow/CMakeLists.txt: ## @@ -805,6 +806,10 @@ if(ARROW_IPC) add_arrow_test(extension_type_test) endif() +if(ARROW_JSON) + add_arrow_test(cano

[GitHub] [arrow] jorisvandenbossche commented on a diff in pull request #8510: GH-15483: [C++] Add a Fixed Shape Tensor canonical ExtensionType

2023-03-31 Thread via GitHub
jorisvandenbossche commented on code in PR #8510: URL: https://github.com/apache/arrow/pull/8510#discussion_r1154282392 ## cpp/src/arrow/extension/fixed_shape_tensor_test.cc: ## @@ -0,0 +1,220 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contribu

[GitHub] [arrow] jorisvandenbossche commented on a diff in pull request #8510: GH-15483: [C++] Add a Fixed Shape Tensor canonical ExtensionType

2023-03-31 Thread via GitHub
jorisvandenbossche commented on code in PR #8510: URL: https://github.com/apache/arrow/pull/8510#discussion_r1154282392 ## cpp/src/arrow/extension/fixed_shape_tensor_test.cc: ## @@ -0,0 +1,220 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contribu

[GitHub] [arrow] eitsupi commented on issue #34812: [Packaging][Python] Use self-hosted arm64 runner instead of Travis CI for arm64 wheels

2023-03-31 Thread via GitHub
eitsupi commented on issue #34812: URL: https://github.com/apache/arrow/issues/34812#issuecomment-1491669788 > @eitsupi Are you interested in this? Sure. Can I try to work on adding `dev/tasks/python-wheels/github.linux.arm64.yml`? -- This is an automated message from the Apache

[GitHub] [arrow] sjperkins commented on pull request #34483: GH-33801: [Python] Generate on the fly Python Extension Types wrapping C++ Extension Types

2023-03-31 Thread via GitHub
sjperkins commented on PR #34483: URL: https://github.com/apache/arrow/pull/34483#issuecomment-1491670384 > Can you reopen it now? (it might depend on whether you closed it yourself) Ah I cannot. I think it does depend on the closer. -- This is an automated message from the Apache G

[GitHub] [arrow] jorisvandenbossche commented on a diff in pull request #8510: GH-15483: [C++] Add a Fixed Shape Tensor canonical ExtensionType

2023-03-31 Thread via GitHub
jorisvandenbossche commented on code in PR #8510: URL: https://github.com/apache/arrow/pull/8510#discussion_r1154285815 ## cpp/src/arrow/extension/fixed_shape_tensor_test.cc: ## @@ -0,0 +1,220 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contribu

[GitHub] [arrow] jorisvandenbossche commented on a diff in pull request #8510: GH-15483: [C++] Add a Fixed Shape Tensor canonical ExtensionType

2023-03-31 Thread via GitHub
jorisvandenbossche commented on code in PR #8510: URL: https://github.com/apache/arrow/pull/8510#discussion_r1154288777 ## cpp/src/arrow/extension/fixed_shape_tensor_test.cc: ## @@ -0,0 +1,220 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contribu

[GitHub] [arrow] thisisnic merged pull request #34798: GH-15247: [R] Error when trying to save a data.frame with NULL column names

2023-03-31 Thread via GitHub
thisisnic merged PR #34798: URL: https://github.com/apache/arrow/pull/34798 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected]

[GitHub] [arrow] jorisvandenbossche commented on pull request #34483: GH-33801: [Python] Generate on the fly Python Extension Types wrapping C++ Extension Types

2023-03-31 Thread via GitHub
jorisvandenbossche commented on PR #34483: URL: https://github.com/apache/arrow/pull/34483#issuecomment-1491681707 OK, that's good to know, thanks for checking! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[GitHub] [arrow] jorisvandenbossche commented on pull request #34483: GH-33801: [Python] Generate on the fly Python Extension Types wrapping C++ Extension Types

2023-03-31 Thread via GitHub
jorisvandenbossche commented on PR #34483: URL: https://github.com/apache/arrow/pull/34483#issuecomment-1491702391 > I'm interested in whether these strategies would be a useful way of exposing pure C++ Extension Types in Python. I am not a huge fan of the idea of creating classes on

[GitHub] [arrow-datafusion] mingmwang commented on issue #5808: `decorrelate_where_in` reports error when optimizing `limit subquery`

2023-03-31 Thread via GitHub
mingmwang commented on issue #5808: URL: https://github.com/apache/arrow-datafusion/issues/5808#issuecomment-1491736320 Sure, I will take a look. It is tricky to support decorate the correlated In/Exist subqueries which contains `Limit`/`OrderBy `clauses. I remember SparkSQL will report er

[GitHub] [arrow-rs] tustvold merged pull request #3985: Fix typos

2023-03-31 Thread via GitHub
tustvold merged PR #3985: URL: https://github.com/apache/arrow-rs/pull/3985 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected]

[GitHub] [arrow-rs] tustvold merged pull request #3986: fix: remove unused type parameters.

2023-03-31 Thread via GitHub
tustvold merged PR #3986: URL: https://github.com/apache/arrow-rs/pull/3986 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected]

[GitHub] [arrow-rs] tustvold merged pull request #3984: Prepare object_store 0.5.6

2023-03-31 Thread via GitHub
tustvold merged PR #3984: URL: https://github.com/apache/arrow-rs/pull/3984 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected]

[GitHub] [arrow-datafusion] mingmwang commented on issue #5808: `decorrelate_where_in` reports error when optimizing `limit subquery`

2023-03-31 Thread via GitHub
mingmwang commented on issue #5808: URL: https://github.com/apache/arrow-datafusion/issues/5808#issuecomment-1491755517 Why it is tricky is because Subquery can be think of as a specific kind of nested loop join, the join condition is very specific and contains limit, the de-correlation pr

[GitHub] [arrow-datafusion] mingmwang commented on issue #5791: SQL case scalar_subquery logical_paln unexpected Aggregate: groupBy=[[col]]

2023-03-31 Thread via GitHub
mingmwang commented on issue #5791: URL: https://github.com/apache/arrow-datafusion/issues/5791#issuecomment-1491761743 No, why do you think it is a bug ? I think the behavior is expected. -- This is an automated message from the Apache Git Service. To respond to the message, please log o

[GitHub] [arrow-datafusion] izveigor opened a new pull request, #5809: Minor: remove typed_min_max_batch_decimal128

2023-03-31 Thread via GitHub
izveigor opened a new pull request, #5809: URL: https://github.com/apache/arrow-datafusion/pull/5809 # Which issue does this PR close? Follow on to https://github.com/apache/arrow-rs/issues/1010 # Rationale for this change This function is not necessary now. # What cha

[GitHub] [arrow-rs] tustvold merged pull request #3987: Revert workspace links for object_store

2023-03-31 Thread via GitHub
tustvold merged PR #3987: URL: https://github.com/apache/arrow-rs/pull/3987 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected]

[GitHub] [arrow-rs] tustvold opened a new pull request, #3987: Revert workspace links for object_store

2023-03-31 Thread via GitHub
tustvold opened a new pull request, #3987: URL: https://github.com/apache/arrow-rs/pull/3987 # Which issue does this PR close? Closes #. # Rationale for this change These cause the verification scripts for the release tarballs to fails, as the workspace

[GitHub] [arrow-rs] tustvold commented on a diff in pull request #3987: Revert workspace links for object_store

2023-03-31 Thread via GitHub
tustvold commented on code in PR #3987: URL: https://github.com/apache/arrow-rs/pull/3987#discussion_r1154346839 ## object_store/Cargo.toml: ## @@ -18,12 +18,12 @@ [package] name = "object_store" version = "0.5.6" -edition = { workspace = true } +edition = "2021" license = "

[GitHub] [arrow-rs] tustvold commented on pull request #3987: Revert workspace links for object_store

2023-03-31 Thread via GitHub
tustvold commented on PR #3987: URL: https://github.com/apache/arrow-rs/pull/3987#issuecomment-1491762454 I'm going to get this in as it should be non-controversial and is blocking the release -- This is an automated message from the Apache Git Service. To respond to the message, please l

[GitHub] [arrow-datafusion] zhzy0077 opened a new issue, #5810: Arithmetic overflow in aggregate_statistics::AggregateStatistics when MAX-MIN overflows

2023-03-31 Thread via GitHub
zhzy0077 opened a new issue, #5810: URL: https://github.com/apache/arrow-datafusion/issues/5810 ### Describe the bug When running statistics, selectivity is calculated by the distance between selected range and total range. See analyze_expr_scalar_comparison. While calculating

[GitHub] [arrow-datafusion] zhzy0077 opened a new pull request, #5811: Scalar arithmetic should return error when overflows.

2023-03-31 Thread via GitHub
zhzy0077 opened a new pull request, #5811: URL: https://github.com/apache/arrow-datafusion/pull/5811 # Which issue does this PR close? Closes #5810. # Rationale for this change Repro in the bug itself. ScalarValue returns an error for many arithmetic errors,

[GitHub] [arrow] wgtmac opened a new pull request, #34822: GH-34821: [DOC][ORC] Update documentation for ORC

2023-03-31 Thread via GitHub
wgtmac opened a new pull request, #34822: URL: https://github.com/apache/arrow/pull/34822 ### Rationale for this change The documentation of ORC is out of date. Now we have supported union and timestamp_instant types. It should be updated to reflect these changes. ### What chan

[GitHub] [arrow] github-actions[bot] commented on pull request #34822: GH-34821: [DOC][ORC] Update documentation for ORC

2023-03-31 Thread via GitHub
github-actions[bot] commented on PR #34822: URL: https://github.com/apache/arrow/pull/34822#issuecomment-1491775973 * Closes: #34821 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [arrow] github-actions[bot] commented on pull request #34822: GH-34821: [DOC][ORC] Update documentation for ORC

2023-03-31 Thread via GitHub
github-actions[bot] commented on PR #34822: URL: https://github.com/apache/arrow/pull/34822#issuecomment-1491776011 :warning: GitHub issue #34821 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the mes

[GitHub] [arrow] wgtmac commented on pull request #34822: GH-34821: [DOC][ORC] Update documentation for ORC

2023-03-31 Thread via GitHub
wgtmac commented on PR #34822: URL: https://github.com/apache/arrow/pull/34822#issuecomment-1491784429 Please take a look, thanks! @wjones127 @westonpace -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

[GitHub] [arrow] lidavidm commented on pull request #34815: GH-34778: [Java] Add hook to further configure the ServerBuilder in FlightServer

2023-03-31 Thread via GitHub
lidavidm commented on PR #34815: URL: https://github.com/apache/arrow/pull/34815#issuecomment-1491787477 Wow, I forgot about that. Yes, docs + any fixes necessary to make it work better would be much welcome. -- This is an automated message from the Apache Git Service. To respond to the m

[GitHub] [arrow] lidavidm commented on pull request #34817: Add Session management messages, Location URI path accessors

2023-03-31 Thread via GitHub
lidavidm commented on PR #34817: URL: https://github.com/apache/arrow/pull/34817#issuecomment-1491793674 I believe we discussed using headers as the base/fallback implementation; how does this integrate into that? -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow] sjperkins commented on pull request #34483: GH-33801: [Python] Generate on the fly Python Extension Types wrapping C++ Extension Types

2023-03-31 Thread via GitHub
sjperkins commented on PR #34483: URL: https://github.com/apache/arrow/pull/34483#issuecomment-1491797111 > I am not a huge fan of the idea of creating classes on the fly .. Also, does this give something more useful than wrapping it just in a base class as is done now? (because right now t

[GitHub] [arrow] ursabot commented on pull request #34710: GH-33998: [R] Update vignettes to reference the new open_*_dataset functions

2023-03-31 Thread via GitHub
ursabot commented on PR #34710: URL: https://github.com/apache/arrow/pull/34710#issuecomment-1491798277 Benchmark runs are scheduled for baseline = fa4b17057570530bba913b92daeba186acb13cba and contender = 5dde5d4e615ef65375762aa46e95a3237b2f4eb5. 5dde5d4e615ef65375762aa46e95a3237b2f4eb5 is

[GitHub] [arrow] rok commented on a diff in pull request #8510: GH-15483: [C++] Add a Fixed Shape Tensor canonical ExtensionType

2023-03-31 Thread via GitHub
rok commented on code in PR #8510: URL: https://github.com/apache/arrow/pull/8510#discussion_r1154375140 ## cpp/src/arrow/extension/fixed_shape_tensor_test.cc: ## @@ -0,0 +1,220 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agr

[GitHub] [arrow-datafusion] waitingkuo commented on issue #5750: `date_trunc` always returns `Timestamp(Nanosecond, None)`

2023-03-31 Thread via GitHub
waitingkuo commented on issue #5750: URL: https://github.com/apache/arrow-datafusion/issues/5750#issuecomment-1491804327 hi @Weijun-H sure please take it, thank you -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] rok commented on a diff in pull request #8510: GH-15483: [C++] Add a Fixed Shape Tensor canonical ExtensionType

2023-03-31 Thread via GitHub
rok commented on code in PR #8510: URL: https://github.com/apache/arrow/pull/8510#discussion_r1154376804 ## cpp/src/arrow/CMakeLists.txt: ## @@ -805,6 +806,10 @@ if(ARROW_IPC) add_arrow_test(extension_type_test) endif() +if(ARROW_JSON) + add_arrow_test(canonical_extension

[GitHub] [arrow] rok commented on a diff in pull request #8510: GH-15483: [C++] Add a Fixed Shape Tensor canonical ExtensionType

2023-03-31 Thread via GitHub
rok commented on code in PR #8510: URL: https://github.com/apache/arrow/pull/8510#discussion_r1154378653 ## cpp/src/arrow/extension/fixed_shape_tensor_test.cc: ## @@ -0,0 +1,220 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agr

[GitHub] [arrow-datafusion] mingmwang commented on issue #5547: Improve the performance of COUNT DISTINCT queries for high cardinality groups

2023-03-31 Thread via GitHub
mingmwang commented on issue #5547: URL: https://github.com/apache/arrow-datafusion/issues/5547#issuecomment-1491810117 > SparkSQL's `Expand` approach will expand the rows numbers(more data copy for all the agg + group by columns) that flow into the AggregateExec. DataFusion's approac

[GitHub] [arrow] rok commented on a diff in pull request #8510: GH-15483: [C++] Add a Fixed Shape Tensor canonical ExtensionType

2023-03-31 Thread via GitHub
rok commented on code in PR #8510: URL: https://github.com/apache/arrow/pull/8510#discussion_r1154380586 ## cpp/src/arrow/extension/fixed_shape_tensor_test.cc: ## @@ -0,0 +1,220 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agr

[GitHub] [arrow-datafusion] Jefffrey commented on pull request #5785: Planner: normalize_ident only when enable_ident_normalization is enabled

2023-03-31 Thread via GitHub
Jefffrey commented on PR #5785: URL: https://github.com/apache/arrow-datafusion/pull/5785#issuecomment-1491812127 > While working on this PR, I also noticed that there are places not related to the planner which also normalize names such as column. I wonder if it's valuable to also make th

[GitHub] [arrow-datafusion] mingmwang commented on pull request #5770: improve Filter pushdown to Join

2023-03-31 Thread via GitHub
mingmwang commented on PR #5770: URL: https://github.com/apache/arrow-datafusion/pull/5770#issuecomment-1491823259 > @mingmwang Could you share any performance numbers for the improvements for the affected queries? Sure, will do. unfortunately, the performance improvement is just a l

[GitHub] [arrow] rok commented on a diff in pull request #8510: GH-15483: [C++] Add a Fixed Shape Tensor canonical ExtensionType

2023-03-31 Thread via GitHub
rok commented on code in PR #8510: URL: https://github.com/apache/arrow/pull/8510#discussion_r1154391217 ## cpp/src/arrow/CMakeLists.txt: ## @@ -805,6 +806,10 @@ if(ARROW_IPC) add_arrow_test(extension_type_test) endif() +if(ARROW_JSON) + add_arrow_test(canonical_extension

[GitHub] [arrow-datafusion] mingmwang commented on a diff in pull request #5770: improve Filter pushdown to Join

2023-03-31 Thread via GitHub
mingmwang commented on code in PR #5770: URL: https://github.com/apache/arrow-datafusion/pull/5770#discussion_r1154391387 ## datafusion/core/src/physical_plan/planner.rs: ## @@ -952,21 +952,21 @@ impl DefaultPhysicalPlanner { let join_filter = match filter

[GitHub] [arrow] pinduzera commented on issue #34779: [R] write_* methods can't have socketConnection as a sink

2023-03-31 Thread via GitHub
pinduzera commented on issue #34779: URL: https://github.com/apache/arrow/issues/34779#issuecomment-1491827431 As an additional: When trying to use `write_ipc_stream(iris, conn)` I receive the following error, which is I am not sure if it is any restriction o Arrow or R sockets, the err

[GitHub] [arrow-datafusion] mingmwang commented on a diff in pull request #5770: improve Filter pushdown to Join

2023-03-31 Thread via GitHub
mingmwang commented on code in PR #5770: URL: https://github.com/apache/arrow-datafusion/pull/5770#discussion_r1154395711 ## benchmarks/expected-plans/q17.txt: ## @@ -1,55 +1,49 @@ -+---+

[GitHub] [arrow] lidavidm commented on a diff in pull request #34817: Add Session management messages, Location URI path accessors

2023-03-31 Thread via GitHub
lidavidm commented on code in PR #34817: URL: https://github.com/apache/arrow/pull/34817#discussion_r1154391450 ## format/FlightSql.proto: ## @@ -1842,6 +1842,94 @@ message ActionCancelQueryResult { CancelResult result = 1; } +/* + * Request message for the "Close Session"

[GitHub] [arrow] paleolimbot commented on a diff in pull request #34798: GH-15247: [R] Error when trying to save a data.frame with NULL column names

2023-03-31 Thread via GitHub
paleolimbot commented on code in PR #34798: URL: https://github.com/apache/arrow/pull/34798#discussion_r1154404361 ## r/R/csv.R: ## @@ -782,12 +782,16 @@ write_csv_arrow <- function(x, tryCatch( x <- as_record_batch_reader(x), error = function(e) { -ab

[GitHub] [arrow-datafusion] Dandandan commented on a diff in pull request #5770: improve Filter pushdown to Join

2023-03-31 Thread via GitHub
Dandandan commented on code in PR #5770: URL: https://github.com/apache/arrow-datafusion/pull/5770#discussion_r1154410295 ## benchmarks/expected-plans/q17.txt: ## @@ -1,55 +1,49 @@ -+---+

[GitHub] [arrow] mapleFU commented on pull request #34511: GH-29105: [C++][Parquet] Relax schema checking when writing using StreamWriter

2023-03-31 Thread via GitHub
mapleFU commented on PR #34511: URL: https://github.com/apache/arrow/pull/34511#issuecomment-1491856122 By the way, I'm curious that why you should fixing this. When read from stale Parquet file, the parquet schema is parsed from the old parquet file, so, it has stale `ConvertedType`.

[GitHub] [arrow-datafusion] alamb opened a new issue, #5812: Blog post with DataFusion Jan - April 2023

2023-03-31 Thread via GitHub
alamb opened a new issue, #5812: URL: https://github.com/apache/arrow-datafusion/issues/5812 ### Is your feature request related to a problem or challenge? We have had good luck writing up quarterly updates for DataFusion, most recently: 1. https://arrow.apache.org/blog/2023/01/19/

[GitHub] [arrow-datafusion] alamb commented on issue #5812: Blog post with DataFusion Jan - April 2023

2023-03-31 Thread via GitHub
alamb commented on issue #5812: URL: https://github.com/apache/arrow-datafusion/issues/5812#issuecomment-1491864723 Some suggested content; new docs / architecture presentation: #5499 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [arrow] thisisnic commented on a diff in pull request #34798: GH-15247: [R] Error when trying to save a data.frame with NULL column names

2023-03-31 Thread via GitHub
thisisnic commented on code in PR #34798: URL: https://github.com/apache/arrow/pull/34798#discussion_r1154427242 ## r/R/csv.R: ## @@ -782,12 +782,16 @@ write_csv_arrow <- function(x, tryCatch( x <- as_record_batch_reader(x), error = function(e) { -abor

[GitHub] [arrow] mapleFU commented on pull request #34511: GH-29105: [C++][Parquet] Relax schema checking when writing using StreamWriter

2023-03-31 Thread via GitHub
mapleFU commented on PR #34511: URL: https://github.com/apache/arrow/pull/34511#issuecomment-1491879112 And maybe logic could be reference from `parquet-mr/parquet-arrow/src/main/java/org/apache/parquet/arrow/schema/SchemaConverter.java`. If you think we need it, I can try to port it --

[GitHub] [arrow] paleolimbot commented on a diff in pull request #34798: GH-15247: [R] Error when trying to save a data.frame with NULL column names

2023-03-31 Thread via GitHub
paleolimbot commented on code in PR #34798: URL: https://github.com/apache/arrow/pull/34798#discussion_r1154443220 ## r/R/csv.R: ## @@ -782,12 +782,16 @@ write_csv_arrow <- function(x, tryCatch( x <- as_record_batch_reader(x), error = function(e) { -ab

[GitHub] [arrow] thisisnic commented on a diff in pull request #34798: GH-15247: [R] Error when trying to save a data.frame with NULL column names

2023-03-31 Thread via GitHub
thisisnic commented on code in PR #34798: URL: https://github.com/apache/arrow/pull/34798#discussion_r1154453775 ## r/R/csv.R: ## @@ -782,12 +782,16 @@ write_csv_arrow <- function(x, tryCatch( x <- as_record_batch_reader(x), error = function(e) { -abor

[GitHub] [arrow] jorisvandenbossche commented on a diff in pull request #8510: GH-15483: [C++] Add a Fixed Shape Tensor canonical ExtensionType

2023-03-31 Thread via GitHub
jorisvandenbossche commented on code in PR #8510: URL: https://github.com/apache/arrow/pull/8510#discussion_r1154453696 ## cpp/src/arrow/CMakeLists.txt: ## @@ -805,6 +806,10 @@ if(ARROW_IPC) add_arrow_test(extension_type_test) endif() +if(ARROW_JSON) + add_arrow_test(cano

[GitHub] [arrow] jorisvandenbossche commented on a diff in pull request #8510: GH-15483: [C++] Add a Fixed Shape Tensor canonical ExtensionType

2023-03-31 Thread via GitHub
jorisvandenbossche commented on code in PR #8510: URL: https://github.com/apache/arrow/pull/8510#discussion_r1154454212 ## cpp/src/arrow/CMakeLists.txt: ## @@ -805,6 +806,10 @@ if(ARROW_IPC) add_arrow_test(extension_type_test) endif() +if(ARROW_JSON) + add_arrow_test(cano

[GitHub] [arrow] jorisvandenbossche commented on a diff in pull request #8510: GH-15483: [C++] Add a Fixed Shape Tensor canonical ExtensionType

2023-03-31 Thread via GitHub
jorisvandenbossche commented on code in PR #8510: URL: https://github.com/apache/arrow/pull/8510#discussion_r1154456089 ## cpp/src/arrow/extension/fixed_shape_tensor_test.cc: ## @@ -0,0 +1,220 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contribu

[GitHub] [arrow] jorisvandenbossche commented on a diff in pull request #8510: GH-15483: [C++] Add a Fixed Shape Tensor canonical ExtensionType

2023-03-31 Thread via GitHub
jorisvandenbossche commented on code in PR #8510: URL: https://github.com/apache/arrow/pull/8510#discussion_r1154458270 ## cpp/src/arrow/extension/fixed_shape_tensor_test.cc: ## @@ -0,0 +1,220 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contribu

[GitHub] [arrow] github-actions[bot] commented on pull request #34825: GH-34775: [R] arrow_table: as.data.frame() sometimes returns a tbl and sometimes a data.frame

2023-03-31 Thread via GitHub
github-actions[bot] commented on PR #34825: URL: https://github.com/apache/arrow/pull/34825#issuecomment-1491918762 * Closes: #34775 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

  1   2   3   4   >