[GitHub] [arrow-rs] liukun4515 commented on pull request #2360: speed up the decimal256 validation based on bytes comparison and add benchmark test

2022-08-10 Thread GitBox
liukun4515 commented on PR #2360: URL: https://github.com/apache/arrow-rs/pull/2360#issuecomment-1211577166 > > I find some wired things about decimal128/i128, So I remove the refactor of decimal128 and just left decimal256. PTAL > > What's weird thing did you meet? @liukun4515

[GitHub] [arrow-ballista] andygrove merged pull request #123: Using tokio tracing for log file

2022-08-10 Thread GitBox
andygrove merged PR #123: URL: https://github.com/apache/arrow-ballista/pull/123 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arro

[GitHub] [arrow-ballista] andygrove closed issue #122: Using tokio tracing for log file

2022-08-10 Thread GitBox
andygrove closed issue #122: Using tokio tracing for log file URL: https://github.com/apache/arrow-ballista/issues/122 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubs

[GitHub] [arrow-ballista] dependabot[bot] commented on pull request #125: Update ahash requirement from 0.7 to 0.8

2022-08-10 Thread GitBox
dependabot[bot] commented on PR #125: URL: https://github.com/apache/arrow-ballista/pull/125#issuecomment-1211546104 The following labels could not be found: `auto-dependencies`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow-ballista] dependabot[bot] opened a new pull request, #125: Update ahash requirement from 0.7 to 0.8

2022-08-10 Thread GitBox
dependabot[bot] opened a new pull request, #125: URL: https://github.com/apache/arrow-ballista/pull/125 Updates the requirements on [ahash](https://github.com/tkaitchuck/ahash) to permit the latest version. Release notes Sourced from https://github.com/tkaitchuck/ahash/releases";>a

[GitHub] [arrow] milesgranger commented on a diff in pull request #13821: ARROW-13763: [Python] Close files in ParquetFile & ParquetDatasetPiece

2022-08-10 Thread GitBox
milesgranger commented on code in PR #13821: URL: https://github.com/apache/arrow/pull/13821#discussion_r943083951 ## python/pyarrow/tests/parquet/test_parquet_file.py: ## @@ -277,3 +278,77 @@ def test_pre_buffer(pre_buffer): buf.seek(0) pf = pq.ParquetFile(buf, pre_bu

[GitHub] [arrow] milesgranger commented on a diff in pull request #13821: ARROW-13763: [Python] Close files in ParquetFile & ParquetDatasetPiece

2022-08-10 Thread GitBox
milesgranger commented on code in PR #13821: URL: https://github.com/apache/arrow/pull/13821#discussion_r943083763 ## python/pyarrow/tests/parquet/test_parquet_file.py: ## @@ -277,3 +278,77 @@ def test_pre_buffer(pre_buffer): buf.seek(0) pf = pq.ParquetFile(buf, pre_bu

[GitHub] [arrow] milesgranger commented on a diff in pull request #13821: ARROW-13763: [Python] Close files in ParquetFile & ParquetDatasetPiece

2022-08-10 Thread GitBox
milesgranger commented on code in PR #13821: URL: https://github.com/apache/arrow/pull/13821#discussion_r943083115 ## python/pyarrow/tests/parquet/test_parquet_file.py: ## @@ -277,3 +278,77 @@ def test_pre_buffer(pre_buffer): buf.seek(0) pf = pq.ParquetFile(buf, pre_bu

[GitHub] [arrow-datafusion] codecov-commenter commented on pull request #3102: Remove offset if its zero

2022-08-10 Thread GitBox
codecov-commenter commented on PR #3102: URL: https://github.com/apache/arrow-datafusion/pull/3102#issuecomment-1211535809 # [Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/3102?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_

[GitHub] [arrow] github-actions[bot] commented on pull request #13851: ARROW-14999: [C++] Don't check field name in ListType Equals()

2022-08-10 Thread GitBox
github-actions[bot] commented on PR #13851: URL: https://github.com/apache/arrow/pull/13851#issuecomment-1211532812 https://issues.apache.org/jira/browse/ARROW-14999 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] kou commented on issue #13847: Arrow::Table.save(path_to_file.parquet) changes data types within Schema.

2022-08-10 Thread GitBox
kou commented on issue #13847: URL: https://github.com/apache/arrow/issues/13847#issuecomment-1211527576 Could you provide a sample Ruby script that reproduces this case? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [arrow-rs] Ted-Jiang opened a new issue, #2406: Support peek_next_page and skip_next_page in InMemoryPageReader

2022-08-10 Thread GitBox
Ted-Jiang opened a new issue, #2406: URL: https://github.com/apache/arrow-rs/issues/2406 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** when i was implementing bench using `skip_records` got ``` Benchmarking arrow_array_

[GitHub] [arrow-datafusion] turbo1912 opened a new pull request, #3102: Remove offset if its zero

2022-08-10 Thread GitBox
turbo1912 opened a new pull request, #3102: URL: https://github.com/apache/arrow-datafusion/pull/3102 # Which issue does this PR close? Closes #2584 # Rationale for this change If offset is 0, it should be a no-op and can be optimized out. Questions to the reviewer?

[GitHub] [arrow-rs] HaoYang670 commented on issue #2402: Support casting from utf8 to binary

2022-08-10 Thread GitBox
HaoYang670 commented on issue #2402: URL: https://github.com/apache/arrow-rs/issues/2402#issuecomment-1211515412 I guess this should be straight forward by changing the data type: ```rust let binary_array_data = utf8_array_data.into_builder() .datatype(DataType::Binary) .b

[GitHub] [arrow-rs] HaoYang670 commented on pull request #2360: speed up the decimal256 validation based on bytes comparison and add benchmark test

2022-08-10 Thread GitBox
HaoYang670 commented on PR #2360: URL: https://github.com/apache/arrow-rs/pull/2360#issuecomment-1211510499 >I find some wired things about decimal128/i128, So I remove the refactor of decimal128 and just left decimal256. PTAL What's weird thing did you meet? @liukun4515 -- This i

[GitHub] [arrow] ursabot commented on pull request #13849: MINOR: [C++] Add check for result of MakeExecNode

2022-08-10 Thread GitBox
ursabot commented on PR #13849: URL: https://github.com/apache/arrow/pull/13849#issuecomment-1211509826 ['Python', 'R'] benchmarks have high level of regressions. [test-mac-arm](https://conbench.ursa.dev/compare/runs/d86fa0d7908849818a970120b1b8a6f2...1a23d48f829e49108b1877c7675690c4/)

[GitHub] [arrow] ursabot commented on pull request #13849: MINOR: [C++] Add check for result of MakeExecNode

2022-08-10 Thread GitBox
ursabot commented on PR #13849: URL: https://github.com/apache/arrow/pull/13849#issuecomment-1211509730 Benchmark runs are scheduled for baseline = cdb5b2019f6723cb37127487c91daccbf9d238d4 and contender = b7c94e2f330b628ac3d3c92972ad691c0ec454d4. b7c94e2f330b628ac3d3c92972ad691c0ec454d4 is

[GitHub] [arrow-rs] Ted-Jiang commented on a diff in pull request #2393: Implement Skip for DeltaBitPackDecoder

2022-08-10 Thread GitBox
Ted-Jiang commented on code in PR #2393: URL: https://github.com/apache/arrow-rs/pull/2393#discussion_r943051200 ## parquet/src/encodings/decoding.rs: ## @@ -736,8 +736,47 @@ where } fn skip(&mut self, num_values: usize) -> Result { -let mut buffer = vec![T::

[GitHub] [arrow-rs] liukun4515 commented on a diff in pull request #2405: Make the API of `fn Decimal:new` be consistent with `fn Decimal:try_new_bytes` and add length bound for `Decimal::raw_value`

2022-08-10 Thread GitBox
liukun4515 commented on code in PR #2405: URL: https://github.com/apache/arrow-rs/pull/2405#discussion_r943051047 ## arrow/src/array/array_decimal.rs: ## @@ -123,6 +123,8 @@ impl BasicDecimalArray { self.raw_value_data_ptr().offset(pos as isize),

[GitHub] [arrow-rs] liukun4515 commented on a diff in pull request #2405: Make the API of `fn Decimal:new` be consistent with `fn Decimal:try_new_bytes` and add length bound for `Decimal::raw_value`

2022-08-10 Thread GitBox
liukun4515 commented on code in PR #2405: URL: https://github.com/apache/arrow-rs/pull/2405#discussion_r943050808 ## arrow/src/util/decimal.rs: ## @@ -114,17 +113,17 @@ impl BasicDecimal { /// Creates a decimal value from precision, scale, and bytes. /// /// Safet

[GitHub] [arrow-rs] Ted-Jiang commented on a diff in pull request #2393: Implement Skip for DeltaBitPackDecoder

2022-08-10 Thread GitBox
Ted-Jiang commented on code in PR #2393: URL: https://github.com/apache/arrow-rs/pull/2393#discussion_r943050633 ## parquet/src/encodings/decoding.rs: ## @@ -736,8 +736,47 @@ where } fn skip(&mut self, num_values: usize) -> Result { -let mut buffer = vec![T::

[GitHub] [arrow-rs] liukun4515 commented on pull request #1855: support compression for IPC

2022-08-10 Thread GitBox
liukun4515 commented on PR #1855: URL: https://github.com/apache/arrow-rs/pull/1855#issuecomment-1211487001 Thanks for @alamb cooperation. I will close this pr. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

[GitHub] [arrow-rs] liukun4515 closed pull request #1855: support compression for IPC

2022-08-10 Thread GitBox
liukun4515 closed pull request #1855: support compression for IPC URL: https://github.com/apache/arrow-rs/pull/1855 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

[GitHub] [arrow-rs] HaoYang670 commented on a diff in pull request #2405: Make the API of `fn Decimal:new` be consistent with `fn Decimal:try_new_bytes` and add length bound for `Decimal::raw_value`

2022-08-10 Thread GitBox
HaoYang670 commented on code in PR #2405: URL: https://github.com/apache/arrow-rs/pull/2405#discussion_r943043549 ## arrow/src/util/decimal.rs: ## @@ -114,17 +113,17 @@ impl BasicDecimal { /// Creates a decimal value from precision, scale, and bytes. /// /// Safet

[GitHub] [arrow-rs] liukun4515 commented on a diff in pull request #2369: support compression for IPC with revamped feature flags

2022-08-10 Thread GitBox
liukun4515 commented on code in PR #2369: URL: https://github.com/apache/arrow-rs/pull/2369#discussion_r943044711 ## arrow/src/ipc/writer.rs: ## @@ -1096,29 +1210,56 @@ fn write_array_data( offset, data_ref.len(), data_ref.null_

[GitHub] [arrow] github-actions[bot] commented on pull request #13838: ARROW-17382: [C++] open_dataset doesn't ignore BOM in csv file when header's with quotes

2022-08-10 Thread GitBox
github-actions[bot] commented on PR #13838: URL: https://github.com/apache/arrow/pull/13838#issuecomment-1211476773 :warning: Ticket **has not been started in JIRA**, please click 'Start Progress'. -- This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [arrow] github-actions[bot] commented on pull request #13838: ARROW-17382: [C++] open_dataset doesn't ignore BOM in csv file when header's with quotes

2022-08-10 Thread GitBox
github-actions[bot] commented on PR #13838: URL: https://github.com/apache/arrow/pull/13838#issuecomment-1211476764 https://issues.apache.org/jira/browse/ARROW-17382 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow-rs] HaoYang670 commented on a diff in pull request #2405: Make the API of `fn Decimal:new` be consistent with `fn Decimal:try_new_bytes` and add length bound for `Decimal::raw_value`

2022-08-10 Thread GitBox
HaoYang670 commented on code in PR #2405: URL: https://github.com/apache/arrow-rs/pull/2405#discussion_r943043549 ## arrow/src/util/decimal.rs: ## @@ -114,17 +113,17 @@ impl BasicDecimal { /// Creates a decimal value from precision, scale, and bytes. /// /// Safet

[GitHub] [arrow-datafusion] avantgardnerio commented on issue #166: TPC-H Query 15

2022-08-10 Thread GitBox
avantgardnerio commented on issue #166: URL: https://github.com/apache/arrow-datafusion/issues/166#issuecomment-1211474461 > Removing the check for empty columns allows the view to be generated @DaltonModlin if the check is removed, do the other tests still pass? If so, perhaps a new

[GitHub] [arrow-rs] HaoYang670 commented on a diff in pull request #2405: Make the API of `fn Decimal:new` be consistent with `fn Decimal:try_new_bytes` and add length bound for `Decimal::raw_value`

2022-08-10 Thread GitBox
HaoYang670 commented on code in PR #2405: URL: https://github.com/apache/arrow-rs/pull/2405#discussion_r943040551 ## arrow/src/array/array_decimal.rs: ## @@ -123,6 +123,8 @@ impl BasicDecimalArray { self.raw_value_data_ptr().offset(pos as isize),

[GitHub] [arrow-rs] liukun4515 commented on a diff in pull request #2405: Make the API of `fn Decimal:new` be consistent with `fn Decimal:try_new_bytes` and add length bound for `Decimal::raw_value`

2022-08-10 Thread GitBox
liukun4515 commented on code in PR #2405: URL: https://github.com/apache/arrow-rs/pull/2405#discussion_r943039839 ## arrow/src/array/array_decimal.rs: ## @@ -123,6 +123,8 @@ impl BasicDecimalArray { self.raw_value_data_ptr().offset(pos as isize),

[GitHub] [arrow-rs] HaoYang670 opened a new pull request, #2405: Make the API of `fn Decimal:new` be consistent with `fn Decimal:try_new_bytes`

2022-08-10 Thread GitBox
HaoYang670 opened a new pull request, #2405: URL: https://github.com/apache/arrow-rs/pull/2405 Signed-off-by: remzi <1371656737...@gmail.com> # Which issue does this PR close? None. # Rationale for this change `fn Decimal:new` and `fn Decimal:try_new_bytes` should have sam

[GitHub] [arrow] jacques-n commented on a diff in pull request #13492: RFC: [C++][Java][FlightRPC] Substrait, transaction, cancellation for Flight SQL

2022-08-10 Thread GitBox
jacques-n commented on code in PR #13492: URL: https://github.com/apache/arrow/pull/13492#discussion_r943026063 ## format/FlightSql.proto: ## @@ -1450,8 +1512,72 @@ message ActionClosePreparedStatementRequest { bytes prepared_statement_handle = 1; } +/* + * Request message

[GitHub] [arrow] westonpace merged pull request #13849: MINOR: [C++] Add check for result of MakeExecNode

2022-08-10 Thread GitBox
westonpace merged PR #13849: URL: https://github.com/apache/arrow/pull/13849 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.ap

[GitHub] [arrow] lidavidm commented on a diff in pull request #13492: RFC: [C++][Java][FlightRPC] Substrait, transaction, cancellation for Flight SQL

2022-08-10 Thread GitBox
lidavidm commented on code in PR #13492: URL: https://github.com/apache/arrow/pull/13492#discussion_r943019347 ## format/FlightSql.proto: ## @@ -1450,8 +1512,72 @@ message ActionClosePreparedStatementRequest { bytes prepared_statement_handle = 1; } +/* + * Request message

[GitHub] [arrow] paleolimbot commented on pull request #13267: ARROW-16690: [R][FlightRPC] Additional max_chunksize parameter in do_put method

2022-08-10 Thread GitBox
paleolimbot commented on PR #13267: URL: https://github.com/apache/arrow/pull/13267#issuecomment-1211436831 Awesome! The test looks great. I think ignoring `max_chunksize` with a warning is the way to go (you could error too...I think either is fine and I haven't actually used this function

[GitHub] [arrow] drin commented on issue #13850: Confusion about the Arrow Formats Categorization

2022-08-10 Thread GitBox
drin commented on issue #13850: URL: https://github.com/apache/arrow/issues/13850#issuecomment-1211435109 note that the first paragraph of the sections you highlighted in red describe the differences between the 2, but essentially the file format additionally writes a footer: > What

[GitHub] [arrow] paleolimbot commented on a diff in pull request #13267: ARROW-16690: [R][FlightRPC] Additional max_chunksize parameter in do_put method

2022-08-10 Thread GitBox
paleolimbot commented on code in PR #13267: URL: https://github.com/apache/arrow/pull/13267#discussion_r943016194 ## r/tests/testthat/test-python-flight.R: ## @@ -37,6 +37,16 @@ if (process_is_running("demo_flight_server")) { regexp = 'data must be a "data.frame", "Table"

[GitHub] [arrow] paleolimbot commented on a diff in pull request #13267: ARROW-16690: [R][FlightRPC] Additional max_chunksize parameter in do_put method

2022-08-10 Thread GitBox
paleolimbot commented on code in PR #13267: URL: https://github.com/apache/arrow/pull/13267#discussion_r943015673 ## r/R/flight.R: ## @@ -72,6 +74,8 @@ flight_put <- function(client, data, path, overwrite = TRUE) { writer <- client$do_put(descriptor_for_path(path), py_data$sc

[GitHub] [arrow] drin commented on issue #13850: Confusion about the Arrow Formats Categorization

2022-08-10 Thread GitBox
drin commented on issue #13850: URL: https://github.com/apache/arrow/issues/13850#issuecomment-1211432886 so RecordBatchFileWriter writes the IPC file format, and RecordBatchStreamWriter writes the IPC stream format. -- This is an automated message from the Apache Git Service. To respond

[GitHub] [arrow] jacques-n commented on a diff in pull request #13492: RFC: [C++][Java][FlightRPC] Substrait, transaction, cancellation for Flight SQL

2022-08-10 Thread GitBox
jacques-n commented on code in PR #13492: URL: https://github.com/apache/arrow/pull/13492#discussion_r943014694 ## format/FlightSql.proto: ## @@ -1450,8 +1512,72 @@ message ActionClosePreparedStatementRequest { bytes prepared_statement_handle = 1; } +/* + * Request message

[GitHub] [arrow] drin commented on issue #13850: Confusion about the Arrow Formats Categorization

2022-08-10 Thread GitBox
drin commented on issue #13850: URL: https://github.com/apache/arrow/issues/13850#issuecomment-1211432568 the IPC format refers to a memory format used by arrow. Most file writers will write the IPC file format, which is the IPC format but with slight adjustments for tuning for files and fi

[GitHub] [arrow] liusitan opened a new issue, #13850: Confusion about the Arrow Formats Categorization

2022-08-10 Thread GitBox
liusitan opened a new issue, #13850: URL: https://github.com/apache/arrow/issues/13850 I am looking at the https://arrow.apache.org/cookbook/py/io.html#memory-mapping-arrow-arrays-from-disk Recently I am trying to implement arrow file format for memory_mapped read. However, I don't k

[GitHub] [arrow] arjunsr1 commented on issue #13847: Arrow::Table.save(path_to_file.parquet) changes data types within Schema.

2022-08-10 Thread GitBox
arjunsr1 commented on issue #13847: URL: https://github.com/apache/arrow/issues/13847#issuecomment-1211418093 @drin , I was unable to use the cast method to change the date64 column to type :timestamp or Arrow::TimeUnit::MILLI. It doesn't allow conversion to TimeUnit. However, I believe tha

[GitHub] [arrow] drin opened a new pull request, #13849: MINOR: [C++] Add check for result of MakeExecNode

2022-08-10 Thread GitBox
drin opened a new pull request, #13849: URL: https://github.com/apache/arrow/pull/13849 ARROW-16894 added benchmarks for asof join node. This PR adds a call to `ASSERT_OK` to satisfy the warning that the `Result` returned by `MakeExecNode` was not checked. -- This is an automated message

[GitHub] [arrow-datafusion] andygrove opened a new pull request, #3101: [WIP] Make `Like` a top-level `Expr` and add SQL support for `ILike` and `SimilarTo`

2022-08-10 Thread GitBox
andygrove opened a new pull request, #3101: URL: https://github.com/apache/arrow-datafusion/pull/3101 # Which issue does this PR close? Closes https://github.com/apache/arrow-datafusion/issues/3099 # Rationale for this change I would like to support queries u

[GitHub] [arrow] github-actions[bot] commented on pull request #13848: ARROW-17381: [C++][Acero] Centralize error handling in ExecPlan

2022-08-10 Thread GitBox
github-actions[bot] commented on PR #13848: URL: https://github.com/apache/arrow/pull/13848#issuecomment-1211359361 :warning: Ticket **has not been started in JIRA**, please click 'Start Progress'. -- This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [arrow] github-actions[bot] commented on pull request #13848: ARROW-17381: [C++][Acero] Centralize error handling in ExecPlan

2022-08-10 Thread GitBox
github-actions[bot] commented on PR #13848: URL: https://github.com/apache/arrow/pull/13848#issuecomment-1211359332 https://issues.apache.org/jira/browse/ARROW-17381 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] ursabot commented on pull request #13487: ARROW-8991: [C++] Add new scalar compute function

2022-08-10 Thread GitBox
ursabot commented on PR #13487: URL: https://github.com/apache/arrow/pull/13487#issuecomment-1211322825 Benchmark runs are scheduled for baseline = 4d931ff1c0f5661a9b134c514555c8d769001759 and contender = 737387bf983bdf33a567d042b460c213cfdc03c6. Results will be available as each benchmark

[GitHub] [arrow] drin commented on pull request #13487: ARROW-8991: [C++] Add new scalar compute function

2022-08-10 Thread GitBox
drin commented on PR #13487: URL: https://github.com/apache/arrow/pull/13487#issuecomment-1211322762 @ursabot please benchmark lang=C++ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[GitHub] [arrow-rs] roeap commented on pull request #2374: feat: add token provider authorization to azure store

2022-08-10 Thread GitBox
roeap commented on PR #2374: URL: https://github.com/apache/arrow-rs/pull/2374#issuecomment-1211322592 > There do appear to be failing lints and tests btw All tests and lints should be passing now. > Very exciting, btw if you haven't seen #2352 that moves away from rusuto for S

[GitHub] [arrow] kou commented on a diff in pull request #13841: MINOR: [Docs] Fix sphinx rst syntax in Integration page

2022-08-10 Thread GitBox
kou commented on code in PR #13841: URL: https://github.com/apache/arrow/pull/13841#discussion_r942934473 ## docs/source/format/Integration.rst: ## @@ -455,14 +455,16 @@ Gold File Integration Tests ~~~ Pre-generated json and arrow IPC files (both file

[GitHub] [arrow-rs] roeap commented on a diff in pull request #2374: feat: add token provider authorization to azure store

2022-08-10 Thread GitBox
roeap commented on code in PR #2374: URL: https://github.com/apache/arrow-rs/pull/2374#discussion_r942933533 ## object_store/src/azure.rs: ## @@ -277,79 +277,95 @@ impl ObjectStore for MicrosoftAzure { } async fn get(&self, location: &Path) -> Result { -let b

[GitHub] [arrow] ursabot commented on pull request #13613: ARROW-15582: [C++] Add support for registering standard Substrait functions

2022-08-10 Thread GitBox
ursabot commented on PR #13613: URL: https://github.com/apache/arrow/pull/13613#issuecomment-1211309587 ['Python', 'R'] benchmarks have high level of regressions. [test-mac-arm](https://conbench.ursa.dev/compare/runs/62345b2dfb5d41f890c2990b35cde93f...d86fa0d7908849818a970120b1b8a6f2/)

[GitHub] [arrow] ursabot commented on pull request #13613: ARROW-15582: [C++] Add support for registering standard Substrait functions

2022-08-10 Thread GitBox
ursabot commented on PR #13613: URL: https://github.com/apache/arrow/pull/13613#issuecomment-1211309395 Benchmark runs are scheduled for baseline = ae071bbce5cb61c7553912f051d55ff5c7b45d7a and contender = cdb5b2019f6723cb37127487c91daccbf9d238d4. cdb5b2019f6723cb37127487c91daccbf9d238d4 is

[GitHub] [arrow] drin commented on issue #13847: Arrow::Table.save(path_to_file.parquet) changes data types within Schema.

2022-08-10 Thread GitBox
drin commented on issue #13847: URL: https://github.com/apache/arrow/issues/13847#issuecomment-1211298934 no problem, good luck! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

[GitHub] [arrow] arjunsr1 commented on issue #13847: Arrow::Table.save(path_to_file.parquet) changes data types within Schema.

2022-08-10 Thread GitBox
arjunsr1 commented on issue #13847: URL: https://github.com/apache/arrow/issues/13847#issuecomment-1211297338 Oh, wow! Thanks for the speedy reply @drin. As for the lack of parquet support, its unfortunate :( I'm going to try using the `cast` method that was added in red-arrow 9.0.0

[GitHub] [arrow] drin commented on issue #13847: Arrow::Table.save(path_to_file.parquet) changes data types within Schema.

2022-08-10 Thread GitBox
drin commented on issue #13847: URL: https://github.com/apache/arrow/issues/13847#issuecomment-1211293475 Hello! it looks like there's a relevant JIRA issue: [ARROW-9502](https://issues.apache.org/jira/browse/ARROW-9502). So, an answer to 1 of your questions seems to be that parquet d

[GitHub] [arrow] arjunsr1 opened a new issue, #13847: Arrow::Table.save(path_to_file.parquet) changes data types within Schema.

2022-08-10 Thread GitBox
arjunsr1 opened a new issue, #13847: URL: https://github.com/apache/arrow/issues/13847 Hello @kou - I had previously written about an issue of timestamp [ms] turning into timestamp [s] when loading parquet file data into an arrow table. This time, I have a similar question, but this time, i

[GitHub] [arrow] kou commented on a diff in pull request #13845: ARROW-16993: [C++] cmake: `cannot create imported target "Boost::headers"`

2022-08-10 Thread GitBox
kou commented on code in PR #13845: URL: https://github.com/apache/arrow/pull/13845#discussion_r942914597 ## cpp/cmake_modules/ThirdpartyToolchain.cmake: ## @@ -1019,12 +1019,13 @@ if(ARROW_USE_BOOST) # Find static boost headers and libs set(Boost_USE_STATIC_LIBS ON)

[GitHub] [arrow-rs] andygrove commented on pull request #2404: Fix DoPutUpdateResult

2022-08-10 Thread GitBox
andygrove commented on PR #2404: URL: https://github.com/apache/arrow-rs/pull/2404#issuecomment-1211280199 It would be nice if we could eventually add some tests for this -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [arrow-rs] andygrove commented on pull request #2404: Fix DoPutUpdateResult

2022-08-10 Thread GitBox
andygrove commented on PR #2404: URL: https://github.com/apache/arrow-rs/pull/2404#issuecomment-1211279123 @wangfenjin fyi -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

[GitHub] [arrow] github-actions[bot] commented on pull request #13846: ARROW-16993: [C++] Don't find Boost components if they aren't needed

2022-08-10 Thread GitBox
github-actions[bot] commented on PR #13846: URL: https://github.com/apache/arrow/pull/13846#issuecomment-1211278721 :warning: Ticket **has not been started in JIRA**, please click 'Start Progress'. -- This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [arrow] github-actions[bot] commented on pull request #13846: ARROW-16993: [C++] Don't find Boost components if they aren't needed

2022-08-10 Thread GitBox
github-actions[bot] commented on PR #13846: URL: https://github.com/apache/arrow/pull/13846#issuecomment-1211278693 https://issues.apache.org/jira/browse/ARROW-16993 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow-datafusion] waitingkuo commented on issue #2979: to_timestamp timeunit to be consistent with postgresql's

2022-08-10 Thread GitBox
waitingkuo commented on issue #2979: URL: https://github.com/apache/arrow-datafusion/issues/2979#issuecomment-1211278322 @alamb after spend time figuring out how postgresql is working, i submitted a proposal here #3100 probably we need to deal with datatype `date / time / time

[GitHub] [arrow-datafusion] waitingkuo opened a new issue, #3100: [EPIC] Proposal for Date/Time enhancement

2022-08-10 Thread GitBox
waitingkuo opened a new issue, #3100: URL: https://github.com/apache/arrow-datafusion/issues/3100 # !!! Please correct me if i'm wrong !!! # Intro - This changes breaks some original behaviors. - UTC+8 is the local time zone in my examples. - As our goal is to be

[GitHub] [arrow] kou commented on pull request #12914: ARROW-2034: [C++] Filesystem implementation for Azure Blob Storage

2022-08-10 Thread GitBox
kou commented on PR #12914: URL: https://github.com/apache/arrow/pull/12914#issuecomment-1211263139 OK. I'll wait for your review at least a few weeks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [arrow] thatstatsguy commented on pull request #13267: ARROW-16690: [R][FlightRPC] Additional max_chunksize parameter in do_put method

2022-08-10 Thread GitBox
thatstatsguy commented on PR #13267: URL: https://github.com/apache/arrow/pull/13267#issuecomment-1211263114 @paleolimbot thanks for such a detailed review and prompts on how to resolve it - you were very kind :) I've updated the linter, unit tests and rerun the devtools::document().

[GitHub] [arrow-rs] avantgardnerio commented on issue #2403: Flight SQL Server sends incorrect response for `DoPutUpdateResult`

2022-08-10 Thread GitBox
avantgardnerio commented on issue #2403: URL: https://github.com/apache/arrow-rs/issues/2403#issuecomment-1211259431 Tagging @tustvold and @andygrove , thank you both for your help! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

[GitHub] [arrow-rs] avantgardnerio opened a new pull request, #2404: Fix DoPutUpdateResult

2022-08-10 Thread GitBox
avantgardnerio opened a new pull request, #2404: URL: https://github.com/apache/arrow-rs/pull/2404 # Which issue does this PR close? Closes #2403. # Rationale for this change My company would like to have JDBC/ODBC support work so that we can develop a market offering o

[GitHub] [arrow-rs] avantgardnerio opened a new issue, #2403: Flight SQL Server sends incorrect response for `DoPutUpdateResult`

2022-08-10 Thread GitBox
avantgardnerio opened a new issue, #2403: URL: https://github.com/apache/arrow-rs/issues/2403 **Describe the bug** The Flight SQL Server incorrectly does a `as_any()` and sends a protobuf enum instead of the message itself. This is incompatible with the JDBC implementation. (As usual

[GitHub] [arrow] thatstatsguy commented on a diff in pull request #13267: ARROW-16690: [R][FlightRPC] Additional max_chunksize parameter in do_put method

2022-08-10 Thread GitBox
thatstatsguy commented on code in PR #13267: URL: https://github.com/apache/arrow/pull/13267#discussion_r942880618 ## r/R/flight.R: ## @@ -72,6 +74,8 @@ flight_put <- function(client, data, path, overwrite = TRUE) { writer <- client$do_put(descriptor_for_path(path), py_data$s

[GitHub] [arrow] thatstatsguy commented on a diff in pull request #13267: ARROW-16690: [R][FlightRPC] Additional max_chunksize parameter in do_put method

2022-08-10 Thread GitBox
thatstatsguy commented on code in PR #13267: URL: https://github.com/apache/arrow/pull/13267#discussion_r942879894 ## r/R/flight.R: ## @@ -72,6 +74,8 @@ flight_put <- function(client, data, path, overwrite = TRUE) { writer <- client$do_put(descriptor_for_path(path), py_data$s

[GitHub] [arrow] thatstatsguy commented on a diff in pull request #13267: ARROW-16690: [R][FlightRPC] Additional max_chunksize parameter in do_put method

2022-08-10 Thread GitBox
thatstatsguy commented on code in PR #13267: URL: https://github.com/apache/arrow/pull/13267#discussion_r942878470 ## r/R/flight.R: ## @@ -56,9 +56,11 @@ flight_disconnect <- function(client) { #' @param overwrite logical: if `path` exists on `client` already, should we #' rep

[GitHub] [arrow] lidavidm commented on pull request #13492: RFC: [C++][Java][FlightRPC] Substrait, transaction, cancellation for Flight SQL

2022-08-10 Thread GitBox
lidavidm commented on PR #13492: URL: https://github.com/apache/arrow/pull/13492#issuecomment-1211208782 I need to figure out the AppVeyor failure here, but this should generally be ready. -- This is an automated message from the Apache Git Service. To respond to the message, please log o

[GitHub] [arrow] github-actions[bot] commented on pull request #13845: ARROW-16993: [C++] cmake: `cannot create imported target "Boost::headers"`

2022-08-10 Thread GitBox
github-actions[bot] commented on PR #13845: URL: https://github.com/apache/arrow/pull/13845#issuecomment-1211195439 Revision: 57bdb36118500382d8802309cef7a3eee0c20927 Submitted crossbow builds: [ursacomputing/crossbow @ actions-28f9432361](https://github.com/ursacomputing/crossbow/bra

[GitHub] [arrow] assignUser commented on pull request #13845: ARROW-16993: [C++] cmake: `cannot create imported target "Boost::headers"`

2022-08-10 Thread GitBox
assignUser commented on PR #13845: URL: https://github.com/apache/arrow/pull/13845#issuecomment-1211193361 @github-actions crossbow submit -g cpp r-binary-packages -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

[GitHub] [arrow] github-actions[bot] commented on pull request #13845: ARROW-16993: [C++] cmake: `cannot create imported target "Boost::headers"`

2022-08-10 Thread GitBox
github-actions[bot] commented on PR #13845: URL: https://github.com/apache/arrow/pull/13845#issuecomment-1211190245 :warning: Ticket **has not been started in JIRA**, please click 'Start Progress'. -- This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [arrow] github-actions[bot] commented on pull request #13845: ARROW-16993: [C++] cmake: `cannot create imported target "Boost::headers"`

2022-08-10 Thread GitBox
github-actions[bot] commented on PR #13845: URL: https://github.com/apache/arrow/pull/13845#issuecomment-1211190219 https://issues.apache.org/jira/browse/ARROW-16993 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] wjones127 commented on issue #13803: Get the table schema as json?

2022-08-10 Thread GitBox
wjones127 commented on issue #13803: URL: https://github.com/apache/arrow/issues/13803#issuecomment-1211179812 FWIW, there is already an implementation of JSON roundtripping of schemas in arrow-rs. So if we did want to standardize, could simply propose using their format. https://gi

[GitHub] [arrow] westonpace merged pull request #13613: ARROW-15582: [C++] Add support for registering standard Substrait functions

2022-08-10 Thread GitBox
westonpace merged PR #13613: URL: https://github.com/apache/arrow/pull/13613 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.ap

[GitHub] [arrow] westonpace commented on pull request #13661: [C++][DONOTMERGE] Use -O2 instead of -O3 for RELEASE builds

2022-08-10 Thread GitBox
westonpace commented on PR #13661: URL: https://github.com/apache/arrow/pull/13661#issuecomment-1211158437 @pitrou I got the results here: https://conbench.ursa.dev/compare/runs/b724609840e242afbf4e1e26682afbe3...b742cce58407420db4da8e461604a1db/ There were no significant changes (one

[GitHub] [arrow] westonpace closed pull request #13829: [C++][DONOTMERGE] Rebased clone of #13661 created for benchmarking purposes

2022-08-10 Thread GitBox
westonpace closed pull request #13829: [C++][DONOTMERGE] Rebased clone of #13661 created for benchmarking purposes URL: https://github.com/apache/arrow/pull/13829 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[GitHub] [arrow] ManManson commented on pull request #13804: ARROW-17318: [C++][Dataset] Support async streaming interface for getting fragments in Dataset

2022-08-10 Thread GitBox
ManManson commented on PR #13804: URL: https://github.com/apache/arrow/pull/13804#issuecomment-1211143906 Rebased and force-pushed the branch. Unfortunately, I cannot attach a link to the diff because rebasing broke it. Changes summary: * Added `utils/async_generator_fwd.h` * `

[GitHub] [arrow-datafusion] waitingkuo commented on a diff in pull request #3098: Hash binary values

2022-08-10 Thread GitBox
waitingkuo commented on code in PR #3098: URL: https://github.com/apache/arrow-datafusion/pull/3098#discussion_r942800753 ## datafusion/core/src/physical_plan/hash_utils.rs: ## @@ -529,6 +531,26 @@ pub fn create_hashes<'a>( multi_col );

[GitHub] [arrow-datafusion] andygrove closed issue #2896: serde_json requires that either `std` (default) or `alloc` feature is enabled

2022-08-10 Thread GitBox
andygrove closed issue #2896: serde_json requires that either `std` (default) or `alloc` feature is enabled URL: https://github.com/apache/arrow-datafusion/issues/2896 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

[GitHub] [arrow-datafusion] andygrove commented on issue #3097: DataFusion 12.0.0 Release

2022-08-10 Thread GitBox
andygrove commented on issue #3097: URL: https://github.com/apache/arrow-datafusion/issues/3097#issuecomment-1211131399 > should be arrow-datafusion 12.0.0? Thanks, yes! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow-datafusion] waitingkuo commented on issue #3048: Follow-up on Clickbench benchmark

2022-08-10 Thread GitBox
waitingkuo commented on issue #3048: URL: https://github.com/apache/arrow-datafusion/issues/3048#issuecomment-1211131371 close again :D -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

[GitHub] [arrow-datafusion] waitingkuo closed issue #3048: Follow-up on Clickbench benchmark

2022-08-10 Thread GitBox
waitingkuo closed issue #3048: Follow-up on Clickbench benchmark URL: https://github.com/apache/arrow-datafusion/issues/3048 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [arrow-datafusion] andygrove opened a new issue, #3099: Add support for Postgres SIMILAR TO syntax

2022-08-10 Thread GitBox
andygrove opened a new issue, #3099: URL: https://github.com/apache/arrow-datafusion/issues/3099 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** Add support for Postgres SIMILAR TO syntax **Describe the solution you'd like**

[GitHub] [arrow-datafusion] codecov-commenter commented on pull request #3098: Hash binary values

2022-08-10 Thread GitBox
codecov-commenter commented on PR #3098: URL: https://github.com/apache/arrow-datafusion/pull/3098#issuecomment-126686 # [Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/3098?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_

[GitHub] [arrow-cookbook] mystic-lama commented on pull request #245: Adding License to files

2022-08-10 Thread GitBox
mystic-lama commented on PR #245: URL: https://github.com/apache/arrow-cookbook/pull/245#issuecomment-1211108334 @lidavidm let's take r package in a different PR. I shall try to get the setup working locally. Updated title accordingly -- This is an automated message from the Apache Git Se

[GitHub] [arrow-cookbook] lidavidm commented on pull request #245: [WIP] [DO NOT MERGE] Adding License to files

2022-08-10 Thread GitBox
lidavidm commented on PR #245: URL: https://github.com/apache/arrow-cookbook/pull/245#issuecomment-1211082650 No clue really. Might've been a flake or something -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[GitHub] [arrow-datafusion] Dandandan opened a new pull request, #3098: Hash binary values

2022-08-10 Thread GitBox
Dandandan opened a new pull request, #3098: URL: https://github.com/apache/arrow-datafusion/pull/3098 # Which issue does this PR close? Closes #3050 # Rationale for this change Currently we can not group by binary values # What changes are included in thi

[GitHub] [arrow-rs] Dandandan opened a new issue, #2402: Support casting from utf8 to binary

2022-08-10 Thread GitBox
Dandandan opened a new issue, #2402: URL: https://github.com/apache/arrow-rs/issues/2402 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** Casting from utf8 to binary is missing. **Describe the solution you'd like**

[GitHub] [arrow-datafusion] waitingkuo commented on issue #3048: Follow-up on Clickbench benchmark

2022-08-10 Thread GitBox
waitingkuo commented on issue #3048: URL: https://github.com/apache/arrow-datafusion/issues/3048#issuecomment-1211054858 i finally found the root cause for most of the issues. their full-dataset and partial-dataset have different schema, submitted a issue there ClickHouse/ClickBench#18

[GitHub] [arrow] bkietz commented on pull request #13784: ARROW-17290: [C++] Add order-comparisons for numeric scalars

2022-08-10 Thread GitBox
bkietz commented on PR #13784: URL: https://github.com/apache/arrow/pull/13784#issuecomment-1211041269 If we don't currently or in the near future require support for comparison of `Scalar`s in the way this PR provides, then I'd say we should close this PR and mark the JIRA wontfix. -- T

[GitHub] [arrow-datafusion] Dandandan commented on issue #3050: Cannot GROUP BY Binary

2022-08-10 Thread GitBox
Dandandan commented on issue #3050: URL: https://github.com/apache/arrow-datafusion/issues/3050#issuecomment-1211039296 I am currently working on this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow-rs] tustvold commented on a diff in pull request #2374: feat: add token provider authorization to azure store

2022-08-10 Thread GitBox
tustvold commented on code in PR #2374: URL: https://github.com/apache/arrow-rs/pull/2374#discussion_r942702801 ## object_store/src/azure.rs: ## @@ -277,79 +277,92 @@ impl ObjectStore for MicrosoftAzure { } async fn get(&self, location: &Path) -> Result { -le

[GitHub] [arrow-rs] tustvold commented on pull request #2374: feat: add token provider authorization to azure store

2022-08-10 Thread GitBox
tustvold commented on PR #2374: URL: https://github.com/apache/arrow-rs/pull/2374#issuecomment-1211017441 There do appear to be failing lints and tests btw -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

  1   2   3   >