[GitHub] [arrow] mapleFU opened a new issue, #14884: [CI] R install resource may got `404`

2022-12-07 Thread GitBox
mapleFU opened a new issue, #14884: URL: https://github.com/apache/arrow/issues/14884 ### Describe the bug, including details regarding any error messages, version, and platform. I found CI for `R` failed, the text looks like: ``` Found the following (possibly) invalid

[GitHub] [arrow] lquerel opened a new issue, #14883: [Go] arrow.ipc.writer.compressBodyBuffers leaks memory during compression phase

2022-12-07 Thread GitBox
lquerel opened a new issue, #14883: URL: https://github.com/apache/arrow/issues/14883 ### Describe the bug, including details regarding any error messages, version, and platform. All my unit tests work with the CheckedAllocator (the one used in the Arrow Go library) but when I added

[GitHub] [arrow] code1704 opened a new issue, #14882: How to do arrow table group by and split?

2022-12-07 Thread GitBox
code1704 opened a new issue, #14882: URL: https://github.com/apache/arrow/issues/14882 ### Describe the usage question you have. Please include as many useful details as possible. How to group arrow table items and split into tables? ``` g = table.group_by("a")

[GitHub] [arrow-julia] quinnj closed issue #327: DST ambiguities in ZonedDateTime not supported

2022-12-07 Thread GitBox
quinnj closed issue #327: DST ambiguities in ZonedDateTime not supported URL: https://github.com/apache/arrow-julia/issues/327 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow] mattwarkentin closed issue #14880: Best practices for handling larger than memory data

2022-12-07 Thread GitBox
mattwarkentin closed issue #14880: Best practices for handling larger than memory data URL: https://github.com/apache/arrow/issues/14880 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [arrow] mattwarkentin opened a new issue, #14880: Best practices for handling larger than memory data

2022-12-07 Thread GitBox
mattwarkentin opened a new issue, #14880: URL: https://github.com/apache/arrow/issues/14880 ### Describe the usage question you have. Please include as many useful details as possible. Hi, I am wondering if someone from the Arrow team could offer some guidance on best

[GitHub] [arrow] zeroshade opened a new issue, #14876: [Go] Address Crashes for empty C Data arrays with nil buffers

2022-12-07 Thread GitBox
zeroshade opened a new issue, #14876: URL: https://github.com/apache/arrow/issues/14876 ### Describe the bug, including details regarding any error messages, version, and platform. Following up from #14805: Go's `cdata` package needs to address handling nil data buffers for 0 length

[GitHub] [arrow] zeroshade opened a new issue, #14875: [Python][C++] C Data Interface incorrect validate failures

2022-12-07 Thread GitBox
zeroshade opened a new issue, #14875: URL: https://github.com/apache/arrow/issues/14875 ### Describe the bug, including details regarding any error messages, version, and platform. Spinning off from #14814: When testing round trips of empty arrays between Python and Go using

[GitHub] [arrow] gf2121 opened a new issue, #14873: [Java] DictionaryEncoder can decode without building a DictionaryHashTable

2022-12-07 Thread GitBox
gf2121 opened a new issue, #14873: URL: https://github.com/apache/arrow/issues/14873 ### Describe the enhancement requested Today DictionaryEncoder always forces the building of a DictionaryHashTable in the constructor. It can be avoided in scenarios where only decoding is required.

[GitHub] [arrow] DavZim opened a new issue, #14872: [R] arrow returns wrong variable content when multiple group_by/summarise statements are used

2022-12-07 Thread GitBox
DavZim opened a new issue, #14872: URL: https://github.com/apache/arrow/issues/14872 ### Describe the bug, including details regarding any error messages, version, and platform. When collecting a query with multiple group_by + summarise statements, one variable gets wrongly assigned

[GitHub] [arrow] jandom closed issue #14871: pq.ParquetDataset usage with moto3 mocks?

2022-12-07 Thread GitBox
jandom closed issue #14871: pq.ParquetDataset usage with moto3 mocks? URL: https://github.com/apache/arrow/issues/14871 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [arrow] jandom opened a new issue, #14871: pq.ParquetDataset usage with moto3 mocks?

2022-12-07 Thread GitBox
jandom opened a new issue, #14871: URL: https://github.com/apache/arrow/issues/14871 ### Describe the usage question you have. Please include as many useful details as possible. Hi there, I'm trying to mock some S3 objects, to write a test exercising a pd.Dataset, this

[GitHub] [arrow] pitrou closed issue #14870: [C++][Parquet] Support min_value and max_value Statistics

2022-12-07 Thread GitBox
pitrou closed issue #14870: [C++][Parquet] Support min_value and max_value Statistics URL: https://github.com/apache/arrow/issues/14870 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [arrow] pitrou opened a new issue, #14870: [C++][Parquet] Support min_value and max_value Statistics

2022-12-07 Thread GitBox
pitrou opened a new issue, #14870: URL: https://github.com/apache/arrow/issues/14870 ### Describe the enhancement requested The `Statistics` structure in Parquet files provides two ways of specifying lower and upper bounds for a data page: * `min` and `max` are legacy fields for

[jira] [Created] (ARROW-18427) [C++] Suppose negative toletance in `AsofJoinNode`

2022-12-07 Thread Yaron Gvili (Jira)
Yaron Gvili created ARROW-18427: --- Summary: [C++] Suppose negative toletance in `AsofJoinNode` Key: ARROW-18427 URL: https://issues.apache.org/jira/browse/ARROW-18427 Project: Apache Arrow

[GitHub] [arrow] lukester1975 opened a new issue, #14869: [C++] arrow.pc should have -DARROW_STATIC for Windows static builds

2022-12-07 Thread GitBox
lukester1975 opened a new issue, #14869: URL: https://github.com/apache/arrow/issues/14869 ### Describe the enhancement requested Without, the generated pc file is insufficient (at least without "manually" defining ARROW_STATIC, which is unpleasant). Quick hack fix:

[GitHub] [arrow] youngfn closed issue #14853: [C++][Streaming execution] can't write data after hash_distinct

2022-12-07 Thread GitBox
youngfn closed issue #14853: [C++][Streaming execution] can't write data after hash_distinct URL: https://github.com/apache/arrow/issues/14853 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] westonpace opened a new issue, #14866: [C++] Remove internal GroupBy implementation

2022-12-07 Thread GitBox
westonpace opened a new issue, #14866: URL: https://github.com/apache/arrow/issues/14866 ### Describe the enhancement requested Currently there are two ways to compute a group by. The supported way is to use an aggregate node in an exec plan. The second (internal) way is to use the