[jira] [Created] (ARROW-17383) RHEL8/Centos Repo has a bad repomd.xml file

2022-08-10 Thread Justin Gerry (Jira)
Justin Gerry created ARROW-17383: Summary: RHEL8/Centos Repo has a bad repomd.xml file Key: ARROW-17383 URL: https://issues.apache.org/jira/browse/ARROW-17383 Project: Apache Arrow Issue

[jira] [Created] (ARROW-17382) [C++] open_dataset doesn't ignore BOM in csv file when header's with quotes

2022-08-10 Thread ZMZ91 (Jira)
ZMZ91 created ARROW-17382: - Summary: [C++] open_dataset doesn't ignore BOM in csv file when header's with quotes Key: ARROW-17382 URL: https://issues.apache.org/jira/browse/ARROW-17382 Project: Apache Arrow

[GitHub] [arrow-nanoarrow] paleolimbot commented on pull request #17: Buffer element appenders

2022-08-10 Thread GitBox
paleolimbot commented on PR #17: URL: https://github.com/apache/arrow-nanoarrow/pull/17#issuecomment-1211464135 Ok - this is a first pass at #8 that implements the functions needed to make "build by buffer" a thing. The `ArrowBufferAppendInt8()` family of functions helps make code that

[GitHub] [arrow-nanoarrow] paleolimbot commented on a diff in pull request #17: Buffer element appenders

2022-08-10 Thread GitBox
paleolimbot commented on code in PR #17: URL: https://github.com/apache/arrow-nanoarrow/pull/17#discussion_r943029428 ## src/nanoarrow/buffer_test.cc: ## @@ -160,3 +160,31 @@ TEST(BufferTest, BufferTestError) { ArrowBufferReset(); } + +TEST(BufferTest,

[jira] [Created] (ARROW-17381) [C++] Centralize Errors in ExecPlan

2022-08-10 Thread Sasha Krassovsky (Jira)
Sasha Krassovsky created ARROW-17381: Summary: [C++] Centralize Errors in ExecPlan Key: ARROW-17381 URL: https://issues.apache.org/jira/browse/ARROW-17381 Project: Apache Arrow Issue

[jira] [Created] (ARROW-17380) Tag record batches with start_byte and end_byte infromation

2022-08-10 Thread Ziheng Wang (Jira)
Ziheng Wang created ARROW-17380: --- Summary: Tag record batches with start_byte and end_byte infromation Key: ARROW-17380 URL: https://issues.apache.org/jira/browse/ARROW-17380 Project: Apache Arrow

[GitHub] [arrow-adbc] lidavidm commented on issue #55: [Format] Minor gaps with existing APIs

2022-08-10 Thread GitBox
lidavidm commented on issue #55: URL: https://github.com/apache/arrow-adbc/issues/55#issuecomment-1211236715 Returning strings from a C API is a bit annoying and I'm not sure whether this is preferable, or if we want to go with an ODBC-style API (pass a caller-allocated buffer and length

[GitHub] [arrow-adbc] lidavidm commented on issue #55: [Format] Minor gaps with existing APIs

2022-08-10 Thread GitBox
lidavidm commented on issue #55: URL: https://github.com/apache/arrow-adbc/issues/55#issuecomment-1211235028 Punting on paramstyle and last inserted ID, but adding row count and current catalog: ```diff commit 50b2e40d727c0a51029d7f5506c0696b3a19a3b9 Author: David Li Date:

[GitHub] [arrow-adbc] zeroshade commented on issue #60: [Format] Retrieve expected param binding information

2022-08-10 Thread GitBox
zeroshade commented on issue #60: URL: https://github.com/apache/arrow-adbc/issues/60#issuecomment-1211193521 Seems good to me! :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [arrow-adbc] lidavidm commented on issue #55: [Format] Minor gaps with existing APIs

2022-08-10 Thread GitBox
lidavidm commented on issue #55: URL: https://github.com/apache/arrow-adbc/issues/55#issuecomment-1211184131 So looking into it - rowcount is easy to bind, but hard to support (lots of things don't support it or only support it for inserts) - that's OK. Flight SQL only exposes it for

[GitHub] [arrow-adbc] lidavidm merged pull request #58: [C][Python] Add options to control append vs create for bulk ingest

2022-08-10 Thread GitBox
lidavidm merged PR #58: URL: https://github.com/apache/arrow-adbc/pull/58 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [arrow-nanoarrow] lidavidm commented on a diff in pull request #17: Buffer element appenders

2022-08-10 Thread GitBox
lidavidm commented on code in PR #17: URL: https://github.com/apache/arrow-nanoarrow/pull/17#discussion_r942812080 ## src/nanoarrow/buffer_test.cc: ## @@ -160,3 +160,31 @@ TEST(BufferTest, BufferTestError) { ArrowBufferReset(); } + +TEST(BufferTest,

[GitHub] [arrow-nanoarrow] codecov-commenter commented on pull request #17: Buffer element appenders

2022-08-10 Thread GitBox
codecov-commenter commented on PR #17: URL: https://github.com/apache/arrow-nanoarrow/pull/17#issuecomment-1211146044 #

[GitHub] [arrow-nanoarrow] paleolimbot opened a new pull request, #17: Buffer element appenders

2022-08-10 Thread GitBox
paleolimbot opened a new pull request, #17: URL: https://github.com/apache/arrow-nanoarrow/pull/17 It turns out this is really annoying to do otherwise! Declaring a variable of an appropriate type gets verbose when switching on type, and it sounds like these functions might be useful for

[GitHub] [arrow-adbc] lidavidm commented on issue #60: [Format] Retrieve expected param binding information

2022-08-10 Thread GitBox
lidavidm commented on issue #60: URL: https://github.com/apache/arrow-adbc/issues/60#issuecomment-1211134073 How about this? The parameters are always encoded as a schema, but unknown types are represented as just NullType. Avoids having lots of optional things/multiple calls.

[jira] [Created] (ARROW-17379) [C++][Docs] Create tutorial content for Acero

2022-08-10 Thread Kae Suarez (Jira)
Kae Suarez created ARROW-17379: -- Summary: [C++][Docs] Create tutorial content for Acero Key: ARROW-17379 URL: https://issues.apache.org/jira/browse/ARROW-17379 Project: Apache Arrow Issue Type:

[jira] [Created] (ARROW-17378) [C++][Docs] Create tutorial content for basic Flight usage

2022-08-10 Thread Kae Suarez (Jira)
Kae Suarez created ARROW-17378: -- Summary: [C++][Docs] Create tutorial content for basic Flight usage Key: ARROW-17378 URL: https://issues.apache.org/jira/browse/ARROW-17378 Project: Apache Arrow

[jira] [Created] (ARROW-17377) [C++][Docs] Create tutorial content for basic Arrow, file access, compute, and datasets

2022-08-10 Thread Kae Suarez (Jira)
Kae Suarez created ARROW-17377: -- Summary: [C++][Docs] Create tutorial content for basic Arrow, file access, compute, and datasets Key: ARROW-17377 URL: https://issues.apache.org/jira/browse/ARROW-17377

[jira] [Created] (ARROW-17376) [C++][Docs] Create a 10 minutes to Arrow article

2022-08-10 Thread Kae Suarez (Jira)
Kae Suarez created ARROW-17376: -- Summary: [C++][Docs] Create a 10 minutes to Arrow article Key: ARROW-17376 URL: https://issues.apache.org/jira/browse/ARROW-17376 Project: Apache Arrow Issue

[jira] [Created] (ARROW-17375) [C++][Docs] Create installation instructional documentation in Sphinx

2022-08-10 Thread Kae Suarez (Jira)
Kae Suarez created ARROW-17375: -- Summary: [C++][Docs] Create installation instructional documentation in Sphinx Key: ARROW-17375 URL: https://issues.apache.org/jira/browse/ARROW-17375 Project: Apache

[GitHub] [arrow-adbc] lidavidm commented on issue #60: [Format] Retrieve expected param binding information

2022-08-10 Thread GitBox
lidavidm commented on issue #60: URL: https://github.com/apache/arrow-adbc/issues/60#issuecomment-1211081338 That makes sense. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [arrow-nanoarrow] lidavidm commented on a diff in pull request #16: Implement array appenders

2022-08-10 Thread GitBox
lidavidm commented on code in PR #16: URL: https://github.com/apache/arrow-nanoarrow/pull/16#discussion_r942753583 ## src/nanoarrow/array_inline.h: ## @@ -0,0 +1,246 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements.

[GitHub] [arrow-adbc] zeroshade commented on issue #60: [Format] Retrieve expected param binding information

2022-08-10 Thread GitBox
zeroshade commented on issue #60: URL: https://github.com/apache/arrow-adbc/issues/60#issuecomment-1211012442 at a minimum it would be good to be able to at least know the *number* of expected inputs even if the schema isn't knowable. Maybe having two values? an integer indicating

[GitHub] [arrow-adbc] lidavidm commented on issue #61: [Format] Simplify Execute and Query interface

2022-08-10 Thread GitBox
lidavidm commented on issue #61: URL: https://github.com/apache/arrow-adbc/issues/61#issuecomment-1210994922 CC @pitrou, @hannes, @krlmlr if you have opinions here? @lwhite1 had the same feedback about executeQuery/execute in Java last month. So for consistency a query method

[GitHub] [arrow-adbc] lidavidm commented on issue #61: [Format] Simplify Execute and Query interface

2022-08-10 Thread GitBox
lidavidm commented on issue #61: URL: https://github.com/apache/arrow-adbc/issues/61#issuecomment-1210989539 Also, possibly the driver manager could define execute-with-result-set and execute-with-rows-affected in terms of the generic execute + generic getters to retrieve the affected

[GitHub] [arrow-adbc] lidavidm commented on issue #61: [Format] Simplify Execute and Query interface

2022-08-10 Thread GitBox
lidavidm commented on issue #61: URL: https://github.com/apache/arrow-adbc/issues/61#issuecomment-1210987487 I think it may still have sense to have a generic Execute to ease compatibility with APIs that do not differentiate between the types of queries (and note JDBC has all three!), but

[GitHub] [arrow-adbc] lidavidm commented on issue #60: [Format] Retrieve expected param binding information

2022-08-10 Thread GitBox
lidavidm commented on issue #60: URL: https://github.com/apache/arrow-adbc/issues/60#issuecomment-1210983818 Flight SQL provides this. I think this makes sense, but yeah, something like `SELECT ?, ?` is going to be dubious. I don't know if there's a great way of indicating that, though.

[GitHub] [arrow-adbc] lidavidm commented on issue #59: Provide a "just query" method

2022-08-10 Thread GitBox
lidavidm commented on issue #59: URL: https://github.com/apache/arrow-adbc/issues/59#issuecomment-1210982209 I think this makes sense to provide as a convenience, but maybe not as the only method. The separate Statement object still lets us configure any options in an ABI-compatible way

[GitHub] [arrow-adbc] zeroshade opened a new issue, #61: Simplify Execute and Query interface

2022-08-10 Thread GitBox
zeroshade opened a new issue, #61: URL: https://github.com/apache/arrow-adbc/issues/61 Rather than the separate `Execute` / `GetStream` functions, it might be better to follow something similar to FlightSQL's interface or Go's `database/sql` API. Have two functions: * Execute

[GitHub] [arrow-adbc] zeroshade commented on issue #59: Provide a "just query" method

2022-08-10 Thread GitBox
zeroshade commented on issue #59: URL: https://github.com/apache/arrow-adbc/issues/59#issuecomment-1210917005 With this, it might make sense for the `AdbcStatement` object to *only* represent a prepared statement and place the `Prepare` method on the Connection rather than on the

[GitHub] [arrow-adbc] zeroshade opened a new issue, #60: Retrieve expected param binding information

2022-08-10 Thread GitBox
zeroshade opened a new issue, #60: URL: https://github.com/apache/arrow-adbc/issues/60 If available, it would be great to be able to retrieve any information about parameter binding that is available. Some potential information that *might* be available: * Number of expected

[GitHub] [arrow-adbc] zeroshade opened a new issue, #59: Provide a "just query" method

2022-08-10 Thread GitBox
zeroshade opened a new issue, #59: URL: https://github.com/apache/arrow-adbc/issues/59 For the common case of executing a single SQL string, let's have a method on the connection object for executing the query directly without the need for an intermediate Statement object -- This is an

[GitHub] [arrow-adbc] zeroshade commented on issue #55: [Format] Minor gaps with existing APIs

2022-08-10 Thread GitBox
zeroshade commented on issue #55: URL: https://github.com/apache/arrow-adbc/issues/55#issuecomment-1210860838 Two more gaps to add: * Retrieve the last inserted id for inserts into an auto-increment table * Retrieve the number of rows affected by the last query (number inserted /

[GitHub] [arrow-nanoarrow] lidavidm commented on a diff in pull request #16: Implement array appenders

2022-08-10 Thread GitBox
lidavidm commented on code in PR #16: URL: https://github.com/apache/arrow-nanoarrow/pull/16#discussion_r942472594 ## src/nanoarrow/nanoarrow.h: ## @@ -508,9 +512,31 @@ void ArrowArraySetValidityBitmap(struct ArrowArray* array, struct ArrowBitmap* b ArrowErrorCode

[GitHub] [arrow-nanoarrow] paleolimbot commented on a diff in pull request #16: Implement array appenders

2022-08-10 Thread GitBox
paleolimbot commented on code in PR #16: URL: https://github.com/apache/arrow-nanoarrow/pull/16#discussion_r942470780 ## src/nanoarrow/nanoarrow.h: ## @@ -508,9 +512,31 @@ void ArrowArraySetValidityBitmap(struct ArrowArray* array, struct ArrowBitmap* b ArrowErrorCode

[GitHub] [arrow-nanoarrow] paleolimbot commented on pull request #16: Implement array appenders

2022-08-10 Thread GitBox
paleolimbot commented on PR #16: URL: https://github.com/apache/arrow-nanoarrow/pull/16#issuecomment-1210685336 Just a note that I'm reworking this interface based on some thoughts after working with this for a day or so: - Instead of copying all the buffer/bitmap methods for the

[jira] [Created] (ARROW-17374) R Arrow install fails with SNAPPY_LIB-NOTFOUND

2022-08-10 Thread Shane Brennan (Jira)
Shane Brennan created ARROW-17374: - Summary: R Arrow install fails with SNAPPY_LIB-NOTFOUND Key: ARROW-17374 URL: https://issues.apache.org/jira/browse/ARROW-17374 Project: Apache Arrow

[GitHub] [arrow-nanoarrow] lidavidm commented on a diff in pull request #16: Implement array appenders

2022-08-10 Thread GitBox
lidavidm commented on code in PR #16: URL: https://github.com/apache/arrow-nanoarrow/pull/16#discussion_r942447804 ## src/nanoarrow/nanoarrow.h: ## @@ -508,9 +512,31 @@ void ArrowArraySetValidityBitmap(struct ArrowArray* array, struct ArrowBitmap* b ArrowErrorCode

[GitHub] [arrow-nanoarrow] lidavidm commented on a diff in pull request #16: Implement array appenders

2022-08-10 Thread GitBox
lidavidm commented on code in PR #16: URL: https://github.com/apache/arrow-nanoarrow/pull/16#discussion_r942446454 ## src/nanoarrow/nanoarrow.h: ## @@ -508,9 +512,31 @@ void ArrowArraySetValidityBitmap(struct ArrowArray* array, struct ArrowBitmap* b ArrowErrorCode

[GitHub] [arrow-nanoarrow] paleolimbot commented on a diff in pull request #16: Implement array appenders

2022-08-10 Thread GitBox
paleolimbot commented on code in PR #16: URL: https://github.com/apache/arrow-nanoarrow/pull/16#discussion_r942446356 ## src/nanoarrow/array_inline.h: ## @@ -0,0 +1,146 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements.

[GitHub] [arrow-nanoarrow] paleolimbot commented on a diff in pull request #16: Implement array appenders

2022-08-10 Thread GitBox
paleolimbot commented on code in PR #16: URL: https://github.com/apache/arrow-nanoarrow/pull/16#discussion_r942442675 ## src/nanoarrow/nanoarrow.h: ## @@ -508,9 +512,31 @@ void ArrowArraySetValidityBitmap(struct ArrowArray* array, struct ArrowBitmap* b ArrowErrorCode

[GitHub] [arrow-nanoarrow] paleolimbot commented on a diff in pull request #16: Implement array appenders

2022-08-10 Thread GitBox
paleolimbot commented on code in PR #16: URL: https://github.com/apache/arrow-nanoarrow/pull/16#discussion_r942441450 ## src/nanoarrow/nanoarrow.h: ## @@ -508,9 +512,31 @@ void ArrowArraySetValidityBitmap(struct ArrowArray* array, struct ArrowBitmap* b ArrowErrorCode

[GitHub] [arrow-nanoarrow] lidavidm commented on a diff in pull request #16: Implement array appenders

2022-08-10 Thread GitBox
lidavidm commented on code in PR #16: URL: https://github.com/apache/arrow-nanoarrow/pull/16#discussion_r942439274 ## src/nanoarrow/nanoarrow.h: ## @@ -508,9 +512,31 @@ void ArrowArraySetValidityBitmap(struct ArrowArray* array, struct ArrowBitmap* b ArrowErrorCode

[GitHub] [arrow-nanoarrow] paleolimbot commented on a diff in pull request #16: Implement array appenders

2022-08-10 Thread GitBox
paleolimbot commented on code in PR #16: URL: https://github.com/apache/arrow-nanoarrow/pull/16#discussion_r942437669 ## src/nanoarrow/nanoarrow.h: ## @@ -508,9 +512,31 @@ void ArrowArraySetValidityBitmap(struct ArrowArray* array, struct ArrowBitmap* b ArrowErrorCode

[GitHub] [arrow-nanoarrow] paleolimbot commented on a diff in pull request #16: Implement array appenders

2022-08-10 Thread GitBox
paleolimbot commented on code in PR #16: URL: https://github.com/apache/arrow-nanoarrow/pull/16#discussion_r942434862 ## src/nanoarrow/utils_inline.h: ## @@ -26,6 +26,114 @@ extern "C" { #endif +static inline void ArrowLayoutInit(struct ArrowLayout* layout, +

[GitHub] [arrow-nanoarrow] paleolimbot commented on a diff in pull request #16: Implement array appenders

2022-08-10 Thread GitBox
paleolimbot commented on code in PR #16: URL: https://github.com/apache/arrow-nanoarrow/pull/16#discussion_r942431188 ## src/nanoarrow/typedefs_inline.h: ## @@ -166,6 +166,24 @@ enum ArrowType { NANOARROW_TYPE_INTERVAL_MONTH_DAY_NANO }; +/// \brief Functional types of

[GitHub] [arrow-nanoarrow] lidavidm commented on a diff in pull request #16: Implement array appenders

2022-08-10 Thread GitBox
lidavidm commented on code in PR #16: URL: https://github.com/apache/arrow-nanoarrow/pull/16#discussion_r942417008 ## src/nanoarrow/nanoarrow.h: ## @@ -508,9 +512,31 @@ void ArrowArraySetValidityBitmap(struct ArrowArray* array, struct ArrowBitmap* b ArrowErrorCode

[jira] [Created] (ARROW-17373) copying dataset and immediatly writing the copy to a different location fails

2022-08-10 Thread Egill Axfjord Fridgeirsson (Jira)
Egill Axfjord Fridgeirsson created ARROW-17373: -- Summary: copying dataset and immediatly writing the copy to a different location fails Key: ARROW-17373 URL:

[jira] [Created] (ARROW-17372) Arrow parquet go is missing Power (ppc64le) specific utils implementations

2022-08-10 Thread Marvin Giessing (Jira)
Marvin Giessing created ARROW-17372: --- Summary: Arrow parquet go is missing Power (ppc64le) specific utils implementations Key: ARROW-17372 URL: https://issues.apache.org/jira/browse/ARROW-17372

[jira] [Created] (ARROW-17371) [R] Remove as.factor to dictionary_encode mapping

2022-08-10 Thread Nicola Crane (Jira)
Nicola Crane created ARROW-17371: Summary: [R] Remove as.factor to dictionary_encode mapping Key: ARROW-17371 URL: https://issues.apache.org/jira/browse/ARROW-17371 Project: Apache Arrow

[jira] [Created] (ARROW-17370) [C++] Add limit to SplitString()

2022-08-10 Thread Kouhei Sutou (Jira)
Kouhei Sutou created ARROW-17370: Summary: [C++] Add limit to SplitString() Key: ARROW-17370 URL: https://issues.apache.org/jira/browse/ARROW-17370 Project: Apache Arrow Issue Type:

[jira] [Created] (ARROW-17369) [CI][Java] Automatically extract dependent library versions from pom.xml for s390x

2022-08-10 Thread Kazuaki Ishizaki (Jira)
Kazuaki Ishizaki created ARROW-17369: Summary: [CI][Java] Automatically extract dependent library versions from pom.xml for s390x Key: ARROW-17369 URL: https://issues.apache.org/jira/browse/ARROW-17369