Re: [PR] feat(c/include/arrow-adbc): Add AdbcStatementRequestSchema [arrow-adbc]

2025-12-05 Thread via GitHub


paleolimbot commented on PR #3623:
URL: https://github.com/apache/arrow-adbc/pull/3623#issuecomment-3617230265

   Agreed (that's what I have always liked about schema negotiation).
   
   Should I put that bit back into this PR or is that a separate documentation 
update?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(c/include/arrow-adbc): Add AdbcStatementRequestSchema [arrow-adbc]

2025-12-04 Thread via GitHub


lidavidm commented on PR #3623:
URL: https://github.com/apache/arrow-adbc/pull/3623#issuecomment-3615500567

   My concern with options is that I'd want them to be reasonably consistent 
across drivers. Maybe it doesn't have to be part of this function, but I also 
don't want every driver to have its own slightly different way of choosing 
between string/string_view/large_string.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(c/include/arrow-adbc): Add AdbcStatementRequestSchema [arrow-adbc]

2025-11-19 Thread via GitHub


paleolimbot commented on PR #3623:
URL: https://github.com/apache/arrow-adbc/pull/3623#issuecomment-3555491528

   I can add that bit back in too...whatever is less confusing! We can also 
experiment with schema negotiation afterward (I don't think it affects the 
behaviour of the function but I could be missing something). I think options 
are orthogonal...options affect the guessing of the schema in the absence of a 
request and we could have those, too.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(c/include/arrow-adbc): Add AdbcStatementRequestSchema [arrow-adbc]

2025-11-19 Thread via GitHub


lidavidm commented on PR #3623:
URL: https://github.com/apache/arrow-adbc/pull/3623#issuecomment-3555270014

   Well, maybe we want to cover cases like explicitly choosing between 
string/large_string/string_view (in principle the driver would have picked one, 
but maybe the user wants one of the other types and this can avoid a later 
conversion). Or is the thought that we need to add driver-specific flags for 
all of that? @CurtHagenlocher had commented before that with JDBC/ODBC, you get 
to pick the type you want when you get the data - that's obviously too much 
flexibility for an Arrow stream, but I don't see why we should limit this to 
only cases where it is ambiguous to the driver. (In the first place, the 
underlying data isn't Arrow much of the time; even if it is, ideally the 
request would be pushed back to the backend.) 
   
   The driver could certainly ignore requests to cast, of course, especially if 
it's actually just going to be a cast internally.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(c/include/arrow-adbc): Add AdbcStatementRequestSchema [arrow-adbc]

2025-11-12 Thread via GitHub


paleolimbot commented on PR #3623:
URL: https://github.com/apache/arrow-adbc/pull/3623#issuecomment-3521890311

   > Is the expectation that the driver itself would cast/convert the types as 
they come back from the source?
   
   Yes, it's intended as guidance where a driver would otherwise have to guess. 
I wrote it up here as best-effort (same as PyCapsule `requested_schema`)...the 
other choice would be that the driver is required to produce that schema. I 
don't happen think that's particularly useful (whatever schema is produced by 
the output stream is always the stream that should be used to interpret the 
data), but for the immediate use cases (SQLite and Postgres drivers) it would 
also not be a problem.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(c/include/arrow-adbc): Add AdbcStatementRequestSchema [arrow-adbc]

2025-11-11 Thread via GitHub


zeroshade commented on PR #3623:
URL: https://github.com/apache/arrow-adbc/pull/3623#issuecomment-3519905815

   Mostly I'm just trying to cut down the number of extra functions to create. 
If the idea is that we wouldn't need to create a new function to handle 
multiple result sets, then I'm okay with that. And that just leaves my first 
question


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(c/include/arrow-adbc): Add AdbcStatementRequestSchema [arrow-adbc]

2025-11-11 Thread via GitHub


lidavidm commented on PR #3623:
URL: https://github.com/apache/arrow-adbc/pull/3623#issuecomment-3519320223

   Wouldn't you just call this in between NextResultSetSchema and 
NextResultSet? We'd just generalize NextResultSetSchema.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(c/include/arrow-adbc): Add AdbcStatementRequestSchema [arrow-adbc]

2025-11-11 Thread via GitHub


zeroshade commented on PR #3623:
URL: https://github.com/apache/arrow-adbc/pull/3623#issuecomment-3518831086

   two questions:
   
   1. Is the expectation that the driver itself would cast/convert the types as 
they come back from the source? 
   2. Since we're also working on and discussing handling multiple result sets, 
how would this work/handle the case of multiple result sets? Would we need an 
entirely new function once we add support for multiple result sets to handle 
this for that case? Is there any way we can adapt this function so that it can 
work in both cases somehow instead of needing to create an entirely new 
function afterwards?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(c/include/arrow-adbc): Add AdbcStatementRequestSchema [arrow-adbc]

2025-11-11 Thread via GitHub


paleolimbot commented on PR #3623:
URL: https://github.com/apache/arrow-adbc/pull/3623#issuecomment-3518246126

   I think the reason for adding this would be for a user who executed and 
collected a query but got an error along the lines of "can't append string to 
column of type int64", which is what would happen today if someone issued a 
query with the sqllite driver and first few thousand values were null.
   
   The second reason is for people who already know the arrow type they want, 
perhaps because they're using a tool like ibis that already knows or because 
they executed the query and the driver guessed something they didn't like or 
that was inconsistent with a previous query against the same data.
   
   I should probably take out the negotiation bit since it seems to be 
distracting from the intent...it's orthogonal to options, which would only 
affect the default guessing (but still have no way to specifically specify type 
parameters or specific problematic columns).
   
   (These are all roughly the same reason why a requested schema is provided 
for a csv reader.)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(c/include/arrow-adbc): Add AdbcStatementRequestSchema [arrow-adbc]

2025-11-11 Thread via GitHub


amoeba commented on PR #3623:
URL: https://github.com/apache/arrow-adbc/pull/3623#issuecomment-3518006425

   This looks pretty good. On one hand, it's not the most ergonomic way to 
handle this but it's the most regular with the existing API. Am I right that 
the most convenient way to use this API may be to first call 
`AdbcStatementExecuteSchema` (and hope it's implemented), then modify the 
returned schema to my desire, call `AdbcStatementRequestSchema` with the 
modified schema, and then execute my statement?
   
   Another question, should it be spelled out that the schema can be partial or 
not?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(c/include/arrow-adbc): Add AdbcStatementRequestSchema [arrow-adbc]

2025-10-27 Thread via GitHub


paleolimbot commented on PR #3623:
URL: https://github.com/apache/arrow-adbc/pull/3623#issuecomment-3454380931

   > I wonder if we also want to somehow allow things like "read 'field_name' 
as string view, but leave other fields alone" or "read all 'database_type' as 
string view, but leave other fields alone"
   
   Sure, or some version of an option get/set with of a list of supported or 
preferred types. Happy to take out the negotiation bit if that's confusing (the 
main purpose here is to help the driver make better choices for loose or 
row-based typing).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(c/include/arrow-adbc): Add AdbcStatementRequestSchema [arrow-adbc]

2025-10-27 Thread via GitHub


lidavidm commented on PR #3623:
URL: https://github.com/apache/arrow-adbc/pull/3623#issuecomment-3453816347

   Thanks for this!
   
   I wonder if we also want to somehow allow things like "read 'field_name' as 
string view, but leave other fields alone" or "read all 'database_type' as 
string view, but leave other fields alone"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]