[jira] [Commented] (CALCITE-2322) Add fetch size support to connection url and JDBC statement

Julian Hyde (Jira) Tue, 29 Jun 2021 14:52:05 -0700


    [ 
https://issues.apache.org/jira/browse/CALCITE-2322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17371654#comment-17371654
 ]

Julian Hyde commented on CALCITE-2322:
--------------------------------------

I agree with [~vlsi] that it is more useful to "fetch N bytes" than "fetch N 
rows". (Efficiency is about doing a reasonable amount of work per network 
round-trip and per batch of rows, and since the scarce resources are network 
packet size and CPU cache, things are best measured in bytes.)

@vlsi wrote:
{quote}I don't really know how to add a byte-sized property side by side with 
fetch_row_count.{quote}

I think the ideal strategy will be a mix of rows and bytes, something like 
"fetch 100 rows or 64K bytes, whichever is larger, but in any case not more 
than 1024K bytes". That strategy wouldn't make much sense if decomposed into 4 
properties ("minRows", "maxRows", "minBytes", "maxBytes") because someone might 
forget to set one property and end up with a default that contradicts the other 
properties. So I would have a new property called "fetchPolicy", which combines 
them all. (The format of "fetchPolicy" is TBD, but you can think of it as a 
JSON string containing various numbers.)

I'm fine with Vladimir's suggestion "fetch_size_rows".

I don't like the "default_" prefix because I think of "fetch_size_rows" as a 
property that can be set at one level (in this case, when the connection is 
created) and overridden at other levels (say by setting a property during the 
life of the connection, or by overriding for a particular statement). There are 
many other properties, e.g. "locale", "timeZone", that could be set in a 
similar way. Using the same property name throughout, without "default" prefix, 
makes it easier to override at different points in the lifecycle.

> Add fetch size support to connection url and JDBC statement
> -----------------------------------------------------------
>
>                 Key: CALCITE-2322
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2322
>             Project: Calcite
>          Issue Type: Improvement
>          Components: avatica, core
>    Affects Versions: 1.11.0
>            Reporter: Kevin Minder
>            Priority: Major
>          Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Currently the remote driver defaults to hard coded fetch size of 100 rows.  
> When a connection is operating in HTTP mode having such a small fetch size 
> can add enormous overhead.  This is especially true if TLS connections are 
> used and made worse if each connection flows throw multiple proxies.  
> Consider that 100K rows returned 100 rows at a time will make 1K HTTP POST 
> requests.  One might say that nobody should ever do that but some tools like 
> Spotfire may end up doing this.

--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-2322) Add fetch size support to connection url and JDBC statement

Reply via email to