[ 
https://issues.apache.org/jira/browse/KUDU-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DawnZhang updated KUDU-1985:
----------------------------
    Description: 
dear Kudu developers,

a string field cost at least 16 bytes in rows data while transferring scan 
results
i read the source code and found the cpptype for STRING is kudu::Slice ( 
contains offset and size info ) which always take 16bytes in row data sidecar.

when there are lots of short/null strings in scan result transferring via 
network could be slow. ( compared with scanning parquet)

do you have any plan to optimize this?



  was:
dear Kudu developers,

a string field cost at least 16 bytes in rows data while transferring scan 
results
cpptype for STRING is kudu::Slice ( contains offset and size info ) and always 
take 16bytes in row data sidecar.

when there are lots of short/null strings in scan result transferring via 
network could be slow. ( compared with scanning parquet)

do you have any plan to optimize this?




> optimize result transferring performance for scanning short/null STRING values
> ------------------------------------------------------------------------------
>
>                 Key: KUDU-1985
>                 URL: https://issues.apache.org/jira/browse/KUDU-1985
>             Project: Kudu
>          Issue Type: Wish
>          Components: client, tserver
>            Reporter: DawnZhang
>
> dear Kudu developers,
> a string field cost at least 16 bytes in rows data while transferring scan 
> results
> i read the source code and found the cpptype for STRING is kudu::Slice ( 
> contains offset and size info ) which always take 16bytes in row data sidecar.
> when there are lots of short/null strings in scan result transferring via 
> network could be slow. ( compared with scanning parquet)
> do you have any plan to optimize this?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to