[ 
https://issues.apache.org/jira/browse/IMPALA-12373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17787132#comment-17787132
 ] 

Csaba Ringhofer commented on IMPALA-12373:
------------------------------------------

>I think we don't need NULL termination so we can store actually 11 chars with 
>libc++'s technique.
In case the StringValue is inside a tuple, it may be possible to store even 
more chars in the tuple.

The idea is to reserve some bytes before the StringValue, and if based on the 
last bit it is a small string, but length is > 11, then we could assume that 
the string starts len - 11 bytes before the address of StringValue. This could 
speed things up a bit in case we have stats about avg/max string length, as we 
the number of extra bytes could be chosen to minimize waste.

> Implement Small String Optimization for StringValue
> ---------------------------------------------------
>
>                 Key: IMPALA-12373
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12373
>             Project: IMPALA
>          Issue Type: Improvement
>            Reporter: Zoltán Borók-Nagy
>            Assignee: Zoltán Borók-Nagy
>            Priority: Major
>              Labels: Performance
>         Attachments: small_string.cpp
>
>
> Implement Small String Optimization for StringValue.
> Current memory layout of StringValue is:
> {noformat}
>   char* ptr;  // 8 byte
>   int len;    // 4 byte
> {noformat}
> For small strings with size up to 8 we could store the string contents in the 
> bytes of the 'ptr'. Something like that:
> {noformat}
>   union {
>     char* ptr;
>     char small_buf[sizeof(ptr)];
>   };
>   int len;
> {noformat}
> Many C++ string implementations use the {{Small String Optimization}} to 
> speed up work with small strings. For example:
> {code:java}
> Microsoft STL, libstdc++, libc++, Boost, Folly.{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to