[jira] [Created] (IGNITE-14743) Support large Tuples.

Andrey Mashenkov (Jira) Wed, 19 May 2021 06:03:06 -0700

Andrey Mashenkov created IGNITE-14743:
-----------------------------------------


             Summary: Support large Tuples.
                 Key: IGNITE-14743
                 URL: https://issues.apache.org/jira/browse/IGNITE-14743
             Project: Ignite
          Issue Type: Improvement
            Reporter: Andrey Mashenkov
             Fix For: 3.0


For now, TupleAssembler writes offsets for varlen columns as 2-byte {{short 
}}type.
This implicitly restricts key/value sizes down to 64 kB in total.

Let's use 4-bytes {{int}} type for offsets for large tuples.

Possible ways are:
 # Just use ints for offsets, this increases the memory overhead for Rows.
 # Pre-calculate potential row size during SchemaDescriptor initialization and 
keep 'offset_size' in schema. 
Unlimited varlen type (which is default) usage will end-up user will have 
4-byte offset size in most cases.
 # Pre-calculate exact tuple size for each row and use row flags. 
This requires Tuple data analysis which we already do to detect non-null varlen 
values and nulls. Strings may be a headache as we have to analyze each char for 
accurate tuple size calculation.
 # Pre-calculate tuple size skipping chars analysis.

Using adaptive offset_size approaches allows us to use 1-2-4 byte numbers 
(byte, short, int) for offsets.

Collations for String columns may be introduced and used as a hint, but we will 
need to check a collation for every char on write.

'Large keys' is an unwanted case, may a solution for values only will be 
enough...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (IGNITE-14743) Support large Tuples.

Reply via email to