[
https://issues.apache.org/jira/browse/PHOENIX-4382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16299450#comment-16299450
]
Vincent Poon commented on PHOENIX-4382:
---------------------------------------
- TINYINT or BYTE(1) should work fine since I explicitly check for length == 2,
but we can add a test case to make sure
- If there are multiple null values, it actually makes things less ambiguous -
that's the hack I used. So, for example, the way we can distinguish
{separatorByte, 2} (representing two nulls), and an equivalent value, is that
if it is supposed to represent a null, then the previous value should have a
length of 0 (we know the length by checking the offsets). You only get the
problem with a single null because you have no prior null length to check.
- yes, BYTE(2) would have the same issue.
- yea, variable length types should be fine
- good point, agree probably not worth the effort
- As for the subject, that would be accurate AFTER the patch. Again, before my
patch, any column value starting with separator byte would be broken. That's
why I found large portions of e.g. the BigInt range being returned as null,
even though it's more than 2 bytes.
I think we can put that line as a known issue with V1 on the storage schemes
webpage. It's limited to just two specific two-byte fixed length values,
though.
> Immutable table SINGLE_CELL_ARRAY_WITH_OFFSETS values starting with separator
> byte return null in query results
> ---------------------------------------------------------------------------------------------------------------
>
> Key: PHOENIX-4382
> URL: https://issues.apache.org/jira/browse/PHOENIX-4382
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 4.14.0
> Reporter: Vincent Poon
> Assignee: Vincent Poon
> Attachments: PHOENIX-4382.v1.master.patch,
> PHOENIX-4382.v2.master.patch, UpsertBigValuesIT.java
>
>
> For immutable tables, upsert of some values like Short.MAX_VALUE results in a
> null value in query resultsets. Mutable tables are not affected. I tried
> with BigInt and got the same problem.
> For Short, the breaking point seems to be 32512.
> This is happening because of the way we serialize nulls. For nulls, we write
> out [separatorByte, #_of_nulls]. However, some data values, like
> Short.MAX_VALUE, start with separatorByte, we can't distinguish between a
> null and these values. Currently the code assumes it's a null when it sees a
> leading separatorByte, hence the incorrect query results.
> See attached test - testShort() , testBigInt()
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)