[
https://issues.apache.org/jira/browse/PHOENIX-1227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14117549#comment-14117549
]
James Taylor commented on PHOENIX-1227:
---------------------------------------
I agree, this shouldn't be allowed. There should be a check for
MD5(v).getDataType().isCoercibleTo(v.getDataType()). If that fails, we should
throw. You could put an explicit CAST in the query if need be.
> Upsert select of binary data doesn't always correctly coerce data into
> correct format
> -------------------------------------------------------------------------------------
>
> Key: PHOENIX-1227
> URL: https://issues.apache.org/jira/browse/PHOENIX-1227
> Project: Phoenix
> Issue Type: Bug
> Reporter: Gabriel Reid
>
> If you run an upsert select statement that selects a binary value and writes
> a numerical value (or probably other types as well), you can end up with
> invalid binary values stored in HBase.
> For example, in something like this if v is an {{INTEGER}} column:
> {code}UPSERT INTO MYTABLE (v) SELECT MD5(v) FROM MYTABLE{code}
> the literal 16-byte binary values from the MD5 function will be added
> verbatim into the field v.
> This is a really big problem if v is the key field, as it can even lead to
> multiple keys with what appear to be the same value. This happens if there
> are multiple (invalid) row keys that begin with the same 4 bytes, as only the
> first 4 bytes of the key will be shown when selecting data from the column,
> but the different full-length values of the row keys will lead to multiple
> records.
> Somewhat related to this, a statement like the following (with a constant
> binary value) will fail immediately due to datatype mismatch:
> {code}UPSERT INTO MYTABLE (v) SELECT MD5(1) FROM MYTABLE{code}
> It seems that the first expression above should probably fail in the same way
> as the expression with the constant binary value (or neither of them should
> fail). Obviously there shouldn't be any invalid values being written in to
> HBase.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)