Samuel Yuan created HIVE-4322:
---------------------------------

             Summary: SkewedInfo in Metastore Thrift API cannot be deserialized 
in Python
                 Key: HIVE-4322
                 URL: https://issues.apache.org/jira/browse/HIVE-4322
             Project: Hive
          Issue Type: Bug
          Components: Metastore, Thrift API
    Affects Versions: 0.11.0
            Reporter: Samuel Yuan
            Assignee: Samuel Yuan
            Priority: Minor


The Thrift-generated Python code that deserializes Thrift objects fails 
whenever a complex type is used as a map key, because by default mutable Python 
objects such as lists do not have a hash function. See 
https://issues.apache.org/jira/browse/THRIFT-162 for related discussion.

The SkewedInfo struct contains a map which uses a list as a key, breaking the 
Python Thrift interface. It is not possible to specify the mapping from Thrift 
types to Python types, or otherwise we could map Thrift lists to Python tuples. 
Instead, the proposed workaround wraps the list inside a new struct. This alone 
does not accomplish anything, but allows Python clients to define a hash 
function for the struct class, e.g.:

def f(object):
    return hash(tuple(object.skewedValueList))

SkewedValueList.__hash__ = f

In practice a more efficient hash might be defined that does not involve 
copying the list. The advantage of wrapping the list inside a struct is that 
the client does not have to define the hash on the list itself, which would 
change the behaviour of lists everywhere else in the code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to