[ https://issues.apache.org/jira/browse/SPARK-34816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kent Yao resolved SPARK-34816. ------------------------------ Fix Version/s: 3.2.0 Resolution: Fixed > Support for Parquet unsigned LogicalTypes > ----------------------------------------- > > Key: SPARK-34816 > URL: https://issues.apache.org/jira/browse/SPARK-34816 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 3.2.0 > Reporter: Kent Yao > Assignee: Kent Yao > Priority: Major > Fix For: 3.2.0 > > > Parquet supports some unsigned datatypes. Here is the definition related in > parquet.thrift > {code:java} > /** > * Common types used by frameworks(e.g. hive, pig) using parquet. This helps > map > * between types in those frameworks to the base types in parquet. This is > only > * metadata and not needed to read or write the data. > */ > /** > * An unsigned integer value. > * > * The number describes the maximum number of meaningful data bits in > * the stored value. 8, 16 and 32 bit values are stored using the > * INT32 physical type. 64 bit values are stored using the INT64 > * physical type. > * > */ > UINT_8 = 11; > UINT_16 = 12; > UINT_32 = 13; > UINT_64 = 14; > {code} > Spark does not support unsigned datatypes. In SPARK-10113, we emit an > exception with a clear message for them. > UInt8-[0:255] > UInt16-[0:65535] > UInt32-[0:4294967295] > UInt64-[0:18446744073709551615] > Unsigned types - may be used to produce smaller in-memory representations of > the data. If the stored value is larger than the maximum allowed by int32 or > int64, then the behavior is undefined. > In this ticket, we try to read them as a higher precision signed type -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org