[ 
https://issues.apache.org/jira/browse/PHOENIX-6321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Istvan Toth resolved PHOENIX-6321.
----------------------------------
    Resolution: Duplicate

> Array of Shorts/Smallint returned as Array of Integers
> ------------------------------------------------------
>
>                 Key: PHOENIX-6321
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-6321
>             Project: Phoenix
>          Issue Type: Bug
>          Components: spark-connector
>    Affects Versions: 5.0.0
>            Reporter: Alvaro Fernandez
>            Priority: Major
>
> When using spark connector to read a Phoenix table with at least a column 
> defined as Array of Shorts, the resulting Dataset infers the schema as a 
> Array of Integers.
> I believe this is due to the following code:
> phoenix/phoenix-spark/src/main/scala/org/apache/phoenix/spark/PhoenixRDD.scala:182
> case t if t.isInstanceOf[PSmallintArray] || 
> t.isInstanceOf[PUnsignedSmallintArray] => ArrayType(IntegerType, containsNull 
> = true)
>  
> phoenix-connectors/phoenix-spark-base/src/main/scala/org/apache/phoenix/spark/SparkSchemaUtil.scala:82
> case t if t.isInstanceOf[PSmallintArray] || 
> t.isInstanceOf[PUnsignedSmallintArray] => ArrayType(IntegerType, containsNull 
> = true)
>  
> Subsequent tries to programatically cast to Shorts will fail with a 
> ClassCastException.
> And it is also impossible to define the original schema within a 
> DataFrameReader as it fails with:"org.apache.spark.sql.AnalysisException: 
> org.apache.phoenix.spark does not allow user-specified schemas.;"
> Making it impossible afaik to work with tables with this kind of data types.
> Is there any reason to have this code intepreting SmallInts/Shorts as 
> Integers?
> Thanks
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to