[
https://issues.apache.org/jira/browse/PHOENIX-7377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Istvan Toth reassigned PHOENIX-7377:
------------------------------------
Assignee: rejeb ben rejeb
> phoenix5-spark dataframe issue with schema inference
> ----------------------------------------------------
>
> Key: PHOENIX-7377
> URL: https://issues.apache.org/jira/browse/PHOENIX-7377
> Project: Phoenix
> Issue Type: Bug
> Components: connectors, spark-connector
> Reporter: rejeb ben rejeb
> Assignee: rejeb ben rejeb
> Priority: Major
>
> The fix of the PHOENIX-4981 introduced a bracking change in the way the
> schema was inferred.
> In previous versions of the connector, for non default column family ,
> columns mapped to "columnName" in DataFrame. Now, they are mapped to
> "columnFamily.columnName".
> There are no unit tests that cover this case, all tests uses tables with
> default column family "0".
> The change is made is this [pull
> request|https://github.com/apache/phoenix/pull/402] (the project was moved to
> another git repo since):
> * In previous version code uses `ColumnInfo.getDisplayName` to define the
> name of the column in the DF.
> * The new class SparkSchemaUtil the method used is
> `ColumnInfo.getColumnName` which returns the columnName as
> `columnFamilyName.columnName`.
> The pull request is related to this ticket PHOENIX-4981 the change is not
> documented.
> This change breaks jobs reading from tables having a non default column
> family.
> The saprk3 connector have the same issue since code has been duplicated from
> spark2 module to spark3 module.
> Since V1 api has been modified to use same method to resolve schema it has
> the same behavior and it should not bcause they are now a deprecated classes
> and should not contain a braking change.
>
> *Resolution proposal:*
> The best way to fix the issue is to add a property to have both options for
> schema non default column family column name mapping.
> The issue is in spark connector and it's resolution will not have a side
> effect on other phoenix-connectors like phoenix5-hive for example.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)