Petar Vasiljevic created SPARK-50092:
----------------------------------------

             Summary: Fix PostgreSQL connector behaviour for multidimensional 
arrays
                 Key: SPARK-50092
                 URL: https://issues.apache.org/jira/browse/SPARK-50092
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 4.0.0
            Reporter: Petar Vasiljevic
             Fix For: 4.0.0


There is a bug introduced in this PR 
[https://github.com/apache/spark/pull/46006]. This PR fixed the behaviour for 
PostgreSQL connector for multidimensional arrays since we have mapped all 
arrays to 1D arrays.

This PR has introduced a bug. Following scenario is broken:
 * User has a table t1 on Postgres and does CTAS command to create table t2 
with same data.
 * PR 46006 is resolving the dimensionality of column by reading the metadata 
from pg_attribute table and attndims column.
 * This query returns correct dimensionality for table t1, but for table t2 
that is created via CTAS it returns 0 always. This leads to all of the arrays 
being mapped to 0-D array which is the type itself (for example int)

As a solution, we can query array_ndims function on given column that will 
return the dimension of the column. It works for CTAS-like-created tables too.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to