Hi, I came across strange behavior when dealing with postgres columns of type numeric[] using Spark 2.3.2, PostgreSQL 10.4, 9.6.9. Consider the following table definition:
create table test1 ( v numeric[], d numeric ); insert into test1 values('{1111.222,2222.332}', 222.4555); When reading the table into a Dataframe, I get the following schema: root |-- v: array (nullable = true) | |-- element: decimal(0,0) (containsNull = true) |-- d: decimal(38,18) (nullable = true) Notice that for both columns precision and scale were not specified, but in case of the array element I got both set to 0, while in the other case defaults were set. Later, when I try to read the Dataframe, I get the following error: java.lang.IllegalArgumentException: requirement failed: Decimal precision 4 exceeds max precision 0 at scala.Predef$.require(Predef.scala:224) at org.apache.spark.sql.types.Decimal.set(Decimal.scala:114) at org.apache.spark.sql.types.Decimal$.apply(Decimal.scala:453) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$16$$anonfun$apply$6$$anonfun$apply$7.apply(JdbcUtils.scala:474) ... I would expect to get array elements of type decimal(38,18) and no error when reading in this case. Should this be considered a bug? Is there a workaround other than changing the column array type definition to include explicit precision and scale? Best regards, Alexey -- реклама ----------------------------------------------------------- Поторопись зарегистрировать самый короткий почтовый адрес @i.ua https://mail.i.ua/reg - и получи 1Gb для хранения писем --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org