[ 
https://issues.apache.org/jira/browse/ARROW-16427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Farmer reassigned ARROW-16427:
-----------------------------------

    Assignee: Todd Farmer

> [Java] jdbcToArrowVectors / sqlToArrowVectorIterator fails to handle variable 
> decimal precision / scale
> -------------------------------------------------------------------------------------------------------
>
>                 Key: ARROW-16427
>                 URL: https://issues.apache.org/jira/browse/ARROW-16427
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Java
>    Affects Versions: 7.0.0
>            Reporter: Jonathan Swenson
>            Assignee: Todd Farmer
>            Priority: Major
>              Labels: JDBC, Java
>
> When a JDBC driver returns a Numeric type that doesn't exactly align with 
> what is in the JDBC metadata, jdbcToArrowVectors / sqlToArrowVectorIterator 
> fails to process the result (failing on serializing the the value into the 
> BigDecimalVector). 
> It appears as though this is because JDBC drivers can return BigDecimal / 
> Numeric values that are different between the metadata and not consistent 
> between each of the rows. 
> Is there a recommended course of action to represent a variable precision / 
> scale decimal vector? In any case it does not seem possible to convert JDBC 
> data with the built in utilities that uses these numeric types when they come 
> in this form. 
> It seems like both the Oracle and the Postgres JDBC driver also returns 
> metadata with a 0,0 precision / scale when values in the result set have 
> different (and varied) precision / scale. 
> An example: 
> Against postgres, running a simple SQL query that produces numeric types can 
> lead to a JDBC result set with BigDecimal values with variable decimal 
> precision/scale. 
> {code:java}
> SELECT value FROM (
>   SELECT 1000000000000000.01 AS "value" 
>   UNION SELECT 1000000000300.0000001
> ) a {code}
>  
> The postgres JDBC adapter produces a result set that looks like the 
> following: 
>  
> || ||value||precision||scale||
> |metadata|N/A|0|0|
> |row 1|1000000000000000.01|18|2|
> |row 2|1000000000300.0000001|20|7|
>  
> Even a result set that returns a single value may Numeric values with 
> precision / scale that do not match the precision / scale in the 
> ResultSetMetadata. 
>  
> {code:java}
> SELECT AVG(one) from (
>   SELECT 1000000000000000.01 as "one" 
>   UNION select 1000000000300.0000001
> ) a {code}
> produces a result set that looks like this
>  
> || ||value||precision||scale||
> |metadata|N/A|0|0|
> |row 1|500500000000150.0050001|22|7|
>  
> When processing the result set using the simple jdbcToArrowVectors (or 
> sqlToArrowVectorIterator) this fails to set the values extracted from the 
> result set into the the DecimalVector
>  
> {code:java}
> val calendar = JdbcToArrowUtils.getUtcCalendar()
> val schema = JdbcToArrowUtils.jdbcToArrowSchema(rs.metaData, calendar)
> val root = VectorSchemaRoot.create(schema, RootAllocator())
> val vectors = JdbcToArrowUtils.jdbcToArrowVectors(rs, root, calendar) {code}
> Error:
>  
> {code:java}
> Exception in thread "main" java.lang.IndexOutOfBoundsException: index: 0, 
> length: 1 (expected: range(0, 0))
>     at org.apache.arrow.memory.ArrowBuf.checkIndexD(ArrowBuf.java:318)
>     at org.apache.arrow.memory.ArrowBuf.chk(ArrowBuf.java:305)
>     at org.apache.arrow.memory.ArrowBuf.getByte(ArrowBuf.java:507)
>     at org.apache.arrow.vector.BitVectorHelper.setBit(BitVectorHelper.java:85)
>     at org.apache.arrow.vector.DecimalVector.set(DecimalVector.java:354)
>     at 
> org.apache.arrow.adapter.jdbc.consumer.DecimalConsumer$NullableDecimalConsumer.consume(DecimalConsumer.java:61)
>     at 
> org.apache.arrow.adapter.jdbc.consumer.CompositeJdbcConsumer.consume(CompositeJdbcConsumer.java:46)
>     at 
> org.apache.arrow.adapter.jdbc.JdbcToArrowUtils.jdbcToArrowVectors(JdbcToArrowUtils.java:369)
>     at 
> org.apache.arrow.adapter.jdbc.JdbcToArrowUtils.jdbcToArrowVectors(JdbcToArrowUtils.java:321)
>  {code}
>  
> using `sqlToArrowVectorIterator` also fails with an error trying to set data 
> into the vector: (requires a little bit of trickery to force creation of the 
> package private configuration)
>  
> {code:java}
> Exception in thread "main" java.lang.RuntimeException: Error occurred while 
> getting next schema root.
>     at 
> org.apache.arrow.adapter.jdbc.ArrowVectorIterator.next(ArrowVectorIterator.java:179)
>     at 
> com.acme.dataformat.ArrowResultSetProcessor.processResultSet(ArrowResultSetProcessor.kt:31)
>     at com.acme.AppKt.main(App.kt:54)
>     at com.acme.AppKt.main(App.kt)
> Caused by: java.lang.RuntimeException: Error occurred while consuming data.
>     at 
> org.apache.arrow.adapter.jdbc.ArrowVectorIterator.consumeData(ArrowVectorIterator.java:121)
>     at 
> org.apache.arrow.adapter.jdbc.ArrowVectorIterator.load(ArrowVectorIterator.java:153)
>     at 
> org.apache.arrow.adapter.jdbc.ArrowVectorIterator.next(ArrowVectorIterator.java:175)
>     ... 3 more
> Caused by: java.lang.UnsupportedOperationException: BigDecimal scale must 
> equal that in the Arrow vector: 7 != 0
>     at 
> org.apache.arrow.vector.util.DecimalUtility.checkPrecisionAndScale(DecimalUtility.java:95)
>     at org.apache.arrow.vector.DecimalVector.set(DecimalVector.java:355)
>     at 
> org.apache.arrow.adapter.jdbc.consumer.DecimalConsumer$NullableDecimalConsumer.consume(DecimalConsumer.java:61)
>     at 
> org.apache.arrow.adapter.jdbc.consumer.CompositeJdbcConsumer.consume(CompositeJdbcConsumer.java:46)
>     at 
> org.apache.arrow.adapter.jdbc.ArrowVectorIterator.consumeData(ArrowVectorIterator.java:113)
>     ... 5 more {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to