[ https://issues.apache.org/jira/browse/ARROW-16427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Todd Farmer reassigned ARROW-16427: ----------------------------------- Assignee: Todd Farmer > [Java] jdbcToArrowVectors / sqlToArrowVectorIterator fails to handle variable > decimal precision / scale > ------------------------------------------------------------------------------------------------------- > > Key: ARROW-16427 > URL: https://issues.apache.org/jira/browse/ARROW-16427 > Project: Apache Arrow > Issue Type: Bug > Components: Java > Affects Versions: 7.0.0 > Reporter: Jonathan Swenson > Assignee: Todd Farmer > Priority: Major > Labels: JDBC, Java > > When a JDBC driver returns a Numeric type that doesn't exactly align with > what is in the JDBC metadata, jdbcToArrowVectors / sqlToArrowVectorIterator > fails to process the result (failing on serializing the the value into the > BigDecimalVector). > It appears as though this is because JDBC drivers can return BigDecimal / > Numeric values that are different between the metadata and not consistent > between each of the rows. > Is there a recommended course of action to represent a variable precision / > scale decimal vector? In any case it does not seem possible to convert JDBC > data with the built in utilities that uses these numeric types when they come > in this form. > It seems like both the Oracle and the Postgres JDBC driver also returns > metadata with a 0,0 precision / scale when values in the result set have > different (and varied) precision / scale. > An example: > Against postgres, running a simple SQL query that produces numeric types can > lead to a JDBC result set with BigDecimal values with variable decimal > precision/scale. > {code:java} > SELECT value FROM ( > SELECT 1000000000000000.01 AS "value" > UNION SELECT 1000000000300.0000001 > ) a {code} > > The postgres JDBC adapter produces a result set that looks like the > following: > > || ||value||precision||scale|| > |metadata|N/A|0|0| > |row 1|1000000000000000.01|18|2| > |row 2|1000000000300.0000001|20|7| > > Even a result set that returns a single value may Numeric values with > precision / scale that do not match the precision / scale in the > ResultSetMetadata. > > {code:java} > SELECT AVG(one) from ( > SELECT 1000000000000000.01 as "one" > UNION select 1000000000300.0000001 > ) a {code} > produces a result set that looks like this > > || ||value||precision||scale|| > |metadata|N/A|0|0| > |row 1|500500000000150.0050001|22|7| > > When processing the result set using the simple jdbcToArrowVectors (or > sqlToArrowVectorIterator) this fails to set the values extracted from the > result set into the the DecimalVector > > {code:java} > val calendar = JdbcToArrowUtils.getUtcCalendar() > val schema = JdbcToArrowUtils.jdbcToArrowSchema(rs.metaData, calendar) > val root = VectorSchemaRoot.create(schema, RootAllocator()) > val vectors = JdbcToArrowUtils.jdbcToArrowVectors(rs, root, calendar) {code} > Error: > > {code:java} > Exception in thread "main" java.lang.IndexOutOfBoundsException: index: 0, > length: 1 (expected: range(0, 0)) > at org.apache.arrow.memory.ArrowBuf.checkIndexD(ArrowBuf.java:318) > at org.apache.arrow.memory.ArrowBuf.chk(ArrowBuf.java:305) > at org.apache.arrow.memory.ArrowBuf.getByte(ArrowBuf.java:507) > at org.apache.arrow.vector.BitVectorHelper.setBit(BitVectorHelper.java:85) > at org.apache.arrow.vector.DecimalVector.set(DecimalVector.java:354) > at > org.apache.arrow.adapter.jdbc.consumer.DecimalConsumer$NullableDecimalConsumer.consume(DecimalConsumer.java:61) > at > org.apache.arrow.adapter.jdbc.consumer.CompositeJdbcConsumer.consume(CompositeJdbcConsumer.java:46) > at > org.apache.arrow.adapter.jdbc.JdbcToArrowUtils.jdbcToArrowVectors(JdbcToArrowUtils.java:369) > at > org.apache.arrow.adapter.jdbc.JdbcToArrowUtils.jdbcToArrowVectors(JdbcToArrowUtils.java:321) > {code} > > using `sqlToArrowVectorIterator` also fails with an error trying to set data > into the vector: (requires a little bit of trickery to force creation of the > package private configuration) > > {code:java} > Exception in thread "main" java.lang.RuntimeException: Error occurred while > getting next schema root. > at > org.apache.arrow.adapter.jdbc.ArrowVectorIterator.next(ArrowVectorIterator.java:179) > at > com.acme.dataformat.ArrowResultSetProcessor.processResultSet(ArrowResultSetProcessor.kt:31) > at com.acme.AppKt.main(App.kt:54) > at com.acme.AppKt.main(App.kt) > Caused by: java.lang.RuntimeException: Error occurred while consuming data. > at > org.apache.arrow.adapter.jdbc.ArrowVectorIterator.consumeData(ArrowVectorIterator.java:121) > at > org.apache.arrow.adapter.jdbc.ArrowVectorIterator.load(ArrowVectorIterator.java:153) > at > org.apache.arrow.adapter.jdbc.ArrowVectorIterator.next(ArrowVectorIterator.java:175) > ... 3 more > Caused by: java.lang.UnsupportedOperationException: BigDecimal scale must > equal that in the Arrow vector: 7 != 0 > at > org.apache.arrow.vector.util.DecimalUtility.checkPrecisionAndScale(DecimalUtility.java:95) > at org.apache.arrow.vector.DecimalVector.set(DecimalVector.java:355) > at > org.apache.arrow.adapter.jdbc.consumer.DecimalConsumer$NullableDecimalConsumer.consume(DecimalConsumer.java:61) > at > org.apache.arrow.adapter.jdbc.consumer.CompositeJdbcConsumer.consume(CompositeJdbcConsumer.java:46) > at > org.apache.arrow.adapter.jdbc.ArrowVectorIterator.consumeData(ArrowVectorIterator.java:113) > ... 5 more {code} > > -- This message was sent by Atlassian Jira (v8.20.7#820007)