Hello,
I'm repeating slack messages here.
I'm experimenting with ingesting JDBC into Paquet. ie repeating
spotify/dbeam with

   - JdbcIO.readRows()
   - AvroUtils.getAvroSchema(beamRows.getSchema()).
   -  AvroUtils.schemaCoder(avroSchema)
   - AvroUtils.getRowToGenericRecordFunction(avroSchema)

Here's the observed issues:
- DECIMAL(21,2) can't be handled due to loosing scale param (2).
org.apache.avro.Conversions.DecimalConversion.validate()
AvroTypeException("Cannot
encode decimal with scale 2 as scale 0 without rounding")

   - it can be fixing Beam Row schema by FieldType.logicalType(
   FixedPrecisionNumeric.of(Integer.MAX_VALUE, 2)) and then it should be
   passed to AvroSchema as LogicalTypes.decimal(Integer.MAX_VALUE, ((
   RowWithStorage)
   (field.getType().getLogicalType()).getArgument()).getValue("scale"
   )).addToSchema(Schema.create(Schema.Type.BYTES)) (it might not be the
   best approach, you know) I noticed
   https://github.com/apache/beam/issues/21226
   https://github.com/apache/beam/issues/20978 which might be related.

 - INT16 represented in beam schema as-is, but its 32-bit INT in avro
and java Short in runtime that causes
ClassCastException: class java.lang.Short cannot be cast to class
java.lang.Integer (java.lang.Short and java.lang.Integer are in module
java.base of loader 'bootstrap')
        at 
org.apache.beam.sdk.extensions.avro.schemas.utils.AvroUtils.convertAvroFieldStrict(AvroUtils.java:1299)

I suppose this method can accept Number and then call intValue() wdyt?

-- 
Sincerely yours
Mikhail Khludnev

Reply via email to