[ 
https://issues.apache.org/jira/browse/AVRO-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17230697#comment-17230697
 ] 

Werner Daehn commented on AVRO-2952:
------------------------------------

Extended the Testcase by Avro Logical type conversions in both directions, from 
Java types to raw type and from raw type to the defined java type.

[Documented 
|https://github.com/apache/avro/blob/47df1a6d3fbd144c8efcae7d1d19a57e1c2bd751/lang/java/avro/ConversionTable.md]the
 supported conversion options and their footnotes.

> Logical Types and Conversions enhancements
> ------------------------------------------
>
>                 Key: AVRO-2952
>                 URL: https://issues.apache.org/jira/browse/AVRO-2952
>             Project: Apache Avro
>          Issue Type: Improvement
>          Components: java
>    Affects Versions: 1.10.1
>            Reporter: Werner Daehn
>            Priority: Critical
>
> *Summary*:
>  * Added a method *Field.getDataType()* which returns an object with common 
> data type related methods. Most important are methods to call the converters.
>  * Added trivial LogicalTypes to allow better database integration, e.g. the 
> LogicalType VARCHAR(10) is a STRING that carries the information that only 
> ASCII chars are in the payload and up to 10 chars max.
> *Example*:
> The user has a record with one ENUM field and he has a Java enum class. 
> Instead of manually converting the Java string into a GenericEnumSymbol he 
> can use the convertToRawType of the AvroDataType class.
> f...Field to be set
> {{testRecord.put(f.name(), 
> f.getDataType().convertToRawType(myEnum.male.name()));}}
> Using the {{f.getDataType().convertToRawType() }}does all the conversion. I 
> considered adding that conversion into the put() method itself but feared 
> side effects. So the user has to invoke the convertToRawType().
> *Reasoning*:
> I am working with Avro (Kafka) for two years now and have implemented 
> improvements around Logical Types. These I merged into the Avro code with 
> zero side effects - pure additions. No breaking changes for other Avro users 
> but a great help for them.
> Imagine you connect two databases via Kafka using Avro as the message payload.
>  # The first problem you will be facing is that RawTypes and LogicalTypes are 
> handled differently. For LogicalTypes there are conversion functions that 
> provide metadata (e.g. getConvertedType returns that a Java Instant is the 
> best data type for a timestamp-millis plus conversion logic. For raw types 
> there is no such thing. A Boolean can be provided as true, "TRUE", 1,...
>  # Second problem will be the lack of getObject()/setObject() methods similar 
> to JDBC. The result are endless switch-case lists to call the correct 
> methods. In every single project for every user.
>  # Number three is the usage of the Converters as such. The intended usage is 
> to add converters to the GenericData and the reader/writer uses the best 
> suited converter. What I have seen most people do however is to use the 
> converters manually and assign the raw value directly. While adding 
> converters is possible still, the conversion at GenericRecord.put() and 
> GenericRecord.get() is easy now.
>  # For a data exchange format like Avro, it is important to carry as much 
> metadata as possible. For example, purely seen from Avro a STRING data type 
> is just fine. 99% of the string data types in a database are VARCHAR(length) 
> and NVARCHAR(length). While putting an ASCII String of length 10 into a 
> STRING is no problem, on the consumer side the only matching data type is a 
> NCLOB - the worst for a data base. The LogicalTypes provide such nice methods 
> to carry such metadata, e.g. a LogicalType VARCHAR(10) backed by a String. 
> These Logical Types do not have any conversion, they just exist for the 
> metadata. You have such a thing already with the UUID LogicalType.
>  
> *Changes*:
>  * A new package logicaltypes created. It includes all new LogicalTypes and 
> the AvroDataType implementations for the various raw data types.
>  * The existing LogicalTypes are unchanged. The corresponding classes in the 
> logicaltype package just extend them.
>  * For that some LogicalType fields needed to be made public.
>  * The LogicalTypes return the more detailed logicaltype.* classes.
>  * A test class created.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to