Thank you Mike for taking the time to reply to this,

I looked at your code and applied the AVRO-457 patch you did. Indeed you fixed a very similar problem. In your case XMLSchema delivers and expects BigDecimals so you mapped that to ByteBuffer as specified in http://avro.apache.org/docs/1.7.7/spec.html#Decimal.

As for the SpecificCompiler, I ended up creating s custom compiler:

      /**
* Temporary workaround for the lack of support for BigDecimal in Avro Specific Compiler.
       * <p/>
* The record.vm template is customized to expose BigDecimal getters and setters.
       *
       */
      public class CustomSpecificCompiler extends SpecificCompiler {

private static final String TEMPLATES_PATH = "/com/legstar/avro/generator/specific/templates/java/classic/";

          public CustomSpecificCompiler(Schema schema) {
              super(schema);
              setTemplateDir(TEMPLATES_PATH);
          }

          /**
* In the case of BigDecimals there is an internal java type (ByteBuffer)
           * and an external java type for getters/setters.
           *
           * @param schema the field schema
           * @return the field java type
           */
          public String externalJavaType(Schema schema) {
              return isBigDecimal(schema) ? "java.math.BigDecimal" : super
                      .javaType(schema);
          }

/** Tests whether a field is to be externalized as a BigDecimal */
          public static boolean isBigDecimal(Schema schema) {
              if (Type.BYTES == schema.getType()) {
JsonNode logicalTypeNode = schema.getJsonProp("logicalType");
                  if (logicalTypeNode != null
                          && "decimal".equals(logicalTypeNode.asText())) {
                      return true;
                  }
              }
              return false;
          }

      }

And then changed the record.vm velocity template like this:

    72c72
< public ${this.mangle($schema.getName())}(#foreach($field in $schema.getFields())${this.externalJavaType($field.schema())} ${this.mangle($field.name())}#if($velocityCount < $schema.getFields().size()), #end#end) {
    ---
> public ${this.mangle($schema.getName())}(#foreach($field in $schema.getFields())${this.javaType($field.schema())} ${this.mangle($field.name())}#if($velocityCount < $schema.getFields().size()), #end#end) {
    74c74
< ${this.generateSetMethod($schema, $field)}(${this.mangle($field.name())});
    ---
> this.${this.mangle($field.name())} = ${this.mangle($field.name())};
    110,113c110
< public ${this.externalJavaType($field.schema())} ${this.generateGetMethod($schema, $field)}() {
    < #if ($this.isBigDecimal($field.schema()))
< return new java.math.BigDecimal(new java.math.BigInteger(${this.mangle($field.name())}.array()), $field.schema().getJsonProp("scale"));
    < #else
    ---
> public ${this.javaType($field.schema())} ${this.generateGetMethod($schema, $field)}() {
    115d111
    < #end
    124,127c120
< public void ${this.generateSetMethod($schema, $field)}(${this.externalJavaType($field.schema())} value) {
    < #if ($this.isBigDecimal($field.schema()))
< this.${this.mangle($field.name(), $schema.isError())} = java.nio.ByteBuffer.wrap(value.unscaledValue().toByteArray());
    < #else
    ---
> public void ${this.generateSetMethod($schema, $field)}(${this.javaType($field.schema())} value) {
    129d121
    < #end

This fixes the issue for me but is not a good long term solution. Particularly the builder part of the generated Specific class is still exposing ByteBuffer instead of BigDecimal which is inconsistent.

More generally, it seems to me a better solution would be that the "java-class" trick be extended so that more complex conversions can occur between the avro type and the java type exposed by Specific classes. Right now, the java type must be castable from the avro type which is limiting.

Anyway, thanks again for your great insight.

Fady




On 11/11/2014 05:06, Michael Pigott wrote:
Hi Fady,
Properly handling BigDecimal types in Java is still an open question. AVRO-1402 [1] added BigDecimal types to the Avro spec, but the Java support is an open ticket under AVRO-1497 [2]. When I added BigDecimal support to AVRO-457 (XML <-> Avro support), I added support for the Avro decimal logical type using Java BigDecimals. You can see the conversion code [3] as well as the writer [4] and reader [5] code in my GitHub repository, or download the patch in AVRO-457 [6] and look for BigDecimal in the Utils.java, XmlDatumWriter.java, and XmlDatumReader.java files, respectively.

Good luck!
Mike

[1] https://issues.apache.org/jira/browse/AVRO-1402
[2] https://issues.apache.org/jira/browse/AVRO-1497
[3] https://github.com/mikepigott/xml-to-avro/blob/master/avro-to-xml/src/main/java/org/apache/avro/xml/Utils.java#L537 [4] https://github.com/mikepigott/xml-to-avro/blob/master/avro-to-xml/src/main/java/org/apache/avro/xml/XmlDatumWriter.java#L1150 [5] https://github.com/mikepigott/xml-to-avro/blob/master/avro-to-xml/src/main/java/org/apache/avro/xml/XmlDatumReader.java#L998
[6] https://issues.apache.org/jira/browse/AVRO-457

On Sat Nov 08 2014 at 4:11:32 AM Fady <f...@legsem.com <mailto:f...@legsem.com>> wrote:

    Hello,

    I am working on a project that aims at converting Mainframe data
    to Avro
    records (https://github.com/legsem/legstar.avro).

    Mainframe data often contains Decimal types. For these, I would
    like the
    corresponding Avro records to expose BigDecimal fields.

    Initially, I followed the recommendation here:
    http://avro.apache.org/docs/1.7.7/spec.html#Decimal. My schema
    contains
    for instance:

         {
           "name":"transactionAmount",
           "type":{
             "type":"bytes",
             "logicalType":"decimal",
             "precision":7,
             "scale":2
           }
         }

    This works fine but the Avro Specific record produced by the
    SpecificCompiler exposes a ByteBuffer for that field.

       @Deprecated public java.nio.ByteBuffer transactionAmount;

    Not what I want.

    I tried this alternative:

         {
           "name":"transactionAmount",
           "type":{
             "type":"string",
             "java-class":"java.math.BigDecimal",
             "logicalType":"decimal",
             "precision":7,
             "scale":2
           }

    Now the SchemaCompiler produces the result I need:

       @Deprecated public java.math.BigDecimal transactionAmount;

    There are 2 problems though:

    1. It is less efficient to serialize/deserialize a BigDecimal from a
    string rather then the 2's complement.

    2. The Specific Record obtained this way cannot be populated using a
    deep copy from a Generic Record.

    To clarify the second point:

    When I convert the mainframe data I do something like:

             GenericRecord genericRecord = new GenericData.Record(schema);
             ... populate genericRecord from Mainframe data ...
             return (D) SpecificData.get().deepCopy(schema,
    genericRecord);

    This fails with :
             java.lang.ClassCastException: java.lang.String cannot be cast
    to java.math.BigDecimal
                 at
    legstar.avro.test.specific.cusdat.Transaction.put(Transaction.java:47)
                 at
    org.apache.avro.generic.GenericData.setField(GenericData.java:573)
                 at
    org.apache.avro.generic.GenericData.setField(GenericData.java:590)
                 at
    org.apache.avro.generic.GenericData.deepCopy(GenericData.java:972)
                 at
    org.apache.avro.generic.GenericData.deepCopy(GenericData.java:926)
                 at
    org.apache.avro.generic.GenericData.deepCopy(GenericData.java:970)
                 at
    org.apache.avro.generic.GenericData.deepCopy(GenericData.java:970)


    This is because the code in the Specific record assumes the value
    received is already a BigDecimal

         case 1: transactionAmount = (java.math.BigDecimal)value$; break;

    In other words, the java-class trick produces the right interface for
    Specific classes but the internal data types are not consistent
    with the
    GenericRecord derived from the same schema.

    So my question is: what would be a better approach for Specific
    classes
    to expose BigDecimal fields?


Reply via email to