Thank you Mike for taking the time to reply to this,
I looked at your code and applied the AVRO-457 patch you did. Indeed you
fixed a very similar problem. In your case XMLSchema delivers and
expects BigDecimals so you mapped that to ByteBuffer as specified in
http://avro.apache.org/docs/1.7.7/spec.html#Decimal.
As for the SpecificCompiler, I ended up creating s custom compiler:
/**
* Temporary workaround for the lack of support for BigDecimal in
Avro Specific Compiler.
* <p/>
* The record.vm template is customized to expose BigDecimal
getters and setters.
*
*/
public class CustomSpecificCompiler extends SpecificCompiler {
private static final String TEMPLATES_PATH =
"/com/legstar/avro/generator/specific/templates/java/classic/";
public CustomSpecificCompiler(Schema schema) {
super(schema);
setTemplateDir(TEMPLATES_PATH);
}
/**
* In the case of BigDecimals there is an internal java type
(ByteBuffer)
* and an external java type for getters/setters.
*
* @param schema the field schema
* @return the field java type
*/
public String externalJavaType(Schema schema) {
return isBigDecimal(schema) ? "java.math.BigDecimal" : super
.javaType(schema);
}
/** Tests whether a field is to be externalized as a
BigDecimal */
public static boolean isBigDecimal(Schema schema) {
if (Type.BYTES == schema.getType()) {
JsonNode logicalTypeNode =
schema.getJsonProp("logicalType");
if (logicalTypeNode != null
&& "decimal".equals(logicalTypeNode.asText())) {
return true;
}
}
return false;
}
}
And then changed the record.vm velocity template like this:
72c72
< public ${this.mangle($schema.getName())}(#foreach($field in
$schema.getFields())${this.externalJavaType($field.schema())}
${this.mangle($field.name())}#if($velocityCount <
$schema.getFields().size()), #end#end) {
---
> public ${this.mangle($schema.getName())}(#foreach($field in
$schema.getFields())${this.javaType($field.schema())}
${this.mangle($field.name())}#if($velocityCount <
$schema.getFields().size()), #end#end) {
74c74
< ${this.generateSetMethod($schema,
$field)}(${this.mangle($field.name())});
---
> this.${this.mangle($field.name())} =
${this.mangle($field.name())};
110,113c110
< public ${this.externalJavaType($field.schema())}
${this.generateGetMethod($schema, $field)}() {
< #if ($this.isBigDecimal($field.schema()))
< return new java.math.BigDecimal(new
java.math.BigInteger(${this.mangle($field.name())}.array()),
$field.schema().getJsonProp("scale"));
< #else
---
> public ${this.javaType($field.schema())}
${this.generateGetMethod($schema, $field)}() {
115d111
< #end
124,127c120
< public void ${this.generateSetMethod($schema,
$field)}(${this.externalJavaType($field.schema())} value) {
< #if ($this.isBigDecimal($field.schema()))
< this.${this.mangle($field.name(), $schema.isError())} =
java.nio.ByteBuffer.wrap(value.unscaledValue().toByteArray());
< #else
---
> public void ${this.generateSetMethod($schema,
$field)}(${this.javaType($field.schema())} value) {
129d121
< #end
This fixes the issue for me but is not a good long term solution.
Particularly the builder part of the generated Specific class is still
exposing ByteBuffer instead of BigDecimal which is inconsistent.
More generally, it seems to me a better solution would be that the
"java-class" trick be extended so that more complex conversions can
occur between the avro type and the java type exposed by Specific
classes. Right now, the java type must be castable from the avro type
which is limiting.
Anyway, thanks again for your great insight.
Fady
On 11/11/2014 05:06, Michael Pigott wrote:
Hi Fady,
Properly handling BigDecimal types in Java is still an open
question. AVRO-1402 [1] added BigDecimal types to the Avro spec, but
the Java support is an open ticket under AVRO-1497 [2]. When I added
BigDecimal support to AVRO-457 (XML <-> Avro support), I added support
for the Avro decimal logical type using Java BigDecimals. You can see
the conversion code [3] as well as the writer [4] and reader [5] code
in my GitHub repository, or download the patch in AVRO-457 [6] and
look for BigDecimal in the Utils.java, XmlDatumWriter.java, and
XmlDatumReader.java files, respectively.
Good luck!
Mike
[1] https://issues.apache.org/jira/browse/AVRO-1402
[2] https://issues.apache.org/jira/browse/AVRO-1497
[3]
https://github.com/mikepigott/xml-to-avro/blob/master/avro-to-xml/src/main/java/org/apache/avro/xml/Utils.java#L537
[4]
https://github.com/mikepigott/xml-to-avro/blob/master/avro-to-xml/src/main/java/org/apache/avro/xml/XmlDatumWriter.java#L1150
[5]
https://github.com/mikepigott/xml-to-avro/blob/master/avro-to-xml/src/main/java/org/apache/avro/xml/XmlDatumReader.java#L998
[6] https://issues.apache.org/jira/browse/AVRO-457
On Sat Nov 08 2014 at 4:11:32 AM Fady <f...@legsem.com
<mailto:f...@legsem.com>> wrote:
Hello,
I am working on a project that aims at converting Mainframe data
to Avro
records (https://github.com/legsem/legstar.avro).
Mainframe data often contains Decimal types. For these, I would
like the
corresponding Avro records to expose BigDecimal fields.
Initially, I followed the recommendation here:
http://avro.apache.org/docs/1.7.7/spec.html#Decimal. My schema
contains
for instance:
{
"name":"transactionAmount",
"type":{
"type":"bytes",
"logicalType":"decimal",
"precision":7,
"scale":2
}
}
This works fine but the Avro Specific record produced by the
SpecificCompiler exposes a ByteBuffer for that field.
@Deprecated public java.nio.ByteBuffer transactionAmount;
Not what I want.
I tried this alternative:
{
"name":"transactionAmount",
"type":{
"type":"string",
"java-class":"java.math.BigDecimal",
"logicalType":"decimal",
"precision":7,
"scale":2
}
Now the SchemaCompiler produces the result I need:
@Deprecated public java.math.BigDecimal transactionAmount;
There are 2 problems though:
1. It is less efficient to serialize/deserialize a BigDecimal from a
string rather then the 2's complement.
2. The Specific Record obtained this way cannot be populated using a
deep copy from a Generic Record.
To clarify the second point:
When I convert the mainframe data I do something like:
GenericRecord genericRecord = new GenericData.Record(schema);
... populate genericRecord from Mainframe data ...
return (D) SpecificData.get().deepCopy(schema,
genericRecord);
This fails with :
java.lang.ClassCastException: java.lang.String cannot be cast
to java.math.BigDecimal
at
legstar.avro.test.specific.cusdat.Transaction.put(Transaction.java:47)
at
org.apache.avro.generic.GenericData.setField(GenericData.java:573)
at
org.apache.avro.generic.GenericData.setField(GenericData.java:590)
at
org.apache.avro.generic.GenericData.deepCopy(GenericData.java:972)
at
org.apache.avro.generic.GenericData.deepCopy(GenericData.java:926)
at
org.apache.avro.generic.GenericData.deepCopy(GenericData.java:970)
at
org.apache.avro.generic.GenericData.deepCopy(GenericData.java:970)
This is because the code in the Specific record assumes the value
received is already a BigDecimal
case 1: transactionAmount = (java.math.BigDecimal)value$; break;
In other words, the java-class trick produces the right interface for
Specific classes but the internal data types are not consistent
with the
GenericRecord derived from the same schema.
So my question is: what would be a better approach for Specific
classes
to expose BigDecimal fields?