[
https://issues.apache.org/jira/browse/AVRO-4033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hagen Weiße updated AVRO-4033:
------------------------------
Description:
Currently avrogencpp generates a class for each union type that is encountered
in the schema. Even if there is a class that represents the exact same union, a
new type will be generated.
Example Schema:
{code:json}
{
"type": "record",
"doc": "Top level Doc.",
"name": "RootRecord",
"fields": [
{
"name": "nullable_string_1",
"doc": "mylong field doc.",
"type": [
"null",
"string"
]
},
{
"name": "nullable_string_2",
"doc": "mylong field doc.",
"type": [
"null",
"string"
]
},
{
"name": "nullable_string_3",
"doc": "mylong field doc.",
"type": [
"null",
"string"
]
}
]
}
{code}
The generated RootRecord will look like this:
{code:c++}
struct RootRecord {
typedef _union_test_json_Union__0__ nullable_string_1_t;
typedef _union_test_json_Union__1__ nullable_string_2_t;
typedef _union_test_json_Union__2__ nullable_string_3_t;
nullable_string_1_t nullable_string_1;
nullable_string_2_t nullable_string_2;
nullable_string_3_t nullable_string_3;
RootRecord() :
nullable_string_1(nullable_string_1_t()),
nullable_string_2(nullable_string_2_t()),
nullable_string_3(nullable_string_3_t())
{ }
};{code}
Especially for common union types (e.g. union of null and string), this leads
to a lot of redundant code.
To solve this avrogencpp could track the name of union types that are generated
and filter out duplicates.
was:
Currently avrogencpp generates a class for each union type that is encountered
in the schema. Even if there is a class that represents the exact same union, a
new type will be generated.
Example Schema:
{code:json}
{
"type": "record",
"doc": "Top level Doc.",
"name": "RootRecord",
"fields": [
{
"name": "nullable_string_1",
"doc": "mylong field doc.",
"type": [
"null",
"string"
]
},
{
"name": "nullable_string_2",
"doc": "mylong field doc.",
"type": [
"null",
"string"
]
},
{
"name": "nullable_string_3",
"doc": "mylong field doc.",
"type": [
"null",
"string"
]
}
]
}
{code}
The generated RootRecord will look like this:
{code:c++}
struct RootRecord {
typedef _union_test_json_Union__0__ nullable_string_1_t;
typedef _union_test_json_Union__1__ nullable_string_2_t;
typedef _union_test_json_Union__2__ nullable_string_3_t;
nullable_string_1_t nullable_string_1;
nullable_string_2_t nullable_string_2;
nullable_string_3_t nullable_string_3;
RootRecord() :
nullable_string_1(nullable_string_1_t()),
nullable_string_2(nullable_string_2_t()),
nullable_string_3(nullable_string_3_t())
{ }
};{code}
Especially for common union types (e.g. union of null and string), this leads
to a lot of redundant code.
> Remove redundant union classes generated by avrogencpp
> ------------------------------------------------------
>
> Key: AVRO-4033
> URL: https://issues.apache.org/jira/browse/AVRO-4033
> Project: Apache Avro
> Issue Type: Improvement
> Components: c++
> Reporter: Hagen Weiße
> Priority: Major
> Labels: c++
>
> Currently avrogencpp generates a class for each union type that is
> encountered in the schema. Even if there is a class that represents the exact
> same union, a new type will be generated.
> Example Schema:
> {code:json}
> {
> "type": "record",
> "doc": "Top level Doc.",
> "name": "RootRecord",
> "fields": [
> {
> "name": "nullable_string_1",
> "doc": "mylong field doc.",
> "type": [
> "null",
> "string"
> ]
> },
> {
> "name": "nullable_string_2",
> "doc": "mylong field doc.",
> "type": [
> "null",
> "string"
> ]
> },
> {
> "name": "nullable_string_3",
> "doc": "mylong field doc.",
> "type": [
> "null",
> "string"
> ]
> }
> ]
> }
> {code}
> The generated RootRecord will look like this:
> {code:c++}
> struct RootRecord {
> typedef _union_test_json_Union__0__ nullable_string_1_t;
> typedef _union_test_json_Union__1__ nullable_string_2_t;
> typedef _union_test_json_Union__2__ nullable_string_3_t;
> nullable_string_1_t nullable_string_1;
> nullable_string_2_t nullable_string_2;
> nullable_string_3_t nullable_string_3;
> RootRecord() :
> nullable_string_1(nullable_string_1_t()),
> nullable_string_2(nullable_string_2_t()),
> nullable_string_3(nullable_string_3_t())
> { }
> };{code}
> Especially for common union types (e.g. union of null and string), this leads
> to a lot of redundant code.
> To solve this avrogencpp could track the name of union types that are
> generated and filter out duplicates.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)