[jira] [Updated] (AVRO-3897) Disallow invalid namespace in fully qualified name for Rust SDK
[ https://issues.apache.org/jira/browse/AVRO-3897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated AVRO-3897: - Description: Currently, the Rust SDK allows the following fully qualified names with Name::new. {code} Name::new("ns.0.record1") Name::new("ns..record1") {code} But they should be disallowed according to the specification. https://avro.apache.org/docs/1.11.1/specification/#names {code} The name portion of the fullname of named types, record field names, and enum symbols must: start with [A-Za-z_] subsequently contain only [A-Za-z0-9_] {code} {code} The null namespace may not be used in a dot-separated sequence of names. So the grammar for a namespace is: | [()*] {code} was: Currently, the Rust SDK allows the following fully qualified names with Name::new. {code} Name::new("ns.0.record1") Name::new("ns..record1") {code} But they should be disallowed according to the specification. > Disallow invalid namespace in fully qualified name for Rust SDK > --- > > Key: AVRO-3897 > URL: https://issues.apache.org/jira/browse/AVRO-3897 > Project: Apache Avro > Issue Type: Bug > Components: rust >Reporter: Kousuke Saruta >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Currently, the Rust SDK allows the following fully qualified names with > Name::new. > {code} > Name::new("ns.0.record1") > Name::new("ns..record1") > {code} > But they should be disallowed according to the specification. > https://avro.apache.org/docs/1.11.1/specification/#names > {code} > The name portion of the fullname of named types, record field names, and enum > symbols must: > start with [A-Za-z_] > subsequently contain only [A-Za-z0-9_] > {code} > {code} > The null namespace may not be used in a dot-separated sequence of names. So > the grammar for a namespace is: >| [()*] > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (AVRO-3862) Add aliases and doc methods to Schema in Rust SDK
[ https://issues.apache.org/jira/browse/AVRO-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated AVRO-3862: - Priority: Minor (was: Major) > Add aliases and doc methods to Schema in Rust SDK > - > > Key: AVRO-3862 > URL: https://issues.apache.org/jira/browse/AVRO-3862 > Project: Apache Avro > Issue Type: Improvement > Components: rust >Affects Versions: 1.12.0 >Reporter: Kousuke Saruta >Priority: Minor > > Named types (Record, Enum and Fixed) have common attributes {*}name{*}, > *aliases* and {*}doc{*}. > We have already have *fn name* in Schema so it's nice to have *fn aliases* > and *fn doc* too. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (AVRO-3851) Validate default value for record fields and enums on parsing
[ https://issues.apache.org/jira/browse/AVRO-3851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated AVRO-3851: - Affects Version/s: 1.12.0 > Validate default value for record fields and enums on parsing > - > > Key: AVRO-3851 > URL: https://issues.apache.org/jira/browse/AVRO-3851 > Project: Apache Avro > Issue Type: Improvement > Components: rust >Affects Versions: 1.12.0 >Reporter: Kousuke Saruta >Priority: Major > > Currently, default values for record fields are not validated on parsing > except for union type fields. > Similarly, default values for enum are not also validated on parsing. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (AVRO-3850) Don't publish Cargo.lock
[ https://issues.apache.org/jira/browse/AVRO-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta resolved AVRO-3850. -- Resolution: Not A Problem Close for now. See https://github.com/apache/avro/pull/2476#issuecomment-1704772007 > Don't publish Cargo.lock > > > Key: AVRO-3850 > URL: https://issues.apache.org/jira/browse/AVRO-3850 > Project: Apache Avro > Issue Type: Improvement > Components: rust >Affects Versions: 1.12.0 >Reporter: Kousuke Saruta >Priority: Minor > > Currently, Cargo.lock is published but it should not be because all the > crates are libraries. > https://doc.rust-lang.org/cargo/guide/cargo-toml-vs-cargo-lock.html -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (AVRO-3847) Record field doesn't accept default value if field type is union and the type of default value is pre-defined name
[ https://issues.apache.org/jira/browse/AVRO-3847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated AVRO-3847: - Affects Version/s: 1.12.0 > Record field doesn't accept default value if field type is union and the type > of default value is pre-defined name > -- > > Key: AVRO-3847 > URL: https://issues.apache.org/jira/browse/AVRO-3847 > Project: Apache Avro > Issue Type: Bug > Components: rust >Affects Versions: 1.12.0 >Reporter: Kousuke Saruta >Priority: Major > > Given we have a schema like as follows. > {code} > { > "name": "record1", > "type": "record", > "fields": [ > { > "name": "f1", > "type": { > "name": "record2", > "type": "record", > "fields": [ > { > "name": "f1_1", > "type": "int" > } > ] > } > }, { > "name": "f2", > "type": ["record2", "int"], > "default": { > "f1_1": 100 > } > } > ] > } > {code} > The type of the field f2 is union of record2 and int, and the default value > is of a value of record2, which is pre-defined. > Current Rust binding doesn't accept such schemas, raising a error message > like as follows. > {code} > Error: One union type Ref must match the `default`'s value type Map > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (AVRO-3846) Race condition can happen among serde tests
[ https://issues.apache.org/jira/browse/AVRO-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated AVRO-3846: - Description: Sometimes one of tests named avro_3747* fails. You can easily reproduce this issue by cargo test avro_3747. These tests are run concurrently by Cargo test and those tests load/store the same atomic variable so This seems race condition was: Sometimes one of tests named avro_3747 fails. You can easily reproduce this issue by cargo test avro_3747. These tests are run concurrently by Cargo test and those tests load/store the same atomic variable so This seems race condition > Race condition can happen among serde tests > --- > > Key: AVRO-3846 > URL: https://issues.apache.org/jira/browse/AVRO-3846 > Project: Apache Avro > Issue Type: Bug > Components: rust >Affects Versions: 1.12.0 >Reporter: Kousuke Saruta >Priority: Major > > Sometimes one of tests named avro_3747* fails. > You can easily reproduce this issue by cargo test avro_3747. > These tests are run concurrently by Cargo test and those tests load/store the > same atomic variable so This seems race condition -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (AVRO-3846) Race condition can happen among serde tests
[ https://issues.apache.org/jira/browse/AVRO-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated AVRO-3846: - Description: Sometimes one of tests named avro_3747 fails. You can easily reproduce this issue by cargo test avro_3747. These tests are run concurrently by Cargo test and those tests load/store the same atomic variable so This seems race condition was: Sometimes one of tests named avro_3747 fails. These tests are run concurrently by Cargo test and those tests load/store the same atomic variable so This seems race condition > Race condition can happen among serde tests > --- > > Key: AVRO-3846 > URL: https://issues.apache.org/jira/browse/AVRO-3846 > Project: Apache Avro > Issue Type: Bug > Components: rust >Affects Versions: 1.12.0 >Reporter: Kousuke Saruta >Priority: Major > > Sometimes one of tests named avro_3747 fails. > You can easily reproduce this issue by cargo test avro_3747. > These tests are run concurrently by Cargo test and those tests load/store the > same atomic variable so This seems race condition -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (AVRO-3830) Handle namespace properly if a name starts with dot
[ https://issues.apache.org/jira/browse/AVRO-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758680#comment-17758680 ] Kousuke Saruta commented on AVRO-3830: -- [~stestagg] Hmm, will you fix that issue by yourself? > Handle namespace properly if a name starts with dot > --- > > Key: AVRO-3830 > URL: https://issues.apache.org/jira/browse/AVRO-3830 > Project: Apache Avro > Issue Type: Bug >Affects Versions: 1.12.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Major > Labels: pull-request-available > Fix For: 1.12.0, 1.11.3 > > Time Spent: 0.5h > Remaining Estimate: 0h > > The specification says about the name and namespace like as follows. > ??The empty string may also be used as a namespace to indicate the null > namespace?? > ??If the name specified contains a dot, then it is assumed to be a fullname, > and any namespace also specified is ignored?? > According to this specification, if a name in a name field starts with a dot, > it's considered that the namespace is null and the corresponding namespace > field should be ignored. > For example, given the following schema. > {code} > { > "name": ".record1", > "namespace": "ns1", > "type": "record", > "fields": [] > } > {code} > The name and namespace should be "record1" and null respectively. > But the namespace is considered as "ns1" in the current Rust binding . -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (AVRO-3830) Handle namespace properly if a name starts with dot
[ https://issues.apache.org/jira/browse/AVRO-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758139#comment-17758139 ] Kousuke Saruta commented on AVRO-3830: -- [~stestagg] {code} The null namespace may not be used in a dot-separated sequence of names. So the grammar for a namespace is: | [()*] {code} This is about namespace, and the problem this ticket discuss is about namespace portion of fullname. namespace allows empty so I think it follows the specification. > Handle namespace properly if a name starts with dot > --- > > Key: AVRO-3830 > URL: https://issues.apache.org/jira/browse/AVRO-3830 > Project: Apache Avro > Issue Type: Bug >Affects Versions: 1.12.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Major > Labels: pull-request-available > Fix For: 1.12.0, 1.11.3 > > Time Spent: 0.5h > Remaining Estimate: 0h > > The specification says about the name and namespace like as follows. > ??The empty string may also be used as a namespace to indicate the null > namespace?? > ??If the name specified contains a dot, then it is assumed to be a fullname, > and any namespace also specified is ignored?? > According to this specification, if a name in a name field starts with a dot, > it's considered that the namespace is null and the corresponding namespace > field should be ignored. > For example, given the following schema. > {code} > { > "name": ".record1", > "namespace": "ns1", > "type": "record", > "fields": [] > } > {code} > The name and namespace should be "record1" and null respectively. > But the namespace is considered as "ns1" in the current Rust binding . -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (AVRO-3841) Align the specification of the way to encode NaN to the actual implementations
[ https://issues.apache.org/jira/browse/AVRO-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated AVRO-3841: - Summary: Align the specification of the way to encode NaN to the actual implementations (was: Align the specification of encoding NaN to the actual implementations) > Align the specification of the way to encode NaN to the actual implementations > -- > > Key: AVRO-3841 > URL: https://issues.apache.org/jira/browse/AVRO-3841 > Project: Apache Avro > Issue Type: Improvement > Components: spec >Affects Versions: 1.12.0 >Reporter: Kousuke Saruta >Priority: Minor > > The specification says about the way to encode float/double like as follows. > {code} > a float is written as 4 bytes. The float is converted into a 32-bit integer > using a method equivalent to Java’s floatToIntBits and then encoded in > little-endian format. > a double is written as 8 bytes. The double is converted into a 64-bit integer > using a method equivalent to Java’s doubleToLongBits and then encoded in > little-endian format. > {code} > But the actual implementation in Java uses > floatToRawIntBits/doubleToRawLongBits rather than > floatToIntBits/doubleToLongBits. > The they are different in the way to encode NaN. > floatToIntBits/doubleToLongBits doesn't distinguish between NaN and -NaN but > floatToRawIntBits/doubleToRawLongBits does. > I confirmed all the implementation distinguish between NaN and -NaN. > So, I think it's better to modify the specification. > Java > {code} > public static int encodeFloat(float f, byte[] buf, int pos) { > final int bits = Float.floatToRawIntBits(f); > buf[pos + 3] = (byte) (bits >>> 24); > buf[pos + 2] = (byte) (bits >>> 16); > buf[pos + 1] = (byte) (bits >>> 8); > buf[pos] = (byte) (bits); > return 4; > } > public static int encodeDouble(double d, byte[] buf, int pos) { > final long bits = Double.doubleToRawLongBits(d); > int first = (int) (bits & 0x); > int second = (int) ((bits >>> 32) & 0x); > // the compiler seems to execute this order the best, likely due to > // register allocation -- the lifetime of constants is minimized. > buf[pos] = (byte) (first); > buf[pos + 4] = (byte) (second); > buf[pos + 5] = (byte) (second >>> 8); > buf[pos + 1] = (byte) (first >>> 8); > buf[pos + 2] = (byte) (first >>> 16); > buf[pos + 6] = (byte) (second >>> 16); > buf[pos + 7] = (byte) (second >>> 24); > buf[pos + 3] = (byte) (first >>> 24); > return 8; > } > {code} > Rust > {code} > Value::Float(x) => buffer.extend_from_slice(_le_bytes()), > Value::Double(x) => buffer.extend_from_slice(_le_bytes()), > {code} > Python > {code} > def write_float(self, datum: float) -> None: > > """ > > A float is written as 4 bytes. > > The float is converted into a 32-bit integer using a method > equivalent to > Java's floatToIntBits and then encoded in little-endian format. > > """ > > self.write(STRUCT_FLOAT.pack(datum)) > def write_double(self, datum: float) -> None: > > """ > > A double is written as 8 bytes. > > The double is converted into a 64-bit integer using a method > equivalent to > Java's doubleToLongBits and then encoded in little-endian format. > > """ > > self.write(STRUCT_DOUBLE.pack(datum)) > {code} > C > {code} > static int write_float(avro_writer_t writer, const float f) > { > #if AVRO_PLATFORM_IS_BIG_ENDIAN >
[jira] [Updated] (AVRO-3841) Align the specification of encoding NaN to the actual implementations
[ https://issues.apache.org/jira/browse/AVRO-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated AVRO-3841: - Issue Type: Improvement (was: Bug) > Align the specification of encoding NaN to the actual implementations > - > > Key: AVRO-3841 > URL: https://issues.apache.org/jira/browse/AVRO-3841 > Project: Apache Avro > Issue Type: Improvement > Components: spec >Affects Versions: 1.12.0 >Reporter: Kousuke Saruta >Priority: Minor > > The specification says about the way to encode float/double like as follows. > {code} > a float is written as 4 bytes. The float is converted into a 32-bit integer > using a method equivalent to Java’s floatToIntBits and then encoded in > little-endian format. > a double is written as 8 bytes. The double is converted into a 64-bit integer > using a method equivalent to Java’s doubleToLongBits and then encoded in > little-endian format. > {code} > But the actual implementation in Java uses > floatToRawIntBits/doubleToRawLongBits rather than > floatToIntBits/doubleToLongBits. > The they are different in the way to encode NaN. > floatToIntBits/doubleToLongBits doesn't distinguish between NaN and -NaN but > floatToRawIntBits/doubleToRawLongBits does. > I confirmed all the implementation distinguish between NaN and -NaN. > So, I think it's better to modify the specification. > Java > {code} > public static int encodeFloat(float f, byte[] buf, int pos) { > final int bits = Float.floatToRawIntBits(f); > buf[pos + 3] = (byte) (bits >>> 24); > buf[pos + 2] = (byte) (bits >>> 16); > buf[pos + 1] = (byte) (bits >>> 8); > buf[pos] = (byte) (bits); > return 4; > } > public static int encodeDouble(double d, byte[] buf, int pos) { > final long bits = Double.doubleToRawLongBits(d); > int first = (int) (bits & 0x); > int second = (int) ((bits >>> 32) & 0x); > // the compiler seems to execute this order the best, likely due to > // register allocation -- the lifetime of constants is minimized. > buf[pos] = (byte) (first); > buf[pos + 4] = (byte) (second); > buf[pos + 5] = (byte) (second >>> 8); > buf[pos + 1] = (byte) (first >>> 8); > buf[pos + 2] = (byte) (first >>> 16); > buf[pos + 6] = (byte) (second >>> 16); > buf[pos + 7] = (byte) (second >>> 24); > buf[pos + 3] = (byte) (first >>> 24); > return 8; > } > {code} > Rust > {code} > Value::Float(x) => buffer.extend_from_slice(_le_bytes()), > Value::Double(x) => buffer.extend_from_slice(_le_bytes()), > {code} > Python > {code} > def write_float(self, datum: float) -> None: > > """ > > A float is written as 4 bytes. > > The float is converted into a 32-bit integer using a method > equivalent to > Java's floatToIntBits and then encoded in little-endian format. > > """ > > self.write(STRUCT_FLOAT.pack(datum)) > def write_double(self, datum: float) -> None: > > """ > > A double is written as 8 bytes. > > The double is converted into a 64-bit integer using a method > equivalent to > Java's doubleToLongBits and then encoded in little-endian format. > > """ > > self.write(STRUCT_DOUBLE.pack(datum)) > {code} > C > {code} > static int write_float(avro_writer_t writer, const float f) > { > #if AVRO_PLATFORM_IS_BIG_ENDIAN > uint8_t buf[4]; > #endif > union { > float f; > int32_t i; > } v; > v.f = f; > #if
[jira] [Updated] (AVRO-3837) Disallow invalid namespaces for the Rust binding
[ https://issues.apache.org/jira/browse/AVRO-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated AVRO-3837: - Description: The current Rust binding doesn't accept invalid namespaces if such namespaces are in a name field. {code} { "name": "ns1.invalid-ns.record1", "type": "record" "fields": [] } {code} But, even if a invalid namespace is in a namespace field, the Rust binding accept such namespaces. {code} { "name": "record1", "namespace": "ns1.invalid-ns", "type": "record", "fields": [] } {code} was: The current Rust binding doesn't accept invalid namespaces if such namespaces are in a name field. {code} { "name": "ns1.invalid-ns.record1", "type": "record" "fields": [] } {code} But, even if a invalid namespace is in a namespace field, the Rust binding accept such namespaces. {code} "name": "record1", "namespace": "ns1.invalid-ns", "type": "record", "fields": [] } {code} > Disallow invalid namespaces for the Rust binding > > > Key: AVRO-3837 > URL: https://issues.apache.org/jira/browse/AVRO-3837 > Project: Apache Avro > Issue Type: Bug > Components: rust >Affects Versions: 1.12.0 >Reporter: Kousuke Saruta >Priority: Major > > The current Rust binding doesn't accept invalid namespaces if such namespaces > are in a name field. > {code} > { > "name": "ns1.invalid-ns.record1", > "type": "record" > "fields": [] > } > {code} > But, even if a invalid namespace is in a namespace field, the Rust binding > accept such namespaces. > {code} > { > "name": "record1", > "namespace": "ns1.invalid-ns", > "type": "record", > "fields": [] > } > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (AVRO-3837) Disallow invalid namespaces for the Rust binding
[ https://issues.apache.org/jira/browse/AVRO-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated AVRO-3837: - Description: The current Rust binding doesn't accept invalid namespaces if such namespaces are in a name field. {code} { "name": "ns1.invalid-ns.record1", "type": "record" "fields": [] } {code} But, even if a invalid namespace is in a namespace field, the Rust binding accept such namespaces. {code} "name": "record1", "namespace": "ns1.invalid-ns", "type": "record", "fields": [] } {code} was: The current Rust binding doesn't accept invalid namespaces if such namespaces are in name field. {code} { "name": "ns1.invalid-ns.record1", "type": "record" "fields": [] } {code} But if a invalid namespace in namespace field doesn't validate. {code} "name": "record1", "namespace": "ns1.invalid-ns", "type": "record", "fields": [] } {code} > Disallow invalid namespaces for the Rust binding > > > Key: AVRO-3837 > URL: https://issues.apache.org/jira/browse/AVRO-3837 > Project: Apache Avro > Issue Type: Bug > Components: rust >Affects Versions: 1.12.0 >Reporter: Kousuke Saruta >Priority: Major > > The current Rust binding doesn't accept invalid namespaces if such namespaces > are in a name field. > {code} > { > "name": "ns1.invalid-ns.record1", > "type": "record" > "fields": [] > } > {code} > But, even if a invalid namespace is in a namespace field, the Rust binding > accept such namespaces. > {code} > "name": "record1", > "namespace": "ns1.invalid-ns", > "type": "record", > "fields": [] > } > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (AVRO-3837) Disallow invalid namespaces for the Rust binding
[ https://issues.apache.org/jira/browse/AVRO-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated AVRO-3837: - Summary: Disallow invalid namespaces for the Rust binding (was: Disallow invalid namespace for the Rust binding) > Disallow invalid namespaces for the Rust binding > > > Key: AVRO-3837 > URL: https://issues.apache.org/jira/browse/AVRO-3837 > Project: Apache Avro > Issue Type: Bug > Components: rust >Affects Versions: 1.12.0 >Reporter: Kousuke Saruta >Priority: Major > > The current Rust binding doesn't accept invalid namespaces if such namespaces > are in name field. > {code} > { > "name": "ns1.invalid-ns.record1", > "type": "record" > "fields": [] > } > {code} > But if a invalid namespace in namespace field doesn't validate. > {code} > "name": "record1", > "namespace": "ns1.invalid-ns", > "type": "record", > "fields": [] > } > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (AVRO-3830) Handle namespace properly if a name starts with dot
[ https://issues.apache.org/jira/browse/AVRO-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated AVRO-3830: - Summary: Handle namespace properly if a name starts with dot (was: Handle namespace property if a name starts with dot) > Handle namespace properly if a name starts with dot > --- > > Key: AVRO-3830 > URL: https://issues.apache.org/jira/browse/AVRO-3830 > Project: Apache Avro > Issue Type: Bug >Affects Versions: 1.12.0 >Reporter: Kousuke Saruta >Priority: Major > > The specification says about the name and namespace like as follows. > ??The empty string may also be used as a namespace to indicate the null > namespace?? > ??If the name specified contains a dot, then it is assumed to be a fullname, > and any namespace also specified is ignored?? > According to this specification, if a name in a name field starts with a dot, > it's considered that the namespace is null and the corresponding namespace > field should be ignored. > For example, given the following schema. > {code} > { > "name": ".record1", > "namespace": "ns1", > "type": "record", > "fields": [] > } > {code} > The name and namespace should be "record1" and null respectively. > But the namespace is considered as "ns1" in the current Rust binding . -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (AVRO-3830) Handle namespace property if a name starts with dot
[ https://issues.apache.org/jira/browse/AVRO-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated AVRO-3830: - Affects Version/s: 1.12.0 > Handle namespace property if a name starts with dot > --- > > Key: AVRO-3830 > URL: https://issues.apache.org/jira/browse/AVRO-3830 > Project: Apache Avro > Issue Type: Bug >Affects Versions: 1.12.0 >Reporter: Kousuke Saruta >Priority: Major > > The specification says about the name and namespace like as follows. > ??The empty string may also be used as a namespace to indicate the null > namespace?? > ??If the name specified contains a dot, then it is assumed to be a fullname, > and any namespace also specified is ignored?? > According to this specification, if a name in a name field starts with a dot, > it's considered that the namespace is null and the corresponding namespace > field should be ignored. > For example, given the following schema. > {code} > { > "name": ".record1", > "namespace": "ns1", > "type": "record", > "fields": [] > } > {code} > The name and namespace should be "record1" and null respectively. > But the namespace is considered as "ns1" in the current Rust binding . -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (AVRO-3827) Disallow duplicate field names
[ https://issues.apache.org/jira/browse/AVRO-3827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated AVRO-3827: - Description: If a schema contains a record and some of its fields have the same field name, such schema should not be allowed. {code:java} { "name": "my_schema", "type": "record", "fields": [ { "name": "f1", "type": { "name": "a", "type": "record", "fields": [] } }, { "name": "f1", "type": { "name": "b", "type": "record", "fields": [] } } ] } {code} But the current Rust binding accept. was: If a schema contains a record and some of its fields have the same field name, such schema should not be allowed. {code} { "name": "my_schema", "type": "record", "fields": [ { "name": "f1", "type": { "name": "a", "type": "record", "fields": [] } } { "name": "f1", "type": { "name": "b", "type": "record", "fields": [] } } ] } {code} But the current Rust binding accept. > Disallow duplicate field names > -- > > Key: AVRO-3827 > URL: https://issues.apache.org/jira/browse/AVRO-3827 > Project: Apache Avro > Issue Type: Bug > Components: rust >Affects Versions: 1.12.0 >Reporter: Kousuke Saruta >Priority: Major > > If a schema contains a record and some of its fields have the same field > name, such schema should not be allowed. > {code:java} > { > "name": "my_schema", > "type": "record", > "fields": [ > { > "name": "f1", > "type": { > "name": "a", > "type": "record", > "fields": [] > } > }, { > "name": "f1", > "type": { > "name": "b", > "type": "record", > "fields": [] > } > } > ] > } > {code} > But the current Rust binding accept. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (AVRO-3825) Disallow invalid namespaces
[ https://issues.apache.org/jira/browse/AVRO-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated AVRO-3825: - Description: According to the specification, each portion of a namespace separated by dot should be [a-zA-Z_][a-zA-Z0-9_]. [https://avro.apache.org/docs/1.11.1/specification/#names] {code:java} The name portion of the fullname of named types, record field names, and enum symbols must: start with [A-Za-z_] subsequently contain only [A-Za-z0-9_] A namespace is a dot-separated sequence of such names. The empty string may also be used as a namespace to indicate the null namespace. Equality of names (including field names and enum symbols) as well as fullnames is case-sensitive. The null namespace may not be used in a dot-separated sequence of names. So the grammar for a namespace is: | [()*] {code} was: According to the specification, each portion of a namespace separated by dot should be [a-z,A-Z,_][a-z,A-Z,0-9_]. [https://avro.apache.org/docs/1.11.1/specification/#names] {code:java} The name portion of the fullname of named types, record field names, and enum symbols must: start with [A-Za-z_] subsequently contain only [A-Za-z0-9_] A namespace is a dot-separated sequence of such names. The empty string may also be used as a namespace to indicate the null namespace. Equality of names (including field names and enum symbols) as well as fullnames is case-sensitive. The null namespace may not be used in a dot-separated sequence of names. So the grammar for a namespace is: | [()*] {code} > Disallow invalid namespaces > --- > > Key: AVRO-3825 > URL: https://issues.apache.org/jira/browse/AVRO-3825 > Project: Apache Avro > Issue Type: Bug > Components: java >Affects Versions: 1.12.0 >Reporter: Kousuke Saruta >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > According to the specification, each portion of a namespace separated by dot > should be [a-zA-Z_][a-zA-Z0-9_]. > [https://avro.apache.org/docs/1.11.1/specification/#names] > {code:java} > The name portion of the fullname of named types, record field names, and enum > symbols must: > start with [A-Za-z_] > subsequently contain only [A-Za-z0-9_] > A namespace is a dot-separated sequence of such names. The empty string may > also be used as a namespace to indicate the null namespace. Equality of names > (including field names and enum symbols) as well as fullnames is > case-sensitive. > The null namespace may not be used in a dot-separated sequence of names. So > the grammar for a namespace is: >| [()*] > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (AVRO-3825) Disallow invalid namespaces
[ https://issues.apache.org/jira/browse/AVRO-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated AVRO-3825: - Summary: Disallow invalid namespaces (was: Disallow invalid namespace) > Disallow invalid namespaces > --- > > Key: AVRO-3825 > URL: https://issues.apache.org/jira/browse/AVRO-3825 > Project: Apache Avro > Issue Type: Bug > Components: java >Affects Versions: 1.12.0 >Reporter: Kousuke Saruta >Priority: Major > > According to the specification, each portion of a namespace separated by dot > should be [a-z,A-Z,_][a-z,A-Z,0-9_]. > [https://avro.apache.org/docs/1.11.1/specification/#names] > {code:java} > The name portion of the fullname of named types, record field names, and enum > symbols must: > start with [A-Za-z_] > subsequently contain only [A-Za-z0-9_] > A namespace is a dot-separated sequence of such names. The empty string may > also be used as a namespace to indicate the null namespace. Equality of names > (including field names and enum symbols) as well as fullnames is > case-sensitive. > The null namespace may not be used in a dot-separated sequence of names. So > the grammar for a namespace is: >| [()*] > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (AVRO-3823) Show helpful error messages
[ https://issues.apache.org/jira/browse/AVRO-3823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17751208#comment-17751208 ] Kousuke Saruta commented on AVRO-3823: -- [~mgrigorov] Oh, I see. I didn't know anyhow works with thiserror well (and I noticed both are created by the same author). > Show helpful error messages > --- > > Key: AVRO-3823 > URL: https://issues.apache.org/jira/browse/AVRO-3823 > Project: Apache Avro > Issue Type: Bug > Components: rust >Affects Versions: 1.12.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Major > Labels: pull-request-available > Fix For: 1.12.0, 1.11.3 > > Time Spent: 0.5h > Remaining Estimate: 0h > > The current Rust binding doesn't show helpful error messages. > Actually, error types are implemented with helpful error messages. > This is an example. > {code:java} > #[error("No `name` field")] > GetNameField, > {code} > But those error messages are not shown. > Given we try to a invalid schema which contains no name field, we expect to > get "No `name` field" but the actual is "GetNameFIeld", which makes it > difficult for users to resolve the problem. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (AVRO-3812) Handle null namespace properly for canonicalized schema representation
[ https://issues.apache.org/jira/browse/AVRO-3812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated AVRO-3812: - Summary: Handle null namespace properly for canonicalized schema representation (was: Handle null namespace properly) > Handle null namespace properly for canonicalized schema representation > -- > > Key: AVRO-3812 > URL: https://issues.apache.org/jira/browse/AVRO-3812 > Project: Apache Avro > Issue Type: Improvement > Components: rust >Affects Versions: 1.12.0 >Reporter: Kousuke Saruta >Priority: Major > > Considering the following schema, which contains namespaces of "". > {code} > { > "namespace": "", > "type": "record", > "name": "my_schema", > "fields": [ >{ > "name": "a", > "type": { >"type": "enum", >"name": "my_enum", >"namespace": "", >"symbols": ["a", "b"] > } >}, { > "name": "b", > "type": { >"type": "fixed", >"name": "my_fixed", >"namespace": "", >"size": 10 > } >} > ] > } > {code} > If we try to canonicalize this schema with the following code > {code} > let schema = Schema::parse_str(schema_str).unwrap().canonical_form(); > println!("{schema}"); > {code} > We get the following result. > {code} > {"name":".my_schema","type":"record","fields":[{"name":"a","type":{"name":".my_enum","type":"enum","symbols":["a","b"]}},{"name":"b","type":{"name":".my_fixed","type":"fixed","size":10}}]} > {code} > But .my_schema, .my_enum and .my_fixed should not starts with a dot. -- This message was sent by Atlassian Jira (v8.20.10#820010)