[jira] [Updated] (AVRO-3897) Disallow invalid namespace in fully qualified name for Rust SDK

2023-11-01 Thread Kousuke Saruta (Jira)


 [ 
https://issues.apache.org/jira/browse/AVRO-3897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated AVRO-3897:
-
Description: 
Currently, the Rust SDK allows the following fully qualified names with 
Name::new.

{code}
Name::new("ns.0.record1")
Name::new("ns..record1")
{code}

But they should be disallowed according to the specification.
https://avro.apache.org/docs/1.11.1/specification/#names

{code}
The name portion of the fullname of named types, record field names, and enum 
symbols must:

start with [A-Za-z_]
subsequently contain only [A-Za-z0-9_]
{code}
{code}
The null namespace may not be used in a dot-separated sequence of names. So the 
grammar for a namespace is:

   | [()*]
{code}

  was:
Currently, the Rust SDK allows the following fully qualified names with 
Name::new.

{code}
Name::new("ns.0.record1")
Name::new("ns..record1")
{code}

But they should be disallowed according to the specification.


> Disallow invalid namespace in fully qualified name for Rust SDK
> ---
>
> Key: AVRO-3897
> URL: https://issues.apache.org/jira/browse/AVRO-3897
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Reporter: Kousuke Saruta
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, the Rust SDK allows the following fully qualified names with 
> Name::new.
> {code}
> Name::new("ns.0.record1")
> Name::new("ns..record1")
> {code}
> But they should be disallowed according to the specification.
> https://avro.apache.org/docs/1.11.1/specification/#names
> {code}
> The name portion of the fullname of named types, record field names, and enum 
> symbols must:
> start with [A-Za-z_]
> subsequently contain only [A-Za-z0-9_]
> {code}
> {code}
> The null namespace may not be used in a dot-separated sequence of names. So 
> the grammar for a namespace is:
>| [()*]
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (AVRO-3862) Add aliases and doc methods to Schema in Rust SDK

2023-09-20 Thread Kousuke Saruta (Jira)


 [ 
https://issues.apache.org/jira/browse/AVRO-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated AVRO-3862:
-
Priority: Minor  (was: Major)

> Add aliases and doc methods to Schema in Rust SDK
> -
>
> Key: AVRO-3862
> URL: https://issues.apache.org/jira/browse/AVRO-3862
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: rust
>Affects Versions: 1.12.0
>Reporter: Kousuke Saruta
>Priority: Minor
>
> Named types (Record, Enum and Fixed) have common attributes {*}name{*}, 
> *aliases* and {*}doc{*}.
> We have already have *fn name* in Schema so it's nice to have *fn aliases* 
> and *fn doc* too.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (AVRO-3851) Validate default value for record fields and enums on parsing

2023-09-04 Thread Kousuke Saruta (Jira)


 [ 
https://issues.apache.org/jira/browse/AVRO-3851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated AVRO-3851:
-
Affects Version/s: 1.12.0

> Validate default value for record fields and enums on parsing
> -
>
> Key: AVRO-3851
> URL: https://issues.apache.org/jira/browse/AVRO-3851
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: rust
>Affects Versions: 1.12.0
>Reporter: Kousuke Saruta
>Priority: Major
>
> Currently, default values for record fields are not validated on parsing 
> except for union type fields.
> Similarly, default values for enum are not also validated on parsing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (AVRO-3850) Don't publish Cargo.lock

2023-09-04 Thread Kousuke Saruta (Jira)


 [ 
https://issues.apache.org/jira/browse/AVRO-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta resolved AVRO-3850.
--
Resolution: Not A Problem

Close for now.
See https://github.com/apache/avro/pull/2476#issuecomment-1704772007

> Don't publish Cargo.lock
> 
>
> Key: AVRO-3850
> URL: https://issues.apache.org/jira/browse/AVRO-3850
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: rust
>Affects Versions: 1.12.0
>Reporter: Kousuke Saruta
>Priority: Minor
>
> Currently, Cargo.lock is published but it should not be because all the 
> crates are libraries.
> https://doc.rust-lang.org/cargo/guide/cargo-toml-vs-cargo-lock.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (AVRO-3847) Record field doesn't accept default value if field type is union and the type of default value is pre-defined name

2023-08-26 Thread Kousuke Saruta (Jira)


 [ 
https://issues.apache.org/jira/browse/AVRO-3847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated AVRO-3847:
-
Affects Version/s: 1.12.0

> Record field doesn't accept default value if field type is union and the type 
> of default value is pre-defined name
> --
>
> Key: AVRO-3847
> URL: https://issues.apache.org/jira/browse/AVRO-3847
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Affects Versions: 1.12.0
>Reporter: Kousuke Saruta
>Priority: Major
>
> Given we have a schema like as follows.
> {code}
> {
> "name": "record1",
> "type": "record",
> "fields": [
> {
> "name": "f1",
> "type": {
> "name": "record2",
> "type": "record",
> "fields": [
> {
> "name": "f1_1",
> "type": "int"
> }
> ]
> }
> },  {
> "name": "f2",
> "type": ["record2", "int"],
> "default": {
> "f1_1": 100
> }
> }
> ]
> }
> {code}
> The type of the field f2 is union of record2 and int, and the default value 
> is of a value of record2, which is pre-defined.
> Current Rust binding doesn't accept such schemas, raising a error message 
> like as follows.
> {code}
> Error: One union type Ref must match the `default`'s value type Map
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (AVRO-3846) Race condition can happen among serde tests

2023-08-26 Thread Kousuke Saruta (Jira)


 [ 
https://issues.apache.org/jira/browse/AVRO-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated AVRO-3846:
-
Description: 
Sometimes one of tests named avro_3747* fails.
You can easily reproduce this issue by cargo test avro_3747.
These tests are run concurrently by Cargo test and those tests load/store the 
same atomic variable so This seems race condition

  was:
Sometimes one of tests named avro_3747 fails.
You can easily reproduce this issue by cargo test avro_3747.
These tests are run concurrently by Cargo test and those tests load/store the 
same atomic variable so This seems race condition


> Race condition can happen among serde tests
> ---
>
> Key: AVRO-3846
> URL: https://issues.apache.org/jira/browse/AVRO-3846
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Affects Versions: 1.12.0
>Reporter: Kousuke Saruta
>Priority: Major
>
> Sometimes one of tests named avro_3747* fails.
> You can easily reproduce this issue by cargo test avro_3747.
> These tests are run concurrently by Cargo test and those tests load/store the 
> same atomic variable so This seems race condition



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (AVRO-3846) Race condition can happen among serde tests

2023-08-26 Thread Kousuke Saruta (Jira)


 [ 
https://issues.apache.org/jira/browse/AVRO-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated AVRO-3846:
-
Description: 
Sometimes one of tests named avro_3747 fails.
You can easily reproduce this issue by cargo test avro_3747.
These tests are run concurrently by Cargo test and those tests load/store the 
same atomic variable so This seems race condition

  was:
Sometimes one of tests named avro_3747 fails.
These tests are run concurrently by Cargo test and those tests load/store the 
same atomic variable so This seems race condition


> Race condition can happen among serde tests
> ---
>
> Key: AVRO-3846
> URL: https://issues.apache.org/jira/browse/AVRO-3846
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Affects Versions: 1.12.0
>Reporter: Kousuke Saruta
>Priority: Major
>
> Sometimes one of tests named avro_3747 fails.
> You can easily reproduce this issue by cargo test avro_3747.
> These tests are run concurrently by Cargo test and those tests load/store the 
> same atomic variable so This seems race condition



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3830) Handle namespace properly if a name starts with dot

2023-08-24 Thread Kousuke Saruta (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758680#comment-17758680
 ] 

Kousuke Saruta commented on AVRO-3830:
--

[~stestagg]
Hmm, will you fix that issue by yourself?

> Handle namespace properly if a name starts with dot
> ---
>
> Key: AVRO-3830
> URL: https://issues.apache.org/jira/browse/AVRO-3830
> Project: Apache Avro
>  Issue Type: Bug
>Affects Versions: 1.12.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0, 1.11.3
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The specification says about the name and namespace like as follows.
> ??The empty string may also be used as a namespace to indicate the null 
> namespace??
> ??If the name specified contains a dot, then it is assumed to be a fullname, 
> and any namespace also specified is ignored??
> According to this specification, if a name in a name field starts with a dot, 
> it's considered that the namespace is null and the corresponding namespace 
> field should be ignored.
> For example, given the following schema.
> {code}
> {
>   "name":  ".record1",
>   "namespace": "ns1",
>   "type": "record",
>   "fields": []
> }
> {code}
> The name and namespace should be "record1" and null respectively.
> But the namespace is considered as "ns1" in the current Rust binding .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3830) Handle namespace properly if a name starts with dot

2023-08-23 Thread Kousuke Saruta (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758139#comment-17758139
 ] 

Kousuke Saruta commented on AVRO-3830:
--

[~stestagg]

{code}
The null namespace may not be used in a dot-separated sequence of names. So the 
grammar for a namespace is:

   | [()*]
{code}

This is about namespace, and the problem this ticket discuss is about namespace 
portion of fullname.
namespace allows empty so I think it follows the specification.


> Handle namespace properly if a name starts with dot
> ---
>
> Key: AVRO-3830
> URL: https://issues.apache.org/jira/browse/AVRO-3830
> Project: Apache Avro
>  Issue Type: Bug
>Affects Versions: 1.12.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0, 1.11.3
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The specification says about the name and namespace like as follows.
> ??The empty string may also be used as a namespace to indicate the null 
> namespace??
> ??If the name specified contains a dot, then it is assumed to be a fullname, 
> and any namespace also specified is ignored??
> According to this specification, if a name in a name field starts with a dot, 
> it's considered that the namespace is null and the corresponding namespace 
> field should be ignored.
> For example, given the following schema.
> {code}
> {
>   "name":  ".record1",
>   "namespace": "ns1",
>   "type": "record",
>   "fields": []
> }
> {code}
> The name and namespace should be "record1" and null respectively.
> But the namespace is considered as "ns1" in the current Rust binding .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (AVRO-3841) Align the specification of the way to encode NaN to the actual implementations

2023-08-23 Thread Kousuke Saruta (Jira)


 [ 
https://issues.apache.org/jira/browse/AVRO-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated AVRO-3841:
-
Summary: Align the specification of the way to encode NaN to the actual 
implementations  (was: Align the specification of encoding NaN to the actual 
implementations)

> Align the specification of the way to encode NaN to the actual implementations
> --
>
> Key: AVRO-3841
> URL: https://issues.apache.org/jira/browse/AVRO-3841
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: spec
>Affects Versions: 1.12.0
>Reporter: Kousuke Saruta
>Priority: Minor
>
> The specification says about the way to encode float/double like as follows.
> {code}
> a float is written as 4 bytes. The float is converted into a 32-bit integer 
> using a method equivalent to Java’s floatToIntBits and then encoded in 
> little-endian format.
> a double is written as 8 bytes. The double is converted into a 64-bit integer 
> using a method equivalent to Java’s doubleToLongBits and then encoded in 
> little-endian format.
> {code}
> But the actual implementation in Java uses 
> floatToRawIntBits/doubleToRawLongBits rather than 
> floatToIntBits/doubleToLongBits.
> The they are different in the way to encode NaN.
> floatToIntBits/doubleToLongBits doesn't distinguish between NaN and -NaN but 
> floatToRawIntBits/doubleToRawLongBits does.
> I confirmed all the implementation distinguish between NaN and -NaN.
> So, I think it's better to modify the specification.
> Java
> {code}
>   public static int encodeFloat(float f, byte[] buf, int pos) {
> final int bits = Float.floatToRawIntBits(f);
> buf[pos + 3] = (byte) (bits >>> 24);
> buf[pos + 2] = (byte) (bits >>> 16);
> buf[pos + 1] = (byte) (bits >>> 8);
> buf[pos] = (byte) (bits);
> return 4;
>   }
>   public static int encodeDouble(double d, byte[] buf, int pos) {
> final long bits = Double.doubleToRawLongBits(d);
> int first = (int) (bits & 0x);
> int second = (int) ((bits >>> 32) & 0x);
> // the compiler seems to execute this order the best, likely due to
> // register allocation -- the lifetime of constants is minimized.
> buf[pos] = (byte) (first);
> buf[pos + 4] = (byte) (second);
> buf[pos + 5] = (byte) (second >>> 8);
> buf[pos + 1] = (byte) (first >>> 8);
> buf[pos + 2] = (byte) (first >>> 16);
> buf[pos + 6] = (byte) (second >>> 16);
> buf[pos + 7] = (byte) (second >>> 24);
> buf[pos + 3] = (byte) (first >>> 24);
> return 8;
>   }
> {code}
> Rust
> {code}
> Value::Float(x) => buffer.extend_from_slice(_le_bytes()),
> Value::Double(x) => buffer.extend_from_slice(_le_bytes()),
> {code}
> Python
> {code}
> def write_float(self, datum: float) -> None:  
> 
> """   
> 
> A float is written as 4 bytes.
> 
> The float is converted into a 32-bit integer using a method 
> equivalent to 
> Java's floatToIntBits and then encoded in little-endian format.   
> 
> """   
> 
> self.write(STRUCT_FLOAT.pack(datum)) 
> def write_double(self, datum: float) -> None: 
> 
> """   
> 
> A double is written as 8 bytes.   
> 
> The double is converted into a 64-bit integer using a method 
> equivalent to
> Java's doubleToLongBits and then encoded in little-endian format. 
> 
> """   
> 
> self.write(STRUCT_DOUBLE.pack(datum))
> {code}
> C
> {code}
> static int write_float(avro_writer_t writer, const float f)
> {
> #if AVRO_PLATFORM_IS_BIG_ENDIAN
> 

[jira] [Updated] (AVRO-3841) Align the specification of encoding NaN to the actual implementations

2023-08-23 Thread Kousuke Saruta (Jira)


 [ 
https://issues.apache.org/jira/browse/AVRO-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated AVRO-3841:
-
Issue Type: Improvement  (was: Bug)

> Align the specification of encoding NaN to the actual implementations
> -
>
> Key: AVRO-3841
> URL: https://issues.apache.org/jira/browse/AVRO-3841
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: spec
>Affects Versions: 1.12.0
>Reporter: Kousuke Saruta
>Priority: Minor
>
> The specification says about the way to encode float/double like as follows.
> {code}
> a float is written as 4 bytes. The float is converted into a 32-bit integer 
> using a method equivalent to Java’s floatToIntBits and then encoded in 
> little-endian format.
> a double is written as 8 bytes. The double is converted into a 64-bit integer 
> using a method equivalent to Java’s doubleToLongBits and then encoded in 
> little-endian format.
> {code}
> But the actual implementation in Java uses 
> floatToRawIntBits/doubleToRawLongBits rather than 
> floatToIntBits/doubleToLongBits.
> The they are different in the way to encode NaN.
> floatToIntBits/doubleToLongBits doesn't distinguish between NaN and -NaN but 
> floatToRawIntBits/doubleToRawLongBits does.
> I confirmed all the implementation distinguish between NaN and -NaN.
> So, I think it's better to modify the specification.
> Java
> {code}
>   public static int encodeFloat(float f, byte[] buf, int pos) {
> final int bits = Float.floatToRawIntBits(f);
> buf[pos + 3] = (byte) (bits >>> 24);
> buf[pos + 2] = (byte) (bits >>> 16);
> buf[pos + 1] = (byte) (bits >>> 8);
> buf[pos] = (byte) (bits);
> return 4;
>   }
>   public static int encodeDouble(double d, byte[] buf, int pos) {
> final long bits = Double.doubleToRawLongBits(d);
> int first = (int) (bits & 0x);
> int second = (int) ((bits >>> 32) & 0x);
> // the compiler seems to execute this order the best, likely due to
> // register allocation -- the lifetime of constants is minimized.
> buf[pos] = (byte) (first);
> buf[pos + 4] = (byte) (second);
> buf[pos + 5] = (byte) (second >>> 8);
> buf[pos + 1] = (byte) (first >>> 8);
> buf[pos + 2] = (byte) (first >>> 16);
> buf[pos + 6] = (byte) (second >>> 16);
> buf[pos + 7] = (byte) (second >>> 24);
> buf[pos + 3] = (byte) (first >>> 24);
> return 8;
>   }
> {code}
> Rust
> {code}
> Value::Float(x) => buffer.extend_from_slice(_le_bytes()),
> Value::Double(x) => buffer.extend_from_slice(_le_bytes()),
> {code}
> Python
> {code}
> def write_float(self, datum: float) -> None:  
> 
> """   
> 
> A float is written as 4 bytes.
> 
> The float is converted into a 32-bit integer using a method 
> equivalent to 
> Java's floatToIntBits and then encoded in little-endian format.   
> 
> """   
> 
> self.write(STRUCT_FLOAT.pack(datum)) 
> def write_double(self, datum: float) -> None: 
> 
> """   
> 
> A double is written as 8 bytes.   
> 
> The double is converted into a 64-bit integer using a method 
> equivalent to
> Java's doubleToLongBits and then encoded in little-endian format. 
> 
> """   
> 
> self.write(STRUCT_DOUBLE.pack(datum))
> {code}
> C
> {code}
> static int write_float(avro_writer_t writer, const float f)
> {
> #if AVRO_PLATFORM_IS_BIG_ENDIAN
> uint8_t buf[4];
> #endif
> union {
> float f;
> int32_t i;
> } v;
> v.f = f;
> #if 

[jira] [Updated] (AVRO-3837) Disallow invalid namespaces for the Rust binding

2023-08-19 Thread Kousuke Saruta (Jira)


 [ 
https://issues.apache.org/jira/browse/AVRO-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated AVRO-3837:
-
Description: 
The current Rust binding doesn't accept invalid namespaces if such namespaces 
are in a name field.

{code}
{
  "name": "ns1.invalid-ns.record1",
  "type": "record"
  "fields": []
}
{code}

But, even if a invalid namespace is in a namespace field, the Rust binding 
accept such namespaces.

{code}
{
  "name": "record1",
  "namespace": "ns1.invalid-ns",
  "type": "record",
  "fields": []
}
{code}

  was:
The current Rust binding doesn't accept invalid namespaces if such namespaces 
are in a name field.

{code}
{
  "name": "ns1.invalid-ns.record1",
  "type": "record"
  "fields": []
}
{code}

But, even if a invalid namespace is in a namespace field, the Rust binding 
accept such namespaces.

{code}
  "name": "record1",
  "namespace": "ns1.invalid-ns",
  "type": "record",
  "fields": []
}
{code}


> Disallow invalid namespaces for the Rust binding
> 
>
> Key: AVRO-3837
> URL: https://issues.apache.org/jira/browse/AVRO-3837
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Affects Versions: 1.12.0
>Reporter: Kousuke Saruta
>Priority: Major
>
> The current Rust binding doesn't accept invalid namespaces if such namespaces 
> are in a name field.
> {code}
> {
>   "name": "ns1.invalid-ns.record1",
>   "type": "record"
>   "fields": []
> }
> {code}
> But, even if a invalid namespace is in a namespace field, the Rust binding 
> accept such namespaces.
> {code}
> {
>   "name": "record1",
>   "namespace": "ns1.invalid-ns",
>   "type": "record",
>   "fields": []
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (AVRO-3837) Disallow invalid namespaces for the Rust binding

2023-08-19 Thread Kousuke Saruta (Jira)


 [ 
https://issues.apache.org/jira/browse/AVRO-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated AVRO-3837:
-
Description: 
The current Rust binding doesn't accept invalid namespaces if such namespaces 
are in a name field.

{code}
{
  "name": "ns1.invalid-ns.record1",
  "type": "record"
  "fields": []
}
{code}

But, even if a invalid namespace is in a namespace field, the Rust binding 
accept such namespaces.

{code}
  "name": "record1",
  "namespace": "ns1.invalid-ns",
  "type": "record",
  "fields": []
}
{code}

  was:
The current Rust binding doesn't accept invalid namespaces if such namespaces 
are in name field.

{code}
{
  "name": "ns1.invalid-ns.record1",
  "type": "record"
  "fields": []
}
{code}

But if a invalid namespace in namespace field doesn't validate.

{code}
  "name": "record1",
  "namespace": "ns1.invalid-ns",
  "type": "record",
  "fields": []
}
{code}


> Disallow invalid namespaces for the Rust binding
> 
>
> Key: AVRO-3837
> URL: https://issues.apache.org/jira/browse/AVRO-3837
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Affects Versions: 1.12.0
>Reporter: Kousuke Saruta
>Priority: Major
>
> The current Rust binding doesn't accept invalid namespaces if such namespaces 
> are in a name field.
> {code}
> {
>   "name": "ns1.invalid-ns.record1",
>   "type": "record"
>   "fields": []
> }
> {code}
> But, even if a invalid namespace is in a namespace field, the Rust binding 
> accept such namespaces.
> {code}
>   "name": "record1",
>   "namespace": "ns1.invalid-ns",
>   "type": "record",
>   "fields": []
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (AVRO-3837) Disallow invalid namespaces for the Rust binding

2023-08-19 Thread Kousuke Saruta (Jira)


 [ 
https://issues.apache.org/jira/browse/AVRO-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated AVRO-3837:
-
Summary: Disallow invalid namespaces for the Rust binding  (was: Disallow 
invalid namespace for the Rust binding)

> Disallow invalid namespaces for the Rust binding
> 
>
> Key: AVRO-3837
> URL: https://issues.apache.org/jira/browse/AVRO-3837
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Affects Versions: 1.12.0
>Reporter: Kousuke Saruta
>Priority: Major
>
> The current Rust binding doesn't accept invalid namespaces if such namespaces 
> are in name field.
> {code}
> {
>   "name": "ns1.invalid-ns.record1",
>   "type": "record"
>   "fields": []
> }
> {code}
> But if a invalid namespace in namespace field doesn't validate.
> {code}
>   "name": "record1",
>   "namespace": "ns1.invalid-ns",
>   "type": "record",
>   "fields": []
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (AVRO-3830) Handle namespace properly if a name starts with dot

2023-08-11 Thread Kousuke Saruta (Jira)


 [ 
https://issues.apache.org/jira/browse/AVRO-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated AVRO-3830:
-
Summary: Handle namespace properly if a name starts with dot  (was: Handle 
namespace property if a name starts with dot)

> Handle namespace properly if a name starts with dot
> ---
>
> Key: AVRO-3830
> URL: https://issues.apache.org/jira/browse/AVRO-3830
> Project: Apache Avro
>  Issue Type: Bug
>Affects Versions: 1.12.0
>Reporter: Kousuke Saruta
>Priority: Major
>
> The specification says about the name and namespace like as follows.
> ??The empty string may also be used as a namespace to indicate the null 
> namespace??
> ??If the name specified contains a dot, then it is assumed to be a fullname, 
> and any namespace also specified is ignored??
> According to this specification, if a name in a name field starts with a dot, 
> it's considered that the namespace is null and the corresponding namespace 
> field should be ignored.
> For example, given the following schema.
> {code}
> {
>   "name":  ".record1",
>   "namespace": "ns1",
>   "type": "record",
>   "fields": []
> }
> {code}
> The name and namespace should be "record1" and null respectively.
> But the namespace is considered as "ns1" in the current Rust binding .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (AVRO-3830) Handle namespace property if a name starts with dot

2023-08-11 Thread Kousuke Saruta (Jira)


 [ 
https://issues.apache.org/jira/browse/AVRO-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated AVRO-3830:
-
Affects Version/s: 1.12.0

> Handle namespace property if a name starts with dot
> ---
>
> Key: AVRO-3830
> URL: https://issues.apache.org/jira/browse/AVRO-3830
> Project: Apache Avro
>  Issue Type: Bug
>Affects Versions: 1.12.0
>Reporter: Kousuke Saruta
>Priority: Major
>
> The specification says about the name and namespace like as follows.
> ??The empty string may also be used as a namespace to indicate the null 
> namespace??
> ??If the name specified contains a dot, then it is assumed to be a fullname, 
> and any namespace also specified is ignored??
> According to this specification, if a name in a name field starts with a dot, 
> it's considered that the namespace is null and the corresponding namespace 
> field should be ignored.
> For example, given the following schema.
> {code}
> {
>   "name":  ".record1",
>   "namespace": "ns1",
>   "type": "record",
>   "fields": []
> }
> {code}
> The name and namespace should be "record1" and null respectively.
> But the namespace is considered as "ns1" in the current Rust binding .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (AVRO-3827) Disallow duplicate field names

2023-08-09 Thread Kousuke Saruta (Jira)


 [ 
https://issues.apache.org/jira/browse/AVRO-3827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated AVRO-3827:
-
Description: 
If a schema contains a record and some of its fields have the same field name, 
such schema should not be allowed.
{code:java}
{
  "name": "my_schema",
  "type": "record",
  "fields": [
{
  "name": "f1",
  "type": {
"name": "a",
"type": "record",
"fields": []
  }
},  {
  "name": "f1",
  "type": {
"name": "b",
"type": "record",
"fields": []
  }
}
  ]
 }
{code}
But the current Rust binding accept.

  was:
If a schema contains a record and some of its fields have the same field name, 
such schema should not be allowed.

{code}
{
  "name": "my_schema",
  "type": "record",
  "fields": [
{
  "name": "f1",
  "type": {
"name": "a",
"type": "record",
"fields": []
  }
}  {
  "name": "f1",
  "type": {
"name": "b",
"type": "record",
"fields": []
  }
}
  ]
 }
{code}

But the current Rust binding accept.


> Disallow duplicate field names
> --
>
> Key: AVRO-3827
> URL: https://issues.apache.org/jira/browse/AVRO-3827
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Affects Versions: 1.12.0
>Reporter: Kousuke Saruta
>Priority: Major
>
> If a schema contains a record and some of its fields have the same field 
> name, such schema should not be allowed.
> {code:java}
> {
>   "name": "my_schema",
>   "type": "record",
>   "fields": [
> {
>   "name": "f1",
>   "type": {
> "name": "a",
> "type": "record",
> "fields": []
>   }
> },  {
>   "name": "f1",
>   "type": {
> "name": "b",
> "type": "record",
> "fields": []
>   }
> }
>   ]
>  }
> {code}
> But the current Rust binding accept.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (AVRO-3825) Disallow invalid namespaces

2023-08-09 Thread Kousuke Saruta (Jira)


 [ 
https://issues.apache.org/jira/browse/AVRO-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated AVRO-3825:
-
Description: 
According to the specification, each portion of a namespace separated by dot 
should be [a-zA-Z_][a-zA-Z0-9_].
[https://avro.apache.org/docs/1.11.1/specification/#names]
{code:java}
The name portion of the fullname of named types, record field names, and enum 
symbols must:

start with [A-Za-z_]
subsequently contain only [A-Za-z0-9_]

A namespace is a dot-separated sequence of such names. The empty string may 
also be used as a namespace to indicate the null namespace. Equality of names 
(including field names and enum symbols) as well as fullnames is case-sensitive.

The null namespace may not be used in a dot-separated sequence of names. So the 
grammar for a namespace is:

   | [()*]

{code}

  was:
According to the specification, each portion of a namespace separated by dot 
should be [a-z,A-Z,_][a-z,A-Z,0-9_].
[https://avro.apache.org/docs/1.11.1/specification/#names]
{code:java}
The name portion of the fullname of named types, record field names, and enum 
symbols must:

start with [A-Za-z_]
subsequently contain only [A-Za-z0-9_]

A namespace is a dot-separated sequence of such names. The empty string may 
also be used as a namespace to indicate the null namespace. Equality of names 
(including field names and enum symbols) as well as fullnames is case-sensitive.

The null namespace may not be used in a dot-separated sequence of names. So the 
grammar for a namespace is:

   | [()*]

{code}


> Disallow invalid namespaces
> ---
>
> Key: AVRO-3825
> URL: https://issues.apache.org/jira/browse/AVRO-3825
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.12.0
>Reporter: Kousuke Saruta
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> According to the specification, each portion of a namespace separated by dot 
> should be [a-zA-Z_][a-zA-Z0-9_].
> [https://avro.apache.org/docs/1.11.1/specification/#names]
> {code:java}
> The name portion of the fullname of named types, record field names, and enum 
> symbols must:
> start with [A-Za-z_]
> subsequently contain only [A-Za-z0-9_]
> A namespace is a dot-separated sequence of such names. The empty string may 
> also be used as a namespace to indicate the null namespace. Equality of names 
> (including field names and enum symbols) as well as fullnames is 
> case-sensitive.
> The null namespace may not be used in a dot-separated sequence of names. So 
> the grammar for a namespace is:
>| [()*]
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (AVRO-3825) Disallow invalid namespaces

2023-08-09 Thread Kousuke Saruta (Jira)


 [ 
https://issues.apache.org/jira/browse/AVRO-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated AVRO-3825:
-
Summary: Disallow invalid namespaces  (was: Disallow invalid namespace)

> Disallow invalid namespaces
> ---
>
> Key: AVRO-3825
> URL: https://issues.apache.org/jira/browse/AVRO-3825
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.12.0
>Reporter: Kousuke Saruta
>Priority: Major
>
> According to the specification, each portion of a namespace separated by dot 
> should be [a-z,A-Z,_][a-z,A-Z,0-9_].
> [https://avro.apache.org/docs/1.11.1/specification/#names]
> {code:java}
> The name portion of the fullname of named types, record field names, and enum 
> symbols must:
> start with [A-Za-z_]
> subsequently contain only [A-Za-z0-9_]
> A namespace is a dot-separated sequence of such names. The empty string may 
> also be used as a namespace to indicate the null namespace. Equality of names 
> (including field names and enum symbols) as well as fullnames is 
> case-sensitive.
> The null namespace may not be used in a dot-separated sequence of names. So 
> the grammar for a namespace is:
>| [()*]
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3823) Show helpful error messages

2023-08-04 Thread Kousuke Saruta (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17751208#comment-17751208
 ] 

Kousuke Saruta commented on AVRO-3823:
--

[~mgrigorov] Oh, I see. I didn't know anyhow works with thiserror well (and I 
noticed both are created by the same author).

> Show helpful error messages
> ---
>
> Key: AVRO-3823
> URL: https://issues.apache.org/jira/browse/AVRO-3823
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Affects Versions: 1.12.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0, 1.11.3
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The current Rust binding doesn't show helpful error messages.
> Actually, error types are implemented with helpful error messages.
> This is an example.
> {code:java}
> #[error("No `name` field")] 
> GetNameField,  
> {code}
> But those error messages are not shown.
> Given we try to a invalid schema which contains no name field, we expect to 
> get "No `name` field" but the actual is "GetNameFIeld", which makes it 
> difficult for users to resolve the problem.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (AVRO-3812) Handle null namespace properly for canonicalized schema representation

2023-07-23 Thread Kousuke Saruta (Jira)


 [ 
https://issues.apache.org/jira/browse/AVRO-3812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated AVRO-3812:
-
Summary: Handle null namespace properly for canonicalized schema 
representation  (was: Handle null namespace properly)

> Handle null namespace properly for canonicalized schema representation
> --
>
> Key: AVRO-3812
> URL: https://issues.apache.org/jira/browse/AVRO-3812
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: rust
>Affects Versions: 1.12.0
>Reporter: Kousuke Saruta
>Priority: Major
>
> Considering the following schema, which contains namespaces of "".
> {code}
> {
>  "namespace": "",
>  "type": "record",
>  "name": "my_schema",
>  "fields": [
>{
>  "name": "a",
>  "type": {
>"type": "enum",
>"name": "my_enum",
>"namespace": "",
>"symbols": ["a", "b"]
>  }
>},  {
>  "name": "b",
>  "type": {
>"type": "fixed",
>"name": "my_fixed",
>"namespace": "",
>"size": 10
>  }
>}
>  ]
> }
> {code}
> If we try to canonicalize this schema with the following code
> {code}
> let schema = Schema::parse_str(schema_str).unwrap().canonical_form();
> println!("{schema}");
> {code}
> We get the following result.
> {code}
> {"name":".my_schema","type":"record","fields":[{"name":"a","type":{"name":".my_enum","type":"enum","symbols":["a","b"]}},{"name":"b","type":{"name":".my_fixed","type":"fixed","size":10}}]}
> {code}
> But .my_schema, .my_enum and .my_fixed should not starts with a dot.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)