[jira] [Commented] (AVRO-4023) Move `IdlUtils` into `org.apache.avro.idl`

2024-07-24 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-4023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17868341#comment-17868341
 ] 

ASF subversion and git services commented on AVRO-4023:
---

Commit e24de3c47dde433e4312f4c6b867e7f60c04c1b6 in avro's branch 
refs/heads/main from Fokko Driesprong
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=e24de3c47 ]

AVRO-4023: Move `IdlUtils` into `org.apache.avro.idl` (#3039)

Noticed this when building the distributions.

I think it makes sense to move it just into the `idl` namspace.

This package hasn't been released before, so we don't break any
existing APIs.

```
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-javadoc-plugin:3.8.0:jar (module-javadocs) on 
project avro-idl: MavenReportException: Error while generating Javadoc:
[ERROR] Exit code: 2
[ERROR] error: No source files for package org.apache.avro.util
[ERROR] 1 error
[ERROR] Command line was: /usr/lib/jvm/java-21-openjdk-arm64/bin/javadoc 
-J-Duser.language= -J-Duser.country= -Xdoclint:none @options @packages
[ERROR]
[ERROR] Refer to the generated Javadoc files in 
'/home/fokko.driesprong/avro/lang/java/idl/target/apidocs' dir.
[ERROR]
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn  -rf :avro-idl
```

> Move `IdlUtils` into `org.apache.avro.idl`
> --
>
> Key: AVRO-4023
> URL: https://issues.apache.org/jira/browse/AVRO-4023
> Project: Apache Avro
>  Issue Type: Improvement
>Affects Versions: 1.11.3
>Reporter: Fokko Driesprong
>Assignee: Fokko Driesprong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-4022) Revive docker image

2024-07-24 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-4022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17868310#comment-17868310
 ] 

ASF subversion and git services commented on AVRO-4022:
---

Commit fc9380bd217c62e358b9020c46ade4e484a3e304 in avro's branch 
refs/heads/main from Fokko Driesprong
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=fc9380bd2 ]

AVRO-4022: Add docker CI (#3037)

* Second try

* Add OpenSSL 1.1

* Fix the toolchain

* Move to pre-compiled OpenSSL and switch to MaybeXS for perl

* Check if there are more issues

* We need distutils

* Add cargo to PATH

* Cleanup

* Bump to Java11

Co-authored-by: Martin Grigorov 

-

Co-authored-by: Martin Grigorov 

> Revive docker image
> ---
>
> Key: AVRO-4022
> URL: https://issues.apache.org/jira/browse/AVRO-4022
> Project: Apache Avro
>  Issue Type: Improvement
>Affects Versions: 1.11.3
>Reporter: Fokko Driesprong
>Assignee: Fokko Driesprong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-4021) Pass in build architecture explicitly

2024-07-22 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-4021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17867861#comment-17867861
 ] 

ASF subversion and git services commented on AVRO-4021:
---

Commit bb86dd0fe1d604b9f741697e424040094b5a02bf in avro's branch 
refs/heads/main from Fokko Driesprong
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=bb86dd0fe ]

AVRO-4021: Pass in BUILDPLATFORM explicitly (#3026)



> Pass in build architecture explicitly
> -
>
> Key: AVRO-4021
> URL: https://issues.apache.org/jira/browse/AVRO-4021
> Project: Apache Avro
>  Issue Type: Improvement
>Reporter: Fokko Driesprong
>Assignee: Fokko Driesprong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3631) Fix serialization of structs containing Fixed fields

2024-07-19 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17867292#comment-17867292
 ] 

ASF subversion and git services commented on AVRO-3631:
---

Commit cc2bcdc952a71074310ba2a86b53f5f4e2c5229f in avro's branch 
refs/heads/branch-1.11 from Romain Leroux
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=cc2bcdc95 ]

AVRO-3631: [Rust] Efficient (de)serialization for optional bytes (#3029)

(cherry picked from commit 5f0776b358fd10fa6062d042e513219d158b525e)


> Fix serialization of structs containing Fixed fields
> 
>
> Key: AVRO-3631
> URL: https://issues.apache.org/jira/browse/AVRO-3631
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Reporter: Rik Heijdens
>Assignee: Martin Tzvetanov Grigorov
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0, 1.11.4
>
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> Consider the following minimal Avro Schema:
> {noformat}
> {
> "type": "record",
> "name": "TestStructFixedField",
> "fields": [
> {
> "name": "field",
> "type": {
> "name": "field",
> "type": "fixed",
> "size": 6
> }
> }
> ]
> }
> {noformat}
> In Rust, I might represent this schema with the following struct:
> {noformat}
> #[derive(Debug, Serialize, Deserialize)]
> struct TestStructFixedField {
> field: [u8; 6]
> }
> {noformat}
> I would then expect to be able to use `apache_avro::to_avro_datum()` to 
> convert an instance of `TestStructFixedField` into an `Vec` using an 
> instance of `Schema` initialized from the schema listed above.
> However, this fails because the `Value` produced by `apache_avro::to_value()` 
> represents `field` as an `Value::Array` rather than a 
> `Value::Fixed<6, Vec` which does not pass schema validation.
> I believe that there are two options to fix this:
> 1. Allow Value::Array> to pass validation if the array has 
> the expected length, and none of the contents of the array are out-of-range 
> for u8. If we go down this route, the implementation of `to_avro_datum()` 
> will have to take care of converting Value::Int to u8 when converting into 
> bytes.
> 2. Update `apache_avro::to_value()` such that fixed length arrays are 
> converted into `Value::Fixed>` rather than `Value::Array`.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3631) Fix serialization of structs containing Fixed fields

2024-07-19 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17867291#comment-17867291
 ] 

ASF subversion and git services commented on AVRO-3631:
---

Commit 5f0776b358fd10fa6062d042e513219d158b525e in avro's branch 
refs/heads/main from Romain Leroux
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=5f0776b35 ]

AVRO-3631: [Rust] Efficient (de)serialization for optional bytes (#3029)



> Fix serialization of structs containing Fixed fields
> 
>
> Key: AVRO-3631
> URL: https://issues.apache.org/jira/browse/AVRO-3631
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Reporter: Rik Heijdens
>Assignee: Martin Tzvetanov Grigorov
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0, 1.11.4
>
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> Consider the following minimal Avro Schema:
> {noformat}
> {
> "type": "record",
> "name": "TestStructFixedField",
> "fields": [
> {
> "name": "field",
> "type": {
> "name": "field",
> "type": "fixed",
> "size": 6
> }
> }
> ]
> }
> {noformat}
> In Rust, I might represent this schema with the following struct:
> {noformat}
> #[derive(Debug, Serialize, Deserialize)]
> struct TestStructFixedField {
> field: [u8; 6]
> }
> {noformat}
> I would then expect to be able to use `apache_avro::to_avro_datum()` to 
> convert an instance of `TestStructFixedField` into an `Vec` using an 
> instance of `Schema` initialized from the schema listed above.
> However, this fails because the `Value` produced by `apache_avro::to_value()` 
> represents `field` as an `Value::Array` rather than a 
> `Value::Fixed<6, Vec` which does not pass schema validation.
> I believe that there are two options to fix this:
> 1. Allow Value::Array> to pass validation if the array has 
> the expected length, and none of the contents of the array are out-of-range 
> for u8. If we go down this route, the implementation of `to_avro_datum()` 
> will have to take care of converting Value::Int to u8 when converting into 
> bytes.
> 2. Update `apache_avro::to_value()` such that fixed length arrays are 
> converted into `Value::Fixed>` rather than `Value::Array`.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3631) Fix serialization of structs containing Fixed fields

2024-07-18 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17867011#comment-17867011
 ] 

ASF subversion and git services commented on AVRO-3631:
---

Commit 34bc5587070152a97cf28cb6ebe4cd5c0b11ecc3 in avro's branch 
refs/heads/branch-1.11 from Romain Leroux
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=34bc55870 ]

AVRO-3631: [Rust] More efficient (de)serialization using serde_bytes (#3027)

* AVRO-3631: [Rust] More efficient (de)serialization using serde_bytes

* AVRO-3631: [Rust] Bump MSRV to 1.73.0

* Remove env var that is default since Rust 1.70.0

-

Co-authored-by: Martin Grigorov 
(cherry picked from commit 17e7994bf0bd63938e7d3cb0f25ffbd2d51424b0)


> Fix serialization of structs containing Fixed fields
> 
>
> Key: AVRO-3631
> URL: https://issues.apache.org/jira/browse/AVRO-3631
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Reporter: Rik Heijdens
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> Consider the following minimal Avro Schema:
> {noformat}
> {
> "type": "record",
> "name": "TestStructFixedField",
> "fields": [
> {
> "name": "field",
> "type": {
> "name": "field",
> "type": "fixed",
> "size": 6
> }
> }
> ]
> }
> {noformat}
> In Rust, I might represent this schema with the following struct:
> {noformat}
> #[derive(Debug, Serialize, Deserialize)]
> struct TestStructFixedField {
> field: [u8; 6]
> }
> {noformat}
> I would then expect to be able to use `apache_avro::to_avro_datum()` to 
> convert an instance of `TestStructFixedField` into an `Vec` using an 
> instance of `Schema` initialized from the schema listed above.
> However, this fails because the `Value` produced by `apache_avro::to_value()` 
> represents `field` as an `Value::Array` rather than a 
> `Value::Fixed<6, Vec` which does not pass schema validation.
> I believe that there are two options to fix this:
> 1. Allow Value::Array> to pass validation if the array has 
> the expected length, and none of the contents of the array are out-of-range 
> for u8. If we go down this route, the implementation of `to_avro_datum()` 
> will have to take care of converting Value::Int to u8 when converting into 
> bytes.
> 2. Update `apache_avro::to_value()` such that fixed length arrays are 
> converted into `Value::Fixed>` rather than `Value::Array`.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3631) Fix serialization of structs containing Fixed fields

2024-07-18 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17867009#comment-17867009
 ] 

ASF subversion and git services commented on AVRO-3631:
---

Commit 17e7994bf0bd63938e7d3cb0f25ffbd2d51424b0 in avro's branch 
refs/heads/main from Romain Leroux
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=17e7994bf ]

AVRO-3631: [Rust] More efficient (de)serialization using serde_bytes (#3027)

* AVRO-3631: [Rust] More efficient (de)serialization using serde_bytes

* AVRO-3631: [Rust] Bump MSRV to 1.73.0

* Remove env var that is default since Rust 1.70.0

-

Co-authored-by: Martin Grigorov 

> Fix serialization of structs containing Fixed fields
> 
>
> Key: AVRO-3631
> URL: https://issues.apache.org/jira/browse/AVRO-3631
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Reporter: Rik Heijdens
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> Consider the following minimal Avro Schema:
> {noformat}
> {
> "type": "record",
> "name": "TestStructFixedField",
> "fields": [
> {
> "name": "field",
> "type": {
> "name": "field",
> "type": "fixed",
> "size": 6
> }
> }
> ]
> }
> {noformat}
> In Rust, I might represent this schema with the following struct:
> {noformat}
> #[derive(Debug, Serialize, Deserialize)]
> struct TestStructFixedField {
> field: [u8; 6]
> }
> {noformat}
> I would then expect to be able to use `apache_avro::to_avro_datum()` to 
> convert an instance of `TestStructFixedField` into an `Vec` using an 
> instance of `Schema` initialized from the schema listed above.
> However, this fails because the `Value` produced by `apache_avro::to_value()` 
> represents `field` as an `Value::Array` rather than a 
> `Value::Fixed<6, Vec` which does not pass schema validation.
> I believe that there are two options to fix this:
> 1. Allow Value::Array> to pass validation if the array has 
> the expected length, and none of the contents of the array are out-of-range 
> for u8. If we go down this route, the implementation of `to_avro_datum()` 
> will have to take care of converting Value::Int to u8 when converting into 
> bytes.
> 2. Update `apache_avro::to_value()` such that fixed length arrays are 
> converted into `Value::Fixed>` rather than `Value::Array`.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3631) Fix serialization of structs containing Fixed fields

2024-07-17 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17866718#comment-17866718
 ] 

ASF subversion and git services commented on AVRO-3631:
---

Commit 841003bacdeee6da8c6710c981dc40084d7c8c67 in avro's branch 
refs/heads/avro-3631/fix-fixed-serialization from Martin Tzvetanov Grigorov
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=841003bac ]

AVRO-3631: Add more test cases

Signed-off-by: Martin Tzvetanov Grigorov 


> Fix serialization of structs containing Fixed fields
> 
>
> Key: AVRO-3631
> URL: https://issues.apache.org/jira/browse/AVRO-3631
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Reporter: Rik Heijdens
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Consider the following minimal Avro Schema:
> {noformat}
> {
> "type": "record",
> "name": "TestStructFixedField",
> "fields": [
> {
> "name": "field",
> "type": {
> "name": "field",
> "type": "fixed",
> "size": 6
> }
> }
> ]
> }
> {noformat}
> In Rust, I might represent this schema with the following struct:
> {noformat}
> #[derive(Debug, Serialize, Deserialize)]
> struct TestStructFixedField {
> field: [u8; 6]
> }
> {noformat}
> I would then expect to be able to use `apache_avro::to_avro_datum()` to 
> convert an instance of `TestStructFixedField` into an `Vec` using an 
> instance of `Schema` initialized from the schema listed above.
> However, this fails because the `Value` produced by `apache_avro::to_value()` 
> represents `field` as an `Value::Array` rather than a 
> `Value::Fixed<6, Vec` which does not pass schema validation.
> I believe that there are two options to fix this:
> 1. Allow Value::Array> to pass validation if the array has 
> the expected length, and none of the contents of the array are out-of-range 
> for u8. If we go down this route, the implementation of `to_avro_datum()` 
> will have to take care of converting Value::Int to u8 when converting into 
> bytes.
> 2. Update `apache_avro::to_value()` such that fixed length arrays are 
> converted into `Value::Fixed>` rather than `Value::Array`.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3631) Fix serialization of structs containing Fixed fields

2024-07-17 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17866722#comment-17866722
 ] 

ASF subversion and git services commented on AVRO-3631:
---

Commit e20a51f9409bf298d4142bd5b72aed3700fdb66d in avro's branch 
refs/heads/avro-3631/fix-fixed-serialization from Martin Tzvetanov Grigorov
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=e20a51f94 ]

AVRO-3631: Rebase to latest master and fix any problems

Signed-off-by: Martin Tzvetanov Grigorov 


> Fix serialization of structs containing Fixed fields
> 
>
> Key: AVRO-3631
> URL: https://issues.apache.org/jira/browse/AVRO-3631
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Reporter: Rik Heijdens
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Consider the following minimal Avro Schema:
> {noformat}
> {
> "type": "record",
> "name": "TestStructFixedField",
> "fields": [
> {
> "name": "field",
> "type": {
> "name": "field",
> "type": "fixed",
> "size": 6
> }
> }
> ]
> }
> {noformat}
> In Rust, I might represent this schema with the following struct:
> {noformat}
> #[derive(Debug, Serialize, Deserialize)]
> struct TestStructFixedField {
> field: [u8; 6]
> }
> {noformat}
> I would then expect to be able to use `apache_avro::to_avro_datum()` to 
> convert an instance of `TestStructFixedField` into an `Vec` using an 
> instance of `Schema` initialized from the schema listed above.
> However, this fails because the `Value` produced by `apache_avro::to_value()` 
> represents `field` as an `Value::Array` rather than a 
> `Value::Fixed<6, Vec` which does not pass schema validation.
> I believe that there are two options to fix this:
> 1. Allow Value::Array> to pass validation if the array has 
> the expected length, and none of the contents of the array are out-of-range 
> for u8. If we go down this route, the implementation of `to_avro_datum()` 
> will have to take care of converting Value::Int to u8 when converting into 
> bytes.
> 2. Update `apache_avro::to_value()` such that fixed length arrays are 
> converted into `Value::Fixed>` rather than `Value::Array`.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3631) Fix serialization of structs containing Fixed fields

2024-07-17 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17866727#comment-17866727
 ] 

ASF subversion and git services commented on AVRO-3631:
---

Commit 0cb3370239460c12c77ae663aa6f1e97e67635d2 in avro's branch 
refs/heads/avro-3631/fix-fixed-serialization from Martin Tzvetanov Grigorov
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=0cb337023 ]

AVRO-3631: rebase to latest main

Signed-off-by: Martin Tzvetanov Grigorov 


> Fix serialization of structs containing Fixed fields
> 
>
> Key: AVRO-3631
> URL: https://issues.apache.org/jira/browse/AVRO-3631
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Reporter: Rik Heijdens
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Consider the following minimal Avro Schema:
> {noformat}
> {
> "type": "record",
> "name": "TestStructFixedField",
> "fields": [
> {
> "name": "field",
> "type": {
> "name": "field",
> "type": "fixed",
> "size": 6
> }
> }
> ]
> }
> {noformat}
> In Rust, I might represent this schema with the following struct:
> {noformat}
> #[derive(Debug, Serialize, Deserialize)]
> struct TestStructFixedField {
> field: [u8; 6]
> }
> {noformat}
> I would then expect to be able to use `apache_avro::to_avro_datum()` to 
> convert an instance of `TestStructFixedField` into an `Vec` using an 
> instance of `Schema` initialized from the schema listed above.
> However, this fails because the `Value` produced by `apache_avro::to_value()` 
> represents `field` as an `Value::Array` rather than a 
> `Value::Fixed<6, Vec` which does not pass schema validation.
> I believe that there are two options to fix this:
> 1. Allow Value::Array> to pass validation if the array has 
> the expected length, and none of the contents of the array are out-of-range 
> for u8. If we go down this route, the implementation of `to_avro_datum()` 
> will have to take care of converting Value::Int to u8 when converting into 
> bytes.
> 2. Update `apache_avro::to_value()` such that fixed length arrays are 
> converted into `Value::Fixed>` rather than `Value::Array`.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3531) GenericDatumReader in multithread lead to infinite loop cause misused of IdentityHashMap

2024-07-17 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17866721#comment-17866721
 ] 

ASF subversion and git services commented on AVRO-3531:
---

Commit 0fafbb2f5276fa25b4b3296b94e9752c08bdaa2d in avro's branch 
refs/heads/avro-3631/fix-fixed-serialization from Martin Tzvetanov Grigorov
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=0fafbb2f5 ]

AVRO-3531: Code formatting

Signed-off-by: Martin Tzvetanov Grigorov 


> GenericDatumReader in multithread lead to infinite loop cause misused of 
> IdentityHashMap
> 
>
> Key: AVRO-3531
> URL: https://issues.apache.org/jira/browse/AVRO-3531
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.11.0
>Reporter: tansion
>Assignee: Christophe Le Saec
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.11.1
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Hi, 
> I am working on a java project that uses Kafka with Avro 
> serialization/deserialization in an messaging platform.
> In production enrionment, we meet a serious issue on the deserialization 
> processs. The GenericDatumReader process some how get into a infinite loop 
> status, and it is happened accationally.
> When the issue happens, The thread stack is like this:
>  
> {code:java}
> "DmqFixedRateConsumer-Thread-17" #453 daemon prio=5 os_prio=0 
> tid=0x7f2ae1832800 nid=0xef49 runnable [0x7f2a743fc000]
>    java.lang.Thread.State: RUNNABLE
>     at java.util.IdentityHashMap.get(IdentityHashMap.java:337)
>     at 
> org.apache.avro.generic.GenericDatumReader.getStringClass(GenericDatumReader.java:503)
>     at 
> org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:454)
>     at 
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:191)
>     at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:160)
>     at 
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:187)
>     at 
> org.apache.avro.reflect.ReflectDatumReader.readField(ReflectDatumReader.java:291)
>     at 
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:247)
>     at 
> org.apache.avro.specific.SpecificDatumReader.readRecord(SpecificDatumReader.java:123)
>     at 
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:179)
>     at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:160)
>     at 
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:187)
>     at 
> org.apache.avro.reflect.ReflectDatumReader.readField(ReflectDatumReader.java:291)
>     at 
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:247)
>     at 
> org.apache.avro.specific.SpecificDatumReader.readRecord(SpecificDatumReader.java:123)
>     at 
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:179)
>     at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:160)
>     at 
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:187)
>     at 
> org.apache.avro.reflect.ReflectDatumReader.readField(ReflectDatumReader.java:291)
>     at 
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:247)
>     at 
> org.apache.avro.specific.SpecificDatumReader.readRecord(SpecificDatumReader.java:123)
>     at 
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:179)
>     at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:160)
>     at 
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:187)
>     at 
> org.apache.avro.reflect.ReflectDatumReader.readField(ReflectDatumReader.java:291)
>     at 
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:247)
>     at 
> org.apache.avro.specific.SpecificDatumReader.readRecord(SpecificDatumReader.java:123)
>     at 
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:179)
>     at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:160)
>     at 
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:187)
>     at 
> org.apache.avro.reflect.ReflectDatumReader.readField(ReflectDatumReader.java:291)
>     at 
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:247)
>     at 
> org.apache.avro.specific.SpecificDatumReader.readRecord(SpecificDatumReader.java:123)
>     at 
> 

[jira] [Commented] (AVRO-3631) Fix serialization of structs containing Fixed fields

2024-07-17 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17866720#comment-17866720
 ] 

ASF subversion and git services commented on AVRO-3631:
---

Commit d577394281d6c1e17c52e38464ebf80f0c490d4b in avro's branch 
refs/heads/avro-3631/fix-fixed-serialization from Martin Tzvetanov Grigorov
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=d57739428 ]

AVRO-3631: Use official serde_bytes crate

Signed-off-by: Martin Tzvetanov Grigorov 


> Fix serialization of structs containing Fixed fields
> 
>
> Key: AVRO-3631
> URL: https://issues.apache.org/jira/browse/AVRO-3631
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Reporter: Rik Heijdens
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Consider the following minimal Avro Schema:
> {noformat}
> {
> "type": "record",
> "name": "TestStructFixedField",
> "fields": [
> {
> "name": "field",
> "type": {
> "name": "field",
> "type": "fixed",
> "size": 6
> }
> }
> ]
> }
> {noformat}
> In Rust, I might represent this schema with the following struct:
> {noformat}
> #[derive(Debug, Serialize, Deserialize)]
> struct TestStructFixedField {
> field: [u8; 6]
> }
> {noformat}
> I would then expect to be able to use `apache_avro::to_avro_datum()` to 
> convert an instance of `TestStructFixedField` into an `Vec` using an 
> instance of `Schema` initialized from the schema listed above.
> However, this fails because the `Value` produced by `apache_avro::to_value()` 
> represents `field` as an `Value::Array` rather than a 
> `Value::Fixed<6, Vec` which does not pass schema validation.
> I believe that there are two options to fix this:
> 1. Allow Value::Array> to pass validation if the array has 
> the expected length, and none of the contents of the array are out-of-range 
> for u8. If we go down this route, the implementation of `to_avro_datum()` 
> will have to take care of converting Value::Int to u8 when converting into 
> bytes.
> 2. Update `apache_avro::to_value()` such that fixed length arrays are 
> converted into `Value::Fixed>` rather than `Value::Array`.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3631) Fix serialization of structs containing Fixed fields

2024-07-17 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17866725#comment-17866725
 ] 

ASF subversion and git services commented on AVRO-3631:
---

Commit 5d373f2e005ce371eab05b844a0c7d46744a8390 in avro's branch 
refs/heads/avro-3631/fix-fixed-serialization from Martin Tzvetanov Grigorov
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=5d373f2e0 ]

AVRO-3631: [Rust] Rebase the PR to latest `main`

Signed-off-by: Martin Tzvetanov Grigorov 


> Fix serialization of structs containing Fixed fields
> 
>
> Key: AVRO-3631
> URL: https://issues.apache.org/jira/browse/AVRO-3631
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Reporter: Rik Heijdens
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Consider the following minimal Avro Schema:
> {noformat}
> {
> "type": "record",
> "name": "TestStructFixedField",
> "fields": [
> {
> "name": "field",
> "type": {
> "name": "field",
> "type": "fixed",
> "size": 6
> }
> }
> ]
> }
> {noformat}
> In Rust, I might represent this schema with the following struct:
> {noformat}
> #[derive(Debug, Serialize, Deserialize)]
> struct TestStructFixedField {
> field: [u8; 6]
> }
> {noformat}
> I would then expect to be able to use `apache_avro::to_avro_datum()` to 
> convert an instance of `TestStructFixedField` into an `Vec` using an 
> instance of `Schema` initialized from the schema listed above.
> However, this fails because the `Value` produced by `apache_avro::to_value()` 
> represents `field` as an `Value::Array` rather than a 
> `Value::Fixed<6, Vec` which does not pass schema validation.
> I believe that there are two options to fix this:
> 1. Allow Value::Array> to pass validation if the array has 
> the expected length, and none of the contents of the array are out-of-range 
> for u8. If we go down this route, the implementation of `to_avro_datum()` 
> will have to take care of converting Value::Int to u8 when converting into 
> bytes.
> 2. Update `apache_avro::to_value()` such that fixed length arrays are 
> converted into `Value::Fixed>` rather than `Value::Array`.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3631) Fix serialization of structs containing Fixed fields

2024-07-17 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17866719#comment-17866719
 ] 

ASF subversion and git services commented on AVRO-3631:
---

Commit aa317ade9d93a1182cd33649119ed6dc1eadbf15 in avro's branch 
refs/heads/avro-3631/fix-fixed-serialization from Martin Tzvetanov Grigorov
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=aa317ade9 ]

AVRO-3631: Fix clippy issues

Signed-off-by: Martin Tzvetanov Grigorov 


> Fix serialization of structs containing Fixed fields
> 
>
> Key: AVRO-3631
> URL: https://issues.apache.org/jira/browse/AVRO-3631
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Reporter: Rik Heijdens
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Consider the following minimal Avro Schema:
> {noformat}
> {
> "type": "record",
> "name": "TestStructFixedField",
> "fields": [
> {
> "name": "field",
> "type": {
> "name": "field",
> "type": "fixed",
> "size": 6
> }
> }
> ]
> }
> {noformat}
> In Rust, I might represent this schema with the following struct:
> {noformat}
> #[derive(Debug, Serialize, Deserialize)]
> struct TestStructFixedField {
> field: [u8; 6]
> }
> {noformat}
> I would then expect to be able to use `apache_avro::to_avro_datum()` to 
> convert an instance of `TestStructFixedField` into an `Vec` using an 
> instance of `Schema` initialized from the schema listed above.
> However, this fails because the `Value` produced by `apache_avro::to_value()` 
> represents `field` as an `Value::Array` rather than a 
> `Value::Fixed<6, Vec` which does not pass schema validation.
> I believe that there are two options to fix this:
> 1. Allow Value::Array> to pass validation if the array has 
> the expected length, and none of the contents of the array are out-of-range 
> for u8. If we go down this route, the implementation of `to_avro_datum()` 
> will have to take care of converting Value::Int to u8 when converting into 
> bytes.
> 2. Update `apache_avro::to_value()` such that fixed length arrays are 
> converted into `Value::Fixed>` rather than `Value::Array`.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3631) Fix serialization of structs containing Fixed fields

2024-07-17 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17866714#comment-17866714
 ] 

ASF subversion and git services commented on AVRO-3631:
---

Commit 172fb85a4125ba5524b5e6a17b97d0257a6dff87 in avro's branch 
refs/heads/avro-3631/fix-fixed-serialization from Martin Tzvetanov Grigorov
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=172fb85a4 ]

AVRO-3631: Add support for ser_de Value::Fixed

It is based on https://github.com/serde-rs/bytes/pull/28 which is not
yet merged.

Signed-off-by: Martin Tzvetanov Grigorov 


> Fix serialization of structs containing Fixed fields
> 
>
> Key: AVRO-3631
> URL: https://issues.apache.org/jira/browse/AVRO-3631
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Reporter: Rik Heijdens
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Consider the following minimal Avro Schema:
> {noformat}
> {
> "type": "record",
> "name": "TestStructFixedField",
> "fields": [
> {
> "name": "field",
> "type": {
> "name": "field",
> "type": "fixed",
> "size": 6
> }
> }
> ]
> }
> {noformat}
> In Rust, I might represent this schema with the following struct:
> {noformat}
> #[derive(Debug, Serialize, Deserialize)]
> struct TestStructFixedField {
> field: [u8; 6]
> }
> {noformat}
> I would then expect to be able to use `apache_avro::to_avro_datum()` to 
> convert an instance of `TestStructFixedField` into an `Vec` using an 
> instance of `Schema` initialized from the schema listed above.
> However, this fails because the `Value` produced by `apache_avro::to_value()` 
> represents `field` as an `Value::Array` rather than a 
> `Value::Fixed<6, Vec` which does not pass schema validation.
> I believe that there are two options to fix this:
> 1. Allow Value::Array> to pass validation if the array has 
> the expected length, and none of the contents of the array are out-of-range 
> for u8. If we go down this route, the implementation of `to_avro_datum()` 
> will have to take care of converting Value::Int to u8 when converting into 
> bytes.
> 2. Update `apache_avro::to_value()` such that fixed length arrays are 
> converted into `Value::Fixed>` rather than `Value::Array`.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3631) Fix serialization of structs containing Fixed fields

2024-07-17 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17866724#comment-17866724
 ] 

ASF subversion and git services commented on AVRO-3631:
---

Commit c1356719a03fcba578dfa630be4b332d823bc977 in avro's branch 
refs/heads/avro-3631/fix-fixed-serialization from Martin Tzvetanov Grigorov
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=c1356719a ]

AVRO-3631: Deserialize supports only owned byte arrays

Signed-off-by: Martin Tzvetanov Grigorov 


> Fix serialization of structs containing Fixed fields
> 
>
> Key: AVRO-3631
> URL: https://issues.apache.org/jira/browse/AVRO-3631
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Reporter: Rik Heijdens
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Consider the following minimal Avro Schema:
> {noformat}
> {
> "type": "record",
> "name": "TestStructFixedField",
> "fields": [
> {
> "name": "field",
> "type": {
> "name": "field",
> "type": "fixed",
> "size": 6
> }
> }
> ]
> }
> {noformat}
> In Rust, I might represent this schema with the following struct:
> {noformat}
> #[derive(Debug, Serialize, Deserialize)]
> struct TestStructFixedField {
> field: [u8; 6]
> }
> {noformat}
> I would then expect to be able to use `apache_avro::to_avro_datum()` to 
> convert an instance of `TestStructFixedField` into an `Vec` using an 
> instance of `Schema` initialized from the schema listed above.
> However, this fails because the `Value` produced by `apache_avro::to_value()` 
> represents `field` as an `Value::Array` rather than a 
> `Value::Fixed<6, Vec` which does not pass schema validation.
> I believe that there are two options to fix this:
> 1. Allow Value::Array> to pass validation if the array has 
> the expected length, and none of the contents of the array are out-of-range 
> for u8. If we go down this route, the implementation of `to_avro_datum()` 
> will have to take care of converting Value::Int to u8 when converting into 
> bytes.
> 2. Update `apache_avro::to_value()` such that fixed length arrays are 
> converted into `Value::Fixed>` rather than `Value::Array`.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3631) Fix serialization of structs containing Fixed fields

2024-07-17 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17866713#comment-17866713
 ] 

ASF subversion and git services commented on AVRO-3631:
---

Commit 612af8b303a5170dd1f052b484a8e41860cfed7f in avro's branch 
refs/heads/avro-3631/fix-fixed-serialization from Rik Heijdens
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=612af8b30 ]

AVRO-3631: Add test for serializing fixed fields

This test-case mainly demonstrates the issue reported in AVRO-3631. It
is unclear to me whether we should actually expect the serializer to
serialize to a Value::Fixed right away given that Schema information is
not available at this stage.


> Fix serialization of structs containing Fixed fields
> 
>
> Key: AVRO-3631
> URL: https://issues.apache.org/jira/browse/AVRO-3631
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Reporter: Rik Heijdens
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Consider the following minimal Avro Schema:
> {noformat}
> {
> "type": "record",
> "name": "TestStructFixedField",
> "fields": [
> {
> "name": "field",
> "type": {
> "name": "field",
> "type": "fixed",
> "size": 6
> }
> }
> ]
> }
> {noformat}
> In Rust, I might represent this schema with the following struct:
> {noformat}
> #[derive(Debug, Serialize, Deserialize)]
> struct TestStructFixedField {
> field: [u8; 6]
> }
> {noformat}
> I would then expect to be able to use `apache_avro::to_avro_datum()` to 
> convert an instance of `TestStructFixedField` into an `Vec` using an 
> instance of `Schema` initialized from the schema listed above.
> However, this fails because the `Value` produced by `apache_avro::to_value()` 
> represents `field` as an `Value::Array` rather than a 
> `Value::Fixed<6, Vec` which does not pass schema validation.
> I believe that there are two options to fix this:
> 1. Allow Value::Array> to pass validation if the array has 
> the expected length, and none of the contents of the array are out-of-range 
> for u8. If we go down this route, the implementation of `to_avro_datum()` 
> will have to take care of converting Value::Int to u8 when converting into 
> bytes.
> 2. Update `apache_avro::to_value()` such that fixed length arrays are 
> converted into `Value::Fixed>` rather than `Value::Array`.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3631) Fix serialization of structs containing Fixed fields

2024-07-17 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17866723#comment-17866723
 ] 

ASF subversion and git services commented on AVRO-3631:
---

Commit 380ff60f02e80735f9ef44167a2b782f1f4f7560 in avro's branch 
refs/heads/avro-3631/fix-fixed-serialization from Martin Tzvetanov Grigorov
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=380ff60f0 ]

AVRO-3631: [Rust] Use serde-byte-array crate for Rust byte array to Avro values 
conversion

Signed-off-by: Martin Tzvetanov Grigorov 


> Fix serialization of structs containing Fixed fields
> 
>
> Key: AVRO-3631
> URL: https://issues.apache.org/jira/browse/AVRO-3631
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Reporter: Rik Heijdens
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Consider the following minimal Avro Schema:
> {noformat}
> {
> "type": "record",
> "name": "TestStructFixedField",
> "fields": [
> {
> "name": "field",
> "type": {
> "name": "field",
> "type": "fixed",
> "size": 6
> }
> }
> ]
> }
> {noformat}
> In Rust, I might represent this schema with the following struct:
> {noformat}
> #[derive(Debug, Serialize, Deserialize)]
> struct TestStructFixedField {
> field: [u8; 6]
> }
> {noformat}
> I would then expect to be able to use `apache_avro::to_avro_datum()` to 
> convert an instance of `TestStructFixedField` into an `Vec` using an 
> instance of `Schema` initialized from the schema listed above.
> However, this fails because the `Value` produced by `apache_avro::to_value()` 
> represents `field` as an `Value::Array` rather than a 
> `Value::Fixed<6, Vec` which does not pass schema validation.
> I believe that there are two options to fix this:
> 1. Allow Value::Array> to pass validation if the array has 
> the expected length, and none of the contents of the array are out-of-range 
> for u8. If we go down this route, the implementation of `to_avro_datum()` 
> will have to take care of converting Value::Int to u8 when converting into 
> bytes.
> 2. Update `apache_avro::to_value()` such that fixed length arrays are 
> converted into `Value::Fixed>` rather than `Value::Array`.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3631) Fix serialization of structs containing Fixed fields

2024-07-17 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17866726#comment-17866726
 ] 

ASF subversion and git services commented on AVRO-3631:
---

Commit d04e44a555047cd9f40862e0696ad1547d6b8148 in avro's branch 
refs/heads/avro-3631/fix-fixed-serialization from Martin Tzvetanov Grigorov
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=d04e44a55 ]

AVRO-3631: [Rust] Minor improvements

Signed-off-by: Martin Tzvetanov Grigorov 


> Fix serialization of structs containing Fixed fields
> 
>
> Key: AVRO-3631
> URL: https://issues.apache.org/jira/browse/AVRO-3631
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Reporter: Rik Heijdens
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Consider the following minimal Avro Schema:
> {noformat}
> {
> "type": "record",
> "name": "TestStructFixedField",
> "fields": [
> {
> "name": "field",
> "type": {
> "name": "field",
> "type": "fixed",
> "size": 6
> }
> }
> ]
> }
> {noformat}
> In Rust, I might represent this schema with the following struct:
> {noformat}
> #[derive(Debug, Serialize, Deserialize)]
> struct TestStructFixedField {
> field: [u8; 6]
> }
> {noformat}
> I would then expect to be able to use `apache_avro::to_avro_datum()` to 
> convert an instance of `TestStructFixedField` into an `Vec` using an 
> instance of `Schema` initialized from the schema listed above.
> However, this fails because the `Value` produced by `apache_avro::to_value()` 
> represents `field` as an `Value::Array` rather than a 
> `Value::Fixed<6, Vec` which does not pass schema validation.
> I believe that there are two options to fix this:
> 1. Allow Value::Array> to pass validation if the array has 
> the expected length, and none of the contents of the array are out-of-range 
> for u8. If we go down this route, the implementation of `to_avro_datum()` 
> will have to take care of converting Value::Int to u8 when converting into 
> bytes.
> 2. Update `apache_avro::to_value()` such that fixed length arrays are 
> converted into `Value::Fixed>` rather than `Value::Array`.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3631) Fix serialization of structs containing Fixed fields

2024-07-17 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17866711#comment-17866711
 ] 

ASF subversion and git services commented on AVRO-3631:
---

Commit 7d43e42eee2516b2d3e8c3b396cb3b35520c74d1 in avro's branch 
refs/heads/avro-3631/fix-fixed-serialization from Martin Tzvetanov Grigorov
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=7d43e42ee ]

AVRO-3631: Add test-case to reproduce

Signed-off-by: Martin Tzvetanov Grigorov 


> Fix serialization of structs containing Fixed fields
> 
>
> Key: AVRO-3631
> URL: https://issues.apache.org/jira/browse/AVRO-3631
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Reporter: Rik Heijdens
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Consider the following minimal Avro Schema:
> {noformat}
> {
> "type": "record",
> "name": "TestStructFixedField",
> "fields": [
> {
> "name": "field",
> "type": {
> "name": "field",
> "type": "fixed",
> "size": 6
> }
> }
> ]
> }
> {noformat}
> In Rust, I might represent this schema with the following struct:
> {noformat}
> #[derive(Debug, Serialize, Deserialize)]
> struct TestStructFixedField {
> field: [u8; 6]
> }
> {noformat}
> I would then expect to be able to use `apache_avro::to_avro_datum()` to 
> convert an instance of `TestStructFixedField` into an `Vec` using an 
> instance of `Schema` initialized from the schema listed above.
> However, this fails because the `Value` produced by `apache_avro::to_value()` 
> represents `field` as an `Value::Array` rather than a 
> `Value::Fixed<6, Vec` which does not pass schema validation.
> I believe that there are two options to fix this:
> 1. Allow Value::Array> to pass validation if the array has 
> the expected length, and none of the contents of the array are out-of-range 
> for u8. If we go down this route, the implementation of `to_avro_datum()` 
> will have to take care of converting Value::Int to u8 when converting into 
> bytes.
> 2. Update `apache_avro::to_value()` such that fixed length arrays are 
> converted into `Value::Fixed>` rather than `Value::Array`.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3631) Fix serialization of structs containing Fixed fields

2024-07-17 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17866717#comment-17866717
 ] 

ASF subversion and git services commented on AVRO-3631:
---

Commit 5f7695ffd748e3e0fd53a87b560033661eb31063 in avro's branch 
refs/heads/avro-3631/fix-fixed-serialization from Martin Tzvetanov Grigorov
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=5f7695ffd ]

AVRO-3631: Fix clippy and Rat issues

Signed-off-by: Martin Tzvetanov Grigorov 


> Fix serialization of structs containing Fixed fields
> 
>
> Key: AVRO-3631
> URL: https://issues.apache.org/jira/browse/AVRO-3631
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Reporter: Rik Heijdens
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Consider the following minimal Avro Schema:
> {noformat}
> {
> "type": "record",
> "name": "TestStructFixedField",
> "fields": [
> {
> "name": "field",
> "type": {
> "name": "field",
> "type": "fixed",
> "size": 6
> }
> }
> ]
> }
> {noformat}
> In Rust, I might represent this schema with the following struct:
> {noformat}
> #[derive(Debug, Serialize, Deserialize)]
> struct TestStructFixedField {
> field: [u8; 6]
> }
> {noformat}
> I would then expect to be able to use `apache_avro::to_avro_datum()` to 
> convert an instance of `TestStructFixedField` into an `Vec` using an 
> instance of `Schema` initialized from the schema listed above.
> However, this fails because the `Value` produced by `apache_avro::to_value()` 
> represents `field` as an `Value::Array` rather than a 
> `Value::Fixed<6, Vec` which does not pass schema validation.
> I believe that there are two options to fix this:
> 1. Allow Value::Array> to pass validation if the array has 
> the expected length, and none of the contents of the array are out-of-range 
> for u8. If we go down this route, the implementation of `to_avro_datum()` 
> will have to take care of converting Value::Int to u8 when converting into 
> bytes.
> 2. Update `apache_avro::to_value()` such that fixed length arrays are 
> converted into `Value::Fixed>` rather than `Value::Array`.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3651) [web] Build and separate release-specific pages.

2024-07-17 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17866712#comment-17866712
 ] 

ASF subversion and git services commented on AVRO-3651:
---

Commit d75417798a8ae3169625b9ddae225f6d5159907e in avro's branch 
refs/heads/avro-3631/fix-fixed-serialization from Rik Heijdens
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=d75417798 ]

AVRO-3651: Add test to de.rs to illustrate issue with Fixed fields


> [web] Build and separate release-specific pages.
> 
>
> Key: AVRO-3651
> URL: https://issues.apache.org/jira/browse/AVRO-3651
> Project: Apache Avro
>  Issue Type: Task
>  Components: website
>Reporter: Ryan Skraba
>Priority: Major
>
> One of the major complications with the Avro website is maintaining the pages 
> for old version.  The website documentation is treated as a release artifact 
> for the release, which makes sense since the pages are contained in the 
> release.
> The Flink community separates the pages that apply to the flink project from 
> the pages that apply to a speciific release.  They are integrated together to 
> navigate easily from release 1.x.0 -> project -> release 1.y.0
> This technique would make it quite a bit easier to deploy the website, and 
> allow us to have links to "Edit this page" that can automatically open a PR 
> on GitHub.
> Project: [https://flink.apache.org/]
> Release: 
> [https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/try-flink/local_installation/]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3631) Fix serialization of structs containing Fixed fields

2024-07-17 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17866716#comment-17866716
 ] 

ASF subversion and git services commented on AVRO-3631:
---

Commit b8784012b8225935c4ee4b234d8afeaa78451f2f in avro's branch 
refs/heads/avro-3631/fix-fixed-serialization from Martin Tzvetanov Grigorov
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=b8784012b ]

AVRO-3631: Add serde serialize_with functions

Those should be used for hinting the serialization process how to serialize a 
byte array to Value::(Bytes|Fixed)

Signed-off-by: Martin Tzvetanov Grigorov 


> Fix serialization of structs containing Fixed fields
> 
>
> Key: AVRO-3631
> URL: https://issues.apache.org/jira/browse/AVRO-3631
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Reporter: Rik Heijdens
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Consider the following minimal Avro Schema:
> {noformat}
> {
> "type": "record",
> "name": "TestStructFixedField",
> "fields": [
> {
> "name": "field",
> "type": {
> "name": "field",
> "type": "fixed",
> "size": 6
> }
> }
> ]
> }
> {noformat}
> In Rust, I might represent this schema with the following struct:
> {noformat}
> #[derive(Debug, Serialize, Deserialize)]
> struct TestStructFixedField {
> field: [u8; 6]
> }
> {noformat}
> I would then expect to be able to use `apache_avro::to_avro_datum()` to 
> convert an instance of `TestStructFixedField` into an `Vec` using an 
> instance of `Schema` initialized from the schema listed above.
> However, this fails because the `Value` produced by `apache_avro::to_value()` 
> represents `field` as an `Value::Array` rather than a 
> `Value::Fixed<6, Vec` which does not pass schema validation.
> I believe that there are two options to fix this:
> 1. Allow Value::Array> to pass validation if the array has 
> the expected length, and none of the contents of the array are out-of-range 
> for u8. If we go down this route, the implementation of `to_avro_datum()` 
> will have to take care of converting Value::Int to u8 when converting into 
> bytes.
> 2. Update `apache_avro::to_value()` such that fixed length arrays are 
> converted into `Value::Fixed>` rather than `Value::Array`.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3631) Fix serialization of structs containing Fixed fields

2024-07-17 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17866715#comment-17866715
 ] 

ASF subversion and git services commented on AVRO-3631:
---

Commit 17d7c60cad43ba0141db5982640483c41cf8731e in avro's branch 
refs/heads/avro-3631/fix-fixed-serialization from Martin Tzvetanov Grigorov
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=17d7c60ca ]

AVRO-3631: Use #[serde(with)] attribute to get rid of implementation detail 
ByteArray

Signed-off-by: Martin Tzvetanov Grigorov 


> Fix serialization of structs containing Fixed fields
> 
>
> Key: AVRO-3631
> URL: https://issues.apache.org/jira/browse/AVRO-3631
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Reporter: Rik Heijdens
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Consider the following minimal Avro Schema:
> {noformat}
> {
> "type": "record",
> "name": "TestStructFixedField",
> "fields": [
> {
> "name": "field",
> "type": {
> "name": "field",
> "type": "fixed",
> "size": 6
> }
> }
> ]
> }
> {noformat}
> In Rust, I might represent this schema with the following struct:
> {noformat}
> #[derive(Debug, Serialize, Deserialize)]
> struct TestStructFixedField {
> field: [u8; 6]
> }
> {noformat}
> I would then expect to be able to use `apache_avro::to_avro_datum()` to 
> convert an instance of `TestStructFixedField` into an `Vec` using an 
> instance of `Schema` initialized from the schema listed above.
> However, this fails because the `Value` produced by `apache_avro::to_value()` 
> represents `field` as an `Value::Array` rather than a 
> `Value::Fixed<6, Vec` which does not pass schema validation.
> I believe that there are two options to fix this:
> 1. Allow Value::Array> to pass validation if the array has 
> the expected length, and none of the contents of the array are out-of-range 
> for u8. If we go down this route, the implementation of `to_avro_datum()` 
> will have to take care of converting Value::Int to u8 when converting into 
> bytes.
> 2. Update `apache_avro::to_value()` such that fixed length arrays are 
> converted into `Value::Fixed>` rather than `Value::Array`.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-4019) [C++] Correct signedness of validator methods

2024-07-17 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-4019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17866710#comment-17866710
 ] 

ASF subversion and git services commented on AVRO-4019:
---

Commit c460d64f51e34a56b2591c7114b2b302ff6576eb in avro's branch 
refs/heads/avro-3631/fix-fixed-serialization from Gerrit Birkeland
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=c460d64f5 ]

[AVRO-4019] [C++] Turn on even more compiler warnings (#2966)

* [C++] Turn on -Wuseless-cast and -Wconversion

* Print versions of compiler/tools used in CI

* Fix another compiler warning

Also attempt to make build.sh continue

* Fix the last of the compiler warnings

* Update CodecTests.cc

* Update DataFileTests.cc

* Fix conversion warning on ARM64

* Hopefully make GCC 9.4 happy

* [C++] Address review comments

> [C++] Correct signedness of validator methods
> -
>
> Key: AVRO-4019
> URL: https://issues.apache.org/jira/browse/AVRO-4019
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: c++
>Reporter: Gerrit Birkeland
>Assignee: Gerrit Birkeland
>Priority: Major
>  Labels: c++, pull-request-available
> Fix For: 1.12.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Issue for documenting signedness correction made with 
> https://github.com/apache/avro/pull/2966



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-4004) [Rust] Canonical form transformation does not strip the logicalType

2024-07-17 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17866709#comment-17866709
 ] 

ASF subversion and git services commented on AVRO-4004:
---

Commit 728b807c43c84f245d8ba6d621b2082b37b65671 in avro's branch 
refs/heads/avro-3631/fix-fixed-serialization from Martin Grigorov
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=728b807c4 ]

AVRO-4004: [Rust] Ignore logicalType fields when creating the canonical form 
(#2976)

* AVRO-4004: [Rust] Ignore logicalType fields when creating the canonical form

Signed-off-by: Martin Tzvetanov Grigorov 

* AVRO-4004: [Rust] Ignore the namespace for non-named schemas

When creating the canonical parsing form of a Schema ignore the
namespace for any non-named Schemas, i.e. anything but Record, Enum,
Fixed and Ref

Signed-off-by: Martin Tzvetanov Grigorov 

* AVRO-4004 Remove the test for round trip after canonical form

Signed-off-by: Martin Tzvetanov Grigorov 

-

Signed-off-by: Martin Tzvetanov Grigorov 

> [Rust] Canonical form transformation does not strip the logicalType 
> 
>
> Key: AVRO-4004
> URL: https://issues.apache.org/jira/browse/AVRO-4004
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Reporter: Dominik Mautz
>Assignee: Martin Tzvetanov Grigorov
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0, 1.11.4
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The Rust implementation of for the canonical transformation does not strip 
> the _logicalType_ as required by the [STRIP] rule 
> ([https://avro.apache.org/docs/1.11.0/spec.html#Transforming+into+Parsing+Canonical+Form]).
>  This results in different fingerprints for the same schema compared to other 
> implementations (at least for Python and Java)
> This is for instance can become an issue for the kafka-delta-ingest 
> ([https://github.com/delta-io/kafka-delta-ingest]).
> Rust
> {code:java}
> [package]
> name = "avro issue"
> version = "0.2.0"
> edition = "2018"
> [dependencies]
> apache-avro = "0.16.0"
> anyhow = "1.0.86"
> {code}
> {code:java}
> use anyhow::Result;
> use apache_avro::{rabin::Rabin, Schema};
> use sha2::Sha256;
> fn main() -> Result<()> {
> let schema_str = r#"
>   {
> "type": "record",
> "name": "test",
> "fields": [
> {"name": "a", "type": "long", "default": 42, "doc": "The field 
> a"},
> {"name": "b", "type": "string", "namespace": "test.a"},
> {"name": "c", "type": "long", "logicalType": "timestamp-micros"}
> ]
> }"#;
> let schema =  Schema::parse_str(schema_str)?;
> let canonical_form = schema.canonical_form();
> let fp_rabin = schema.fingerprint::();
> println!("Canonical form: {}", canonical_form);
> println!("Rabin fingerprint: {}", fp_rabin);
> Ok(())
> }
> {code}
> Output:
> {code:java}
> Canonical form: 
> {"name":"test","type":"record","fields":[{"name":"a","type":"long"},{"name":"b","type":"string"},{"name":"c","type":{"type":"long","logicalType":"timestamp-micros"}}]}
> Rabin fingerprint: 28cf0a67d9937bb3
> {code}
> As you can see, the _logicalType_ is still present in the "canonical form."
> Python
> {code:python}
>  
> import avro.schema
> schema_str = """
> {
> "type": "record",
> "name": "test",
> "fields": [
> {"name": "a", "type": "long", "default": 42, "doc": "The field 
> a"},
> {"name": "b", "type": "string", "namespace": "test.a"},
> {"name": "c", "type": "long", "logicalType": "timestamp-micros"}
> ]
> }"""
> schema = avro.schema.parse(schema_str)
> print(f"Canonical form: {schema.canonical_form}")
> print(f"Rabin fingerprint: {schema.fingerprint().hex()}")
> {code}
> Output:
> {code:java}
> Canonical form: 
> {"name":"test","type":"record","fields":[{"name":"a","type":"long"},{"name":"b","type":"string"},{"name":"c","type":"long"}]}
> Rabin fingerprint: 385501e341b00a1c
> {code}
> Java returns the same output as python.
> Imho, I think that changing the line
> [https://github.com/apache/avro/blob/main/lang/rust/avro/src/schema.rs#L2159]
> to
> {code:java}
> //...
>  if field_ordering_position(k).is_none() || k == "default" || k == "doc" || k 
> == "aliases"  || k == "logicalType" {
> //...
>  {code}
> should resolve the issue. However, I am unsure if this line should actually 
> include more even attributes (other than the currently explicitly stated).
> Nevertheless, the test in 
> [https://github.com/apache/avro/blob/fdab5db0816e28e3e10c87910c8b6f98c33072dc/lang/rust/avro/src/schema.rs#L3388]
> must also be adopted to reflect the correct transformation of the canonical 
> form and the corresponding fingerprint.
> Rabin: 

[jira] [Commented] (AVRO-4019) [C++] Correct signedness of validator methods

2024-07-16 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-4019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17866254#comment-17866254
 ] 

ASF subversion and git services commented on AVRO-4019:
---

Commit c460d64f51e34a56b2591c7114b2b302ff6576eb in avro's branch 
refs/heads/main from Gerrit Birkeland
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=c460d64f5 ]

[AVRO-4019] [C++] Turn on even more compiler warnings (#2966)

* [C++] Turn on -Wuseless-cast and -Wconversion

* Print versions of compiler/tools used in CI

* Fix another compiler warning

Also attempt to make build.sh continue

* Fix the last of the compiler warnings

* Update CodecTests.cc

* Update DataFileTests.cc

* Fix conversion warning on ARM64

* Hopefully make GCC 9.4 happy

* [C++] Address review comments

> [C++] Correct signedness of validator methods
> -
>
> Key: AVRO-4019
> URL: https://issues.apache.org/jira/browse/AVRO-4019
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: c++
>Reporter: Gerrit Birkeland
>Priority: Major
>  Labels: c++, pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Issue for documenting signedness correction made with 
> https://github.com/apache/avro/pull/2966



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-4004) [Rust] Canonical form transformation does not strip the logicalType

2024-07-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17865505#comment-17865505
 ] 

ASF subversion and git services commented on AVRO-4004:
---

Commit 4765ef58dccffb92eadda51cfcb4957ed4ea1a32 in avro's branch 
refs/heads/branch-1.11 from Martin Grigorov
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=4765ef58d ]

AVRO-4004: [Rust] Ignore logicalType fields when creating the canonical form 
(#2976)

* AVRO-4004: [Rust] Ignore logicalType fields when creating the canonical form

Signed-off-by: Martin Tzvetanov Grigorov 

* AVRO-4004: [Rust] Ignore the namespace for non-named schemas

When creating the canonical parsing form of a Schema ignore the
namespace for any non-named Schemas, i.e. anything but Record, Enum,
Fixed and Ref

Signed-off-by: Martin Tzvetanov Grigorov 

* AVRO-4004 Remove the test for round trip after canonical form

Signed-off-by: Martin Tzvetanov Grigorov 

-

Signed-off-by: Martin Tzvetanov Grigorov 
(cherry picked from commit 728b807c43c84f245d8ba6d621b2082b37b65671)


> [Rust] Canonical form transformation does not strip the logicalType 
> 
>
> Key: AVRO-4004
> URL: https://issues.apache.org/jira/browse/AVRO-4004
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Reporter: Dominik Mautz
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The Rust implementation of for the canonical transformation does not strip 
> the _logicalType_ as required by the [STRIP] rule 
> ([https://avro.apache.org/docs/1.11.0/spec.html#Transforming+into+Parsing+Canonical+Form]).
>  This results in different fingerprints for the same schema compared to other 
> implementations (at least for Python and Java)
> This is for instance can become an issue for the kafka-delta-ingest 
> ([https://github.com/delta-io/kafka-delta-ingest]).
> Rust
> {code:java}
> [package]
> name = "avro issue"
> version = "0.2.0"
> edition = "2018"
> [dependencies]
> apache-avro = "0.16.0"
> anyhow = "1.0.86"
> {code}
> {code:java}
> use anyhow::Result;
> use apache_avro::{rabin::Rabin, Schema};
> use sha2::Sha256;
> fn main() -> Result<()> {
> let schema_str = r#"
>   {
> "type": "record",
> "name": "test",
> "fields": [
> {"name": "a", "type": "long", "default": 42, "doc": "The field 
> a"},
> {"name": "b", "type": "string", "namespace": "test.a"},
> {"name": "c", "type": "long", "logicalType": "timestamp-micros"}
> ]
> }"#;
> let schema =  Schema::parse_str(schema_str)?;
> let canonical_form = schema.canonical_form();
> let fp_rabin = schema.fingerprint::();
> println!("Canonical form: {}", canonical_form);
> println!("Rabin fingerprint: {}", fp_rabin);
> Ok(())
> }
> {code}
> Output:
> {code:java}
> Canonical form: 
> {"name":"test","type":"record","fields":[{"name":"a","type":"long"},{"name":"b","type":"string"},{"name":"c","type":{"type":"long","logicalType":"timestamp-micros"}}]}
> Rabin fingerprint: 28cf0a67d9937bb3
> {code}
> As you can see, the _logicalType_ is still present in the "canonical form."
> Python
> {code:python}
>  
> import avro.schema
> schema_str = """
> {
> "type": "record",
> "name": "test",
> "fields": [
> {"name": "a", "type": "long", "default": 42, "doc": "The field 
> a"},
> {"name": "b", "type": "string", "namespace": "test.a"},
> {"name": "c", "type": "long", "logicalType": "timestamp-micros"}
> ]
> }"""
> schema = avro.schema.parse(schema_str)
> print(f"Canonical form: {schema.canonical_form}")
> print(f"Rabin fingerprint: {schema.fingerprint().hex()}")
> {code}
> Output:
> {code:java}
> Canonical form: 
> {"name":"test","type":"record","fields":[{"name":"a","type":"long"},{"name":"b","type":"string"},{"name":"c","type":"long"}]}
> Rabin fingerprint: 385501e341b00a1c
> {code}
> Java returns the same output as python.
> Imho, I think that changing the line
> [https://github.com/apache/avro/blob/main/lang/rust/avro/src/schema.rs#L2159]
> to
> {code:java}
> //...
>  if field_ordering_position(k).is_none() || k == "default" || k == "doc" || k 
> == "aliases"  || k == "logicalType" {
> //...
>  {code}
> should resolve the issue. However, I am unsure if this line should actually 
> include more even attributes (other than the currently explicitly stated).
> Nevertheless, the test in 
> [https://github.com/apache/avro/blob/fdab5db0816e28e3e10c87910c8b6f98c33072dc/lang/rust/avro/src/schema.rs#L3388]
> must also be adopted to reflect the correct transformation of the canonical 
> form and the corresponding fingerprint.
> Rabin: 385501e341b00a1c
> MD5: 

[jira] [Commented] (AVRO-4004) [Rust] Canonical form transformation does not strip the logicalType

2024-07-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17865504#comment-17865504
 ] 

ASF subversion and git services commented on AVRO-4004:
---

Commit 728b807c43c84f245d8ba6d621b2082b37b65671 in avro's branch 
refs/heads/main from Martin Grigorov
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=728b807c4 ]

AVRO-4004: [Rust] Ignore logicalType fields when creating the canonical form 
(#2976)

* AVRO-4004: [Rust] Ignore logicalType fields when creating the canonical form

Signed-off-by: Martin Tzvetanov Grigorov 

* AVRO-4004: [Rust] Ignore the namespace for non-named schemas

When creating the canonical parsing form of a Schema ignore the
namespace for any non-named Schemas, i.e. anything but Record, Enum,
Fixed and Ref

Signed-off-by: Martin Tzvetanov Grigorov 

* AVRO-4004 Remove the test for round trip after canonical form

Signed-off-by: Martin Tzvetanov Grigorov 

-

Signed-off-by: Martin Tzvetanov Grigorov 

> [Rust] Canonical form transformation does not strip the logicalType 
> 
>
> Key: AVRO-4004
> URL: https://issues.apache.org/jira/browse/AVRO-4004
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Reporter: Dominik Mautz
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The Rust implementation of for the canonical transformation does not strip 
> the _logicalType_ as required by the [STRIP] rule 
> ([https://avro.apache.org/docs/1.11.0/spec.html#Transforming+into+Parsing+Canonical+Form]).
>  This results in different fingerprints for the same schema compared to other 
> implementations (at least for Python and Java)
> This is for instance can become an issue for the kafka-delta-ingest 
> ([https://github.com/delta-io/kafka-delta-ingest]).
> Rust
> {code:java}
> [package]
> name = "avro issue"
> version = "0.2.0"
> edition = "2018"
> [dependencies]
> apache-avro = "0.16.0"
> anyhow = "1.0.86"
> {code}
> {code:java}
> use anyhow::Result;
> use apache_avro::{rabin::Rabin, Schema};
> use sha2::Sha256;
> fn main() -> Result<()> {
> let schema_str = r#"
>   {
> "type": "record",
> "name": "test",
> "fields": [
> {"name": "a", "type": "long", "default": 42, "doc": "The field 
> a"},
> {"name": "b", "type": "string", "namespace": "test.a"},
> {"name": "c", "type": "long", "logicalType": "timestamp-micros"}
> ]
> }"#;
> let schema =  Schema::parse_str(schema_str)?;
> let canonical_form = schema.canonical_form();
> let fp_rabin = schema.fingerprint::();
> println!("Canonical form: {}", canonical_form);
> println!("Rabin fingerprint: {}", fp_rabin);
> Ok(())
> }
> {code}
> Output:
> {code:java}
> Canonical form: 
> {"name":"test","type":"record","fields":[{"name":"a","type":"long"},{"name":"b","type":"string"},{"name":"c","type":{"type":"long","logicalType":"timestamp-micros"}}]}
> Rabin fingerprint: 28cf0a67d9937bb3
> {code}
> As you can see, the _logicalType_ is still present in the "canonical form."
> Python
> {code:python}
>  
> import avro.schema
> schema_str = """
> {
> "type": "record",
> "name": "test",
> "fields": [
> {"name": "a", "type": "long", "default": 42, "doc": "The field 
> a"},
> {"name": "b", "type": "string", "namespace": "test.a"},
> {"name": "c", "type": "long", "logicalType": "timestamp-micros"}
> ]
> }"""
> schema = avro.schema.parse(schema_str)
> print(f"Canonical form: {schema.canonical_form}")
> print(f"Rabin fingerprint: {schema.fingerprint().hex()}")
> {code}
> Output:
> {code:java}
> Canonical form: 
> {"name":"test","type":"record","fields":[{"name":"a","type":"long"},{"name":"b","type":"string"},{"name":"c","type":"long"}]}
> Rabin fingerprint: 385501e341b00a1c
> {code}
> Java returns the same output as python.
> Imho, I think that changing the line
> [https://github.com/apache/avro/blob/main/lang/rust/avro/src/schema.rs#L2159]
> to
> {code:java}
> //...
>  if field_ordering_position(k).is_none() || k == "default" || k == "doc" || k 
> == "aliases"  || k == "logicalType" {
> //...
>  {code}
> should resolve the issue. However, I am unsure if this line should actually 
> include more even attributes (other than the currently explicitly stated).
> Nevertheless, the test in 
> [https://github.com/apache/avro/blob/fdab5db0816e28e3e10c87910c8b6f98c33072dc/lang/rust/avro/src/schema.rs#L3388]
> must also be adopted to reflect the correct transformation of the canonical 
> form and the corresponding fingerprint.
> Rabin: 385501e341b00a1c
> MD5: 384f46367ef8c22dbbf44109b82ff7aa
> SHA-256: 

[jira] [Commented] (AVRO-4004) [Rust] Canonical form transformation does not strip the logicalType

2024-07-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17865503#comment-17865503
 ] 

ASF subversion and git services commented on AVRO-4004:
---

Commit 81bb4391b7f3ba08e99104ae816ff7ca53bbf148 in avro's branch 
refs/heads/avro-4004-strip-logical-types from Martin Tzvetanov Grigorov
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=81bb4391b ]

AVRO-4004 Remove the test for round trip after canonical form

Signed-off-by: Martin Tzvetanov Grigorov 


> [Rust] Canonical form transformation does not strip the logicalType 
> 
>
> Key: AVRO-4004
> URL: https://issues.apache.org/jira/browse/AVRO-4004
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Reporter: Dominik Mautz
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The Rust implementation of for the canonical transformation does not strip 
> the _logicalType_ as required by the [STRIP] rule 
> ([https://avro.apache.org/docs/1.11.0/spec.html#Transforming+into+Parsing+Canonical+Form]).
>  This results in different fingerprints for the same schema compared to other 
> implementations (at least for Python and Java)
> This is for instance can become an issue for the kafka-delta-ingest 
> ([https://github.com/delta-io/kafka-delta-ingest]).
> Rust
> {code:java}
> [package]
> name = "avro issue"
> version = "0.2.0"
> edition = "2018"
> [dependencies]
> apache-avro = "0.16.0"
> anyhow = "1.0.86"
> {code}
> {code:java}
> use anyhow::Result;
> use apache_avro::{rabin::Rabin, Schema};
> use sha2::Sha256;
> fn main() -> Result<()> {
> let schema_str = r#"
>   {
> "type": "record",
> "name": "test",
> "fields": [
> {"name": "a", "type": "long", "default": 42, "doc": "The field 
> a"},
> {"name": "b", "type": "string", "namespace": "test.a"},
> {"name": "c", "type": "long", "logicalType": "timestamp-micros"}
> ]
> }"#;
> let schema =  Schema::parse_str(schema_str)?;
> let canonical_form = schema.canonical_form();
> let fp_rabin = schema.fingerprint::();
> println!("Canonical form: {}", canonical_form);
> println!("Rabin fingerprint: {}", fp_rabin);
> Ok(())
> }
> {code}
> Output:
> {code:java}
> Canonical form: 
> {"name":"test","type":"record","fields":[{"name":"a","type":"long"},{"name":"b","type":"string"},{"name":"c","type":{"type":"long","logicalType":"timestamp-micros"}}]}
> Rabin fingerprint: 28cf0a67d9937bb3
> {code}
> As you can see, the _logicalType_ is still present in the "canonical form."
> Python
> {code:python}
>  
> import avro.schema
> schema_str = """
> {
> "type": "record",
> "name": "test",
> "fields": [
> {"name": "a", "type": "long", "default": 42, "doc": "The field 
> a"},
> {"name": "b", "type": "string", "namespace": "test.a"},
> {"name": "c", "type": "long", "logicalType": "timestamp-micros"}
> ]
> }"""
> schema = avro.schema.parse(schema_str)
> print(f"Canonical form: {schema.canonical_form}")
> print(f"Rabin fingerprint: {schema.fingerprint().hex()}")
> {code}
> Output:
> {code:java}
> Canonical form: 
> {"name":"test","type":"record","fields":[{"name":"a","type":"long"},{"name":"b","type":"string"},{"name":"c","type":"long"}]}
> Rabin fingerprint: 385501e341b00a1c
> {code}
> Java returns the same output as python.
> Imho, I think that changing the line
> [https://github.com/apache/avro/blob/main/lang/rust/avro/src/schema.rs#L2159]
> to
> {code:java}
> //...
>  if field_ordering_position(k).is_none() || k == "default" || k == "doc" || k 
> == "aliases"  || k == "logicalType" {
> //...
>  {code}
> should resolve the issue. However, I am unsure if this line should actually 
> include more even attributes (other than the currently explicitly stated).
> Nevertheless, the test in 
> [https://github.com/apache/avro/blob/fdab5db0816e28e3e10c87910c8b6f98c33072dc/lang/rust/avro/src/schema.rs#L3388]
> must also be adopted to reflect the correct transformation of the canonical 
> form and the corresponding fingerprint.
> Rabin: 385501e341b00a1c
> MD5: 384f46367ef8c22dbbf44109b82ff7aa
> SHA-256: 8e72f58f2d84a59d6a08e8db5fdc6484dee35babf33179cea72889ae63083f36



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-4004) [Rust] Canonical form transformation does not strip the logicalType

2024-07-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17865497#comment-17865497
 ] 

ASF subversion and git services commented on AVRO-4004:
---

Commit d17feac06e3a73ad4b9afd60fd66120ef6929da1 in avro's branch 
refs/heads/avro-4004-strip-logical-types from Martin Tzvetanov Grigorov
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=d17feac06 ]

AVRO-4004 Remove the test for round trip after canonical form

Signed-off-by: Martin Tzvetanov Grigorov 


> [Rust] Canonical form transformation does not strip the logicalType 
> 
>
> Key: AVRO-4004
> URL: https://issues.apache.org/jira/browse/AVRO-4004
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Reporter: Dominik Mautz
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The Rust implementation of for the canonical transformation does not strip 
> the _logicalType_ as required by the [STRIP] rule 
> ([https://avro.apache.org/docs/1.11.0/spec.html#Transforming+into+Parsing+Canonical+Form]).
>  This results in different fingerprints for the same schema compared to other 
> implementations (at least for Python and Java)
> This is for instance can become an issue for the kafka-delta-ingest 
> ([https://github.com/delta-io/kafka-delta-ingest]).
> Rust
> {code:java}
> [package]
> name = "avro issue"
> version = "0.2.0"
> edition = "2018"
> [dependencies]
> apache-avro = "0.16.0"
> anyhow = "1.0.86"
> {code}
> {code:java}
> use anyhow::Result;
> use apache_avro::{rabin::Rabin, Schema};
> use sha2::Sha256;
> fn main() -> Result<()> {
> let schema_str = r#"
>   {
> "type": "record",
> "name": "test",
> "fields": [
> {"name": "a", "type": "long", "default": 42, "doc": "The field 
> a"},
> {"name": "b", "type": "string", "namespace": "test.a"},
> {"name": "c", "type": "long", "logicalType": "timestamp-micros"}
> ]
> }"#;
> let schema =  Schema::parse_str(schema_str)?;
> let canonical_form = schema.canonical_form();
> let fp_rabin = schema.fingerprint::();
> println!("Canonical form: {}", canonical_form);
> println!("Rabin fingerprint: {}", fp_rabin);
> Ok(())
> }
> {code}
> Output:
> {code:java}
> Canonical form: 
> {"name":"test","type":"record","fields":[{"name":"a","type":"long"},{"name":"b","type":"string"},{"name":"c","type":{"type":"long","logicalType":"timestamp-micros"}}]}
> Rabin fingerprint: 28cf0a67d9937bb3
> {code}
> As you can see, the _logicalType_ is still present in the "canonical form."
> Python
> {code:python}
>  
> import avro.schema
> schema_str = """
> {
> "type": "record",
> "name": "test",
> "fields": [
> {"name": "a", "type": "long", "default": 42, "doc": "The field 
> a"},
> {"name": "b", "type": "string", "namespace": "test.a"},
> {"name": "c", "type": "long", "logicalType": "timestamp-micros"}
> ]
> }"""
> schema = avro.schema.parse(schema_str)
> print(f"Canonical form: {schema.canonical_form}")
> print(f"Rabin fingerprint: {schema.fingerprint().hex()}")
> {code}
> Output:
> {code:java}
> Canonical form: 
> {"name":"test","type":"record","fields":[{"name":"a","type":"long"},{"name":"b","type":"string"},{"name":"c","type":"long"}]}
> Rabin fingerprint: 385501e341b00a1c
> {code}
> Java returns the same output as python.
> Imho, I think that changing the line
> [https://github.com/apache/avro/blob/main/lang/rust/avro/src/schema.rs#L2159]
> to
> {code:java}
> //...
>  if field_ordering_position(k).is_none() || k == "default" || k == "doc" || k 
> == "aliases"  || k == "logicalType" {
> //...
>  {code}
> should resolve the issue. However, I am unsure if this line should actually 
> include more even attributes (other than the currently explicitly stated).
> Nevertheless, the test in 
> [https://github.com/apache/avro/blob/fdab5db0816e28e3e10c87910c8b6f98c33072dc/lang/rust/avro/src/schema.rs#L3388]
> must also be adopted to reflect the correct transformation of the canonical 
> form and the corresponding fingerprint.
> Rabin: 385501e341b00a1c
> MD5: 384f46367ef8c22dbbf44109b82ff7aa
> SHA-256: 8e72f58f2d84a59d6a08e8db5fdc6484dee35babf33179cea72889ae63083f36



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-4004) [Rust] Canonical form transformation does not strip the logicalType

2024-07-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17865489#comment-17865489
 ] 

ASF subversion and git services commented on AVRO-4004:
---

Commit 9456bc4e9d8de1a831aba34f35dfb05c3dfaab3a in avro's branch 
refs/heads/avro-4004-strip-logical-types from Martin Tzvetanov Grigorov
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=9456bc4e9 ]

AVRO-4004: [Rust] Ignore the namespace for non-named schemas

When creating the canonical parsing form of a Schema ignore the
namespace for any non-named Schemas, i.e. anything but Record, Enum,
Fixed and Ref

Signed-off-by: Martin Tzvetanov Grigorov 


> [Rust] Canonical form transformation does not strip the logicalType 
> 
>
> Key: AVRO-4004
> URL: https://issues.apache.org/jira/browse/AVRO-4004
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Reporter: Dominik Mautz
>Priority: Major
>
> The Rust implementation of for the canonical transformation does not strip 
> the _logicalType_ as required by the [STRIP] rule 
> ([https://avro.apache.org/docs/1.11.0/spec.html#Transforming+into+Parsing+Canonical+Form]).
>  This results in different fingerprints for the same schema compared to other 
> implementations (at least for Python and Java)
> This is for instance can become an issue for the kafka-delta-ingest 
> ([https://github.com/delta-io/kafka-delta-ingest]).
> Rust
> {code:java}
> [package]
> name = "avro issue"
> version = "0.2.0"
> edition = "2018"
> [dependencies]
> apache-avro = "0.16.0"
> anyhow = "1.0.86"
> {code}
> {code:java}
> use anyhow::Result;
> use apache_avro::{rabin::Rabin, Schema};
> use sha2::Sha256;
> fn main() -> Result<()> {
> let schema_str = r#"
>   {
> "type": "record",
> "name": "test",
> "fields": [
> {"name": "a", "type": "long", "default": 42, "doc": "The field 
> a"},
> {"name": "b", "type": "string", "namespace": "test.a"},
> {"name": "c", "type": "long", "logicalType": "timestamp-micros"}
> ]
> }"#;
> let schema =  Schema::parse_str(schema_str)?;
> let canonical_form = schema.canonical_form();
> let fp_rabin = schema.fingerprint::();
> println!("Canonical form: {}", canonical_form);
> println!("Rabin fingerprint: {}", fp_rabin);
> Ok(())
> }
> {code}
> Output:
> {code:java}
> Canonical form: 
> {"name":"test","type":"record","fields":[{"name":"a","type":"long"},{"name":"b","type":"string"},{"name":"c","type":{"type":"long","logicalType":"timestamp-micros"}}]}
> Rabin fingerprint: 28cf0a67d9937bb3
> {code}
> As you can see, the _logicalType_ is still present in the "canonical form."
> Python
> {code:python}
>  
> import avro.schema
> schema_str = """
> {
> "type": "record",
> "name": "test",
> "fields": [
> {"name": "a", "type": "long", "default": 42, "doc": "The field 
> a"},
> {"name": "b", "type": "string", "namespace": "test.a"},
> {"name": "c", "type": "long", "logicalType": "timestamp-micros"}
> ]
> }"""
> schema = avro.schema.parse(schema_str)
> print(f"Canonical form: {schema.canonical_form}")
> print(f"Rabin fingerprint: {schema.fingerprint().hex()}")
> {code}
> Output:
> {code:java}
> Canonical form: 
> {"name":"test","type":"record","fields":[{"name":"a","type":"long"},{"name":"b","type":"string"},{"name":"c","type":"long"}]}
> Rabin fingerprint: 385501e341b00a1c
> {code}
> Java returns the same output as python.
> Imho, I think that changing the line
> [https://github.com/apache/avro/blob/main/lang/rust/avro/src/schema.rs#L2159]
> to
> {code:java}
> //...
>  if field_ordering_position(k).is_none() || k == "default" || k == "doc" || k 
> == "aliases"  || k == "logicalType" {
> //...
>  {code}
> should resolve the issue. However, I am unsure if this line should actually 
> include more even attributes (other than the currently explicitly stated).
> Nevertheless, the test in 
> [https://github.com/apache/avro/blob/fdab5db0816e28e3e10c87910c8b6f98c33072dc/lang/rust/avro/src/schema.rs#L3388]
> must also be adopted to reflect the correct transformation of the canonical 
> form and the corresponding fingerprint.
> Rabin: 385501e341b00a1c
> MD5: 384f46367ef8c22dbbf44109b82ff7aa
> SHA-256: 8e72f58f2d84a59d6a08e8db5fdc6484dee35babf33179cea72889ae63083f36



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-4004) [Rust] Canonical form transformation does not strip the logicalType

2024-07-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17865451#comment-17865451
 ] 

ASF subversion and git services commented on AVRO-4004:
---

Commit 283ac88605e87b97bbed2305b5affbaf2989 in avro's branch 
refs/heads/avro-4004-strip-logical-types from Martin Tzvetanov Grigorov
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=283ac ]

AVRO-4004: [Rust] Ignore logicalType fields when creating the canonical form

Signed-off-by: Martin Tzvetanov Grigorov 


> [Rust] Canonical form transformation does not strip the logicalType 
> 
>
> Key: AVRO-4004
> URL: https://issues.apache.org/jira/browse/AVRO-4004
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Reporter: Dominik Mautz
>Priority: Major
>
> The Rust implementation of for the canonical transformation does not strip 
> the _logicalType_ as required by the [STRIP] rule 
> ([https://avro.apache.org/docs/1.11.0/spec.html#Transforming+into+Parsing+Canonical+Form]).
>  This results in different fingerprints for the same schema compared to other 
> implementations (at least for Python and Java)
> This is for instance can become an issue for the kafka-delta-ingest 
> ([https://github.com/delta-io/kafka-delta-ingest]).
> Rust
> {code:java}
> [package]
> name = "avro issue"
> version = "0.2.0"
> edition = "2018"
> [dependencies]
> apache-avro = "0.16.0"
> anyhow = "1.0.86"
> {code}
> {code:java}
> use anyhow::Result;
> use apache_avro::{rabin::Rabin, Schema};
> use sha2::Sha256;
> fn main() -> Result<()> {
> let schema_str = r#"
>   {
> "type": "record",
> "name": "test",
> "fields": [
> {"name": "a", "type": "long", "default": 42, "doc": "The field 
> a"},
> {"name": "b", "type": "string", "namespace": "test.a"},
> {"name": "c", "type": "long", "logicalType": "timestamp-micros"}
> ]
> }"#;
> let schema =  Schema::parse_str(schema_str)?;
> let canonical_form = schema.canonical_form();
> let fp_rabin = schema.fingerprint::();
> println!("Canonical form: {}", canonical_form);
> println!("Rabin fingerprint: {}", fp_rabin);
> Ok(())
> }
> {code}
> Output:
> {code:java}
> Canonical form: 
> {"name":"test","type":"record","fields":[{"name":"a","type":"long"},{"name":"b","type":"string"},{"name":"c","type":{"type":"long","logicalType":"timestamp-micros"}}]}
> Rabin fingerprint: 28cf0a67d9937bb3
> {code}
> As you can see, the _logicalType_ is still present in the "canonical form."
> Python
> {code:python}
>  
> import avro.schema
> schema_str = """
> {
> "type": "record",
> "name": "test",
> "fields": [
> {"name": "a", "type": "long", "default": 42, "doc": "The field 
> a"},
> {"name": "b", "type": "string", "namespace": "test.a"},
> {"name": "c", "type": "long", "logicalType": "timestamp-micros"}
> ]
> }"""
> schema = avro.schema.parse(schema_str)
> print(f"Canonical form: {schema.canonical_form}")
> print(f"Rabin fingerprint: {schema.fingerprint().hex()}")
> {code}
> Output:
> {code:java}
> Canonical form: 
> {"name":"test","type":"record","fields":[{"name":"a","type":"long"},{"name":"b","type":"string"},{"name":"c","type":"long"}]}
> Rabin fingerprint: 385501e341b00a1c
> {code}
> Java returns the same output as python.
> Imho, I think that changing the line
> [https://github.com/apache/avro/blob/main/lang/rust/avro/src/schema.rs#L2159]
> to
> {code:java}
> //...
>  if field_ordering_position(k).is_none() || k == "default" || k == "doc" || k 
> == "aliases"  || k == "logicalType" {
> //...
>  {code}
> should resolve the issue. However, I am unsure if this line should actually 
> include more even attributes (other than the currently explicitly stated).
> Nevertheless, the test in 
> [https://github.com/apache/avro/blob/fdab5db0816e28e3e10c87910c8b6f98c33072dc/lang/rust/avro/src/schema.rs#L3388]
> must also be adopted to reflect the correct transformation of the canonical 
> form and the corresponding fingerprint.
> Rabin: 385501e341b00a1c
> MD5: 384f46367ef8c22dbbf44109b82ff7aa
> SHA-256: 8e72f58f2d84a59d6a08e8db5fdc6484dee35babf33179cea72889ae63083f36



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-4014) [Rust] Sporadic value-schema mismatch with fixed struct

2024-07-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17865449#comment-17865449
 ] 

ASF subversion and git services commented on AVRO-4014:
---

Commit 7e04c388bb41ca7fe205febb34d36a666489e0a4 in avro's branch 
refs/heads/avro-4004-strip-logical-types from Matt Tanous
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=7e04c388b ]

AVRO-4014: [Rust] Add value and schema to ValidationWithReason error class 
(#3007)



> [Rust] Sporadic value-schema mismatch with fixed struct
> ---
>
> Key: AVRO-4014
> URL: https://issues.apache.org/jira/browse/AVRO-4014
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Affects Versions: 1.11.3
>Reporter: Matthew Tanous
>Assignee: Martin Tzvetanov Grigorov
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.12.0, 1.11.4
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> We are trying to Avro encode a structure before writing to Kafka, and when we 
> are at high load writing the struct into an Avro writer (we started seeing 
> around 2.6% error rates at 500K messages per minute) we start seeing this 
> error: 
> {code:java}
> Value does not match schema: Reason: Unsupported value-schema 
> combination{code}
> This is surprising as the same logic is used to build the record in each 
> case, and that record is built using the Avro record type with the same 
> schema:
> {code:java}
> Record::new(){code}
> This is the code that is ultimately raising the error, but because it is not 
> specifying _which_ value does not match _which_ part of the schema, it is 
> extremely difficult to debug.
> {code:java}
>                 let mut writer = Writer::new(, Vec::new());
>                 writer
>                     .append(record) // This will fail if the message and 
> schema don't match
>                     .map_err(|err| Report::msg(err.to_string()))?;{code}
> A simple start would be to add logging of the value and the schema that are 
> mismatched to help us debug this issue, as I'm not able to determine if the 
> `apache-avro` library is doing something erroneous or our code is breaking in 
> some unforeseen way.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3687) Rust enum missing default

2024-07-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17865445#comment-17865445
 ] 

ASF subversion and git services commented on AVRO-3687:
---

Commit 3413ac504b74d920a4fbaa1e106529e36ffb512c in avro's branch 
refs/heads/avro-4004-strip-logical-types from John Bell
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=3413ac504 ]

AVRO-3687 [Rust - avro_derive]: Add support for default enum values for rust 
derive macros (#2954)

* Add support for default enum values for rust derive macros

* AVRO-3687: Better error messages when multiple enum variants are marked as 
default

Signed-off-by: Martin Tzvetanov Grigorov 

-

Signed-off-by: Martin Tzvetanov Grigorov 
Co-authored-by: Martin Tzvetanov Grigorov 

> Rust enum missing default
> -
>
> Key: AVRO-3687
> URL: https://issues.apache.org/jira/browse/AVRO-3687
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Affects Versions: 1.11.1
>Reporter: Santiago Fraire Willemoes
>Assignee: Martin Tzvetanov Grigorov
>Priority: Major
>  Labels: enum, pull-request-available, rust
> Fix For: 1.12.0, 1.11.4
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> I cannot seem to find the enum's default attribute as documented [in the 
> spec|https://avro.apache.org/docs/1.11.1/specification/#enums:~:text=for%20names).-,default,-%3A%20A%20default%20value]
> I'm trying to create an avdl parser and this is a blocker for me. I was 
> wondering if there's a reason for this. Otherwise I can submit a PR, please 
> let me know, thanks.
> Code sample:
> {code}
> let schema_str = 
> r#"{"name":"Shapes","type":"enum","symbols":["SQUARE","TRIANGLE","CIRCLE","OVAL"],
>  "default": "SQUARE"}"#;
> let r = Schema::parse_str(schema_str).unwrap();
> let can = r.canonical_form();
> println!("{r:?}");
> println!("{can}");
> {code}
> Observe the enum in its canonical form is missing the default.
> Looking at the Enum's code, we cannot see a default field:
> https://github.com/apache/avro/blob/master/lang/rust/avro/src/schema.rs#L113-L119
> I apologize if this is somehow wrong



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-4015) avro-cpp does not work with CMake's FetchContent

2024-07-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17865448#comment-17865448
 ] 

ASF subversion and git services commented on AVRO-4015:
---

Commit 8281e610ab5666bb3ae6e22a2cc2b11af0fb9226 in avro's branch 
refs/heads/avro-4004-strip-logical-types from Viper Bailey
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=8281e610a ]

AVRO-4015: [C++] fixed the c++ build to facilitate using it with FetchContent 
(#3008)

- exposed the include libraries in such a way that they can be accessed
- added the necessary libs to the static target

Co-authored-by: Michael Bailey 

> avro-cpp does not work with CMake's FetchContent
> 
>
> Key: AVRO-4015
> URL: https://issues.apache.org/jira/browse/AVRO-4015
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: build, c++
>Reporter: Michael Bailey
>Assignee: Martin Tzvetanov Grigorov
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Currently it is not possible to link Avro-cpp as a dependency to a c++ 
> project using on FetchContent(). The principle reason is that the include 
> directories are not properly exposed on the avrocpp and avrocpp_s targets. 
> One other problem is that the necessary libraries are not linked to the 
> avrocpp_s target, meaning any attempt to link against that target will result 
> in linker errors.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-4010) Avoid resolving schema on every call to read()

2024-07-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-4010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17865444#comment-17865444
 ] 

ASF subversion and git services commented on AVRO-4010:
---

Commit f3b6ee2d32ae5200675e345b4d26b151caf3034b in avro's branch 
refs/heads/avro-4004-strip-logical-types from Michael Spector
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=f3b6ee2d3 ]

AVRO-4010: [Rust] Avoid re-resolving schema on every read() (#2995)

Co-authored-by: Michael Spector 

> Avoid resolving schema on every call to read()
> --
>
> Key: AVRO-4010
> URL: https://issues.apache.org/jira/browse/AVRO-4010
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: rust
>Affects Versions: 1.11.3
>Reporter: Michael Spector
>Assignee: Martin Tzvetanov Grigorov
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0, 1.11.4
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> `ResolvedSchema::try_from()` is called from within `Reader::read()`, which 
> can be easily avoided if resolved schema is cached along with writer schema 
> once initialized.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3635) [Java] BinaryDecoder trapped into infinite loop while decode crafted data

2024-07-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17865447#comment-17865447
 ] 

ASF subversion and git services commented on AVRO-3635:
---

Commit 9233d64356c782141b8d2c1abd70371d7ad6e0d1 in avro's branch 
refs/heads/avro-4004-strip-logical-types from Oscar Westra van Holthe - Kind
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=9233d6435 ]

AVRO-3635: Disallow skipping a negative amount of bytes (#2997)

This is what all other implementations of this method do, and fixes
infinite loops due to malicious data.

> [Java] BinaryDecoder trapped into infinite loop while decode crafted data
> -
>
> Key: AVRO-3635
> URL: https://issues.apache.org/jira/browse/AVRO-3635
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.11.0
>Reporter: bismillah
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> stackrace:
>  
> {code:java}
> "DataComputingThread5" #58 prio=5 os_prio=0 tid=0x8ab4b000 
> nid=0x13907 runnable [0x3ce11000]
>    java.lang.Thread.State: RUNNABLE
>     at org.apache.avro.io.BinaryDecoder.doSkipItems(BinaryDecoder.java:454)
>     at org.apache.avro.io.BinaryDecoder.skipArray(BinaryDecoder.java:473)
>     at 
> org.apache.avro.generic.GenericDatumReader.skip(GenericDatumReader.java:576)
>     at 
> org.apache.avro.io.FastReaderBuilder.lambda$initializeRecordReader$0(FastReaderBuilder.java:159)
>     at 
> org.apache.avro.io.FastReaderBuilder$$Lambda$652/470404086.execute(Unknown 
> Source)
>     at 
> org.apache.avro.io.FastReaderBuilder$RecordReader.read(FastReaderBuilder.java:576)
>     at 
> org.apache.avro.io.FastReaderBuilder.lambda$createUnionReader$30(FastReaderBuilder.java:413)
>     at 
> org.apache.avro.io.FastReaderBuilder$$Lambda$679/1790128078.read(Unknown 
> Source)
>     at 
> org.apache.avro.io.FastReaderBuilder.lambda$createFieldSetter$1(FastReaderBuilder.java:182)
> ... {code}
>  
> specific code:
> {code:java}
> private long doSkipItems() throws IOException {
> long result;
> for(result = this.readLong(); result < 0L; result = this.readLong()) {
> long bytecount = this.readLong();
> this.doSkipBytes(bytecount);
> }
> return result;
> }
> protected void doSkipBytes(long length) throws IOException {
> int remaining = this.limit - this.pos;
> if (length <= (long)remaining) {
> this.pos = (int)((long)this.pos + length);
> } else {
> this.limit = this.pos = 0;
> length -= (long)remaining;
> this.source.skipSourceBytes(length);
> }
> } {code}
> if the bytecount is negative, during doSkipBytes, the pos is moved forward. 
> As a result, the previous data is parsed repeatedly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-4016) Remove the use of MD5 in org.apache.avro.file.DataFileWriter#generateSync

2024-07-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-4016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17865450#comment-17865450
 ] 

ASF subversion and git services commented on AVRO-4016:
---

Commit 25d86840557e7b2e33c78d425131e5c19693e461 in avro's branch 
refs/heads/avro-4004-strip-logical-types from Oscar Westra van Holthe - Kind
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=25d868405 ]

AVRO-4016: Use SecureRandom for file sync markers (#3016)



> Remove the use of MD5 in org.apache.avro.file.DataFileWriter#generateSync
> -
>
> Key: AVRO-4016
> URL: https://issues.apache.org/jira/browse/AVRO-4016
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Affects Versions: 1.11.3
>Reporter: Oscar Westra van Holthe - Kind
>Assignee: Oscar Westra van Holthe - Kind
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In the chat, someone mentioned using a FIPS environment, which disallows the 
> use of insecure cryptographic hash functions, like MD5.
> The {{DataFileWriter}} class uses an MD5 hash of a random UUID and a 
> timestamp to generate what's essentially 16 random bytes.
> This can more easily be done with {{{}SecureRandom{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-4013) PHP 8 Deprecations

2024-07-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-4013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17865446#comment-17865446
 ] 

ASF subversion and git services commented on AVRO-4013:
---

Commit 701a8447bae4cf259e1269a793ac83ddd4d45aab in avro's branch 
refs/heads/avro-4004-strip-logical-types from Thiago Romão Barcala
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=701a8447b ]

AVRO-4013: [PHP] PHP 8 deprecations (#3000)



> PHP 8 Deprecations
> --
>
> Key: AVRO-4013
> URL: https://issues.apache.org/jira/browse/AVRO-4013
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: php
>Reporter: Thiago Romão Barcala
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> PHP 8 added some deprecations that will cause errors in future releases:
>  - Properties that are declared dynamically:
>  -- {{AvroDataIOWriter::$sync_marker}}
>  -- {{AvroProtocol::$protocol}}
>  -- {{AvroProtocolMessage::$name}}
>  - Function {{strftime}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-4016) Remove the use of MD5 in org.apache.avro.file.DataFileWriter#generateSync

2024-07-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-4016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17864939#comment-17864939
 ] 

ASF subversion and git services commented on AVRO-4016:
---

Commit dc520149f092be197455833d4f46f712868a1546 in avro's branch 
refs/heads/branch-1.11 from Oscar Westra van Holthe - Kind
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=dc520149f ]

AVRO-4016: Use SecureRandom for file sync markers (#3016)

(cherry-picked from 25d86840557e7b2e33c78d425131e5c19693e461)


> Remove the use of MD5 in org.apache.avro.file.DataFileWriter#generateSync
> -
>
> Key: AVRO-4016
> URL: https://issues.apache.org/jira/browse/AVRO-4016
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Affects Versions: 1.11.3
>Reporter: Oscar Westra van Holthe - Kind
>Assignee: Oscar Westra van Holthe - Kind
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In the chat, someone mentioned using a FIPS environment, which disallows the 
> use of insecure cryptographic hash functions, like MD5.
> The {{DataFileWriter}} class uses an MD5 hash of a random UUID and a 
> timestamp to generate what's essentially 16 random bytes.
> This can more easily be done with {{{}SecureRandom{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-4016) Remove the use of MD5 in org.apache.avro.file.DataFileWriter#generateSync

2024-07-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-4016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17864925#comment-17864925
 ] 

ASF subversion and git services commented on AVRO-4016:
---

Commit 25d86840557e7b2e33c78d425131e5c19693e461 in avro's branch 
refs/heads/main from Oscar Westra van Holthe - Kind
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=25d868405 ]

AVRO-4016: Use SecureRandom for file sync markers (#3016)



> Remove the use of MD5 in org.apache.avro.file.DataFileWriter#generateSync
> -
>
> Key: AVRO-4016
> URL: https://issues.apache.org/jira/browse/AVRO-4016
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Affects Versions: 1.11.3
>Reporter: Oscar Westra van Holthe - Kind
>Assignee: Oscar Westra van Holthe - Kind
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In the chat, someone mentioned using a FIPS environment, which disallows the 
> use of insecure cryptographic hash functions, like MD5.
> The {{DataFileWriter}} class uses an MD5 hash of a random UUID and a 
> timestamp to generate what's essentially 16 random bytes.
> This can more easily be done with {{{}SecureRandom{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-4014) [Rust] Sporadic value-schema mismatch with fixed struct

2024-07-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17864920#comment-17864920
 ] 

ASF subversion and git services commented on AVRO-4014:
---

Commit e3e5101ebfa76d0a386f3021fdcc10c1159d1153 in avro's branch 
refs/heads/branch-1.11 from Matt Tanous
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=e3e5101eb ]

AVRO-4014: [Rust] Add value and schema to ValidationWithReason error class 
(#3007)

(cherry picked from commit 7e04c388bb41ca7fe205febb34d36a666489e0a4)


> [Rust] Sporadic value-schema mismatch with fixed struct
> ---
>
> Key: AVRO-4014
> URL: https://issues.apache.org/jira/browse/AVRO-4014
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Affects Versions: 1.11.3
>Reporter: Matthew Tanous
>Priority: Blocker
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> We are trying to Avro encode a structure before writing to Kafka, and when we 
> are at high load writing the struct into an Avro writer (we started seeing 
> around 2.6% error rates at 500K messages per minute) we start seeing this 
> error: 
> {code:java}
> Value does not match schema: Reason: Unsupported value-schema 
> combination{code}
> This is surprising as the same logic is used to build the record in each 
> case, and that record is built using the Avro record type with the same 
> schema:
> {code:java}
> Record::new(){code}
> This is the code that is ultimately raising the error, but because it is not 
> specifying _which_ value does not match _which_ part of the schema, it is 
> extremely difficult to debug.
> {code:java}
>                 let mut writer = Writer::new(, Vec::new());
>                 writer
>                     .append(record) // This will fail if the message and 
> schema don't match
>                     .map_err(|err| Report::msg(err.to_string()))?;{code}
> A simple start would be to add logging of the value and the schema that are 
> mismatched to help us debug this issue, as I'm not able to determine if the 
> `apache-avro` library is doing something erroneous or our code is breaking in 
> some unforeseen way.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-4014) [Rust] Sporadic value-schema mismatch with fixed struct

2024-07-10 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17864753#comment-17864753
 ] 

ASF subversion and git services commented on AVRO-4014:
---

Commit 7e04c388bb41ca7fe205febb34d36a666489e0a4 in avro's branch 
refs/heads/main from Matt Tanous
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=7e04c388b ]

AVRO-4014: [Rust] Add value and schema to ValidationWithReason error class 
(#3007)



> [Rust] Sporadic value-schema mismatch with fixed struct
> ---
>
> Key: AVRO-4014
> URL: https://issues.apache.org/jira/browse/AVRO-4014
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Affects Versions: 1.11.3
>Reporter: Matthew Tanous
>Priority: Blocker
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> We are trying to Avro encode a structure before writing to Kafka, and when we 
> are at high load writing the struct into an Avro writer (we started seeing 
> around 2.6% error rates at 500K messages per minute) we start seeing this 
> error: 
> {code:java}
> Value does not match schema: Reason: Unsupported value-schema 
> combination{code}
> This is surprising as the same logic is used to build the record in each 
> case, and that record is built using the Avro record type with the same 
> schema:
> {code:java}
> Record::new(){code}
> This is the code that is ultimately raising the error, but because it is not 
> specifying _which_ value does not match _which_ part of the schema, it is 
> extremely difficult to debug.
> {code:java}
>                 let mut writer = Writer::new(, Vec::new());
>                 writer
>                     .append(record) // This will fail if the message and 
> schema don't match
>                     .map_err(|err| Report::msg(err.to_string()))?;{code}
> A simple start would be to add logging of the value and the schema that are 
> mismatched to help us debug this issue, as I'm not able to determine if the 
> `apache-avro` library is doing something erroneous or our code is breaking in 
> some unforeseen way.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-4015) avro-cpp does not work with CMake's FetchContent

2024-07-10 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17864619#comment-17864619
 ] 

ASF subversion and git services commented on AVRO-4015:
---

Commit 8281e610ab5666bb3ae6e22a2cc2b11af0fb9226 in avro's branch 
refs/heads/main from Viper Bailey
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=8281e610a ]

AVRO-4015: [C++] fixed the c++ build to facilitate using it with FetchContent 
(#3008)

- exposed the include libraries in such a way that they can be accessed
- added the necessary libs to the static target

Co-authored-by: Michael Bailey 

> avro-cpp does not work with CMake's FetchContent
> 
>
> Key: AVRO-4015
> URL: https://issues.apache.org/jira/browse/AVRO-4015
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: build, c++
>Reporter: Michael Bailey
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Currently it is not possible to link Avro-cpp as a dependency to a c++ 
> project using on FetchContent(). The principle reason is that the include 
> directories are not properly exposed on the avrocpp and avrocpp_s targets. 
> One other problem is that the necessary libraries are not linked to the 
> avrocpp_s target, meaning any attempt to link against that target will result 
> in linker errors.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3904) [rust] Sometimes when calculating schema compatibility the code panics but maybe it should not

2024-07-08 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863757#comment-17863757
 ] 

ASF subversion and git services commented on AVRO-3904:
---

Commit 708daa2b42ae6c37343cb8ad987303aaf8476afa in avro's branch 
refs/heads/branch-1.11 from Marcos Schroh
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=708daa2b4 ]

AVRO-3904: [Rust] return a Result when checking schema compatibility so the end 
users will have feedback in case or errors

Co-authored-by: Marcos Schroh 

(cherry picked from commit 1cea6907a24773bdc5d7282fdd90e92b6aef0ab3)


> [rust] Sometimes when calculating schema compatibility the code panics but 
> maybe it should not
> --
>
> Key: AVRO-3904
> URL: https://issues.apache.org/jira/browse/AVRO-3904
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: rust
>Reporter: Marcos Schroh
>Assignee: Christophe Le Saec
>Priority: Minor
>  Labels: pull-request-available, rust
> Fix For: 1.12.0, 1.11.4
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> When calculating the *schema compatibility can read* (schema A can read or 
> not an event written using a schema B) we expect a *true* or *false* result, 
> but in some cases the code *panics with a message* (reason of schema not 
> being compatible) which I think it should not.
> Example:
> {code:js}
> let schema_1 = Schema::parse_str(
> r#"{
>     "type": "record",
>     "name": "StatisticsMap",
>     "fields": [
>         {"name": "success", "type": {"type": "map", "values": "int"}}
>     ]
> }"#)?;
> let schema_2 = Schema::parse_str(
>     r#"{
> "type": "record",
> "name": "StatisticsMap",
> "fields": [
>     {"name": "success", "type": ["null", {"type": "map", "values": "int"}], 
> "default": null}
> ]
> }"#)?
> assert!(SchemaCompatibility::can_read(_1, _2)); # true as 
> expected
> assert!(SchemaCompatibility::can_read(_2, _1)); # expected 
> result false!!
> The application panicked (crashed).
> Message:  internal error: entered unreachable code: writers_schema should 
> have been Schema::Map
>  
> {code}
> PS: If the intention is to give feedback to end users when schemas are not 
> compatible then it makes sense the panic (maybe a Result should be better?) 
> but the feedback should be present every time that the result is {*}false{*}, 
> which is not the case.
> I have a PR ready to fix this in case that we want to change the current 
> behaviour



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3635) [Java] BinaryDecoder trapped into infinite loop while decode crafted data

2024-07-08 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863755#comment-17863755
 ] 

ASF subversion and git services commented on AVRO-3635:
---

Commit 9233d64356c782141b8d2c1abd70371d7ad6e0d1 in avro's branch 
refs/heads/dependabot/maven/lang/java/jetty.version-9.4.55.v20240627 from Oscar 
Westra van Holthe - Kind
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=9233d6435 ]

AVRO-3635: Disallow skipping a negative amount of bytes (#2997)

This is what all other implementations of this method do, and fixes
infinite loops due to malicious data.

> [Java] BinaryDecoder trapped into infinite loop while decode crafted data
> -
>
> Key: AVRO-3635
> URL: https://issues.apache.org/jira/browse/AVRO-3635
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.11.0
>Reporter: bismillah
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> stackrace:
>  
> {code:java}
> "DataComputingThread5" #58 prio=5 os_prio=0 tid=0x8ab4b000 
> nid=0x13907 runnable [0x3ce11000]
>    java.lang.Thread.State: RUNNABLE
>     at org.apache.avro.io.BinaryDecoder.doSkipItems(BinaryDecoder.java:454)
>     at org.apache.avro.io.BinaryDecoder.skipArray(BinaryDecoder.java:473)
>     at 
> org.apache.avro.generic.GenericDatumReader.skip(GenericDatumReader.java:576)
>     at 
> org.apache.avro.io.FastReaderBuilder.lambda$initializeRecordReader$0(FastReaderBuilder.java:159)
>     at 
> org.apache.avro.io.FastReaderBuilder$$Lambda$652/470404086.execute(Unknown 
> Source)
>     at 
> org.apache.avro.io.FastReaderBuilder$RecordReader.read(FastReaderBuilder.java:576)
>     at 
> org.apache.avro.io.FastReaderBuilder.lambda$createUnionReader$30(FastReaderBuilder.java:413)
>     at 
> org.apache.avro.io.FastReaderBuilder$$Lambda$679/1790128078.read(Unknown 
> Source)
>     at 
> org.apache.avro.io.FastReaderBuilder.lambda$createFieldSetter$1(FastReaderBuilder.java:182)
> ... {code}
>  
> specific code:
> {code:java}
> private long doSkipItems() throws IOException {
> long result;
> for(result = this.readLong(); result < 0L; result = this.readLong()) {
> long bytecount = this.readLong();
> this.doSkipBytes(bytecount);
> }
> return result;
> }
> protected void doSkipBytes(long length) throws IOException {
> int remaining = this.limit - this.pos;
> if (length <= (long)remaining) {
> this.pos = (int)((long)this.pos + length);
> } else {
> this.limit = this.pos = 0;
> length -= (long)remaining;
> this.source.skipSourceBytes(length);
> }
> } {code}
> if the bytecount is negative, during doSkipBytes, the pos is moved forward. 
> As a result, the previous data is parsed repeatedly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3635) [Java] BinaryDecoder trapped into infinite loop while decode crafted data

2024-07-07 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863620#comment-17863620
 ] 

ASF subversion and git services commented on AVRO-3635:
---

Commit 9233d64356c782141b8d2c1abd70371d7ad6e0d1 in avro's branch 
refs/heads/main from Oscar Westra van Holthe - Kind
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=9233d6435 ]

AVRO-3635: Disallow skipping a negative amount of bytes (#2997)

This is what all other implementations of this method do, and fixes
infinite loops due to malicious data.

> [Java] BinaryDecoder trapped into infinite loop while decode crafted data
> -
>
> Key: AVRO-3635
> URL: https://issues.apache.org/jira/browse/AVRO-3635
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.11.0
>Reporter: bismillah
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> stackrace:
>  
> {code:java}
> "DataComputingThread5" #58 prio=5 os_prio=0 tid=0x8ab4b000 
> nid=0x13907 runnable [0x3ce11000]
>    java.lang.Thread.State: RUNNABLE
>     at org.apache.avro.io.BinaryDecoder.doSkipItems(BinaryDecoder.java:454)
>     at org.apache.avro.io.BinaryDecoder.skipArray(BinaryDecoder.java:473)
>     at 
> org.apache.avro.generic.GenericDatumReader.skip(GenericDatumReader.java:576)
>     at 
> org.apache.avro.io.FastReaderBuilder.lambda$initializeRecordReader$0(FastReaderBuilder.java:159)
>     at 
> org.apache.avro.io.FastReaderBuilder$$Lambda$652/470404086.execute(Unknown 
> Source)
>     at 
> org.apache.avro.io.FastReaderBuilder$RecordReader.read(FastReaderBuilder.java:576)
>     at 
> org.apache.avro.io.FastReaderBuilder.lambda$createUnionReader$30(FastReaderBuilder.java:413)
>     at 
> org.apache.avro.io.FastReaderBuilder$$Lambda$679/1790128078.read(Unknown 
> Source)
>     at 
> org.apache.avro.io.FastReaderBuilder.lambda$createFieldSetter$1(FastReaderBuilder.java:182)
> ... {code}
>  
> specific code:
> {code:java}
> private long doSkipItems() throws IOException {
> long result;
> for(result = this.readLong(); result < 0L; result = this.readLong()) {
> long bytecount = this.readLong();
> this.doSkipBytes(bytecount);
> }
> return result;
> }
> protected void doSkipBytes(long length) throws IOException {
> int remaining = this.limit - this.pos;
> if (length <= (long)remaining) {
> this.pos = (int)((long)this.pos + length);
> } else {
> this.limit = this.pos = 0;
> length -= (long)remaining;
> this.source.skipSourceBytes(length);
> }
> } {code}
> if the bytecount is negative, during doSkipBytes, the pos is moved forward. 
> As a result, the previous data is parsed repeatedly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-4006) [Java] DataFileReader does not correctly identify last sync marker when reading/skipping blocks

2024-07-07 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-4006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863611#comment-17863611
 ] 

ASF subversion and git services commented on AVRO-4006:
---

Commit 2490231cf5352cad1df02682c99a4cd11242b98c in avro's branch 
refs/heads/dependabot/maven/lang/java/org.apache.hadoop-hadoop-client-3.4.0 
from Oscar Westra van Holthe - Kind
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=2490231cf ]

AVRO-4006: Fix block finish while reading data files (#2969)



> [Java] DataFileReader does not correctly identify last sync marker when 
> reading/skipping blocks
> ---
>
> Key: AVRO-4006
> URL: https://issues.apache.org/jira/browse/AVRO-4006
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.11.3
>Reporter: Oscar Westra van Holthe - Kind
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> The following code demonstrates the problem:
> {code:java}
> import org.apache.avro.Schema;
> import org.apache.avro.SchemaBuilder;
> import org.apache.avro.file.DataFileReader;
> import org.apache.avro.file.DataFileWriter;
> import org.apache.avro.file.SeekableFileInput;
> import org.apache.avro.generic.GenericData;
> import org.apache.avro.generic.GenericDatumReader;
> import org.apache.avro.generic.GenericDatumWriter;
> import org.apache.avro.generic.IndexedRecord;
> import org.apache.avro.io.DatumReader;
> import java.io.File;
> import java.io.IOException;
> public class AvroTest {
> public static void main(String[] args) throws IOException {
> File avroFile = new File("test.avro");
> GenericData model = GenericData.get();
> Schema simple = 
> SchemaBuilder.record("TestRecord").fields().requiredString("text").endRecord();
> Schema.Field textField = simple.getField("text");
> try (DataFileWriter writer = new DataFileWriter<>(new 
> GenericDatumWriter<>(null, model)).create(simple, avroFile)) {
> for (int i = 1; i <= 1000; i++) {
> Object record = model.newRecord(null, simple);
> model.setField(record, textField.name(), textField.pos(), "i 
> = " + i);
> writer.append(record);
> if (i % 100 == 0) {
> long syncPos = writer.sync();
> System.out.printf("Synced %d records; file position 
> %d%n", i, syncPos);
> }
> }
> }
> IndexedRecord result;
> DatumReader datumReader = new 
> GenericDatumReader<>(simple, simple, model);
> try (SeekableFileInput sfi = new SeekableFileInput(avroFile);
>  MyDataFileReader reader = new 
> MyDataFileReader<>(sfi, datumReader)) {
> // Find the start of the last block reading the entire file, 
> WITHOUT decoding any records.
> // Note that this does decompress the data, but that's so fast 
> these days that it hardly affects reading speed.
> long lastSyncPos = reader.previousSync();
> while (reader.hasNext()) {
> lastSyncPos = reader.previousSync();
> System.out.printf("Sync marker at %d%n", lastSyncPos);
> // Mark the block as read, so hasNext() will read the next 
> block
> reader.nextBlock();
> }
> System.out.printf("Sync marker at %d%n", reader.previousSync());
> reader.seek(lastSyncPos);
> IndexedRecord lastRecord1 = null;
> int decoded = 0;
> while (reader.hasNext()) {
> lastRecord1 = reader.next(lastRecord1);
> decoded++;
> }
> System.out.printf("Decoded %d records%n", decoded);
> result = lastRecord1;
> }
> Object lastRecord = result;
> System.out.printf("Last record: %s%n", lastRecord);
> }
> private static class MyDataFileReader extends DataFileReader {
> public MyDataFileReader(SeekableFileInput sfi, DatumReader 
> datumReader) throws IOException {
> super(sfi, datumReader);
> }
> @Override
> public void blockFinished() throws IOException {
> super.blockFinished();
> }
> }
> }
> {code}
> The output:
> {noformat}
> Synced 100 records; file position 828
> Synced 200 records; file position 1648
> Synced 300 records; file position 2468
> Synced 400 records; file position 3288
> Synced 500 records; file position 4108
> Synced 600 records; file position 4928
> Synced 700 records; file position 5748
> Synced 800 records; file position 6568
> Synced 900 records; file position 7388
> Synced 1000 records; 

[jira] [Commented] (AVRO-3687) Rust enum missing default

2024-07-07 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863616#comment-17863616
 ] 

ASF subversion and git services commented on AVRO-3687:
---

Commit 3413ac504b74d920a4fbaa1e106529e36ffb512c in avro's branch 
refs/heads/dependabot/maven/lang/java/org.apache.hadoop-hadoop-client-3.4.0 
from John Bell
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=3413ac504 ]

AVRO-3687 [Rust - avro_derive]: Add support for default enum values for rust 
derive macros (#2954)

* Add support for default enum values for rust derive macros

* AVRO-3687: Better error messages when multiple enum variants are marked as 
default

Signed-off-by: Martin Tzvetanov Grigorov 

-

Signed-off-by: Martin Tzvetanov Grigorov 
Co-authored-by: Martin Tzvetanov Grigorov 

> Rust enum missing default
> -
>
> Key: AVRO-3687
> URL: https://issues.apache.org/jira/browse/AVRO-3687
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Affects Versions: 1.11.1
>Reporter: Santiago Fraire Willemoes
>Assignee: Martin Tzvetanov Grigorov
>Priority: Major
>  Labels: enum, pull-request-available, rust
> Fix For: 1.12.0, 1.11.4
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> I cannot seem to find the enum's default attribute as documented [in the 
> spec|https://avro.apache.org/docs/1.11.1/specification/#enums:~:text=for%20names).-,default,-%3A%20A%20default%20value]
> I'm trying to create an avdl parser and this is a blocker for me. I was 
> wondering if there's a reason for this. Otherwise I can submit a PR, please 
> let me know, thanks.
> Code sample:
> {code}
> let schema_str = 
> r#"{"name":"Shapes","type":"enum","symbols":["SQUARE","TRIANGLE","CIRCLE","OVAL"],
>  "default": "SQUARE"}"#;
> let r = Schema::parse_str(schema_str).unwrap();
> let can = r.canonical_form();
> println!("{r:?}");
> println!("{can}");
> {code}
> Observe the enum in its canonical form is missing the default.
> Looking at the Enum's code, we cannot see a default field:
> https://github.com/apache/avro/blob/master/lang/rust/avro/src/schema.rs#L113-L119
> I apologize if this is somehow wrong



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-1517) Unicode strings are accepted as bytes and fixed type by perl API

2024-07-07 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863612#comment-17863612
 ] 

ASF subversion and git services commented on AVRO-1517:
---

Commit 677e9829bae30cc76527c6f5702f8c2384be61c5 in avro's branch 
refs/heads/dependabot/maven/lang/java/org.apache.hadoop-hadoop-client-3.4.0 
from José Joaquín Atria
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=677e9829b ]

AVRO-1517: [Perl] Encode UTF-8 strings as bytes (#2979)

>From John Karp's original description of [the issue]:

> By default in Perl, a string is a sequence of bytes, values 0-255.
> However, if a Unicode character is included that cannot be represented
> with a single byte, the string gets 'upgraded' to a non-byte-based
> Unicode string allowing ordinals outside that range. When string
> operations are done with byte and non-byte Unicode strings, the result
> is always non-byte, with the byte string first 'upgraded'. Upgrading
> consists of utf8 encoding and setting a utf8 flag on the string. ('utf8'
> is a variant of UTF-8 used by Perl)
>
> The Perl Avro API is accepting these Unicode strings as-is for the
> 'bytes' type. This is a problem because
>
>   1. values >255 are not valid as bytes, and any encoding is their job
>
>   2. As Avro assembles the serialized data, Perl 'upgrades' all the data,
>  having the effect of utf8 encoding our serialized binary data.
>
> The correct behavior is for the Avro Perl API is to attempt to downgrade
> the string, and if this fails because it contained values >255 then to
> raise an error. (The behavior of 'string' won't change, it will still
> take Unicode strings as expected.)

This change, based on the one submitted for that ticket, adds these
behaviours and tests to exercise them.

[the issue]: https://issues.apache.org/jira/browse/AVRO-1517

> Unicode strings are accepted as bytes and fixed type by perl API
> 
>
> Key: AVRO-1517
> URL: https://issues.apache.org/jira/browse/AVRO-1517
> Project: Apache Avro
>  Issue Type: Bug
>  Components: perl
>Reporter: John Karp
>Assignee: José Joaquín Atria
>Priority: Major
> Fix For: 1.12.0
>
> Attachments: AVRO-1517.patch
>
>
> By default in perl, a string is a sequence of bytes, values 0-255. However, 
> if a Unicode character is included that cannot be represented with a single 
> byte, the string gets 'upgraded' to a non-byte-based Unicode string allowing 
> ordinals outside that range. When string operations are done with byte and 
> non-byte Unicode strings, the result is always non-byte, with the byte string 
> first 'upgraded'. Upgrading consists of utf8 encoding and setting a utf8 flag 
> on the string. ('utf8' is a variant of UTF-8 used by perl)
> The perl Avro API is accepting these Unicode strings as-is for the 'bytes' 
> type. This is a problem because 1) values >255 are not valid as bytes, and 
> any encoding is their job. 2) As Avro assembles the serialized data, perl 
> 'upgrades' all the data, having the effect of utf8 encoding our serialized 
> binary data.
> The correct behavior is for the Avro perl API is to attempt to downgrade the 
> string, and if this fails because of contained values >255 then to raise an 
> error. (The behavior of 'string' won't change, it will still take Unicode 
> strings as expected.)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3992) [C++] Encoding a record with 0 fields in a vector throws

2024-07-07 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863609#comment-17863609
 ] 

ASF subversion and git services commented on AVRO-3992:
---

Commit 49587555fa79214bdee8929ee9cdf1c4d3e183f6 in avro's branch 
refs/heads/dependabot/maven/lang/java/org.apache.hadoop-hadoop-client-3.4.0 
from Gerrit Birkeland
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=49587555f ]

AVRO-3992 [C++] Fix compiler warnings in code generated by schema with empty 
record (#2927)

* [C++] Fix compiler warnings in code generated by schema with empty record

* [C++] Generate union names for record array-unions

Added to make writing a test for empty unions easier.

* [C++] Fix validatingEncoder for records

> [C++] Encoding a record with 0 fields in a vector throws
> 
>
> Key: AVRO-3992
> URL: https://issues.apache.org/jira/browse/AVRO-3992
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: c++
>Affects Versions: 1.11.3
>Reporter: Gerrit Birkeland
>Assignee: Gerrit Birkeland
>Priority: Major
> Fix For: 1.12.0
>
>
> I have an Avro schema resembling the following:
> {code:java}
> {
>   "type": "record",
>   "name": "StackCalculator",
>   "fields": [
> {
>   "name": "stack",
>   "type": {
> "type": "array",
> "items": [
>   "int",
>   {
> "type": "record",
> "name": "Dup",
> "fields": []
>   },
>   {
> "type": "record",
> "name": "Add",
> "fields": []
>   }
> ]
>   }
> }
>   ]
> }
> {code}
> If I create one of these records with the stack:
> {code:java}
> uer::StackCalculator calc;
> uer::StackCalculator::stack_item_t item;
> item.set_int(3);
> calc.stack.push_back(item);
> item.set_Dup(uer::Dup());
> calc.stack.push_back(item);
> item.set_Add(uer::Add());
> calc.stack.push_back(item);
> {code}
> and try to encode this
> {code:java}
> ValidSchema s;
> ifstream ifs("jsonschemas/union_empty_record");
> compileJsonSchema(ifs, s);
> unique_ptr os = memoryOutputStream();
> EncoderPtr e = validatingEncoder(s, jsonPrettyEncoder());
> e->init(*os);
> avro::encode(*e, calc);
> {code}
> Avro throws {{{}startItem at not an item boundary{}}}. If the records without 
> fields are given a dummy field, this works.
> Fix available at [https://github.com/apache/avro/pull/2927] - bot didn't pick 
> it up since the PR was first



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3748) issue with DataFileSeekableInput.SeekableInputStream.skip

2024-07-07 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863613#comment-17863613
 ] 

ASF subversion and git services commented on AVRO-3748:
---

Commit 9443fa9b84d4ebf89f0a6dfd7341283609650d98 in avro's branch 
refs/heads/dependabot/maven/lang/java/org.apache.hadoop-hadoop-client-3.4.0 
from Oscar Westra van Holthe - Kind
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=9443fa9b8 ]

AVRO-3748: [Java] Fix SeekableInput.skip (#2984)

* AVRO-3748: Fix SeekableInput.skip

Two of the implementations of SeekableInput.skip had a bug: skip was
implemented as seek (i.e. using an absolute input position instead of a
relative one). This fixes that.

* AVRO-3748: Avoid reset+skip confusion

> issue with DataFileSeekableInput.SeekableInputStream.skip
> -
>
> Key: AVRO-3748
> URL: https://issues.apache.org/jira/browse/AVRO-3748
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.11.1
>Reporter: Steven Aerts
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0, 1.11.4
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> We found a longstanding bug in the implementation of 
> {{DataFileSeekableInput.SeekableInputStream.skip.}}
> This skip function is not hit that often.  It can for example be hit when the 
> FastReader is enabled and it tries to skip a significant amount of data.
> The implmentation of this function is however fault and can result in data 
> corruption or 
> {{{}java.io.EOFException{}}}, as instead of skipping the number of bytes, it 
> will seek to a wrong place in the file.
>  
> We have a pull request ready to fix and test this issue.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-1521) Inconsistent behavior of Perl API with 'boolean' type

2024-07-07 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863614#comment-17863614
 ] 

ASF subversion and git services commented on AVRO-1521:
---

Commit 82d864fd3751e77ecd255b6b28914926d72916f9 in avro's branch 
refs/heads/dependabot/maven/lang/java/org.apache.hadoop-hadoop-client-3.4.0 
from José Joaquín Atria
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=82d864fd3 ]

AVRO-1521 [Perl] Fix boolean encoding errors (#2986)

This change fixes a long-standing issue with the binary encoding
of boolean values. In particular, that while several "smart" values
were accepted as valid boolean values by Avro::Schema (eg. "true"
and "no"), Avro::BinaryEncoder encoded them as true or false depending
on their truth value for Perl. This resulted in both of those examples
being encoded as true, because for Perl any non-empty string is true.

This change makes it so that those values are accepted and properly
handled, and handles other values that represent boolean values
like JSON::PP::Boolean references and native Perl booleans (those
that would be returned by eg. builtin::true).

This also includes a small but possibly breaking bugfix for the
detection of valid boolean values in Avro::Schema, which was using
a non-anchored regular expression to filter values, meaning that
eg. any value that had an "n" anywhere would be considered valid.
This was most likely an involuntary error, so while breaking, it
feels like we have to fix it.

> Inconsistent behavior of Perl API with 'boolean' type
> -
>
> Key: AVRO-1521
> URL: https://issues.apache.org/jira/browse/AVRO-1521
> Project: Apache Avro
>  Issue Type: Bug
>  Components: perl
>Reporter: John Karp
>Assignee: José Joaquín Atria
>Priority: Major
> Fix For: 1.12.0
>
>
> The perl boolean serialization code in BinaryEncoder.pm encodes anything 
> false to perl, such as 0, '0', '', () and undef, as false, and anything true 
> to perl, which is literally everything else, as true.
> Inconsistent with the above serialization, the code used in Schema.pm to 
> determine which union branch to use, is checking for boolean-ness with:
> {noformat}
> m{yes|no|y|n|t|f|true|false}i
> {noformat}
> meaning only those particular strings are considered booleans.
> So all those values, including 'no' 'n' 'f' and 'false', still get serialized 
> to true.
> We could just standardize on one of the two and use it consistently. But 
> neither works that well in unions, because unless you put the boolean type 
> last in the union definition, a wide variety of data will be downcast to 
> boolean type.
> Perl has no built-in or standardized boolean type, so there's no solution 
> like we have in the other language Avro APIs. But we could do as the perl 
> JSON module does, and define objects for true and false.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-4013) PHP 8 Deprecations

2024-07-07 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-4013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863617#comment-17863617
 ] 

ASF subversion and git services commented on AVRO-4013:
---

Commit 701a8447bae4cf259e1269a793ac83ddd4d45aab in avro's branch 
refs/heads/dependabot/maven/lang/java/org.apache.hadoop-hadoop-client-3.4.0 
from Thiago Romão Barcala
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=701a8447b ]

AVRO-4013: [PHP] PHP 8 deprecations (#3000)



> PHP 8 Deprecations
> --
>
> Key: AVRO-4013
> URL: https://issues.apache.org/jira/browse/AVRO-4013
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: php
>Reporter: Thiago Romão Barcala
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> PHP 8 added some deprecations that will cause errors in future releases:
>  - Properties that are declared dynamically:
>  -- {{AvroDataIOWriter::$sync_marker}}
>  -- {{AvroProtocol::$protocol}}
>  -- {{AvroProtocolMessage::$name}}
>  - Function {{strftime}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-1463) Undefined values cause warnings when unions with null serialized

2024-07-07 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-1463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863610#comment-17863610
 ] 

ASF subversion and git services commented on AVRO-1463:
---

Commit 695695478f497b347defec30fb58c9bf2c7a134d in avro's branch 
refs/heads/dependabot/maven/lang/java/org.apache.hadoop-hadoop-client-3.4.0 
from José Joaquín Atria
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=695695478 ]

AVRO-1463 [Perl] Quietly validate undefined values (#2975)



> Undefined values cause warnings when unions with null serialized
> 
>
> Key: AVRO-1463
> URL: https://issues.apache.org/jira/browse/AVRO-1463
> Project: Apache Avro
>  Issue Type: Bug
>  Components: perl
>Reporter: John Karp
>Assignee: José Joaquín Atria
>Priority: Minor
> Fix For: 1.12.0
>
> Attachments: AVRO-1463.patch
>
>
> This code produces warnings:
> {noformat}
> $enc = '';
> $schema = Avro::Schema->parse(q(["long","null"]));
> Avro::BinaryEncoder->encode(
> schema => $schema,
> data => undef,
> emit_cb => sub { $enc .= ${ $_[0] } },
> );
> {noformat}
> {noformat}
> Use of uninitialized value $data in pack at 
> /home/johnkarp/git/avro/lang/perl/blib/lib/Avro/Schema.pm line 285.
> Use of uninitialized value $data in string eq at 
> /home/johnkarp/git/avro/lang/perl/blib/lib/Avro/Schema.pm line 287.
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-4010) Avoid resolving schema on every call to read()

2024-07-07 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-4010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863615#comment-17863615
 ] 

ASF subversion and git services commented on AVRO-4010:
---

Commit f3b6ee2d32ae5200675e345b4d26b151caf3034b in avro's branch 
refs/heads/dependabot/maven/lang/java/org.apache.hadoop-hadoop-client-3.4.0 
from Michael Spector
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=f3b6ee2d3 ]

AVRO-4010: [Rust] Avoid re-resolving schema on every read() (#2995)

Co-authored-by: Michael Spector 

> Avoid resolving schema on every call to read()
> --
>
> Key: AVRO-4010
> URL: https://issues.apache.org/jira/browse/AVRO-4010
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: rust
>Affects Versions: 1.11.3
>Reporter: Michael Spector
>Assignee: Martin Tzvetanov Grigorov
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0, 1.11.4
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> `ResolvedSchema::try_from()` is called from within `Reader::read()`, which 
> can be easily avoided if resolved schema is cached along with writer schema 
> once initialized.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-4013) PHP 8 Deprecations

2024-07-07 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-4013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863604#comment-17863604
 ] 

ASF subversion and git services commented on AVRO-4013:
---

Commit 701a8447bae4cf259e1269a793ac83ddd4d45aab in avro's branch 
refs/heads/main from Thiago Romão Barcala
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=701a8447b ]

AVRO-4013: [PHP] PHP 8 deprecations (#3000)



> PHP 8 Deprecations
> --
>
> Key: AVRO-4013
> URL: https://issues.apache.org/jira/browse/AVRO-4013
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: php
>Reporter: Thiago Romão Barcala
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> PHP 8 added some deprecations that will cause errors in future releases:
>  - Properties that are declared dynamically:
>  -- {{AvroDataIOWriter::$sync_marker}}
>  -- {{AvroProtocol::$protocol}}
>  -- {{AvroProtocolMessage::$name}}
>  - Function {{strftime}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3687) Rust enum missing default

2024-07-04 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863036#comment-17863036
 ] 

ASF subversion and git services commented on AVRO-3687:
---

Commit 3413ac504b74d920a4fbaa1e106529e36ffb512c in avro's branch 
refs/heads/main from John Bell
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=3413ac504 ]

AVRO-3687 [Rust - avro_derive]: Add support for default enum values for rust 
derive macros (#2954)

* Add support for default enum values for rust derive macros

* AVRO-3687: Better error messages when multiple enum variants are marked as 
default

Signed-off-by: Martin Tzvetanov Grigorov 

-

Signed-off-by: Martin Tzvetanov Grigorov 
Co-authored-by: Martin Tzvetanov Grigorov 

> Rust enum missing default
> -
>
> Key: AVRO-3687
> URL: https://issues.apache.org/jira/browse/AVRO-3687
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Affects Versions: 1.11.1
>Reporter: Santiago Fraire Willemoes
>Priority: Major
>  Labels: enum, pull-request-available, rust
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> I cannot seem to find the enum's default attribute as documented [in the 
> spec|https://avro.apache.org/docs/1.11.1/specification/#enums:~:text=for%20names).-,default,-%3A%20A%20default%20value]
> I'm trying to create an avdl parser and this is a blocker for me. I was 
> wondering if there's a reason for this. Otherwise I can submit a PR, please 
> let me know, thanks.
> Code sample:
> {code}
> let schema_str = 
> r#"{"name":"Shapes","type":"enum","symbols":["SQUARE","TRIANGLE","CIRCLE","OVAL"],
>  "default": "SQUARE"}"#;
> let r = Schema::parse_str(schema_str).unwrap();
> let can = r.canonical_form();
> println!("{r:?}");
> println!("{can}");
> {code}
> Observe the enum in its canonical form is missing the default.
> Looking at the Enum's code, we cannot see a default field:
> https://github.com/apache/avro/blob/master/lang/rust/avro/src/schema.rs#L113-L119
> I apologize if this is somehow wrong



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3687) Rust enum missing default

2024-07-04 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863037#comment-17863037
 ] 

ASF subversion and git services commented on AVRO-3687:
---

Commit 823ec2a0b5ea963f557dfbd44456edc124f98d84 in avro's branch 
refs/heads/branch-1.11 from John Bell
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=823ec2a0b ]

AVRO-3687 [Rust - avro_derive]: Add support for default enum values for rust 
derive macros (#2954)

* Add support for default enum values for rust derive macros

* AVRO-3687: Better error messages when multiple enum variants are marked as 
default

Signed-off-by: Martin Tzvetanov Grigorov 

-

Signed-off-by: Martin Tzvetanov Grigorov 
Co-authored-by: Martin Tzvetanov Grigorov 
(cherry picked from commit 3413ac504b74d920a4fbaa1e106529e36ffb512c)


> Rust enum missing default
> -
>
> Key: AVRO-3687
> URL: https://issues.apache.org/jira/browse/AVRO-3687
> Project: Apache Avro
>  Issue Type: Bug
>  Components: rust
>Affects Versions: 1.11.1
>Reporter: Santiago Fraire Willemoes
>Priority: Major
>  Labels: enum, pull-request-available, rust
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> I cannot seem to find the enum's default attribute as documented [in the 
> spec|https://avro.apache.org/docs/1.11.1/specification/#enums:~:text=for%20names).-,default,-%3A%20A%20default%20value]
> I'm trying to create an avdl parser and this is a blocker for me. I was 
> wondering if there's a reason for this. Otherwise I can submit a PR, please 
> let me know, thanks.
> Code sample:
> {code}
> let schema_str = 
> r#"{"name":"Shapes","type":"enum","symbols":["SQUARE","TRIANGLE","CIRCLE","OVAL"],
>  "default": "SQUARE"}"#;
> let r = Schema::parse_str(schema_str).unwrap();
> let can = r.canonical_form();
> println!("{r:?}");
> println!("{can}");
> {code}
> Observe the enum in its canonical form is missing the default.
> Looking at the Enum's code, we cannot see a default field:
> https://github.com/apache/avro/blob/master/lang/rust/avro/src/schema.rs#L113-L119
> I apologize if this is somehow wrong



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-1517) Unicode strings are accepted as bytes and fixed type by perl API

2024-07-04 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863030#comment-17863030
 ] 

ASF subversion and git services commented on AVRO-1517:
---

Commit 677e9829bae30cc76527c6f5702f8c2384be61c5 in avro's branch 
refs/heads/dependabot/cargo/lang/rust/env_logger-0.11.3 from José Joaquín Atria
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=677e9829b ]

AVRO-1517: [Perl] Encode UTF-8 strings as bytes (#2979)

>From John Karp's original description of [the issue]:

> By default in Perl, a string is a sequence of bytes, values 0-255.
> However, if a Unicode character is included that cannot be represented
> with a single byte, the string gets 'upgraded' to a non-byte-based
> Unicode string allowing ordinals outside that range. When string
> operations are done with byte and non-byte Unicode strings, the result
> is always non-byte, with the byte string first 'upgraded'. Upgrading
> consists of utf8 encoding and setting a utf8 flag on the string. ('utf8'
> is a variant of UTF-8 used by Perl)
>
> The Perl Avro API is accepting these Unicode strings as-is for the
> 'bytes' type. This is a problem because
>
>   1. values >255 are not valid as bytes, and any encoding is their job
>
>   2. As Avro assembles the serialized data, Perl 'upgrades' all the data,
>  having the effect of utf8 encoding our serialized binary data.
>
> The correct behavior is for the Avro Perl API is to attempt to downgrade
> the string, and if this fails because it contained values >255 then to
> raise an error. (The behavior of 'string' won't change, it will still
> take Unicode strings as expected.)

This change, based on the one submitted for that ticket, adds these
behaviours and tests to exercise them.

[the issue]: https://issues.apache.org/jira/browse/AVRO-1517

> Unicode strings are accepted as bytes and fixed type by perl API
> 
>
> Key: AVRO-1517
> URL: https://issues.apache.org/jira/browse/AVRO-1517
> Project: Apache Avro
>  Issue Type: Bug
>  Components: perl
>Reporter: John Karp
>Assignee: José Joaquín Atria
>Priority: Major
> Fix For: 1.12.0
>
> Attachments: AVRO-1517.patch
>
>
> By default in perl, a string is a sequence of bytes, values 0-255. However, 
> if a Unicode character is included that cannot be represented with a single 
> byte, the string gets 'upgraded' to a non-byte-based Unicode string allowing 
> ordinals outside that range. When string operations are done with byte and 
> non-byte Unicode strings, the result is always non-byte, with the byte string 
> first 'upgraded'. Upgrading consists of utf8 encoding and setting a utf8 flag 
> on the string. ('utf8' is a variant of UTF-8 used by perl)
> The perl Avro API is accepting these Unicode strings as-is for the 'bytes' 
> type. This is a problem because 1) values >255 are not valid as bytes, and 
> any encoding is their job. 2) As Avro assembles the serialized data, perl 
> 'upgrades' all the data, having the effect of utf8 encoding our serialized 
> binary data.
> The correct behavior is for the Avro perl API is to attempt to downgrade the 
> string, and if this fails because of contained values >255 then to raise an 
> error. (The behavior of 'string' won't change, it will still take Unicode 
> strings as expected.)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-4010) Avoid resolving schema on every call to read()

2024-07-04 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-4010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863033#comment-17863033
 ] 

ASF subversion and git services commented on AVRO-4010:
---

Commit f3b6ee2d32ae5200675e345b4d26b151caf3034b in avro's branch 
refs/heads/dependabot/cargo/lang/rust/env_logger-0.11.3 from Michael Spector
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=f3b6ee2d3 ]

AVRO-4010: [Rust] Avoid re-resolving schema on every read() (#2995)

Co-authored-by: Michael Spector 

> Avoid resolving schema on every call to read()
> --
>
> Key: AVRO-4010
> URL: https://issues.apache.org/jira/browse/AVRO-4010
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: rust
>Affects Versions: 1.11.3
>Reporter: Michael Spector
>Assignee: Martin Tzvetanov Grigorov
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0, 1.11.4
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> `ResolvedSchema::try_from()` is called from within `Reader::read()`, which 
> can be easily avoided if resolved schema is cached along with writer schema 
> once initialized.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3748) issue with DataFileSeekableInput.SeekableInputStream.skip

2024-07-04 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863031#comment-17863031
 ] 

ASF subversion and git services commented on AVRO-3748:
---

Commit 9443fa9b84d4ebf89f0a6dfd7341283609650d98 in avro's branch 
refs/heads/dependabot/cargo/lang/rust/env_logger-0.11.3 from Oscar Westra van 
Holthe - Kind
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=9443fa9b8 ]

AVRO-3748: [Java] Fix SeekableInput.skip (#2984)

* AVRO-3748: Fix SeekableInput.skip

Two of the implementations of SeekableInput.skip had a bug: skip was
implemented as seek (i.e. using an absolute input position instead of a
relative one). This fixes that.

* AVRO-3748: Avoid reset+skip confusion

> issue with DataFileSeekableInput.SeekableInputStream.skip
> -
>
> Key: AVRO-3748
> URL: https://issues.apache.org/jira/browse/AVRO-3748
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.11.1
>Reporter: Steven Aerts
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0, 1.11.4
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> We found a longstanding bug in the implementation of 
> {{DataFileSeekableInput.SeekableInputStream.skip.}}
> This skip function is not hit that often.  It can for example be hit when the 
> FastReader is enabled and it tries to skip a significant amount of data.
> The implmentation of this function is however fault and can result in data 
> corruption or 
> {{{}java.io.EOFException{}}}, as instead of skipping the number of bytes, it 
> will seek to a wrong place in the file.
>  
> We have a pull request ready to fix and test this issue.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-1523) Perl API: int/long type minimum value checks are off by one

2024-07-04 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-1523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863026#comment-17863026
 ] 

ASF subversion and git services commented on AVRO-1523:
---

Commit 6863074f20a7c6e4a34780877206723a5d3c4e24 in avro's branch 
refs/heads/dependabot/cargo/lang/rust/env_logger-0.11.3 from José Joaquín Atria
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=6863074f2 ]

AVRO-1523 [Perl] Fix valid range for int and long (#2974)



> Perl API: int/long type minimum value checks are off by one
> ---
>
> Key: AVRO-1523
> URL: https://issues.apache.org/jira/browse/AVRO-1523
> Project: Apache Avro
>  Issue Type: Bug
>  Components: perl
>Reporter: John Karp
>Assignee: José Joaquín Atria
>Priority: Minor
> Fix For: 1.12.0
>
> Attachments: AVRO-1523.patch
>
>
> -2,147,483,648 is rejected as an int, and −9,223,372,036,854,775,808 is 
> rejected as a long when passed to the binary encoder, but they are valid 
> signed 32-bit and 64-bit numbers respectively.
> The problem is that the range check is made against the absolute value of the 
> input, but in two's complement arithmetic types the minimum and maximum 
> values have different absolute values.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-1830) Avro-Perl DataFileReader chokes when avro.codec is absent

2024-07-04 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863024#comment-17863024
 ] 

ASF subversion and git services commented on AVRO-1830:
---

Commit e62c8ee22132f345aa56f7782811b1e001512b11 in avro's branch 
refs/heads/dependabot/cargo/lang/rust/env_logger-0.11.3 from José Joaquín Atria
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=e62c8ee22 ]

AVRO-1830 [Perl] Support containers without codec (#2965)



> Avro-Perl DataFileReader chokes when avro.codec is absent
> -
>
> Key: AVRO-1830
> URL: https://issues.apache.org/jira/browse/AVRO-1830
> Project: Apache Avro
>  Issue Type: Bug
>  Components: perl
>Affects Versions: 1.8.0
>Reporter: SK Liew
>Assignee: Martin Tzvetanov Grigorov
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.12.0
>
> Attachments: Avro-1830.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When a container does not specify its "avro.codec", it should be assumed to 
> be "null". An exception is thrown when I try to read such a container using 
> Avro::DataFileReader. The error happens at Avro/DataFileReader.pm line 101.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3985) Restrict trusted packages in ReflectData and SpecificData

2024-07-04 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863025#comment-17863025
 ] 

ASF subversion and git services commented on AVRO-3985:
---

Commit f6b3bd7e50e6e09fedddb98c61558c022ba31285 in avro's branch 
refs/heads/dependabot/cargo/lang/rust/env_logger-0.11.3 from JB Onofré
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=f6b3bd7e5 ]

AVRO-3985: Add trusted packages support in SpecificData (#2934)

* AVRO-3985: Add trusted packages support in SpecificData

* Apply suggestions from code review

Co-authored-by: Martin Grigorov 

* Move to SecurityException

* Remove redundant import

-

Co-authored-by: Fokko Driesprong 
Co-authored-by: Martin Grigorov 

> Restrict trusted packages in ReflectData and SpecificData
> -
>
> Key: AVRO-3985
> URL: https://issues.apache.org/jira/browse/AVRO-3985
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Reporter: Jean-Baptiste Onofré
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0, 1.11.4
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Right now, there's no check in allowed packages in {{ReflectData}} and 
> {{{}SpecificData{}}}.
> That could be problematic for marshalling/unmarshalling, as the as malicious 
> payload can exploit the host system.
> I propose to introduce a {{org.apache.avro.TRUSTED_PACKAGES}} system property:
> {code:java}
> -Dorg.apache.avro.TRUSTED_PACKAGES=my.package,my.other.package,...{code}
> In case we want to shortcut the mechanism, we would be able to allow all 
> packages to be trusted using {{*}} wildcard:
> {code:java}
> -Dorg.apache.avro.TRUSTED_PACKAGES=*{code}
> By default, I would recommend to have limited trusted packages: 
> {{{}java.lang,javax.security,java.util,org.apache.avro{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-4007) [Rust] Faster is_nullable for UnionSchema

2024-07-04 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-4007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863023#comment-17863023
 ] 

ASF subversion and git services commented on AVRO-4007:
---

Commit 4eda118a42f930bdc6f463621e2a6450098cbfe7 in avro's branch 
refs/heads/dependabot/cargo/lang/rust/env_logger-0.11.3 from John Emhoff
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=4eda118a4 ]

AVRO-4007: [rust] Faster `is_nullable` for UnionSchema (#2961)

* Faster `is_nullable` for UnionSchema

I'm writing several gigabytes of Avro and noticed that it seems
oddly slow. I ran a profile and noticed that about 25% of my total
run time was being spent in `UnionSchema::is_nullable`.

It looks like what's happening is that the test `x == Schema::Null`
is slow because the equality test involves a schema canonicalization.

I've updated the match to match against Schema::Null instead and see
a significant performance increase.

* Fix formatting

* Apply clippy suggestion

Signed-off-by: Martin Tzvetanov Grigorov 

-

Signed-off-by: Martin Tzvetanov Grigorov 
Co-authored-by: Martin Grigorov 
Co-authored-by: Martin Tzvetanov Grigorov 

> [Rust] Faster is_nullable for UnionSchema
> -
>
> Key: AVRO-4007
> URL: https://issues.apache.org/jira/browse/AVRO-4007
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: rust
>Reporter: Martin Tzvetanov Grigorov
>Assignee: Martin Tzvetanov Grigorov
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.12.0, 1.11.4
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> https://github.com/apache/avro/pull/2961
> {code}
> Writing large amounts of avro data in rust is slow because (in my case) ~40% 
> of total run time is spent in the function UnionSchema::is_nullable. The 
> issue is that the x == Schema::Null invokes schema canonicalization which is
> apparently somewhat slow. I've modified the method to use match instead and 
> see a considerable performance improvement.
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-4006) [Java] DataFileReader does not correctly identify last sync marker when reading/skipping blocks

2024-07-04 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-4006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863029#comment-17863029
 ] 

ASF subversion and git services commented on AVRO-4006:
---

Commit 2490231cf5352cad1df02682c99a4cd11242b98c in avro's branch 
refs/heads/dependabot/cargo/lang/rust/env_logger-0.11.3 from Oscar Westra van 
Holthe - Kind
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=2490231cf ]

AVRO-4006: Fix block finish while reading data files (#2969)



> [Java] DataFileReader does not correctly identify last sync marker when 
> reading/skipping blocks
> ---
>
> Key: AVRO-4006
> URL: https://issues.apache.org/jira/browse/AVRO-4006
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.11.3
>Reporter: Oscar Westra van Holthe - Kind
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> The following code demonstrates the problem:
> {code:java}
> import org.apache.avro.Schema;
> import org.apache.avro.SchemaBuilder;
> import org.apache.avro.file.DataFileReader;
> import org.apache.avro.file.DataFileWriter;
> import org.apache.avro.file.SeekableFileInput;
> import org.apache.avro.generic.GenericData;
> import org.apache.avro.generic.GenericDatumReader;
> import org.apache.avro.generic.GenericDatumWriter;
> import org.apache.avro.generic.IndexedRecord;
> import org.apache.avro.io.DatumReader;
> import java.io.File;
> import java.io.IOException;
> public class AvroTest {
> public static void main(String[] args) throws IOException {
> File avroFile = new File("test.avro");
> GenericData model = GenericData.get();
> Schema simple = 
> SchemaBuilder.record("TestRecord").fields().requiredString("text").endRecord();
> Schema.Field textField = simple.getField("text");
> try (DataFileWriter writer = new DataFileWriter<>(new 
> GenericDatumWriter<>(null, model)).create(simple, avroFile)) {
> for (int i = 1; i <= 1000; i++) {
> Object record = model.newRecord(null, simple);
> model.setField(record, textField.name(), textField.pos(), "i 
> = " + i);
> writer.append(record);
> if (i % 100 == 0) {
> long syncPos = writer.sync();
> System.out.printf("Synced %d records; file position 
> %d%n", i, syncPos);
> }
> }
> }
> IndexedRecord result;
> DatumReader datumReader = new 
> GenericDatumReader<>(simple, simple, model);
> try (SeekableFileInput sfi = new SeekableFileInput(avroFile);
>  MyDataFileReader reader = new 
> MyDataFileReader<>(sfi, datumReader)) {
> // Find the start of the last block reading the entire file, 
> WITHOUT decoding any records.
> // Note that this does decompress the data, but that's so fast 
> these days that it hardly affects reading speed.
> long lastSyncPos = reader.previousSync();
> while (reader.hasNext()) {
> lastSyncPos = reader.previousSync();
> System.out.printf("Sync marker at %d%n", lastSyncPos);
> // Mark the block as read, so hasNext() will read the next 
> block
> reader.nextBlock();
> }
> System.out.printf("Sync marker at %d%n", reader.previousSync());
> reader.seek(lastSyncPos);
> IndexedRecord lastRecord1 = null;
> int decoded = 0;
> while (reader.hasNext()) {
> lastRecord1 = reader.next(lastRecord1);
> decoded++;
> }
> System.out.printf("Decoded %d records%n", decoded);
> result = lastRecord1;
> }
> Object lastRecord = result;
> System.out.printf("Last record: %s%n", lastRecord);
> }
> private static class MyDataFileReader extends DataFileReader {
> public MyDataFileReader(SeekableFileInput sfi, DatumReader 
> datumReader) throws IOException {
> super(sfi, datumReader);
> }
> @Override
> public void blockFinished() throws IOException {
> super.blockFinished();
> }
> }
> }
> {code}
> The output:
> {noformat}
> Synced 100 records; file position 828
> Synced 200 records; file position 1648
> Synced 300 records; file position 2468
> Synced 400 records; file position 3288
> Synced 500 records; file position 4108
> Synced 600 records; file position 4928
> Synced 700 records; file position 5748
> Synced 800 records; file position 6568
> Synced 900 records; file position 7388
> Synced 1000 records; file position 8209
> 

[jira] [Commented] (AVRO-3992) [C++] Encoding a record with 0 fields in a vector throws

2024-07-04 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863027#comment-17863027
 ] 

ASF subversion and git services commented on AVRO-3992:
---

Commit 49587555fa79214bdee8929ee9cdf1c4d3e183f6 in avro's branch 
refs/heads/dependabot/cargo/lang/rust/env_logger-0.11.3 from Gerrit Birkeland
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=49587555f ]

AVRO-3992 [C++] Fix compiler warnings in code generated by schema with empty 
record (#2927)

* [C++] Fix compiler warnings in code generated by schema with empty record

* [C++] Generate union names for record array-unions

Added to make writing a test for empty unions easier.

* [C++] Fix validatingEncoder for records

> [C++] Encoding a record with 0 fields in a vector throws
> 
>
> Key: AVRO-3992
> URL: https://issues.apache.org/jira/browse/AVRO-3992
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: c++
>Affects Versions: 1.11.3
>Reporter: Gerrit Birkeland
>Assignee: Gerrit Birkeland
>Priority: Major
> Fix For: 1.12.0
>
>
> I have an Avro schema resembling the following:
> {code:java}
> {
>   "type": "record",
>   "name": "StackCalculator",
>   "fields": [
> {
>   "name": "stack",
>   "type": {
> "type": "array",
> "items": [
>   "int",
>   {
> "type": "record",
> "name": "Dup",
> "fields": []
>   },
>   {
> "type": "record",
> "name": "Add",
> "fields": []
>   }
> ]
>   }
> }
>   ]
> }
> {code}
> If I create one of these records with the stack:
> {code:java}
> uer::StackCalculator calc;
> uer::StackCalculator::stack_item_t item;
> item.set_int(3);
> calc.stack.push_back(item);
> item.set_Dup(uer::Dup());
> calc.stack.push_back(item);
> item.set_Add(uer::Add());
> calc.stack.push_back(item);
> {code}
> and try to encode this
> {code:java}
> ValidSchema s;
> ifstream ifs("jsonschemas/union_empty_record");
> compileJsonSchema(ifs, s);
> unique_ptr os = memoryOutputStream();
> EncoderPtr e = validatingEncoder(s, jsonPrettyEncoder());
> e->init(*os);
> avro::encode(*e, calc);
> {code}
> Avro throws {{{}startItem at not an item boundary{}}}. If the records without 
> fields are given a dummy field, this works.
> Fix available at [https://github.com/apache/avro/pull/2927] - bot didn't pick 
> it up since the PR was first



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-1463) Undefined values cause warnings when unions with null serialized

2024-07-04 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-1463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863028#comment-17863028
 ] 

ASF subversion and git services commented on AVRO-1463:
---

Commit 695695478f497b347defec30fb58c9bf2c7a134d in avro's branch 
refs/heads/dependabot/cargo/lang/rust/env_logger-0.11.3 from José Joaquín Atria
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=695695478 ]

AVRO-1463 [Perl] Quietly validate undefined values (#2975)



> Undefined values cause warnings when unions with null serialized
> 
>
> Key: AVRO-1463
> URL: https://issues.apache.org/jira/browse/AVRO-1463
> Project: Apache Avro
>  Issue Type: Bug
>  Components: perl
>Reporter: John Karp
>Assignee: José Joaquín Atria
>Priority: Minor
> Fix For: 1.12.0
>
> Attachments: AVRO-1463.patch
>
>
> This code produces warnings:
> {noformat}
> $enc = '';
> $schema = Avro::Schema->parse(q(["long","null"]));
> Avro::BinaryEncoder->encode(
> schema => $schema,
> data => undef,
> emit_cb => sub { $enc .= ${ $_[0] } },
> );
> {noformat}
> {noformat}
> Use of uninitialized value $data in pack at 
> /home/johnkarp/git/avro/lang/perl/blib/lib/Avro/Schema.pm line 285.
> Use of uninitialized value $data in string eq at 
> /home/johnkarp/git/avro/lang/perl/blib/lib/Avro/Schema.pm line 287.
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-1521) Inconsistent behavior of Perl API with 'boolean' type

2024-07-04 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863032#comment-17863032
 ] 

ASF subversion and git services commented on AVRO-1521:
---

Commit 82d864fd3751e77ecd255b6b28914926d72916f9 in avro's branch 
refs/heads/dependabot/cargo/lang/rust/env_logger-0.11.3 from José Joaquín Atria
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=82d864fd3 ]

AVRO-1521 [Perl] Fix boolean encoding errors (#2986)

This change fixes a long-standing issue with the binary encoding
of boolean values. In particular, that while several "smart" values
were accepted as valid boolean values by Avro::Schema (eg. "true"
and "no"), Avro::BinaryEncoder encoded them as true or false depending
on their truth value for Perl. This resulted in both of those examples
being encoded as true, because for Perl any non-empty string is true.

This change makes it so that those values are accepted and properly
handled, and handles other values that represent boolean values
like JSON::PP::Boolean references and native Perl booleans (those
that would be returned by eg. builtin::true).

This also includes a small but possibly breaking bugfix for the
detection of valid boolean values in Avro::Schema, which was using
a non-anchored regular expression to filter values, meaning that
eg. any value that had an "n" anywhere would be considered valid.
This was most likely an involuntary error, so while breaking, it
feels like we have to fix it.

> Inconsistent behavior of Perl API with 'boolean' type
> -
>
> Key: AVRO-1521
> URL: https://issues.apache.org/jira/browse/AVRO-1521
> Project: Apache Avro
>  Issue Type: Bug
>  Components: perl
>Reporter: John Karp
>Assignee: José Joaquín Atria
>Priority: Major
> Fix For: 1.12.0
>
>
> The perl boolean serialization code in BinaryEncoder.pm encodes anything 
> false to perl, such as 0, '0', '', () and undef, as false, and anything true 
> to perl, which is literally everything else, as true.
> Inconsistent with the above serialization, the code used in Schema.pm to 
> determine which union branch to use, is checking for boolean-ness with:
> {noformat}
> m{yes|no|y|n|t|f|true|false}i
> {noformat}
> meaning only those particular strings are considered booleans.
> So all those values, including 'no' 'n' 'f' and 'false', still get serialized 
> to true.
> We could just standardize on one of the two and use it consistently. But 
> neither works that well in unions, because unless you put the boolean type 
> last in the union definition, a wide variety of data will be downcast to 
> boolean type.
> Perl has no built-in or standardized boolean type, so there's no solution 
> like we have in the other language Avro APIs. But we could do as the perl 
> JSON module does, and define objects for true and false.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-1514) Clean up perl API dependencies

2024-07-04 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-1514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863022#comment-17863022
 ] 

ASF subversion and git services commented on AVRO-1514:
---

Commit 6f7be100753445640bc3b41f6d36f3c366c335b3 in avro's branch 
refs/heads/dependabot/cargo/lang/rust/env_logger-0.11.3 from José Joaquín Atria
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=6f7be1007 ]

AVRO-1514: Clean up Perl dependencies (#2962)

AVRO-1514: 

* Sort Perl dependencies in Github action

This minimises the chance of duplicates sneaking by.

* Drop system dependencies in Perl Github action

* Manually set Perl repository metadata

* Drop dependency on IO::String

* Add missing dependency on Test::Pod

* Mark Test::More as a test dependency

* Add Module::Install as a configure requires

* Be explicit about JSON::MaybeXS dependency

JSON::MaybeXS comes installed in the Perl container we get from
the perl setup Github action, but it's probably a good idea to
be explicit about it.

> Clean up perl API dependencies
> --
>
> Key: AVRO-1514
> URL: https://issues.apache.org/jira/browse/AVRO-1514
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: perl
>Reporter: John Karp
>Assignee: Martin Tzvetanov Grigorov
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.12.0
>
> Attachments: AVRO-1514-0.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> If we assume a non-ancient perl (>=5.8.1), we can clean up the dependencies:
> (build) Module::Install: bundle it
> (build) Module::Install::ReadmeFromPod: keep
> (build) Module::Install::Repository: remove, hardcode repository value 
> instead of autodetecting
> (build) Test::More 0.88: keep, but note requisite version built in starting 
> at 5.10.1
> (test) Test::Exception: keep
> (test) Test::Pod: declare (missing in Makefile.PL)
> (test/run) Math::BigInt: don't declare, now built-in
> (run) JSON::XS: replace with JSON to not tie to a backend
> (run) parent: keep, but note built-in starting at 5.10.1
> (run) Compress::Zlib: keep, but note built-in starting at 5.9.3
> (run) IO::String: replace with perl 5.8 functionality
> (run) Encode: don't declare, now built-in
> (run) Regexp::Common: keep
> (run) Object::Tiny: keep
> (run) Try::Tiny: keep



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-4010) Avoid resolving schema on every call to read()

2024-07-03 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-4010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17862805#comment-17862805
 ] 

ASF subversion and git services commented on AVRO-4010:
---

Commit f3b6ee2d32ae5200675e345b4d26b151caf3034b in avro's branch 
refs/heads/main from Michael Spector
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=f3b6ee2d3 ]

AVRO-4010: [Rust] Avoid re-resolving schema on every read() (#2995)

Co-authored-by: Michael Spector 

> Avoid resolving schema on every call to read()
> --
>
> Key: AVRO-4010
> URL: https://issues.apache.org/jira/browse/AVRO-4010
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: rust
>Affects Versions: 1.11.3
>Reporter: Michael Spector
>Assignee: Martin Tzvetanov Grigorov
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0, 1.11.4
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> `ResolvedSchema::try_from()` is called from within `Reader::read()`, which 
> can be easily avoided if resolved schema is cached along with writer schema 
> once initialized.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-4010) Avoid resolving schema on every call to read()

2024-07-03 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-4010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17862806#comment-17862806
 ] 

ASF subversion and git services commented on AVRO-4010:
---

Commit 2976d395c8e2485c3e71f34a22964b9b55496356 in avro's branch 
refs/heads/branch-1.11 from Michael Spector
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=2976d395c ]

AVRO-4010: [Rust] Avoid re-resolving schema on every read() (#2995)

Co-authored-by: Michael Spector 
(cherry picked from commit f3b6ee2d32ae5200675e345b4d26b151caf3034b)


> Avoid resolving schema on every call to read()
> --
>
> Key: AVRO-4010
> URL: https://issues.apache.org/jira/browse/AVRO-4010
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: rust
>Affects Versions: 1.11.3
>Reporter: Michael Spector
>Assignee: Martin Tzvetanov Grigorov
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0, 1.11.4
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> `ResolvedSchema::try_from()` is called from within `Reader::read()`, which 
> can be easily avoided if resolved schema is cached along with writer schema 
> once initialized.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-1521) Inconsistent behavior of Perl API with 'boolean' type

2024-06-28 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17860790#comment-17860790
 ] 

ASF subversion and git services commented on AVRO-1521:
---

Commit 82d864fd3751e77ecd255b6b28914926d72916f9 in avro's branch 
refs/heads/main from José Joaquín Atria
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=82d864fd3 ]

AVRO-1521 [Perl] Fix boolean encoding errors (#2986)

This change fixes a long-standing issue with the binary encoding
of boolean values. In particular, that while several "smart" values
were accepted as valid boolean values by Avro::Schema (eg. "true"
and "no"), Avro::BinaryEncoder encoded them as true or false depending
on their truth value for Perl. This resulted in both of those examples
being encoded as true, because for Perl any non-empty string is true.

This change makes it so that those values are accepted and properly
handled, and handles other values that represent boolean values
like JSON::PP::Boolean references and native Perl booleans (those
that would be returned by eg. builtin::true).

This also includes a small but possibly breaking bugfix for the
detection of valid boolean values in Avro::Schema, which was using
a non-anchored regular expression to filter values, meaning that
eg. any value that had an "n" anywhere would be considered valid.
This was most likely an involuntary error, so while breaking, it
feels like we have to fix it.

> Inconsistent behavior of Perl API with 'boolean' type
> -
>
> Key: AVRO-1521
> URL: https://issues.apache.org/jira/browse/AVRO-1521
> Project: Apache Avro
>  Issue Type: Bug
>  Components: perl
>Reporter: John Karp
>Assignee: John Karp
>Priority: Major
>
> The perl boolean serialization code in BinaryEncoder.pm encodes anything 
> false to perl, such as 0, '0', '', () and undef, as false, and anything true 
> to perl, which is literally everything else, as true.
> Inconsistent with the above serialization, the code used in Schema.pm to 
> determine which union branch to use, is checking for boolean-ness with:
> {noformat}
> m{yes|no|y|n|t|f|true|false}i
> {noformat}
> meaning only those particular strings are considered booleans.
> So all those values, including 'no' 'n' 'f' and 'false', still get serialized 
> to true.
> We could just standardize on one of the two and use it consistently. But 
> neither works that well in unions, because unless you put the boolean type 
> last in the union definition, a wide variety of data will be downcast to 
> boolean type.
> Perl has no built-in or standardized boolean type, so there's no solution 
> like we have in the other language Avro APIs. But we could do as the perl 
> JSON module does, and define objects for true and false.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3748) issue with DataFileSeekableInput.SeekableInputStream.skip

2024-06-28 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17860716#comment-17860716
 ] 

ASF subversion and git services commented on AVRO-3748:
---

Commit 9443fa9b84d4ebf89f0a6dfd7341283609650d98 in avro's branch 
refs/heads/main from Oscar Westra van Holthe - Kind
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=9443fa9b8 ]

AVRO-3748: [Java] Fix SeekableInput.skip (#2984)

* AVRO-3748: Fix SeekableInput.skip

Two of the implementations of SeekableInput.skip had a bug: skip was
implemented as seek (i.e. using an absolute input position instead of a
relative one). This fixes that.

* AVRO-3748: Avoid reset+skip confusion

> issue with DataFileSeekableInput.SeekableInputStream.skip
> -
>
> Key: AVRO-3748
> URL: https://issues.apache.org/jira/browse/AVRO-3748
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.11.1
>Reporter: Steven Aerts
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> We found a longstanding bug in the implementation of 
> {{DataFileSeekableInput.SeekableInputStream.skip.}}
> This skip function is not hit that often.  It can for example be hit when the 
> FastReader is enabled and it tries to skip a significant amount of data.
> The implmentation of this function is however fault and can result in data 
> corruption or 
> {{{}java.io.EOFException{}}}, as instead of skipping the number of bytes, it 
> will seek to a wrong place in the file.
>  
> We have a pull request ready to fix and test this issue.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-4007) [Rust] Faster is_nullable for UnionSchema

2024-06-24 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-4007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17859604#comment-17859604
 ] 

ASF subversion and git services commented on AVRO-4007:
---

Commit 34bf111368cebf6ae5a772b4223c77392dde5614 in avro's branch 
refs/heads/branch-1.11 from John Emhoff
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=34bf11136 ]

AVRO-4007: [rust] Faster `is_nullable` for UnionSchema (#2961)

* Faster `is_nullable` for UnionSchema

I'm writing several gigabytes of Avro and noticed that it seems
oddly slow. I ran a profile and noticed that about 25% of my total
run time was being spent in `UnionSchema::is_nullable`.

It looks like what's happening is that the test `x == Schema::Null`
is slow because the equality test involves a schema canonicalization.

I've updated the match to match against Schema::Null instead and see
a significant performance increase.

* Fix formatting

* Apply clippy suggestion

Signed-off-by: Martin Tzvetanov Grigorov 

-

Signed-off-by: Martin Tzvetanov Grigorov 
Co-authored-by: Martin Grigorov 
Co-authored-by: Martin Tzvetanov Grigorov 
(cherry picked from commit 4eda118a42f930bdc6f463621e2a6450098cbfe7)


> [Rust] Faster is_nullable for UnionSchema
> -
>
> Key: AVRO-4007
> URL: https://issues.apache.org/jira/browse/AVRO-4007
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: rust
>Reporter: Martin Tzvetanov Grigorov
>Assignee: Martin Tzvetanov Grigorov
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> https://github.com/apache/avro/pull/2961
> {code}
> Writing large amounts of avro data in rust is slow because (in my case) ~40% 
> of total run time is spent in the function UnionSchema::is_nullable. The 
> issue is that the x == Schema::Null invokes schema canonicalization which is
> apparently somewhat slow. I've modified the method to use match instead and 
> see a considerable performance improvement.
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-4007) [Rust] Faster is_nullable for UnionSchema

2024-06-24 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-4007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17859603#comment-17859603
 ] 

ASF subversion and git services commented on AVRO-4007:
---

Commit 4eda118a42f930bdc6f463621e2a6450098cbfe7 in avro's branch 
refs/heads/main from John Emhoff
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=4eda118a4 ]

AVRO-4007: [rust] Faster `is_nullable` for UnionSchema (#2961)

* Faster `is_nullable` for UnionSchema

I'm writing several gigabytes of Avro and noticed that it seems
oddly slow. I ran a profile and noticed that about 25% of my total
run time was being spent in `UnionSchema::is_nullable`.

It looks like what's happening is that the test `x == Schema::Null`
is slow because the equality test involves a schema canonicalization.

I've updated the match to match against Schema::Null instead and see
a significant performance increase.

* Fix formatting

* Apply clippy suggestion

Signed-off-by: Martin Tzvetanov Grigorov 

-

Signed-off-by: Martin Tzvetanov Grigorov 
Co-authored-by: Martin Grigorov 
Co-authored-by: Martin Tzvetanov Grigorov 

> [Rust] Faster is_nullable for UnionSchema
> -
>
> Key: AVRO-4007
> URL: https://issues.apache.org/jira/browse/AVRO-4007
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: rust
>Reporter: Martin Tzvetanov Grigorov
>Assignee: Martin Tzvetanov Grigorov
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> https://github.com/apache/avro/pull/2961
> {code}
> Writing large amounts of avro data in rust is slow because (in my case) ~40% 
> of total run time is spent in the function UnionSchema::is_nullable. The 
> issue is that the x == Schema::Null invokes schema canonicalization which is
> apparently somewhat slow. I've modified the method to use match instead and 
> see a considerable performance improvement.
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-1514) Clean up perl API dependencies

2024-06-24 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-1514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17859601#comment-17859601
 ] 

ASF subversion and git services commented on AVRO-1514:
---

Commit 6f7be100753445640bc3b41f6d36f3c366c335b3 in avro's branch 
refs/heads/main from José Joaquín Atria
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=6f7be1007 ]

AVRO-1514: Clean up Perl dependencies (#2962)

AVRO-1514: 

* Sort Perl dependencies in Github action

This minimises the chance of duplicates sneaking by.

* Drop system dependencies in Perl Github action

* Manually set Perl repository metadata

* Drop dependency on IO::String

* Add missing dependency on Test::Pod

* Mark Test::More as a test dependency

* Add Module::Install as a configure requires

* Be explicit about JSON::MaybeXS dependency

JSON::MaybeXS comes installed in the Perl container we get from
the perl setup Github action, but it's probably a good idea to
be explicit about it.

> Clean up perl API dependencies
> --
>
> Key: AVRO-1514
> URL: https://issues.apache.org/jira/browse/AVRO-1514
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: perl
>Reporter: John Karp
>Assignee: John Karp
>Priority: Minor
>  Labels: pull-request-available
> Attachments: AVRO-1514-0.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If we assume a non-ancient perl (>=5.8.1), we can clean up the dependencies:
> (build) Module::Install: bundle it
> (build) Module::Install::ReadmeFromPod: keep
> (build) Module::Install::Repository: remove, hardcode repository value 
> instead of autodetecting
> (build) Test::More 0.88: keep, but note requisite version built in starting 
> at 5.10.1
> (test) Test::Exception: keep
> (test) Test::Pod: declare (missing in Makefile.PL)
> (test/run) Math::BigInt: don't declare, now built-in
> (run) JSON::XS: replace with JSON to not tie to a backend
> (run) parent: keep, but note built-in starting at 5.10.1
> (run) Compress::Zlib: keep, but note built-in starting at 5.9.3
> (run) IO::String: replace with perl 5.8 functionality
> (run) Encode: don't declare, now built-in
> (run) Regexp::Common: keep
> (run) Object::Tiny: keep
> (run) Try::Tiny: keep



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3990) [C++] avrogencpp generates invalid code for union with a reserved word

2024-06-14 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17855008#comment-17855008
 ] 

ASF subversion and git services commented on AVRO-3990:
---

Commit 6aeb7b7aa10ef721f3a16158900205ecab7174b1 in avro's branch 
refs/heads/main from Gerrit Birkeland
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=6aeb7b7aa ]

AVRO-3990 [C++] Fix invalid code generation for union with reserved name (#2930)



> [C++] avrogencpp generates invalid code for union with a reserved word
> --
>
> Key: AVRO-3990
> URL: https://issues.apache.org/jira/browse/AVRO-3990
> Project: Apache Avro
>  Issue Type: Bug
>Reporter: Gerrit Birkeland
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When avrogencpp is run with this schema, it generates C++ code with compiler 
> errors due to inconsistently passing names through the {{decorate}} wrapper.
> {code:java}
> {
>   "type": "record",
>   "name": "Record",
>   "fields": [
> {
>   "name": "void",
>   "type": [
> "int",
> "double"
>   ]
> }
>   ]
> }
> {code}
> This generates the following, note that the typedef uses {{void_t}} (one 
> underscore) while references to the typedef use {{void__t}} (two underscores)
> {code:java}
> struct Record {
> typedef _cpp_reserved_words_union_typedef_Union__0__ void_t;
> void__t void_;
> Record() :
> void_(void__t())
> { }
> };
> {code}
> Note: This problem was injected after the latest release, so can only be 
> reproduced with a build off of the current repo.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3995) [C++] Update build system to disallow compiling with unsupported language versions

2024-06-13 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17854680#comment-17854680
 ] 

ASF subversion and git services commented on AVRO-3995:
---

Commit 1a348b2e841b5406663114503c12c354c0811b93 in avro's branch 
refs/heads/main from Gerrit Birkeland
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=1a348b2e8 ]

AVRO-3995 [C++] Requires C++17 to compile Avro (#2949)



> [C++] Update build system to disallow compiling with unsupported language 
> versions
> --
>
> Key: AVRO-3995
> URL: https://issues.apache.org/jira/browse/AVRO-3995
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: c++
>Affects Versions: 1.11.3
>Reporter: Gerrit Birkeland
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> In August of 2023 a commit was merged to Avro which effectively forced all 
> users of the project to compile with C\+\+17 at a minimum, but this wasn't 
> enforced by the build system anywhere, so currently attempting to compile 
> Avro with a C\+\+11 or C\+\+14 compiler will result in build errors rather 
> than an obvious message telling the user what is wrong.
> Avro should provide a user friendly error message when attempting to compile 
> the library with an unsupported C++ standard.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3999) Avoid warnings in Perl test suite

2024-06-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17854435#comment-17854435
 ] 

ASF subversion and git services commented on AVRO-3999:
---

Commit 072b51fb548c35fee192917ef1d3cdbd944ef53b in avro's branch 
refs/heads/main from José Joaquín Atria
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=072b51fb5 ]

AVRO-3999 - Avoid warnings in Perl test suite (#2953)

* Add error message when schema do not match in Perl

When the schema did not match in a call to Avro::BinaryDecoder::decode,
a Avro::Schema::Error::Mismatch error was thrown without a body. This
was generating a warning when trying to stringify an undefined value,
and resulted in the empty string being used as the error message, which
was not veryb informative.

This change adds a message which should solve both issues.

* Do not exit sub via next in Perl tests

This silences a loud warning in xt/schema.t

> Avoid warnings in Perl test suite
> -
>
> Key: AVRO-3999
> URL: https://issues.apache.org/jira/browse/AVRO-3999
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: perl
>Reporter: José Joaquín Atria
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The test suite generated several warnings which could easily be avoided. 
> Specifically an undefined value being stringified in t/03_bin_decode.t, and 
> exiting a subroutine via next in xt/schema.t. See output below for 
> illustration:
>  
> {code:java}
> $ ./build.sh test
> include /home/user/avro/lang/perl/inc/Module/Install.pm
> include inc/Module/Install/Metadata.pm
> include inc/Module/Install/Base.pm
> include inc/Module/Install/ReadmeFromPod.pm
> readme_from lib/Avro.pm to txt
> include inc/Module/Install/Repository.pm
> Cannot determine repository URL
> include inc/Module/Install/MakeMaker.pm
> include inc/Module/Install/Makefile.pm
> Generating a Unix-style Makefile
> Writing Makefile for Avro
> Writing MYMETA.yml and MYMETA.json
> Writing META.yml
> sed -e s/++MODULE_VERSION++/1.12.0-SNAPSHOT/  >blib/lib/Avro/BinaryEncoder.pm
> sed -e s/++MODULE_VERSION++/1.12.0-SNAPSHOT/  >blib/lib/Avro/DataFileWriter.pm
> sed -e s/++MODULE_VERSION++/1.12.0-SNAPSHOT/  >blib/lib/Avro/BinaryDecoder.pm
> sed -e s/++MODULE_VERSION++/1.12.0-SNAPSHOT/  >blib/lib/Avro/DataFileReader.pm
> sed -e s/++MODULE_VERSION++/1.12.0-SNAPSHOT/  >blib/lib/Avro/DataFile.pm
> sed -e s/++MODULE_VERSION++/1.12.0-SNAPSHOT/  >blib/lib/Avro/Schema.pm
> sed -e s/++MODULE_VERSION++/1.12.0-SNAPSHOT/  >blib/lib/Avro/Protocol/Message.pm
> sed -e s/++MODULE_VERSION++/1.12.0-SNAPSHOT/  >blib/lib/Avro/Protocol.pm
> sed -e s/++MODULE_VERSION++/1.12.0-SNAPSHOT/ blib/lib/Avro.pm
> PERL_DL_NONLAZY=1 "/home/user/.perl/perls/perl-5.36.0/bin/perl" 
> "-MExtUtils::Command::MM" "-MTest::Harness" "-e" "undef 
> *Test::Harness::Switches; test_harness(0, 'inc', 'blib/lib', 'blib/arch')" 
> t/*.t xt/*.t
> t/00_compile.t . ok
> t/01_names.t ... ok
> t/01_schema.t .. ok
> t/02_bin_encode.t .. ok
> t/03_bin_decode.t .. 1/? Use of uninitialized value in concatenation (.) or 
> string at /home/user/.perl/perls/perl-5.36.0/lib/site_perl/5.36.0/Error.pm 
> line 288.
> t/03_bin_decode.t .. ok
> t/04_datafile.t  ok
> t/05_protocol.t  ok
> xt/interop.t ... ok
> xt/pod.t ... ok
> xt/schema.t  1/? Exiting subroutine via next at xt/schema.t line 26.
> Exiting subroutine via next at xt/schema.t line 26.
> Exiting subroutine via next at xt/schema.t line 26.
> Exiting subroutine via next at xt/schema.t line 26.
> Exiting subroutine via next at xt/schema.t line 26.
> Exiting subroutine via next at xt/schema.t line 26.
> Exiting subroutine via next at xt/schema.t line 26.
> Exiting subroutine via next at xt/schema.t line 26.
> Exiting subroutine via next at xt/schema.t line 26.
> Exiting subroutine via next at xt/schema.t line 26.
> Exiting subroutine via next at xt/schema.t line 26.
> Exiting subroutine via next at xt/schema.t line 26.
> Exiting subroutine via next at xt/schema.t line 26.
> Exiting subroutine via next at xt/schema.t line 26.
> Exiting subroutine via next at xt/schema.t line 26.
> Exiting subroutine via next at xt/schema.t line 26.
> Exiting subroutine via next at xt/schema.t line 26.
> Exiting subroutine via next at xt/schema.t line 26.
> Exiting subroutine via next at xt/schema.t line 26.
> Exiting subroutine via next at xt/schema.t line 26.
> Exiting subroutine via next at xt/schema.t line 26.
> Exiting subroutine via next at xt/schema.t line 26.
> Exiting subroutine via next at xt/schema.t line 26.
> Exiting subroutine via next at xt/schema.t line 26.
> Exiting subroutine via next at xt/schema.t line 26.
> Exiting subroutine via next at xt/schema.t line 26.
> Exiting subroutine via 

[jira] [Commented] (AVRO-3983) Allow setting a custom encoder in DataFileWriter

2024-06-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17854066#comment-17854066
 ] 

ASF subversion and git services commented on AVRO-3983:
---

Commit 359a6c7bd43a2db44e5bc5720ecff9bc8755e034 in avro's branch 
refs/heads/branch-1.11 from Fokko Driesprong
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=359a6c7bd ]

Backport: Support BlockingDirectBinaryEncoder (#2899)

* AVRO-3983: Allow setting a custom encoder in DataFileWriter (#2874)

* AVRO-3871: Add blocking direct binary encoder (#2521)

* Java: Add blocking direct binary encoder

* Optimize

* Comments and more tests

* Comments and more tests

* Fix rat check

* AVRO-3871: Support nested lists/maps in BlockingDirectBinaryEncoder (#2732)

* Support nested lists/maps

* Add some tests

> Allow setting a custom encoder in DataFileWriter
> 
>
> Key: AVRO-3983
> URL: https://issues.apache.org/jira/browse/AVRO-3983
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Affects Versions: 1.11.3
>Reporter: Fokko Driesprong
>Assignee: Fokko Driesprong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3871) Add BlockingDirectBinaryEncoder

2024-06-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17854067#comment-17854067
 ] 

ASF subversion and git services commented on AVRO-3871:
---

Commit 359a6c7bd43a2db44e5bc5720ecff9bc8755e034 in avro's branch 
refs/heads/branch-1.11 from Fokko Driesprong
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=359a6c7bd ]

Backport: Support BlockingDirectBinaryEncoder (#2899)

* AVRO-3983: Allow setting a custom encoder in DataFileWriter (#2874)

* AVRO-3871: Add blocking direct binary encoder (#2521)

* Java: Add blocking direct binary encoder

* Optimize

* Comments and more tests

* Comments and more tests

* Fix rat check

* AVRO-3871: Support nested lists/maps in BlockingDirectBinaryEncoder (#2732)

* Support nested lists/maps

* Add some tests

> Add BlockingDirectBinaryEncoder
> ---
>
> Key: AVRO-3871
> URL: https://issues.apache.org/jira/browse/AVRO-3871
> Project: Apache Avro
>  Issue Type: Improvement
>Affects Versions: 1.11.2
>Reporter: Fokko Driesprong
>Assignee: Fokko Driesprong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3993) Writing an AVRO enum field with an invalid value generates unhelpful NPE

2024-06-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17853928#comment-17853928
 ] 

ASF subversion and git services commented on AVRO-3993:
---

Commit 25651dc2909a18e62d8219d2a730bf02040d576f in avro's branch 
refs/heads/main from Gray
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=25651dc29 ]

AVRO-3993: [java] Add better exception msgs when writing invalid enum symnbol 
(#2945)

* Added better exception messages when writing invalid enum symnbol.

* Remove unnecessary import.

* revert local pom changes that werent supposed to go

* improved the comment

* fixed some characters per the validation

-

Co-authored-by: Gray Watson 

> Writing an AVRO enum field with an invalid value generates unhelpful NPE
> 
>
> Key: AVRO-3993
> URL: https://issues.apache.org/jira/browse/AVRO-3993
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Reporter: Gray
>Priority: Minor
>  Labels: pull-request-available
> Attachments: avro-stacktrace.txt
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {{When an enum field is written with a symbol that is not one of the valid 
> enum symbols configured in the schema, the code generates a NPE which turns 
> into a null value message which is not extremely helpful. It would be useful 
> if the message mentioned the invalid symbol and the available symbols from 
> the schema.}}
> Here's a sample of the generated stack trace.  More in the attachment:
> {{java.lang.NullPointerException: null value for (non-nullable) enum1 at 
> record1.field1}}
> {{     ...}}
> {{Caused by: java.lang.NullPointerException}}
> {{     at org.apache.avro.Schema$EnumSchema.getEnumOrdinal(Schema.java:1118)}}
> {{...}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3731) Integrate software donation of gradle-avro-plugin

2024-06-10 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17853574#comment-17853574
 ] 

ASF subversion and git services commented on AVRO-3731:
---

Commit b41d6072d37aacad21df988b98a6e1665c90e658 in avro's branch 
refs/heads/avro-3731-gradle-avro-plugin from RanbirK
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=b41d6072d ]

AVRO-3731:[gradl- avro-plugin] Integrates software donation of gradle avro 
plugin (#2946)

* Add initial changes to workflow files to start fixing builds

* Debug os compatibility test

* Debug os compatibility test

* Debug os compatibility test

* Refactoring

-

Co-authored-by: Ola Hungerford 
Co-authored-by: Ranbir Kumar 

> Integrate software donation of gradle-avro-plugin
> -
>
> Key: AVRO-3731
> URL: https://issues.apache.org/jira/browse/AVRO-3731
> Project: Apache Avro
>  Issue Type: New Feature
>Affects Versions: 1.12.0
>Reporter: Ryan Skraba
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The Apache project has 
> [voted|https://lists.apache.org/thread/hz8fomzcwt2yhyz6l4ntzp3h0vpqd8xg] to 
> accept the software donation of the 
> [gradle-avro-plugin|https://github.com/davidmc24/gradle-avro-plugin/discussions/208]
>  graciously donated by the maintainer (David M. Carr).  Thanks!
> This umbrella Jira is tracking the steps we need to take to integrate the 
> code into the project.
> We already have the Software Grant Agreement to cover the incoming code, 
> which is already licensed under the Apache 2.0 license with source code 
> headers including {{Copyright © 2013-2019 Commerce Technologies, LLC.}}
> Some useful resources for the process are provided by the incubator:
> * https://incubator.apache.org/guides/transitioning_asf.html
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3987) Concurrency improvement for ReflectData FieldAccessors

2024-05-31 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17851145#comment-17851145
 ] 

ASF subversion and git services commented on AVRO-3987:
---

Commit e932c9453be7b36e8874fe92edb3710beef4e47c in avro's branch 
refs/heads/main from Ashley Taylor
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=e932c9453 ]

AVRO-3987 replace synchronized with immutable replacement approach (#2900)



> Concurrency improvement for ReflectData FieldAccessors
> --
>
> Key: AVRO-3987
> URL: https://issues.apache.org/jira/browse/AVRO-3987
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Reporter: Ashley Taylor
>Assignee: Ashley Taylor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Currently, within ReflectData subclass ClassAccessorData, there is a 
> WeakHashMap called bySchema.
> This contains the FieldAccessor[] needed to build the Object. 
> To prevent concurrency issues currently, the getAccessorsFor method is 
> synchronized. This method is called per record read/write. Using an immutable 
> replace approach with each changing and switching to a volatile object 
> results in significant performance improvement



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3965) Default values for fixed can be longer than size

2024-05-31 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17851104#comment-17851104
 ] 

ASF subversion and git services commented on AVRO-3965:
---

Commit 0661bfe71836be253626e1c85d7c6c12a48fe667 in avro's branch 
refs/heads/dependabot/maven/lang/java/org.apache.hadoop-hadoop-client-3.4.0 
from Martin Grigorov
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=0661bfe71 ]

AVRO-3965: [Rust] Default values for fixed can be longer than size (#2907)

Return an error if the default value's length is not the same as the
specified size of a Fixed schema

Signed-off-by: Martin Tzvetanov Grigorov 

> Default values for fixed can be longer than size
> 
>
> Key: AVRO-3965
> URL: https://issues.apache.org/jira/browse/AVRO-3965
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java, rust
>Reporter: Roman Mitasov
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> A value longer than the specified size can be put into "default" property of 
> a "fixed" field.
> Behaviour in this case is not documented.
> This should either be forbidden or documented. Even adding a UB warning in 
> the documentation would be better than the current situation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-2924) SpecificCompiler add 'LocalDateTime' logical type

2024-05-20 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-2924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847807#comment-17847807
 ] 

ASF subversion and git services commented on AVRO-2924:
---

Commit cd9a7a0682469605e0b78eea71380753be90e382 in avro's branch 
refs/heads/avro-3731-gradle-avro-plugin from Samael
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=cd9a7a068 ]

AVRO-3732: [gradle-plugin] import gradle plugin from gradle-avro-plugin (#2310)

AVRO-3732: 

* Try again to avoid gradle welcome banner in CI

* ci: try turning on github actions compatibility tests

* fix ci syntax

* ci: fix yaml indentation

* ci: comment out fail fast

* ci: fix matrix structure

* CI: try matrix with allowed failures

* CI: give up on conditional failures allowed for now; exclude java 13

* CI: info output for builds

* Try to resolve the test failures on windows regarding default encoding 
handling

* Fix encoding support on windows, for real this time

* CI: add unsupported-java-versions job

* CI: run the unsupported java versions job on all the OS versions

After all, they'll all fail almost immediately anyway

* README: Update badge to use github actions rather than travis

* add support for generating optional getters

* README: fix CI badge syntax

* README: fix CI badge image

* add doc for optional getter field generation

* Update changelog to note the recent merged pull request

* CI: disable the gradle daemon to try to eliminate the sporadic clean failure 
on windows

* Remove security policy; not a CommerceHub OSS project any more

* Update various files for commercehub-oss -> davidmc24 github move

* Working version of custom conversions against modern gradle; still need to 
adjust for earlier versions

* Don't use MapProperty yet; it wasn't introduced until Gradle 5.1

* Don't use ListProperty

It changed incompatibly between Gradle 4.4 and Gralde 4.5

* Don't use Class.newInstance(), as it was deprecated in Java 11

* Update issue templates

* Update bug_report.md to add a checklist

* Update feature_request.md to include a checklist

* CI: don't bother doing maintenance builds on old OS versions; only use latest

* CI: update another place with os versions that I missed

* version: 0.18.0

* version: 0.18.1-SNAPSHOT

* update to Avro 1.9.2 since https://issues.apache.org/jira/browse/AVRO-2548 
has been fixed there

* Add support for Gradle 6.0-6.2, drop support for gradle <5.1 (#101)

* Update changelog for #104

* Add support for Java 13

* Add support for testing multiple kotlin versions

* Update plugin's own build to address some deprecation warnings of APIs being 
removed in Gradle 7

* BuildCacheSupportFunctionalSpec no longer needs an @IgnoreIf, as we only 
support versions where the Build Cache is supported.

* Remove license plugin

It was resulting in deprecation warnings about Gradle 7, a new version wasn't 
available, and I don't think it was providing real value.

* Lots of test updates

* Remove taskInfoAbsent, as it isn't needed any more with the versions of 
Gradle we support
* Remove isMultipleClassDirectoriesUsed, as all versions of Gradle we support 
now use it
* Leverage GradleRunner's withPluginClasspath feature when able to use plugin 
DSL to apply the test plugin
* Add addDefaultRepository utility method
* Add applyPlugin override that takes a version
* Rename addDependency to addImplementationDependency; add addRuntimeDependency 
and addDependency that takes a configuration argument
* Use stripMargin more consistently

* Add tests for Kotlin DSL usage (#61)

* Handle a test that appears to fail on Windows due to weird file locking 
behaviors

* Update to note a Kotlin-Java version incompatibility

* Update to gradle 6.2.2

* Official Gradle Wrapper Validation Action

See: https://github.com/gradle/wrapper-validation-action
Added as a dedicated Workflow

* Support Task Configuration Avoidance (#97)

https://docs.gradle.org/current/userguide/task_configuration_avoidance.html

Thanks to [dcabasson](https://github.com/dcabasson) for the collaboration

* Update test result directory names

* Work around a bug showing in Gradle 5.1

It appears that in Gradle 5.1, TaskContainer's `withType` overwrites the 
results of `matching`, causing java compilation tasks to be returned.
This results in a circular task dependency.
Changing the order to filter by type first fixes it.

* See if we can get Java 14 support working with a Gradle 6.3 nightly build

* Update codenarc support so it works in Java 14+; update compatibility notes

* version: 0.19.0

* version: 0.19.1-SNAPSHOT

* Create FUNDING.yml

* Update bug_report.md

* Fix schema dependency resolution when types are referenced with a `{ "type": 
NAME }` block rather than just `NAME` (#107)

* Eliminate `NullPointerException` handling in schema dependency resolution, as 
it no longer appears to be needed.

* version: 0.19.1

* version: 0.19.2-SNAPSHOT

* Add 

[jira] [Commented] (AVRO-3732) Code drop existing sources, updating headers, RAT, NOTICES, etc.

2024-05-20 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847805#comment-17847805
 ] 

ASF subversion and git services commented on AVRO-3732:
---

Commit cd9a7a0682469605e0b78eea71380753be90e382 in avro's branch 
refs/heads/avro-3731-gradle-avro-plugin from Samael
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=cd9a7a068 ]

AVRO-3732: [gradle-plugin] import gradle plugin from gradle-avro-plugin (#2310)

AVRO-3732: 

* Try again to avoid gradle welcome banner in CI

* ci: try turning on github actions compatibility tests

* fix ci syntax

* ci: fix yaml indentation

* ci: comment out fail fast

* ci: fix matrix structure

* CI: try matrix with allowed failures

* CI: give up on conditional failures allowed for now; exclude java 13

* CI: info output for builds

* Try to resolve the test failures on windows regarding default encoding 
handling

* Fix encoding support on windows, for real this time

* CI: add unsupported-java-versions job

* CI: run the unsupported java versions job on all the OS versions

After all, they'll all fail almost immediately anyway

* README: Update badge to use github actions rather than travis

* add support for generating optional getters

* README: fix CI badge syntax

* README: fix CI badge image

* add doc for optional getter field generation

* Update changelog to note the recent merged pull request

* CI: disable the gradle daemon to try to eliminate the sporadic clean failure 
on windows

* Remove security policy; not a CommerceHub OSS project any more

* Update various files for commercehub-oss -> davidmc24 github move

* Working version of custom conversions against modern gradle; still need to 
adjust for earlier versions

* Don't use MapProperty yet; it wasn't introduced until Gradle 5.1

* Don't use ListProperty

It changed incompatibly between Gradle 4.4 and Gralde 4.5

* Don't use Class.newInstance(), as it was deprecated in Java 11

* Update issue templates

* Update bug_report.md to add a checklist

* Update feature_request.md to include a checklist

* CI: don't bother doing maintenance builds on old OS versions; only use latest

* CI: update another place with os versions that I missed

* version: 0.18.0

* version: 0.18.1-SNAPSHOT

* update to Avro 1.9.2 since https://issues.apache.org/jira/browse/AVRO-2548 
has been fixed there

* Add support for Gradle 6.0-6.2, drop support for gradle <5.1 (#101)

* Update changelog for #104

* Add support for Java 13

* Add support for testing multiple kotlin versions

* Update plugin's own build to address some deprecation warnings of APIs being 
removed in Gradle 7

* BuildCacheSupportFunctionalSpec no longer needs an @IgnoreIf, as we only 
support versions where the Build Cache is supported.

* Remove license plugin

It was resulting in deprecation warnings about Gradle 7, a new version wasn't 
available, and I don't think it was providing real value.

* Lots of test updates

* Remove taskInfoAbsent, as it isn't needed any more with the versions of 
Gradle we support
* Remove isMultipleClassDirectoriesUsed, as all versions of Gradle we support 
now use it
* Leverage GradleRunner's withPluginClasspath feature when able to use plugin 
DSL to apply the test plugin
* Add addDefaultRepository utility method
* Add applyPlugin override that takes a version
* Rename addDependency to addImplementationDependency; add addRuntimeDependency 
and addDependency that takes a configuration argument
* Use stripMargin more consistently

* Add tests for Kotlin DSL usage (#61)

* Handle a test that appears to fail on Windows due to weird file locking 
behaviors

* Update to note a Kotlin-Java version incompatibility

* Update to gradle 6.2.2

* Official Gradle Wrapper Validation Action

See: https://github.com/gradle/wrapper-validation-action
Added as a dedicated Workflow

* Support Task Configuration Avoidance (#97)

https://docs.gradle.org/current/userguide/task_configuration_avoidance.html

Thanks to [dcabasson](https://github.com/dcabasson) for the collaboration

* Update test result directory names

* Work around a bug showing in Gradle 5.1

It appears that in Gradle 5.1, TaskContainer's `withType` overwrites the 
results of `matching`, causing java compilation tasks to be returned.
This results in a circular task dependency.
Changing the order to filter by type first fixes it.

* See if we can get Java 14 support working with a Gradle 6.3 nightly build

* Update codenarc support so it works in Java 14+; update compatibility notes

* version: 0.19.0

* version: 0.19.1-SNAPSHOT

* Create FUNDING.yml

* Update bug_report.md

* Fix schema dependency resolution when types are referenced with a `{ "type": 
NAME }` block rather than just `NAME` (#107)

* Eliminate `NullPointerException` handling in schema dependency resolution, as 
it no longer appears to be needed.

* version: 0.19.1

* version: 0.19.2-SNAPSHOT

* Add 

[jira] [Commented] (AVRO-2548) StringType of "String" causes logicalType converters to be ignored for field

2024-05-20 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-2548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847806#comment-17847806
 ] 

ASF subversion and git services commented on AVRO-2548:
---

Commit cd9a7a0682469605e0b78eea71380753be90e382 in avro's branch 
refs/heads/avro-3731-gradle-avro-plugin from Samael
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=cd9a7a068 ]

AVRO-3732: [gradle-plugin] import gradle plugin from gradle-avro-plugin (#2310)

AVRO-3732: 

* Try again to avoid gradle welcome banner in CI

* ci: try turning on github actions compatibility tests

* fix ci syntax

* ci: fix yaml indentation

* ci: comment out fail fast

* ci: fix matrix structure

* CI: try matrix with allowed failures

* CI: give up on conditional failures allowed for now; exclude java 13

* CI: info output for builds

* Try to resolve the test failures on windows regarding default encoding 
handling

* Fix encoding support on windows, for real this time

* CI: add unsupported-java-versions job

* CI: run the unsupported java versions job on all the OS versions

After all, they'll all fail almost immediately anyway

* README: Update badge to use github actions rather than travis

* add support for generating optional getters

* README: fix CI badge syntax

* README: fix CI badge image

* add doc for optional getter field generation

* Update changelog to note the recent merged pull request

* CI: disable the gradle daemon to try to eliminate the sporadic clean failure 
on windows

* Remove security policy; not a CommerceHub OSS project any more

* Update various files for commercehub-oss -> davidmc24 github move

* Working version of custom conversions against modern gradle; still need to 
adjust for earlier versions

* Don't use MapProperty yet; it wasn't introduced until Gradle 5.1

* Don't use ListProperty

It changed incompatibly between Gradle 4.4 and Gralde 4.5

* Don't use Class.newInstance(), as it was deprecated in Java 11

* Update issue templates

* Update bug_report.md to add a checklist

* Update feature_request.md to include a checklist

* CI: don't bother doing maintenance builds on old OS versions; only use latest

* CI: update another place with os versions that I missed

* version: 0.18.0

* version: 0.18.1-SNAPSHOT

* update to Avro 1.9.2 since https://issues.apache.org/jira/browse/AVRO-2548 
has been fixed there

* Add support for Gradle 6.0-6.2, drop support for gradle <5.1 (#101)

* Update changelog for #104

* Add support for Java 13

* Add support for testing multiple kotlin versions

* Update plugin's own build to address some deprecation warnings of APIs being 
removed in Gradle 7

* BuildCacheSupportFunctionalSpec no longer needs an @IgnoreIf, as we only 
support versions where the Build Cache is supported.

* Remove license plugin

It was resulting in deprecation warnings about Gradle 7, a new version wasn't 
available, and I don't think it was providing real value.

* Lots of test updates

* Remove taskInfoAbsent, as it isn't needed any more with the versions of 
Gradle we support
* Remove isMultipleClassDirectoriesUsed, as all versions of Gradle we support 
now use it
* Leverage GradleRunner's withPluginClasspath feature when able to use plugin 
DSL to apply the test plugin
* Add addDefaultRepository utility method
* Add applyPlugin override that takes a version
* Rename addDependency to addImplementationDependency; add addRuntimeDependency 
and addDependency that takes a configuration argument
* Use stripMargin more consistently

* Add tests for Kotlin DSL usage (#61)

* Handle a test that appears to fail on Windows due to weird file locking 
behaviors

* Update to note a Kotlin-Java version incompatibility

* Update to gradle 6.2.2

* Official Gradle Wrapper Validation Action

See: https://github.com/gradle/wrapper-validation-action
Added as a dedicated Workflow

* Support Task Configuration Avoidance (#97)

https://docs.gradle.org/current/userguide/task_configuration_avoidance.html

Thanks to [dcabasson](https://github.com/dcabasson) for the collaboration

* Update test result directory names

* Work around a bug showing in Gradle 5.1

It appears that in Gradle 5.1, TaskContainer's `withType` overwrites the 
results of `matching`, causing java compilation tasks to be returned.
This results in a circular task dependency.
Changing the order to filter by type first fixes it.

* See if we can get Java 14 support working with a Gradle 6.3 nightly build

* Update codenarc support so it works in Java 14+; update compatibility notes

* version: 0.19.0

* version: 0.19.1-SNAPSHOT

* Create FUNDING.yml

* Update bug_report.md

* Fix schema dependency resolution when types are referenced with a `{ "type": 
NAME }` block rather than just `NAME` (#107)

* Eliminate `NullPointerException` handling in schema dependency resolution, as 
it no longer appears to be needed.

* version: 0.19.1

* version: 0.19.2-SNAPSHOT

* Add 

[jira] [Commented] (AVRO-3965) Default values for fixed can be longer than size

2024-05-15 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846528#comment-17846528
 ] 

ASF subversion and git services commented on AVRO-3965:
---

Commit 82fa91cdeb8178568d19924d7dc25e9ab42ae601 in avro's branch 
refs/heads/branch-1.11 from Martin Grigorov
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=82fa91cde ]

AVRO-3965: [Rust] Default values for fixed can be longer than size (#2907)

Return an error if the default value's length is not the same as the
specified size of a Fixed schema

Signed-off-by: Martin Tzvetanov Grigorov 
(cherry picked from commit 0661bfe71836be253626e1c85d7c6c12a48fe667)


> Default values for fixed can be longer than size
> 
>
> Key: AVRO-3965
> URL: https://issues.apache.org/jira/browse/AVRO-3965
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Reporter: Roman Mitasov
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> A value longer than the specified size can be put into "default" property of 
> a "fixed" field.
> Behaviour in this case is not documented.
> This should either be forbidden or documented. Even adding a UB warning in 
> the documentation would be better than the current situation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3965) Default values for fixed can be longer than size

2024-05-15 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846526#comment-17846526
 ] 

ASF subversion and git services commented on AVRO-3965:
---

Commit 0661bfe71836be253626e1c85d7c6c12a48fe667 in avro's branch 
refs/heads/main from Martin Grigorov
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=0661bfe71 ]

AVRO-3965: [Rust] Default values for fixed can be longer than size (#2907)

Return an error if the default value's length is not the same as the
specified size of a Fixed schema

Signed-off-by: Martin Tzvetanov Grigorov 

> Default values for fixed can be longer than size
> 
>
> Key: AVRO-3965
> URL: https://issues.apache.org/jira/browse/AVRO-3965
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Reporter: Roman Mitasov
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> A value longer than the specified size can be put into "default" property of 
> a "fixed" field.
> Behaviour in this case is not documented.
> This should either be forbidden or documented. Even adding a UB warning in 
> the documentation would be better than the current situation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3965) Default values for fixed can be longer than size

2024-05-14 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846279#comment-17846279
 ] 

ASF subversion and git services commented on AVRO-3965:
---

Commit 6c05e0d9e1c7715773023c57775b747f0c718951 in avro's branch 
refs/heads/avro-3965-default-len-bigger-than-size from Martin Tzvetanov Grigorov
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=6c05e0d9e ]

AVRO-3965: [Rust] Default values for fixed can be longer than size

Return an error if the default value's length is not the same as the
specified size of a Fixed schema

Signed-off-by: Martin Tzvetanov Grigorov 


> Default values for fixed can be longer than size
> 
>
> Key: AVRO-3965
> URL: https://issues.apache.org/jira/browse/AVRO-3965
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Reporter: Roman Mitasov
>Priority: Major
>
> A value longer than the specified size can be put into "default" property of 
> a "fixed" field.
> Behaviour in this case is not documented.
> This should either be forbidden or documented. Even adding a UB warning in 
> the documentation would be better than the current situation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3677) Introduce Named Schema Formatters

2024-05-06 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17843691#comment-17843691
 ] 

ASF subversion and git services commented on AVRO-3677:
---

Commit 362aef8a07bc17969601a4ff2cbf60ef7488d13c in avro's branch 
refs/heads/main from Oscar Westra van Holthe - Kind
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=362aef8a0 ]

AVRO-3677: Add SchemaFormatter (#2885)

* AVRO-3677: Introduce Named Schema Formatters

Adds a SchemaFormatter interface and factory method to format schemas to
different formats by name. The initial implementation supports JSON
(both inline and pretty printed), the parsing canonical form, and the IDL
format.

> Introduce Named Schema Formatters
> -
>
> Key: AVRO-3677
> URL: https://issues.apache.org/jira/browse/AVRO-3677
> Project: Apache Avro
>  Issue Type: New Feature
>  Components: java
>Affects Versions: 1.11.1
>Reporter: Oscar Westra van Holthe - Kind
>Assignee: Oscar Westra van Holthe - Kind
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Similar to AVRO-3666, which introduces multiple schema parsers, I propose to 
> introduce multiple, named, schema formatters.
> Names can be of the form {{{}[/]{}}}, there the variant part 
> is optional.
> Initially, the list would be:
>  * json -> alias for json/pretty
>  * json/pretty -> pretty{-}-{-}printed JSON; replaces 
> {{{}Schema.toString(true){}}}
>  * json/inline -> single-line JSON; replaces {{{}Schema.toString(false){}}}
>  * canonical -> Parsing Canonical Form (as per spec)
> Then, after merging AVRO-3404, we can also add:
> * idl -> to write schemata in IDL format, as requested in AVRO-1757



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3871) Add BlockingDirectBinaryEncoder

2024-05-05 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17843587#comment-17843587
 ] 

ASF subversion and git services commented on AVRO-3871:
---

Commit 9f9023cd03d65b6c4793d592037776a88206423c in avro's branch 
refs/heads/main from Fokko Driesprong
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=9f9023cd0 ]

AVRO-3871: Support nested lists/maps in BlockingDirectBinaryEncoder (#2732)

* Support nested lists/maps

* Add some tests

> Add BlockingDirectBinaryEncoder
> ---
>
> Key: AVRO-3871
> URL: https://issues.apache.org/jira/browse/AVRO-3871
> Project: Apache Avro
>  Issue Type: Improvement
>Affects Versions: 1.11.2
>Reporter: Fokko Driesprong
>Assignee: Fokko Driesprong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3983) Allow setting a custom encoder in DataFileWriter

2024-05-05 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17843586#comment-17843586
 ] 

ASF subversion and git services commented on AVRO-3983:
---

Commit e962bc47d49758a583665846b122b1c3c73a9e2c in avro's branch 
refs/heads/main from Fokko Driesprong
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=e962bc47d ]

AVRO-3983: Allow setting a custom encoder in DataFileWriter (#2874)



> Allow setting a custom encoder in DataFileWriter
> 
>
> Key: AVRO-3983
> URL: https://issues.apache.org/jira/browse/AVRO-3983
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Affects Versions: 1.11.3
>Reporter: Fokko Driesprong
>Assignee: Fokko Driesprong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3982) Use String.isEmpty() instead

2024-04-30 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17842377#comment-17842377
 ] 

ASF subversion and git services commented on AVRO-3982:
---

Commit abf9b88052b29c541e354b56fe8c2b54c8b4dff8 in avro's branch 
refs/heads/main from Fokko Driesprong
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=abf9b8805 ]

AVRO-3982: Use `String.isEmpty()` instead (#2873)



> Use String.isEmpty() instead
> 
>
> Key: AVRO-3982
> URL: https://issues.apache.org/jira/browse/AVRO-3982
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Affects Versions: 1.11.3
>Reporter: Fokko Driesprong
>Assignee: Fokko Driesprong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3978) Build with Java 11 minimum

2024-04-29 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17842080#comment-17842080
 ] 

ASF subversion and git services commented on AVRO-3978:
---

Commit 589b8936563c7e18e5b25906143c7b5d52c9e0b8 in avro's branch 
refs/heads/dependabot/maven/lang/java/org.apache.maven.plugins-maven-gpg-plugin-3.2.4
 from JB Onofré
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=589b89365 ]

AVRO-3978: Upgrade main to build with Java 11 minimum (#2855)



> Build with Java 11 minimum
> --
>
> Key: AVRO-3978
> URL: https://issues.apache.org/jira/browse/AVRO-3978
> Project: Apache Avro
>  Issue Type: Task
>  Components: java
>Affects Versions: 1.12.0
>Reporter: Jean-Baptiste Onofré
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> As discussed on the [dev mailing 
> list|https://lists.apache.org/thread/2v4l9tdzch2qgo20mtkvw6gftd0lpf79], Avro 
> main (e.g. 1.12.x now) should be updated to build with Java 11 minimum.
> NB: Avro 1.11.x is still Java 8 compliant. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (AVRO-3978) Build with Java 11 minimum

2024-04-29 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-3978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17842082#comment-17842082
 ] 

ASF subversion and git services commented on AVRO-3978:
---

Commit 589b8936563c7e18e5b25906143c7b5d52c9e0b8 in avro's branch 
refs/heads/dependabot/maven/lang/java/org.apache-apache-32 from JB Onofré
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=589b89365 ]

AVRO-3978: Upgrade main to build with Java 11 minimum (#2855)



> Build with Java 11 minimum
> --
>
> Key: AVRO-3978
> URL: https://issues.apache.org/jira/browse/AVRO-3978
> Project: Apache Avro
>  Issue Type: Task
>  Components: java
>Affects Versions: 1.12.0
>Reporter: Jean-Baptiste Onofré
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> As discussed on the [dev mailing 
> list|https://lists.apache.org/thread/2v4l9tdzch2qgo20mtkvw6gftd0lpf79], Avro 
> main (e.g. 1.12.x now) should be updated to build with Java 11 minimum.
> NB: Avro 1.11.x is still Java 8 compliant. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


  1   2   3   4   5   6   7   8   9   10   >