This is an automated email from the ASF dual-hosted git repository. kriskras99 pushed a commit to branch feat/full_enum_support in repository https://gitbox.apache.org/repos/asf/avro-rs.git
commit de8ccedcf90824f1609b668004a85f51d9a906ae Author: Kriskras99 <[email protected]> AuthorDate: Fri Mar 13 11:25:58 2026 +0100 doc: Document the mapping between the Serde and Avro data models --- avro/src/documentation/mod.rs | 1 + avro/src/documentation/serde_data_model_to_avro.rs | 116 +++++++++++++++++++++ 2 files changed, 117 insertions(+) diff --git a/avro/src/documentation/mod.rs b/avro/src/documentation/mod.rs index 7771012..dcd0848 100644 --- a/avro/src/documentation/mod.rs +++ b/avro/src/documentation/mod.rs @@ -22,3 +22,4 @@ pub mod dynamic; pub mod primer; +pub mod serde_data_model_to_avro; diff --git a/avro/src/documentation/serde_data_model_to_avro.rs b/avro/src/documentation/serde_data_model_to_avro.rs new file mode 100644 index 0000000..b623577 --- /dev/null +++ b/avro/src/documentation/serde_data_model_to_avro.rs @@ -0,0 +1,116 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +//! # Mapping the Serde data model to the Avro data model +//! +//! When manually mapping Rust types to an Avro schema, or the reverse, it is important to understand +//! how the different data models are mapped. This should only be done in very specific circumstances, +//! most users should use the [`AvroSchema`] derive. +//! +//! Only the schemas generated by the [`AvroSchema`] derive and the mapping as defined here are +//! supported. Any other behavior might change in a minor version. +//! +//! The following list is based on [the data model defined by Serde](https://serde.rs/data-model.html): +//! - **14 primitive types** +//! - `bool` => [`Schema::Boolean`] +//! - `i8`, `i16`, `i32`, `u8`, `u16` => [`Schema::Int`] +//! - `i64`, `u32` => [`Schema::Long`] +//! - `u64` => [`Schema::Fixed`]`(name: "u64", size: 8)` +//! - This is not a `Schema::Long` as that is a signed number of maximum 64 bits. +//! - `i128` => [`Schema::Fixed`]`(name: "i128", size: 16)` +//! - `u128` => [`Schema::Fixed`]`(name: "u128", size: 16)` +//! - `f32` => [`Schema::Float`] +//! - `f64` => [`Schema::Double`] +//! - `char` => [`Schema::String`] +//! - Only one character allowed, deserializer will return an error for strings with more than one character. +//! - **string** => [`Schema::String`] +//! - **byte array** => [`Schema::Bytes`] or [`Schema::Fixed`] +//! - **option** => [`Schema::Union([Schema::Null, _])`](crate::schema::Schema::Union) +//! - **unit** => [`Schema::Null`] +//! - **unit struct** => [`Schema::Record`] with the unqualified name equal to the name of the struct and zero fields +//! - **unit variant** => See [Enums](##Enums) +//! - **newtype struct** => [`Schema::Record`] with the unqualified name equal to the name of the struct and one field +//! - **newtype variant** => See [Enums](##Enums) +//! - **seq** => [`Schema::Array`] +//! - **tuple** +//! - For tuples with one element, the schema will be the schema the only element +//! - For tuples with more than one element, the schema will be a [`Schema::Record`] with as many fields as there are elements. +//! The schema must have the attribute `org.apache.avro.rust.tuple` with the value set to `true`. +//! - **Note:** Serde (de)serializes arrays (`[T; N]`) as tuples. To (de)serialize an array as a +//! [`Schema::Array`] use [`apache_avro::serde::array`]. +//! - **tuple_struct** => [`Schema::Record`] with the unqualified name equal to the name of the struct and as many fields as there are elements +//! - **Note:** Tuple structs with 0 or 1 elements will also be (de)serialized as a [`Schema::Record`]. This +//! is different from normal tuples`. +//! - **tuple_variant** => See [Enums](##Enums) +//! - **map** => [`Schema::Map`] +//! - **Note:** the key type of the map will be (de)serialized as a [`Schema::String`] +//! - **struct** => [`Schema::Record`] +//! - **struct_variant** => See [Enums](##Enums) +//! +//! ## Enums +//! +//! ### Externally tagged +//! This is the default enum representation for Serde. It can be mapped in three ways to the Avro data model. +//! For all three options it is important that the enum index matches the Avro index. +//! - As a [`Schema::Enum`], this is only possible for enums with only unit variants. +//! - As a [`Schema::Union`] with a [`Schema::Record`] for every variant: +//! - **unit_variant** => [`Schema::Record`] named as the variant and with no fields. +//! - **newtype_variant** => [`Schema::Record`] named as the variant and with one field. +//! The schema must have the attribute `org.apache.avro.rust.union_of_records` with the value set to `true`. +//! - **tuple_variant** => [`Schema::Record`] named as the variant and with as many fields as there are element. +//! - **struct_variant** => [`Schema::Record`] named as the variant and with a field for every field of the struct variant. +//! - As a [`Schema::Union`] without the wrapper [`Schema::Record`], all schemas must be unique: +//! - **unit_variant** => [`Schema::Null`]. +//! - **newtype_variant** => The schema of the inner type. +//! - **tuple_variant** => [`Schema::Record`] named as the variant and with as many fields as there are element. +//! - **struct_variant** => [`Schema::Record`] named as the variant and with a field for every field of the struct variant. +//! +//! ### Internally tagged +//! This enum representation is used by Serde if the attribute `#[serde(tag = "...")]` is used. +//! It maps to a [`Schema::Record`]. There must be at least one field that is named as the value of the +//! `tag` attribute. If a field is not used by all variants it must have a `default` set. +//! +//! ### Adjacently tagged +//! This enum representation is used by Serde if the attributes `#[serde(tag = "...", content = "...")]` are used. +//! It maps to a [`Schema::Record`] with two fields. One field must be named as the value of the `tag` +//! attribute and use the [`Schema::Enum`] schema. The other field must be named as the value of the +//! `content` tag and use the [`Schema::Union`] schema. +//! +//! ### Untagged +//! This enum representation is ued by Serde if the attribute `#[serde(untagged)]` is used. It maps +//! to a [`Schema::Union`] with the following schemas: +//! - **unit_variant** => [`Schema::Null`]. +//! - **newtype_variant** => The schema of the inner type. +//! - **tuple_variant** => [`Schema::Record`] named as the variant and with as many fields as there are element. +//! - **struct_variant** => [`Schema::Record`] named as the variant and with a field for every field of the struct variant. +//! +//! [`AvroSchema`]: crate::AvroSchema +//! [`Schema::Array`]: crate::schema::Schema::Array +//! [`Schema::Boolean`]: crate::schema::Schema::Boolean +//! [`Schema::Bytes`]: crate::schema::Schema::Bytes +//! [`Schema::Double`]: crate::schema::Schema::Double +//! [`Schema::Enum`]: crate::schema::Schema::Enum +//! [`Schema::Fixed`]: crate::schema::Schema::Fixed +//! [`Schema::Float`]: crate::schema::Schema::Float +//! [`Schema::Int`]: crate::schema::Schema::Int +//! [`Schema::Long`]: crate::schema::Schema::Long +//! [`Schema::Map`]: crate::schema::Schema::Map +//! [`Schema::Null`]: crate::schema::Schema::Null +//! [`Schema::Record`]: crate::schema::Schema::Record +//! [`Schema::String`]: crate::schema::Schema::String +//! [`Schema::Union`]: crate::schema::Schema::Union +//! [`apache_avro::serde::array`]: crate::serde::array
