[
https://issues.apache.org/jira/browse/AVRO-3479?focusedWorklogId=754480&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-754480
]
ASF GitHub Bot logged work on AVRO-3479:
----------------------------------------
Author: ASF GitHub Bot
Created on: 08/Apr/22 09:28
Start Date: 08/Apr/22 09:28
Worklog Time Spent: 10m
Work Description: martin-g commented on code in PR #1631:
URL: https://github.com/apache/avro/pull/1631#discussion_r845824178
##########
lang/rust/avro_derive/src/lib.rs:
##########
@@ -0,0 +1,366 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use proc_macro2::{Span, TokenStream, TokenTree};
+use quote::quote;
+
+use syn::{parse_macro_input, Attribute, DeriveInput, Error, Lit, Path, Type,
TypePath};
+
+#[proc_macro_derive(AvroSchema, attributes(namespace))]
+// Templated from Serde
+pub fn proc_macro_derive_avro_schema(input: proc_macro::TokenStream) ->
proc_macro::TokenStream {
+ let mut input = parse_macro_input!(input as DeriveInput);
+ derive_avro_schema(&mut input)
+ .unwrap_or_else(to_compile_errors)
+ .into()
+}
+
+fn derive_avro_schema(input: &mut DeriveInput) -> Result<TokenStream,
Vec<syn::Error>> {
+ let namespace = get_namespace_from_attributes(&input.attrs)?;
+ let full_schema_name = vec![namespace, Some(input.ident.to_string())]
+ .into_iter()
+ .flatten()
+ .collect::<Vec<String>>()
+ .join(".");
+ let schema_def = match &input.data {
+ syn::Data::Struct(s) => {
+ get_data_struct_schema_def(&full_schema_name, s,
input.ident.span())?
+ }
+ syn::Data::Enum(e) => get_data_enum_schema_def(&full_schema_name, e,
input.ident.span())?,
+ _ => {
+ return Err(vec![Error::new(
+ input.ident.span(),
+ "AvroSchema derive only works for structs and simple enums ",
+ )])
+ }
+ };
+
+ let ty = &input.ident;
+ let (impl_generics, ty_generics, where_clause) =
input.generics.split_for_impl();
+ Ok(quote! {
+ impl #impl_generics apache_avro::schema::AvroSchemaWithResolved for
#ty #ty_generics #where_clause {
+ fn get_schema_with_resolved(resolved_schemas: &mut
HashMap<apache_avro::schema::Name, apache_avro::schema::Schema>) ->
apache_avro::schema::Schema {
+ let name =
apache_avro::schema::Name::new(#full_schema_name).expect(&format!("Unable to
parse schema name {}", #full_schema_name)[..]);
+ if resolved_schemas.contains_key(&name) {
+ resolved_schemas.get(&name).unwrap().clone()
+ }else {
+ resolved_schemas.insert(name.clone(), Schema::Ref{name:
name.clone()});
+ #schema_def
+ }
+ }
+ }
+ })
+}
+
+fn get_namespace_from_attributes(attrs: &[Attribute]) ->
Result<Option<String>, Vec<Error>> {
+ let namespace_attr_path_constant: Path = syn::parse2::<Path>(quote!
{namespace}).unwrap();
Review Comment:
Let's namespace the Avro specific attributes with `#[avro(...)]`
E.g. `#[avro(namespace="com.example")]`
This is what Serde does too - https://serde.rs/attributes.html
##########
lang/rust/avro/src/schema.rs:
##########
@@ -1498,6 +1499,137 @@ fn field_ordering_position(field: &str) ->
Option<usize> {
.map(|pos| pos + 1)
}
+pub fn record_schema_for_fields(
+ name: Name,
+ aliases: Aliases,
+ doc: Documentation,
+ fields: Vec<RecordField>,
+) -> Schema {
+ let lookup: HashMap<String, usize> = fields
+ .iter()
+ .map(|field| (field.name.to_owned(), field.position))
+ .collect();
+ Schema::Record {
+ name,
+ aliases,
+ doc,
+ fields,
+ lookup,
+ }
+}
+
+pub trait AvroSchema {
+ fn get_schema() -> Schema;
+}
+
+/// TODO Help me name this. The idea here that any previously parsed or
constructed schema with a name is registered in resolved schemas and passed
recursively to avoid infinite recursion
+pub trait AvroSchemaWithResolved {
+ fn get_schema_with_resolved(resolved_schemas: &mut Names) -> Schema;
+}
+
+impl<T> AvroSchema for T
+where
+ T: AvroSchemaWithResolved,
+{
+ fn get_schema() -> Schema {
+ T::get_schema_with_resolved(&mut HashMap::default())
+ }
+}
+
+macro_rules! impl_schema(
+ ($type:ty, $variant_constructor:expr) => (
+ impl AvroSchemaWithResolved for $type {
+ fn get_schema_with_resolved(_: &mut HashMap<Name, Schema>) ->
Schema {
Review Comment:
```suggestion
fn get_schema_with_resolved(_: &mut Names) -> Schema {
```
Can we use `Names` here ?
##########
lang/rust/avro_derive/Cargo.toml:
##########
@@ -0,0 +1,33 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+[package]
+name = "avro_derive"
+version = "0.1.0"
+edition = "2021"
Review Comment:
Rust 1.51.0 (MSRV - minimum supported Rust version) reports 2021 as unstable.
Do we need 2021 for some specific feature or we can use 2018 as the main
module ?
##########
lang/rust/avro_derive/src/lib.rs:
##########
@@ -0,0 +1,366 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use proc_macro2::{Span, TokenStream, TokenTree};
+use quote::quote;
+
+use syn::{parse_macro_input, Attribute, DeriveInput, Error, Lit, Path, Type,
TypePath};
+
+#[proc_macro_derive(AvroSchema, attributes(namespace))]
+// Templated from Serde
+pub fn proc_macro_derive_avro_schema(input: proc_macro::TokenStream) ->
proc_macro::TokenStream {
+ let mut input = parse_macro_input!(input as DeriveInput);
+ derive_avro_schema(&mut input)
+ .unwrap_or_else(to_compile_errors)
+ .into()
+}
+
+fn derive_avro_schema(input: &mut DeriveInput) -> Result<TokenStream,
Vec<syn::Error>> {
+ let namespace = get_namespace_from_attributes(&input.attrs)?;
+ let full_schema_name = vec![namespace, Some(input.ident.to_string())]
+ .into_iter()
+ .flatten()
+ .collect::<Vec<String>>()
+ .join(".");
+ let schema_def = match &input.data {
+ syn::Data::Struct(s) => {
+ get_data_struct_schema_def(&full_schema_name, s,
input.ident.span())?
+ }
+ syn::Data::Enum(e) => get_data_enum_schema_def(&full_schema_name, e,
input.ident.span())?,
+ _ => {
+ return Err(vec![Error::new(
+ input.ident.span(),
+ "AvroSchema derive only works for structs and simple enums ",
+ )])
+ }
+ };
+
+ let ty = &input.ident;
+ let (impl_generics, ty_generics, where_clause) =
input.generics.split_for_impl();
+ Ok(quote! {
+ impl #impl_generics apache_avro::schema::AvroSchemaWithResolved for
#ty #ty_generics #where_clause {
+ fn get_schema_with_resolved(resolved_schemas: &mut
HashMap<apache_avro::schema::Name, apache_avro::schema::Schema>) ->
apache_avro::schema::Schema {
+ let name =
apache_avro::schema::Name::new(#full_schema_name).expect(&format!("Unable to
parse schema name {}", #full_schema_name)[..]);
+ if resolved_schemas.contains_key(&name) {
+ resolved_schemas.get(&name).unwrap().clone()
+ }else {
+ resolved_schemas.insert(name.clone(), Schema::Ref{name:
name.clone()});
+ #schema_def
+ }
+ }
+ }
+ })
+}
+
+fn get_namespace_from_attributes(attrs: &[Attribute]) ->
Result<Option<String>, Vec<Error>> {
+ let namespace_attr_path_constant: Path = syn::parse2::<Path>(quote!
{namespace}).unwrap();
+ const NAMESPACE_PARSING_ERROR_CONSTANST: &str =
+ "Namespace attribute must be in form #[namespace =
\"com.testing.namespace\"]";
+ // parse out namespace if present. Requires strict syntax
+ for attr in attrs {
+ if namespace_attr_path_constant == attr.path {
+ let mut input_tokens = attr.tokens.clone().into_iter();
+ if let (
+ Some(TokenTree::Punct(punct)),
+ Some(TokenTree::Literal(namespace_literal)),
+ None,
+ ) = (
+ input_tokens.next(),
+ input_tokens.next(),
+ input_tokens.next(),
+ ) {
+ if punct.as_char() == '=' {
+ if let Lit::Str(lit_str) = Lit::new(namespace_literal) {
+ return Ok(Some(lit_str.value()));
+ }
+ }
+ }
+ return Err(vec![Error::new_spanned(
+ &attr.tokens,
+ NAMESPACE_PARSING_ERROR_CONSTANST,
+ )]);
+ }
+ }
+ Ok(None)
+}
+
+fn get_data_struct_schema_def(
+ full_schema_name: &str,
+ s: &syn::DataStruct,
+ error_span: Span,
+) -> Result<TokenStream, Vec<Error>> {
+ let mut record_field_exprs = vec![];
+ match s.fields {
+ syn::Fields::Named(ref a) => {
+ for (position, field) in a.named.iter().enumerate() {
+ let name = field.ident.as_ref().unwrap().to_string(); // we
know everything has a name
+ let schema_expr = type_to_schema_expr(&field.ty)?;
+ let position = position;
+ record_field_exprs.push(quote! {
+ apache_avro::schema::RecordField {
+ name: #name.to_string(),
+ doc: Option::None,
+ default: Option::None,
+ schema: #schema_expr,
+ order:
apache_avro::schema::RecordFieldOrder::Ignore,
+ position: #position,
+ }
+ });
+ }
+ }
+ syn::Fields::Unnamed(_) => {
+ return Err(vec![Error::new(
+ error_span,
+ "AvroSchema derive does not work for tuple structs",
+ )])
+ }
+ syn::Fields::Unit => {
+ return Err(vec![Error::new(
+ error_span,
+ "AvroSchema derive does not work for unit structs",
+ )])
+ }
+ }
+ Ok(quote! {
+ let schema_fields = vec![#(#record_field_exprs),*];
+ let name =
apache_avro::schema::Name::new(#full_schema_name).expect(&format!("Unable to
struct name for schema {}", #full_schema_name)[..]);
+ apache_avro::schema::record_schema_for_fields(name, None, None,
schema_fields)
+ })
+}
+
+fn get_data_enum_schema_def(
+ full_schema_name: &str,
+ e: &syn::DataEnum,
+ error_span: Span,
+) -> Result<TokenStream, Vec<Error>> {
+ if e.variants.iter().all(|v| syn::Fields::Unit == v.fields) {
+ let symbols: Vec<String> = e
+ .variants
+ .iter()
+ .map(|varient| varient.ident.to_string())
+ .collect();
+ Ok(quote! {
+ apache_avro::schema::Schema::Enum {
+ name:
apache_avro::schema::Name::new(#full_schema_name).expect(&format!("Unable to
parse enum name for schema {}", #full_schema_name)[..]),
+ aliases: None,
+ doc: None,
+ symbols: vec![#(#symbols.to_owned()),*]
+ }
+ })
+ } else {
+ Err(vec![Error::new(
+ error_span,
+ "AvroSchema derive does not work for enums with non unit structs",
+ )])
+ }
+}
+
+/// Takes in the Tokens of a type and returns the tokens of an expression with
return type `Schema`
+fn type_to_schema_expr(ty: &Type) -> Result<TokenStream, Vec<Error>> {
+ if let Type::Path(p) = ty {
+ let type_string = p.path.segments.last().unwrap().ident.to_string();
+
+ let schema = match &type_string[..] {
+ "bool" => quote! {Schema::Boolean},
+ "i8" | "i16" | "i32" | "u8" | "u16" => quote!
{apache_avro::schema::Schema::Int},
+ "i64" => quote! {apache_avro::schema::Schema::Long},
Review Comment:
I see we treat `u8` and `u16` as Schema::Int. Why `u32` is not in the list
for Schema::Long then ?
##########
lang/rust/avro_derive/src/lib.rs:
##########
@@ -0,0 +1,366 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use proc_macro2::{Span, TokenStream, TokenTree};
+use quote::quote;
+
+use syn::{parse_macro_input, Attribute, DeriveInput, Error, Lit, Path, Type,
TypePath};
+
+#[proc_macro_derive(AvroSchema, attributes(namespace))]
+// Templated from Serde
+pub fn proc_macro_derive_avro_schema(input: proc_macro::TokenStream) ->
proc_macro::TokenStream {
+ let mut input = parse_macro_input!(input as DeriveInput);
+ derive_avro_schema(&mut input)
+ .unwrap_or_else(to_compile_errors)
+ .into()
+}
+
+fn derive_avro_schema(input: &mut DeriveInput) -> Result<TokenStream,
Vec<syn::Error>> {
+ let namespace = get_namespace_from_attributes(&input.attrs)?;
+ let full_schema_name = vec![namespace, Some(input.ident.to_string())]
+ .into_iter()
+ .flatten()
+ .collect::<Vec<String>>()
+ .join(".");
+ let schema_def = match &input.data {
+ syn::Data::Struct(s) => {
+ get_data_struct_schema_def(&full_schema_name, s,
input.ident.span())?
+ }
+ syn::Data::Enum(e) => get_data_enum_schema_def(&full_schema_name, e,
input.ident.span())?,
+ _ => {
+ return Err(vec![Error::new(
+ input.ident.span(),
+ "AvroSchema derive only works for structs and simple enums ",
+ )])
+ }
+ };
+
+ let ty = &input.ident;
+ let (impl_generics, ty_generics, where_clause) =
input.generics.split_for_impl();
+ Ok(quote! {
+ impl #impl_generics apache_avro::schema::AvroSchemaWithResolved for
#ty #ty_generics #where_clause {
+ fn get_schema_with_resolved(resolved_schemas: &mut
HashMap<apache_avro::schema::Name, apache_avro::schema::Schema>) ->
apache_avro::schema::Schema {
+ let name =
apache_avro::schema::Name::new(#full_schema_name).expect(&format!("Unable to
parse schema name {}", #full_schema_name)[..]);
+ if resolved_schemas.contains_key(&name) {
+ resolved_schemas.get(&name).unwrap().clone()
+ }else {
+ resolved_schemas.insert(name.clone(), Schema::Ref{name:
name.clone()});
+ #schema_def
+ }
+ }
+ }
+ })
+}
+
+fn get_namespace_from_attributes(attrs: &[Attribute]) ->
Result<Option<String>, Vec<Error>> {
+ let namespace_attr_path_constant: Path = syn::parse2::<Path>(quote!
{namespace}).unwrap();
+ const NAMESPACE_PARSING_ERROR_CONSTANST: &str =
+ "Namespace attribute must be in form #[namespace =
\"com.testing.namespace\"]";
+ // parse out namespace if present. Requires strict syntax
+ for attr in attrs {
+ if namespace_attr_path_constant == attr.path {
+ let mut input_tokens = attr.tokens.clone().into_iter();
+ if let (
+ Some(TokenTree::Punct(punct)),
+ Some(TokenTree::Literal(namespace_literal)),
+ None,
+ ) = (
+ input_tokens.next(),
+ input_tokens.next(),
+ input_tokens.next(),
+ ) {
+ if punct.as_char() == '=' {
+ if let Lit::Str(lit_str) = Lit::new(namespace_literal) {
+ return Ok(Some(lit_str.value()));
+ }
+ }
+ }
+ return Err(vec![Error::new_spanned(
+ &attr.tokens,
+ NAMESPACE_PARSING_ERROR_CONSTANST,
+ )]);
+ }
+ }
+ Ok(None)
+}
+
+fn get_data_struct_schema_def(
+ full_schema_name: &str,
+ s: &syn::DataStruct,
+ error_span: Span,
+) -> Result<TokenStream, Vec<Error>> {
+ let mut record_field_exprs = vec![];
+ match s.fields {
+ syn::Fields::Named(ref a) => {
+ for (position, field) in a.named.iter().enumerate() {
+ let name = field.ident.as_ref().unwrap().to_string(); // we
know everything has a name
+ let schema_expr = type_to_schema_expr(&field.ty)?;
+ let position = position;
+ record_field_exprs.push(quote! {
+ apache_avro::schema::RecordField {
+ name: #name.to_string(),
+ doc: Option::None,
+ default: Option::None,
+ schema: #schema_expr,
+ order:
apache_avro::schema::RecordFieldOrder::Ignore,
+ position: #position,
+ }
+ });
+ }
+ }
+ syn::Fields::Unnamed(_) => {
+ return Err(vec![Error::new(
+ error_span,
+ "AvroSchema derive does not work for tuple structs",
+ )])
+ }
+ syn::Fields::Unit => {
+ return Err(vec![Error::new(
+ error_span,
+ "AvroSchema derive does not work for unit structs",
+ )])
+ }
+ }
+ Ok(quote! {
+ let schema_fields = vec![#(#record_field_exprs),*];
+ let name =
apache_avro::schema::Name::new(#full_schema_name).expect(&format!("Unable to
struct name for schema {}", #full_schema_name)[..]);
+ apache_avro::schema::record_schema_for_fields(name, None, None,
schema_fields)
+ })
+}
+
+fn get_data_enum_schema_def(
+ full_schema_name: &str,
+ e: &syn::DataEnum,
+ error_span: Span,
+) -> Result<TokenStream, Vec<Error>> {
+ if e.variants.iter().all(|v| syn::Fields::Unit == v.fields) {
+ let symbols: Vec<String> = e
+ .variants
+ .iter()
+ .map(|varient| varient.ident.to_string())
+ .collect();
+ Ok(quote! {
+ apache_avro::schema::Schema::Enum {
+ name:
apache_avro::schema::Name::new(#full_schema_name).expect(&format!("Unable to
parse enum name for schema {}", #full_schema_name)[..]),
+ aliases: None,
+ doc: None,
+ symbols: vec![#(#symbols.to_owned()),*]
+ }
+ })
+ } else {
+ Err(vec![Error::new(
+ error_span,
+ "AvroSchema derive does not work for enums with non unit structs",
+ )])
+ }
+}
+
+/// Takes in the Tokens of a type and returns the tokens of an expression with
return type `Schema`
+fn type_to_schema_expr(ty: &Type) -> Result<TokenStream, Vec<Error>> {
+ if let Type::Path(p) = ty {
+ let type_string = p.path.segments.last().unwrap().ident.to_string();
+
+ let schema = match &type_string[..] {
+ "bool" => quote! {Schema::Boolean},
+ "i8" | "i16" | "i32" | "u8" | "u16" => quote!
{apache_avro::schema::Schema::Int},
+ "i64" => quote! {apache_avro::schema::Schema::Long},
+ "f32" => quote! {apache_avro::schema::Schema::Float},
+ "f64" => quote! {apache_avro::schema::Schema::Double},
+ "String" | "str" => quote! {apache_avro::schema::Schema::String},
+ "char" => {
+ return Err(vec![Error::new_spanned(
+ ty,
+ "AvroSchema: Cannot guarentee sucessful deserialization of
this type",
+ )])
+ }
+ "u32" | "u64" => {
+ return Err(vec![Error::new_spanned(
+ ty,
+ "Cannot guarentee sucessful serialization of this type due to
overflow concerns",
+ )])
+ } //Can't guarentee serialization type
+ _ => {
+ // Fails when the type does not implement
AvroSchemaWithResolved directly or covered by blanket implementation
+ // TODO check and error report with something like
https://docs.rs/quote/1.0.15/quote/macro.quote_spanned.html#example
+ type_path_schema_expr(p)
+ }
+ };
+ Ok(schema)
+ } else if let Type::Array(ta) = ty {
+ let inner_schema_expr = type_to_schema_expr(&ta.elem)?;
+ Ok(quote!
{apache_avro::schema::Schema::Array(Box::new(#inner_schema_expr))})
+ } else if let Type::Reference(tr) = ty {
+ type_to_schema_expr(&tr.elem)
+ } else {
+ Err(vec![Error::new_spanned(
+ ty,
+ format!("Unable to generate schema for type: {:?}", ty),
+ )])
+ }
+}
+
+/// Generates the schema def expression for fully qualified type paths using
the associated function
+/// - `A -> <A as AvroSchemaWithResolved>::get_schema_with_resolved()`
+/// - `A<T> -> <A<T> as AvroSchemaWithResolved>::get_schema_with_resolved()`
+fn type_path_schema_expr(p: &TypePath) -> TokenStream {
+ quote! {<#p as
apache_avro::schema::AvroSchemaWithResolved>::get_schema_with_resolved(resolved_schemas)}
+}
+
+/// Stolen from serde
+fn to_compile_errors(errors: Vec<syn::Error>) -> proc_macro2::TokenStream {
+ let compile_errors = errors.iter().map(syn::Error::to_compile_error);
+ quote!(#(#compile_errors)*)
+}
+
+#[cfg(test)]
+mod tests {
+ // Note this useful idiom: importing names from outer (for mod tests)
scope.
+ use super::*;
+ #[test]
+ fn basic_case() {
+ let test_struct = quote! {
+ struct A {
+ a: i32,
+ b: String
+ }
+ };
+
+ match syn::parse2::<DeriveInput>(test_struct) {
+ Ok(mut input) => {
+ assert!(derive_avro_schema(&mut input).is_ok())
+ }
+ Err(error) => panic!(
+ "Failied to parse as derive input when it should be able to.
Error: {:?}",
+ error
+ ),
+ };
+ }
+
+ #[test]
+ fn tuple_struct_unsupported() {
+ let test_tuple_struct = quote! {
+ struct B (i32, String);
+ };
+
+ match syn::parse2::<DeriveInput>(test_tuple_struct) {
+ Ok(mut input) => {
+ assert!(derive_avro_schema(&mut input).is_err())
+ }
+ Err(error) => panic!(
+ "Failied to parse as derive input when it should be able to.
Error: {:?}",
+ error
+ ),
+ };
+ }
+
+ #[test]
+ fn unit_struct_unsupported() {
+ let test_tuple_struct = quote! {
+ struct AbsoluteUnit;
+ };
+
+ match syn::parse2::<DeriveInput>(test_tuple_struct) {
+ Ok(mut input) => {
+ assert!(derive_avro_schema(&mut input).is_err())
+ }
+ Err(error) => panic!(
+ "Failied to parse as derive input when it should be able to.
Error: {:?}",
+ error
+ ),
+ };
+ }
+
+ #[test]
+ fn optional_type_generating() {
+ let stuct_with_optional = quote! {
+ struct Test4 {
+ a : Option<i32>
+ }
+ };
+ match syn::parse2::<DeriveInput>(stuct_with_optional) {
+ Ok(mut input) => {
+ assert!(derive_avro_schema(&mut input).is_ok())
+ }
+ Err(error) => panic!(
+ "Failied to parse as derive input when it should be able to.
Error: {:?}",
Review Comment:
```suggestion
"Failed to parse as derive input when it should be able to.
Error: {:?}",
```
##########
lang/rust/avro_derive/tests/derive.rs:
##########
@@ -0,0 +1,409 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use apache_avro::schema::{AvroSchema, AvroSchemaWithResolved};
+use apache_avro::{from_value, Reader, Schema, Writer};
+use avro_derive::*;
+use serde::de::DeserializeOwned;
+use serde::ser::Serialize;
+use std::collections::HashMap;
+
+#[macro_use]
+extern crate serde;
+
+#[cfg(test)]
+mod test_derive {
+ use std::{
+ borrow::{Borrow, Cow},
+ sync::Mutex,
+ };
+
+ use super::*;
+
+ /// Takes in a type that implements the right combination of traits and
runs it through a Serde Cycle and asserts the result is the same
+ fn freeze_dry_assert<T>(obj: T)
+ where
+ T: std::fmt::Debug + Serialize + DeserializeOwned + AvroSchema + Clone
+ PartialEq,
+ {
+ let encoded = freeze(obj.clone());
Review Comment:
`freeze` and `dry` could be named `serialize` and `deserialize`.
Is there a reason to avoid using the latter names ?
##########
lang/rust/avro_derive/src/lib.rs:
##########
@@ -0,0 +1,366 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use proc_macro2::{Span, TokenStream, TokenTree};
+use quote::quote;
+
+use syn::{parse_macro_input, Attribute, DeriveInput, Error, Lit, Path, Type,
TypePath};
+
+#[proc_macro_derive(AvroSchema, attributes(namespace))]
+// Templated from Serde
+pub fn proc_macro_derive_avro_schema(input: proc_macro::TokenStream) ->
proc_macro::TokenStream {
+ let mut input = parse_macro_input!(input as DeriveInput);
+ derive_avro_schema(&mut input)
+ .unwrap_or_else(to_compile_errors)
+ .into()
+}
+
+fn derive_avro_schema(input: &mut DeriveInput) -> Result<TokenStream,
Vec<syn::Error>> {
+ let namespace = get_namespace_from_attributes(&input.attrs)?;
+ let full_schema_name = vec![namespace, Some(input.ident.to_string())]
+ .into_iter()
+ .flatten()
+ .collect::<Vec<String>>()
+ .join(".");
+ let schema_def = match &input.data {
+ syn::Data::Struct(s) => {
+ get_data_struct_schema_def(&full_schema_name, s,
input.ident.span())?
+ }
+ syn::Data::Enum(e) => get_data_enum_schema_def(&full_schema_name, e,
input.ident.span())?,
+ _ => {
+ return Err(vec![Error::new(
+ input.ident.span(),
+ "AvroSchema derive only works for structs and simple enums ",
+ )])
+ }
+ };
+
+ let ty = &input.ident;
+ let (impl_generics, ty_generics, where_clause) =
input.generics.split_for_impl();
+ Ok(quote! {
+ impl #impl_generics apache_avro::schema::AvroSchemaWithResolved for
#ty #ty_generics #where_clause {
+ fn get_schema_with_resolved(resolved_schemas: &mut
HashMap<apache_avro::schema::Name, apache_avro::schema::Schema>) ->
apache_avro::schema::Schema {
+ let name =
apache_avro::schema::Name::new(#full_schema_name).expect(&format!("Unable to
parse schema name {}", #full_schema_name)[..]);
+ if resolved_schemas.contains_key(&name) {
+ resolved_schemas.get(&name).unwrap().clone()
+ }else {
+ resolved_schemas.insert(name.clone(), Schema::Ref{name:
name.clone()});
+ #schema_def
+ }
+ }
+ }
+ })
+}
+
+fn get_namespace_from_attributes(attrs: &[Attribute]) ->
Result<Option<String>, Vec<Error>> {
+ let namespace_attr_path_constant: Path = syn::parse2::<Path>(quote!
{namespace}).unwrap();
+ const NAMESPACE_PARSING_ERROR_CONSTANST: &str =
+ "Namespace attribute must be in form #[namespace =
\"com.testing.namespace\"]";
+ // parse out namespace if present. Requires strict syntax
+ for attr in attrs {
+ if namespace_attr_path_constant == attr.path {
+ let mut input_tokens = attr.tokens.clone().into_iter();
+ if let (
+ Some(TokenTree::Punct(punct)),
+ Some(TokenTree::Literal(namespace_literal)),
+ None,
+ ) = (
+ input_tokens.next(),
+ input_tokens.next(),
+ input_tokens.next(),
+ ) {
+ if punct.as_char() == '=' {
+ if let Lit::Str(lit_str) = Lit::new(namespace_literal) {
+ return Ok(Some(lit_str.value()));
+ }
+ }
+ }
+ return Err(vec![Error::new_spanned(
+ &attr.tokens,
+ NAMESPACE_PARSING_ERROR_CONSTANST,
+ )]);
+ }
+ }
+ Ok(None)
+}
+
+fn get_data_struct_schema_def(
+ full_schema_name: &str,
+ s: &syn::DataStruct,
+ error_span: Span,
+) -> Result<TokenStream, Vec<Error>> {
+ let mut record_field_exprs = vec![];
+ match s.fields {
+ syn::Fields::Named(ref a) => {
+ for (position, field) in a.named.iter().enumerate() {
+ let name = field.ident.as_ref().unwrap().to_string(); // we
know everything has a name
+ let schema_expr = type_to_schema_expr(&field.ty)?;
+ let position = position;
+ record_field_exprs.push(quote! {
+ apache_avro::schema::RecordField {
+ name: #name.to_string(),
+ doc: Option::None,
+ default: Option::None,
Review Comment:
Let's create a sub-task tickets for them
##########
lang/rust/avro_derive/src/lib.rs:
##########
@@ -0,0 +1,366 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use proc_macro2::{Span, TokenStream, TokenTree};
+use quote::quote;
+
+use syn::{parse_macro_input, Attribute, DeriveInput, Error, Lit, Path, Type,
TypePath};
+
+#[proc_macro_derive(AvroSchema, attributes(namespace))]
+// Templated from Serde
+pub fn proc_macro_derive_avro_schema(input: proc_macro::TokenStream) ->
proc_macro::TokenStream {
+ let mut input = parse_macro_input!(input as DeriveInput);
+ derive_avro_schema(&mut input)
+ .unwrap_or_else(to_compile_errors)
+ .into()
+}
+
+fn derive_avro_schema(input: &mut DeriveInput) -> Result<TokenStream,
Vec<syn::Error>> {
+ let namespace = get_namespace_from_attributes(&input.attrs)?;
+ let full_schema_name = vec![namespace, Some(input.ident.to_string())]
+ .into_iter()
+ .flatten()
+ .collect::<Vec<String>>()
+ .join(".");
+ let schema_def = match &input.data {
+ syn::Data::Struct(s) => {
+ get_data_struct_schema_def(&full_schema_name, s,
input.ident.span())?
+ }
+ syn::Data::Enum(e) => get_data_enum_schema_def(&full_schema_name, e,
input.ident.span())?,
+ _ => {
+ return Err(vec![Error::new(
+ input.ident.span(),
+ "AvroSchema derive only works for structs and simple enums ",
+ )])
+ }
+ };
+
+ let ty = &input.ident;
+ let (impl_generics, ty_generics, where_clause) =
input.generics.split_for_impl();
+ Ok(quote! {
+ impl #impl_generics apache_avro::schema::AvroSchemaWithResolved for
#ty #ty_generics #where_clause {
+ fn get_schema_with_resolved(resolved_schemas: &mut
HashMap<apache_avro::schema::Name, apache_avro::schema::Schema>) ->
apache_avro::schema::Schema {
+ let name =
apache_avro::schema::Name::new(#full_schema_name).expect(&format!("Unable to
parse schema name {}", #full_schema_name)[..]);
+ if resolved_schemas.contains_key(&name) {
+ resolved_schemas.get(&name).unwrap().clone()
+ }else {
+ resolved_schemas.insert(name.clone(), Schema::Ref{name:
name.clone()});
+ #schema_def
+ }
+ }
+ }
+ })
+}
+
+fn get_namespace_from_attributes(attrs: &[Attribute]) ->
Result<Option<String>, Vec<Error>> {
+ let namespace_attr_path_constant: Path = syn::parse2::<Path>(quote!
{namespace}).unwrap();
+ const NAMESPACE_PARSING_ERROR_CONSTANST: &str =
+ "Namespace attribute must be in form #[namespace =
\"com.testing.namespace\"]";
+ // parse out namespace if present. Requires strict syntax
+ for attr in attrs {
+ if namespace_attr_path_constant == attr.path {
+ let mut input_tokens = attr.tokens.clone().into_iter();
+ if let (
+ Some(TokenTree::Punct(punct)),
+ Some(TokenTree::Literal(namespace_literal)),
+ None,
+ ) = (
+ input_tokens.next(),
+ input_tokens.next(),
+ input_tokens.next(),
+ ) {
+ if punct.as_char() == '=' {
+ if let Lit::Str(lit_str) = Lit::new(namespace_literal) {
+ return Ok(Some(lit_str.value()));
+ }
+ }
+ }
+ return Err(vec![Error::new_spanned(
+ &attr.tokens,
+ NAMESPACE_PARSING_ERROR_CONSTANST,
+ )]);
+ }
+ }
+ Ok(None)
+}
+
+fn get_data_struct_schema_def(
+ full_schema_name: &str,
+ s: &syn::DataStruct,
+ error_span: Span,
+) -> Result<TokenStream, Vec<Error>> {
+ let mut record_field_exprs = vec![];
+ match s.fields {
+ syn::Fields::Named(ref a) => {
+ for (position, field) in a.named.iter().enumerate() {
+ let name = field.ident.as_ref().unwrap().to_string(); // we
know everything has a name
+ let schema_expr = type_to_schema_expr(&field.ty)?;
+ let position = position;
+ record_field_exprs.push(quote! {
+ apache_avro::schema::RecordField {
+ name: #name.to_string(),
+ doc: Option::None,
+ default: Option::None,
+ schema: #schema_expr,
+ order:
apache_avro::schema::RecordFieldOrder::Ignore,
+ position: #position,
+ }
+ });
+ }
+ }
+ syn::Fields::Unnamed(_) => {
+ return Err(vec![Error::new(
+ error_span,
+ "AvroSchema derive does not work for tuple structs",
+ )])
+ }
+ syn::Fields::Unit => {
+ return Err(vec![Error::new(
+ error_span,
+ "AvroSchema derive does not work for unit structs",
+ )])
+ }
+ }
+ Ok(quote! {
+ let schema_fields = vec![#(#record_field_exprs),*];
+ let name =
apache_avro::schema::Name::new(#full_schema_name).expect(&format!("Unable to
struct name for schema {}", #full_schema_name)[..]);
Review Comment:
`Unable to struct name for schema` does not sound correct to me (a
non-native English speaker!).
Maybe something like `Invalid schema name: {}` ?!
##########
lang/rust/avro_derive/src/lib.rs:
##########
@@ -0,0 +1,366 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use proc_macro2::{Span, TokenStream, TokenTree};
+use quote::quote;
+
+use syn::{parse_macro_input, Attribute, DeriveInput, Error, Lit, Path, Type,
TypePath};
+
+#[proc_macro_derive(AvroSchema, attributes(namespace))]
+// Templated from Serde
+pub fn proc_macro_derive_avro_schema(input: proc_macro::TokenStream) ->
proc_macro::TokenStream {
+ let mut input = parse_macro_input!(input as DeriveInput);
+ derive_avro_schema(&mut input)
+ .unwrap_or_else(to_compile_errors)
+ .into()
+}
+
+fn derive_avro_schema(input: &mut DeriveInput) -> Result<TokenStream,
Vec<syn::Error>> {
+ let namespace = get_namespace_from_attributes(&input.attrs)?;
+ let full_schema_name = vec![namespace, Some(input.ident.to_string())]
+ .into_iter()
+ .flatten()
+ .collect::<Vec<String>>()
+ .join(".");
+ let schema_def = match &input.data {
+ syn::Data::Struct(s) => {
+ get_data_struct_schema_def(&full_schema_name, s,
input.ident.span())?
+ }
+ syn::Data::Enum(e) => get_data_enum_schema_def(&full_schema_name, e,
input.ident.span())?,
+ _ => {
+ return Err(vec![Error::new(
+ input.ident.span(),
+ "AvroSchema derive only works for structs and simple enums ",
+ )])
+ }
+ };
+
+ let ty = &input.ident;
+ let (impl_generics, ty_generics, where_clause) =
input.generics.split_for_impl();
+ Ok(quote! {
+ impl #impl_generics apache_avro::schema::AvroSchemaWithResolved for
#ty #ty_generics #where_clause {
+ fn get_schema_with_resolved(resolved_schemas: &mut
HashMap<apache_avro::schema::Name, apache_avro::schema::Schema>) ->
apache_avro::schema::Schema {
+ let name =
apache_avro::schema::Name::new(#full_schema_name).expect(&format!("Unable to
parse schema name {}", #full_schema_name)[..]);
+ if resolved_schemas.contains_key(&name) {
+ resolved_schemas.get(&name).unwrap().clone()
+ }else {
+ resolved_schemas.insert(name.clone(), Schema::Ref{name:
name.clone()});
+ #schema_def
+ }
+ }
+ }
+ })
+}
+
+fn get_namespace_from_attributes(attrs: &[Attribute]) ->
Result<Option<String>, Vec<Error>> {
+ let namespace_attr_path_constant: Path = syn::parse2::<Path>(quote!
{namespace}).unwrap();
+ const NAMESPACE_PARSING_ERROR_CONSTANST: &str =
+ "Namespace attribute must be in form #[namespace =
\"com.testing.namespace\"]";
+ // parse out namespace if present. Requires strict syntax
+ for attr in attrs {
+ if namespace_attr_path_constant == attr.path {
+ let mut input_tokens = attr.tokens.clone().into_iter();
+ if let (
+ Some(TokenTree::Punct(punct)),
+ Some(TokenTree::Literal(namespace_literal)),
+ None,
+ ) = (
+ input_tokens.next(),
+ input_tokens.next(),
+ input_tokens.next(),
+ ) {
+ if punct.as_char() == '=' {
+ if let Lit::Str(lit_str) = Lit::new(namespace_literal) {
+ return Ok(Some(lit_str.value()));
+ }
+ }
+ }
+ return Err(vec![Error::new_spanned(
+ &attr.tokens,
+ NAMESPACE_PARSING_ERROR_CONSTANST,
+ )]);
+ }
+ }
+ Ok(None)
+}
+
+fn get_data_struct_schema_def(
+ full_schema_name: &str,
+ s: &syn::DataStruct,
+ error_span: Span,
+) -> Result<TokenStream, Vec<Error>> {
+ let mut record_field_exprs = vec![];
+ match s.fields {
+ syn::Fields::Named(ref a) => {
+ for (position, field) in a.named.iter().enumerate() {
+ let name = field.ident.as_ref().unwrap().to_string(); // we
know everything has a name
+ let schema_expr = type_to_schema_expr(&field.ty)?;
+ let position = position;
+ record_field_exprs.push(quote! {
+ apache_avro::schema::RecordField {
+ name: #name.to_string(),
+ doc: Option::None,
+ default: Option::None,
Review Comment:
Later we can add support for providing the `default` from an attribute -
`#[avro(default = ...)]` and the `doc` from the Rustdoc
##########
lang/rust/avro_derive/src/lib.rs:
##########
@@ -0,0 +1,366 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use proc_macro2::{Span, TokenStream, TokenTree};
+use quote::quote;
+
+use syn::{parse_macro_input, Attribute, DeriveInput, Error, Lit, Path, Type,
TypePath};
+
+#[proc_macro_derive(AvroSchema, attributes(namespace))]
+// Templated from Serde
+pub fn proc_macro_derive_avro_schema(input: proc_macro::TokenStream) ->
proc_macro::TokenStream {
+ let mut input = parse_macro_input!(input as DeriveInput);
+ derive_avro_schema(&mut input)
+ .unwrap_or_else(to_compile_errors)
+ .into()
+}
+
+fn derive_avro_schema(input: &mut DeriveInput) -> Result<TokenStream,
Vec<syn::Error>> {
+ let namespace = get_namespace_from_attributes(&input.attrs)?;
+ let full_schema_name = vec![namespace, Some(input.ident.to_string())]
+ .into_iter()
+ .flatten()
+ .collect::<Vec<String>>()
+ .join(".");
+ let schema_def = match &input.data {
+ syn::Data::Struct(s) => {
+ get_data_struct_schema_def(&full_schema_name, s,
input.ident.span())?
+ }
+ syn::Data::Enum(e) => get_data_enum_schema_def(&full_schema_name, e,
input.ident.span())?,
+ _ => {
+ return Err(vec![Error::new(
+ input.ident.span(),
+ "AvroSchema derive only works for structs and simple enums ",
+ )])
+ }
+ };
+
+ let ty = &input.ident;
+ let (impl_generics, ty_generics, where_clause) =
input.generics.split_for_impl();
+ Ok(quote! {
+ impl #impl_generics apache_avro::schema::AvroSchemaWithResolved for
#ty #ty_generics #where_clause {
+ fn get_schema_with_resolved(resolved_schemas: &mut
HashMap<apache_avro::schema::Name, apache_avro::schema::Schema>) ->
apache_avro::schema::Schema {
+ let name =
apache_avro::schema::Name::new(#full_schema_name).expect(&format!("Unable to
parse schema name {}", #full_schema_name)[..]);
+ if resolved_schemas.contains_key(&name) {
+ resolved_schemas.get(&name).unwrap().clone()
+ }else {
+ resolved_schemas.insert(name.clone(), Schema::Ref{name:
name.clone()});
+ #schema_def
+ }
+ }
+ }
+ })
+}
+
+fn get_namespace_from_attributes(attrs: &[Attribute]) ->
Result<Option<String>, Vec<Error>> {
+ let namespace_attr_path_constant: Path = syn::parse2::<Path>(quote!
{namespace}).unwrap();
+ const NAMESPACE_PARSING_ERROR_CONSTANST: &str =
+ "Namespace attribute must be in form #[namespace =
\"com.testing.namespace\"]";
+ // parse out namespace if present. Requires strict syntax
+ for attr in attrs {
+ if namespace_attr_path_constant == attr.path {
+ let mut input_tokens = attr.tokens.clone().into_iter();
+ if let (
+ Some(TokenTree::Punct(punct)),
+ Some(TokenTree::Literal(namespace_literal)),
+ None,
+ ) = (
+ input_tokens.next(),
+ input_tokens.next(),
+ input_tokens.next(),
+ ) {
+ if punct.as_char() == '=' {
+ if let Lit::Str(lit_str) = Lit::new(namespace_literal) {
+ return Ok(Some(lit_str.value()));
+ }
+ }
+ }
+ return Err(vec![Error::new_spanned(
+ &attr.tokens,
+ NAMESPACE_PARSING_ERROR_CONSTANST,
+ )]);
+ }
+ }
+ Ok(None)
+}
+
+fn get_data_struct_schema_def(
+ full_schema_name: &str,
+ s: &syn::DataStruct,
+ error_span: Span,
+) -> Result<TokenStream, Vec<Error>> {
+ let mut record_field_exprs = vec![];
+ match s.fields {
+ syn::Fields::Named(ref a) => {
+ for (position, field) in a.named.iter().enumerate() {
+ let name = field.ident.as_ref().unwrap().to_string(); // we
know everything has a name
+ let schema_expr = type_to_schema_expr(&field.ty)?;
+ let position = position;
+ record_field_exprs.push(quote! {
+ apache_avro::schema::RecordField {
+ name: #name.to_string(),
+ doc: Option::None,
+ default: Option::None,
+ schema: #schema_expr,
+ order:
apache_avro::schema::RecordFieldOrder::Ignore,
+ position: #position,
+ }
+ });
+ }
+ }
+ syn::Fields::Unnamed(_) => {
+ return Err(vec![Error::new(
+ error_span,
+ "AvroSchema derive does not work for tuple structs",
+ )])
+ }
+ syn::Fields::Unit => {
+ return Err(vec![Error::new(
+ error_span,
+ "AvroSchema derive does not work for unit structs",
+ )])
+ }
+ }
+ Ok(quote! {
+ let schema_fields = vec![#(#record_field_exprs),*];
+ let name =
apache_avro::schema::Name::new(#full_schema_name).expect(&format!("Unable to
struct name for schema {}", #full_schema_name)[..]);
+ apache_avro::schema::record_schema_for_fields(name, None, None,
schema_fields)
+ })
+}
+
+fn get_data_enum_schema_def(
+ full_schema_name: &str,
+ e: &syn::DataEnum,
+ error_span: Span,
+) -> Result<TokenStream, Vec<Error>> {
+ if e.variants.iter().all(|v| syn::Fields::Unit == v.fields) {
+ let symbols: Vec<String> = e
+ .variants
+ .iter()
+ .map(|varient| varient.ident.to_string())
+ .collect();
+ Ok(quote! {
+ apache_avro::schema::Schema::Enum {
+ name:
apache_avro::schema::Name::new(#full_schema_name).expect(&format!("Unable to
parse enum name for schema {}", #full_schema_name)[..]),
+ aliases: None,
+ doc: None,
+ symbols: vec![#(#symbols.to_owned()),*]
+ }
+ })
+ } else {
+ Err(vec![Error::new(
+ error_span,
+ "AvroSchema derive does not work for enums with non unit structs",
+ )])
+ }
+}
+
+/// Takes in the Tokens of a type and returns the tokens of an expression with
return type `Schema`
+fn type_to_schema_expr(ty: &Type) -> Result<TokenStream, Vec<Error>> {
+ if let Type::Path(p) = ty {
+ let type_string = p.path.segments.last().unwrap().ident.to_string();
+
+ let schema = match &type_string[..] {
+ "bool" => quote! {Schema::Boolean},
+ "i8" | "i16" | "i32" | "u8" | "u16" => quote!
{apache_avro::schema::Schema::Int},
+ "i64" => quote! {apache_avro::schema::Schema::Long},
+ "f32" => quote! {apache_avro::schema::Schema::Float},
+ "f64" => quote! {apache_avro::schema::Schema::Double},
+ "String" | "str" => quote! {apache_avro::schema::Schema::String},
+ "char" => {
+ return Err(vec![Error::new_spanned(
+ ty,
+ "AvroSchema: Cannot guarentee sucessful deserialization of
this type",
+ )])
+ }
+ "u32" | "u64" => {
+ return Err(vec![Error::new_spanned(
+ ty,
+ "Cannot guarentee sucessful serialization of this type due to
overflow concerns",
+ )])
+ } //Can't guarentee serialization type
+ _ => {
+ // Fails when the type does not implement
AvroSchemaWithResolved directly or covered by blanket implementation
+ // TODO check and error report with something like
https://docs.rs/quote/1.0.15/quote/macro.quote_spanned.html#example
+ type_path_schema_expr(p)
+ }
+ };
+ Ok(schema)
+ } else if let Type::Array(ta) = ty {
+ let inner_schema_expr = type_to_schema_expr(&ta.elem)?;
+ Ok(quote!
{apache_avro::schema::Schema::Array(Box::new(#inner_schema_expr))})
+ } else if let Type::Reference(tr) = ty {
+ type_to_schema_expr(&tr.elem)
+ } else {
+ Err(vec![Error::new_spanned(
+ ty,
+ format!("Unable to generate schema for type: {:?}", ty),
+ )])
+ }
+}
+
+/// Generates the schema def expression for fully qualified type paths using
the associated function
+/// - `A -> <A as AvroSchemaWithResolved>::get_schema_with_resolved()`
+/// - `A<T> -> <A<T> as AvroSchemaWithResolved>::get_schema_with_resolved()`
+fn type_path_schema_expr(p: &TypePath) -> TokenStream {
+ quote! {<#p as
apache_avro::schema::AvroSchemaWithResolved>::get_schema_with_resolved(resolved_schemas)}
+}
+
+/// Stolen from serde
+fn to_compile_errors(errors: Vec<syn::Error>) -> proc_macro2::TokenStream {
+ let compile_errors = errors.iter().map(syn::Error::to_compile_error);
+ quote!(#(#compile_errors)*)
+}
+
+#[cfg(test)]
+mod tests {
+ // Note this useful idiom: importing names from outer (for mod tests)
scope.
+ use super::*;
+ #[test]
+ fn basic_case() {
+ let test_struct = quote! {
+ struct A {
+ a: i32,
+ b: String
+ }
+ };
+
+ match syn::parse2::<DeriveInput>(test_struct) {
+ Ok(mut input) => {
+ assert!(derive_avro_schema(&mut input).is_ok())
+ }
+ Err(error) => panic!(
+ "Failied to parse as derive input when it should be able to.
Error: {:?}",
+ error
+ ),
+ };
+ }
+
+ #[test]
+ fn tuple_struct_unsupported() {
+ let test_tuple_struct = quote! {
+ struct B (i32, String);
+ };
+
+ match syn::parse2::<DeriveInput>(test_tuple_struct) {
+ Ok(mut input) => {
+ assert!(derive_avro_schema(&mut input).is_err())
+ }
+ Err(error) => panic!(
+ "Failied to parse as derive input when it should be able to.
Error: {:?}",
+ error
+ ),
+ };
+ }
+
+ #[test]
+ fn unit_struct_unsupported() {
+ let test_tuple_struct = quote! {
+ struct AbsoluteUnit;
+ };
+
+ match syn::parse2::<DeriveInput>(test_tuple_struct) {
+ Ok(mut input) => {
+ assert!(derive_avro_schema(&mut input).is_err())
+ }
+ Err(error) => panic!(
+ "Failied to parse as derive input when it should be able to.
Error: {:?}",
+ error
+ ),
+ };
+ }
+
+ #[test]
+ fn optional_type_generating() {
+ let stuct_with_optional = quote! {
Review Comment:
```suggestion
let struct_with_optional = quote! {
```
##########
lang/rust/avro_derive/src/lib.rs:
##########
@@ -0,0 +1,366 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use proc_macro2::{Span, TokenStream, TokenTree};
+use quote::quote;
+
+use syn::{parse_macro_input, Attribute, DeriveInput, Error, Lit, Path, Type,
TypePath};
+
+#[proc_macro_derive(AvroSchema, attributes(namespace))]
+// Templated from Serde
+pub fn proc_macro_derive_avro_schema(input: proc_macro::TokenStream) ->
proc_macro::TokenStream {
+ let mut input = parse_macro_input!(input as DeriveInput);
+ derive_avro_schema(&mut input)
+ .unwrap_or_else(to_compile_errors)
+ .into()
+}
+
+fn derive_avro_schema(input: &mut DeriveInput) -> Result<TokenStream,
Vec<syn::Error>> {
+ let namespace = get_namespace_from_attributes(&input.attrs)?;
+ let full_schema_name = vec![namespace, Some(input.ident.to_string())]
+ .into_iter()
+ .flatten()
+ .collect::<Vec<String>>()
+ .join(".");
+ let schema_def = match &input.data {
+ syn::Data::Struct(s) => {
+ get_data_struct_schema_def(&full_schema_name, s,
input.ident.span())?
+ }
+ syn::Data::Enum(e) => get_data_enum_schema_def(&full_schema_name, e,
input.ident.span())?,
+ _ => {
+ return Err(vec![Error::new(
+ input.ident.span(),
+ "AvroSchema derive only works for structs and simple enums ",
+ )])
+ }
+ };
+
+ let ty = &input.ident;
+ let (impl_generics, ty_generics, where_clause) =
input.generics.split_for_impl();
+ Ok(quote! {
+ impl #impl_generics apache_avro::schema::AvroSchemaWithResolved for
#ty #ty_generics #where_clause {
+ fn get_schema_with_resolved(resolved_schemas: &mut
HashMap<apache_avro::schema::Name, apache_avro::schema::Schema>) ->
apache_avro::schema::Schema {
+ let name =
apache_avro::schema::Name::new(#full_schema_name).expect(&format!("Unable to
parse schema name {}", #full_schema_name)[..]);
+ if resolved_schemas.contains_key(&name) {
+ resolved_schemas.get(&name).unwrap().clone()
+ }else {
+ resolved_schemas.insert(name.clone(), Schema::Ref{name:
name.clone()});
+ #schema_def
+ }
+ }
+ }
+ })
+}
+
+fn get_namespace_from_attributes(attrs: &[Attribute]) ->
Result<Option<String>, Vec<Error>> {
+ let namespace_attr_path_constant: Path = syn::parse2::<Path>(quote!
{namespace}).unwrap();
+ const NAMESPACE_PARSING_ERROR_CONSTANST: &str =
+ "Namespace attribute must be in form #[namespace =
\"com.testing.namespace\"]";
+ // parse out namespace if present. Requires strict syntax
+ for attr in attrs {
+ if namespace_attr_path_constant == attr.path {
+ let mut input_tokens = attr.tokens.clone().into_iter();
+ if let (
+ Some(TokenTree::Punct(punct)),
+ Some(TokenTree::Literal(namespace_literal)),
+ None,
+ ) = (
+ input_tokens.next(),
+ input_tokens.next(),
+ input_tokens.next(),
+ ) {
+ if punct.as_char() == '=' {
+ if let Lit::Str(lit_str) = Lit::new(namespace_literal) {
+ return Ok(Some(lit_str.value()));
+ }
+ }
+ }
+ return Err(vec![Error::new_spanned(
+ &attr.tokens,
+ NAMESPACE_PARSING_ERROR_CONSTANST,
+ )]);
+ }
+ }
+ Ok(None)
+}
+
+fn get_data_struct_schema_def(
+ full_schema_name: &str,
+ s: &syn::DataStruct,
+ error_span: Span,
+) -> Result<TokenStream, Vec<Error>> {
+ let mut record_field_exprs = vec![];
+ match s.fields {
+ syn::Fields::Named(ref a) => {
+ for (position, field) in a.named.iter().enumerate() {
+ let name = field.ident.as_ref().unwrap().to_string(); // we
know everything has a name
+ let schema_expr = type_to_schema_expr(&field.ty)?;
+ let position = position;
+ record_field_exprs.push(quote! {
+ apache_avro::schema::RecordField {
+ name: #name.to_string(),
+ doc: Option::None,
+ default: Option::None,
+ schema: #schema_expr,
+ order:
apache_avro::schema::RecordFieldOrder::Ignore,
+ position: #position,
+ }
+ });
+ }
+ }
+ syn::Fields::Unnamed(_) => {
+ return Err(vec![Error::new(
+ error_span,
+ "AvroSchema derive does not work for tuple structs",
+ )])
+ }
+ syn::Fields::Unit => {
+ return Err(vec![Error::new(
+ error_span,
+ "AvroSchema derive does not work for unit structs",
+ )])
+ }
+ }
+ Ok(quote! {
+ let schema_fields = vec![#(#record_field_exprs),*];
+ let name =
apache_avro::schema::Name::new(#full_schema_name).expect(&format!("Unable to
struct name for schema {}", #full_schema_name)[..]);
+ apache_avro::schema::record_schema_for_fields(name, None, None,
schema_fields)
+ })
+}
+
+fn get_data_enum_schema_def(
+ full_schema_name: &str,
+ e: &syn::DataEnum,
+ error_span: Span,
+) -> Result<TokenStream, Vec<Error>> {
+ if e.variants.iter().all(|v| syn::Fields::Unit == v.fields) {
+ let symbols: Vec<String> = e
+ .variants
+ .iter()
+ .map(|varient| varient.ident.to_string())
+ .collect();
+ Ok(quote! {
+ apache_avro::schema::Schema::Enum {
+ name:
apache_avro::schema::Name::new(#full_schema_name).expect(&format!("Unable to
parse enum name for schema {}", #full_schema_name)[..]),
+ aliases: None,
+ doc: None,
+ symbols: vec![#(#symbols.to_owned()),*]
+ }
+ })
+ } else {
+ Err(vec![Error::new(
+ error_span,
+ "AvroSchema derive does not work for enums with non unit structs",
+ )])
+ }
+}
+
+/// Takes in the Tokens of a type and returns the tokens of an expression with
return type `Schema`
+fn type_to_schema_expr(ty: &Type) -> Result<TokenStream, Vec<Error>> {
+ if let Type::Path(p) = ty {
+ let type_string = p.path.segments.last().unwrap().ident.to_string();
+
+ let schema = match &type_string[..] {
+ "bool" => quote! {Schema::Boolean},
+ "i8" | "i16" | "i32" | "u8" | "u16" => quote!
{apache_avro::schema::Schema::Int},
+ "i64" => quote! {apache_avro::schema::Schema::Long},
+ "f32" => quote! {apache_avro::schema::Schema::Float},
+ "f64" => quote! {apache_avro::schema::Schema::Double},
+ "String" | "str" => quote! {apache_avro::schema::Schema::String},
+ "char" => {
+ return Err(vec![Error::new_spanned(
Review Comment:
Why do we do this then -
https://github.com/apache/avro/pull/1631/files#diff-b702ef60e00210dede234ba524d4ce1d4ff14ef199191ec46a7f4ba8a6bd211eR1557
?
##########
lang/rust/avro_derive/src/lib.rs:
##########
@@ -0,0 +1,366 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use proc_macro2::{Span, TokenStream, TokenTree};
+use quote::quote;
+
+use syn::{parse_macro_input, Attribute, DeriveInput, Error, Lit, Path, Type,
TypePath};
+
+#[proc_macro_derive(AvroSchema, attributes(namespace))]
+// Templated from Serde
+pub fn proc_macro_derive_avro_schema(input: proc_macro::TokenStream) ->
proc_macro::TokenStream {
+ let mut input = parse_macro_input!(input as DeriveInput);
+ derive_avro_schema(&mut input)
+ .unwrap_or_else(to_compile_errors)
+ .into()
+}
+
+fn derive_avro_schema(input: &mut DeriveInput) -> Result<TokenStream,
Vec<syn::Error>> {
+ let namespace = get_namespace_from_attributes(&input.attrs)?;
+ let full_schema_name = vec![namespace, Some(input.ident.to_string())]
+ .into_iter()
+ .flatten()
+ .collect::<Vec<String>>()
+ .join(".");
+ let schema_def = match &input.data {
+ syn::Data::Struct(s) => {
+ get_data_struct_schema_def(&full_schema_name, s,
input.ident.span())?
+ }
+ syn::Data::Enum(e) => get_data_enum_schema_def(&full_schema_name, e,
input.ident.span())?,
+ _ => {
+ return Err(vec![Error::new(
+ input.ident.span(),
+ "AvroSchema derive only works for structs and simple enums ",
+ )])
+ }
+ };
+
+ let ty = &input.ident;
+ let (impl_generics, ty_generics, where_clause) =
input.generics.split_for_impl();
+ Ok(quote! {
+ impl #impl_generics apache_avro::schema::AvroSchemaWithResolved for
#ty #ty_generics #where_clause {
+ fn get_schema_with_resolved(resolved_schemas: &mut
HashMap<apache_avro::schema::Name, apache_avro::schema::Schema>) ->
apache_avro::schema::Schema {
+ let name =
apache_avro::schema::Name::new(#full_schema_name).expect(&format!("Unable to
parse schema name {}", #full_schema_name)[..]);
+ if resolved_schemas.contains_key(&name) {
+ resolved_schemas.get(&name).unwrap().clone()
+ }else {
+ resolved_schemas.insert(name.clone(), Schema::Ref{name:
name.clone()});
+ #schema_def
+ }
+ }
+ }
+ })
+}
+
+fn get_namespace_from_attributes(attrs: &[Attribute]) ->
Result<Option<String>, Vec<Error>> {
+ let namespace_attr_path_constant: Path = syn::parse2::<Path>(quote!
{namespace}).unwrap();
+ const NAMESPACE_PARSING_ERROR_CONSTANST: &str =
+ "Namespace attribute must be in form #[namespace =
\"com.testing.namespace\"]";
+ // parse out namespace if present. Requires strict syntax
+ for attr in attrs {
+ if namespace_attr_path_constant == attr.path {
+ let mut input_tokens = attr.tokens.clone().into_iter();
+ if let (
+ Some(TokenTree::Punct(punct)),
+ Some(TokenTree::Literal(namespace_literal)),
+ None,
+ ) = (
+ input_tokens.next(),
+ input_tokens.next(),
+ input_tokens.next(),
+ ) {
+ if punct.as_char() == '=' {
+ if let Lit::Str(lit_str) = Lit::new(namespace_literal) {
+ return Ok(Some(lit_str.value()));
+ }
+ }
+ }
+ return Err(vec![Error::new_spanned(
+ &attr.tokens,
+ NAMESPACE_PARSING_ERROR_CONSTANST,
+ )]);
+ }
+ }
+ Ok(None)
+}
+
+fn get_data_struct_schema_def(
+ full_schema_name: &str,
+ s: &syn::DataStruct,
+ error_span: Span,
+) -> Result<TokenStream, Vec<Error>> {
+ let mut record_field_exprs = vec![];
+ match s.fields {
+ syn::Fields::Named(ref a) => {
+ for (position, field) in a.named.iter().enumerate() {
+ let name = field.ident.as_ref().unwrap().to_string(); // we
know everything has a name
+ let schema_expr = type_to_schema_expr(&field.ty)?;
+ let position = position;
+ record_field_exprs.push(quote! {
+ apache_avro::schema::RecordField {
+ name: #name.to_string(),
+ doc: Option::None,
+ default: Option::None,
+ schema: #schema_expr,
+ order:
apache_avro::schema::RecordFieldOrder::Ignore,
+ position: #position,
+ }
+ });
+ }
+ }
+ syn::Fields::Unnamed(_) => {
+ return Err(vec![Error::new(
+ error_span,
+ "AvroSchema derive does not work for tuple structs",
+ )])
+ }
+ syn::Fields::Unit => {
+ return Err(vec![Error::new(
+ error_span,
+ "AvroSchema derive does not work for unit structs",
+ )])
+ }
+ }
+ Ok(quote! {
+ let schema_fields = vec![#(#record_field_exprs),*];
+ let name =
apache_avro::schema::Name::new(#full_schema_name).expect(&format!("Unable to
struct name for schema {}", #full_schema_name)[..]);
+ apache_avro::schema::record_schema_for_fields(name, None, None,
schema_fields)
+ })
+}
+
+fn get_data_enum_schema_def(
+ full_schema_name: &str,
+ e: &syn::DataEnum,
+ error_span: Span,
+) -> Result<TokenStream, Vec<Error>> {
+ if e.variants.iter().all(|v| syn::Fields::Unit == v.fields) {
+ let symbols: Vec<String> = e
+ .variants
+ .iter()
+ .map(|varient| varient.ident.to_string())
+ .collect();
+ Ok(quote! {
+ apache_avro::schema::Schema::Enum {
+ name:
apache_avro::schema::Name::new(#full_schema_name).expect(&format!("Unable to
parse enum name for schema {}", #full_schema_name)[..]),
+ aliases: None,
+ doc: None,
+ symbols: vec![#(#symbols.to_owned()),*]
+ }
+ })
+ } else {
+ Err(vec![Error::new(
+ error_span,
+ "AvroSchema derive does not work for enums with non unit structs",
+ )])
+ }
+}
+
+/// Takes in the Tokens of a type and returns the tokens of an expression with
return type `Schema`
+fn type_to_schema_expr(ty: &Type) -> Result<TokenStream, Vec<Error>> {
+ if let Type::Path(p) = ty {
+ let type_string = p.path.segments.last().unwrap().ident.to_string();
+
+ let schema = match &type_string[..] {
+ "bool" => quote! {Schema::Boolean},
+ "i8" | "i16" | "i32" | "u8" | "u16" => quote!
{apache_avro::schema::Schema::Int},
+ "i64" => quote! {apache_avro::schema::Schema::Long},
+ "f32" => quote! {apache_avro::schema::Schema::Float},
+ "f64" => quote! {apache_avro::schema::Schema::Double},
+ "String" | "str" => quote! {apache_avro::schema::Schema::String},
+ "char" => {
+ return Err(vec![Error::new_spanned(
+ ty,
+ "AvroSchema: Cannot guarentee sucessful deserialization of
this type",
+ )])
+ }
+ "u32" | "u64" => {
+ return Err(vec![Error::new_spanned(
+ ty,
+ "Cannot guarentee sucessful serialization of this type due to
overflow concerns",
+ )])
+ } //Can't guarentee serialization type
+ _ => {
+ // Fails when the type does not implement
AvroSchemaWithResolved directly or covered by blanket implementation
+ // TODO check and error report with something like
https://docs.rs/quote/1.0.15/quote/macro.quote_spanned.html#example
+ type_path_schema_expr(p)
+ }
+ };
+ Ok(schema)
+ } else if let Type::Array(ta) = ty {
+ let inner_schema_expr = type_to_schema_expr(&ta.elem)?;
+ Ok(quote!
{apache_avro::schema::Schema::Array(Box::new(#inner_schema_expr))})
+ } else if let Type::Reference(tr) = ty {
+ type_to_schema_expr(&tr.elem)
+ } else {
+ Err(vec![Error::new_spanned(
+ ty,
+ format!("Unable to generate schema for type: {:?}", ty),
+ )])
+ }
+}
+
+/// Generates the schema def expression for fully qualified type paths using
the associated function
+/// - `A -> <A as AvroSchemaWithResolved>::get_schema_with_resolved()`
+/// - `A<T> -> <A<T> as AvroSchemaWithResolved>::get_schema_with_resolved()`
+fn type_path_schema_expr(p: &TypePath) -> TokenStream {
+ quote! {<#p as
apache_avro::schema::AvroSchemaWithResolved>::get_schema_with_resolved(resolved_schemas)}
+}
+
+/// Stolen from serde
+fn to_compile_errors(errors: Vec<syn::Error>) -> proc_macro2::TokenStream {
+ let compile_errors = errors.iter().map(syn::Error::to_compile_error);
+ quote!(#(#compile_errors)*)
+}
+
+#[cfg(test)]
+mod tests {
+ // Note this useful idiom: importing names from outer (for mod tests)
scope.
Review Comment:
This comment is not really needed.
##########
lang/rust/avro_derive/tests/derive.rs:
##########
@@ -0,0 +1,409 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use apache_avro::schema::{AvroSchema, AvroSchemaWithResolved};
+use apache_avro::{from_value, Reader, Schema, Writer};
+use avro_derive::*;
+use serde::de::DeserializeOwned;
+use serde::ser::Serialize;
+use std::collections::HashMap;
+
+#[macro_use]
+extern crate serde;
+
+#[cfg(test)]
+mod test_derive {
+ use std::{
+ borrow::{Borrow, Cow},
+ sync::Mutex,
+ };
+
+ use super::*;
+
+ /// Takes in a type that implements the right combination of traits and
runs it through a Serde Cycle and asserts the result is the same
+ fn freeze_dry_assert<T>(obj: T)
+ where
+ T: std::fmt::Debug + Serialize + DeserializeOwned + AvroSchema + Clone
+ PartialEq,
+ {
+ let encoded = freeze(obj.clone());
+ let dried: T = dry(encoded);
+ assert_eq!(obj, dried);
+ }
+
+ fn freeze_dry<T>(obj: T) -> T
+ where
+ T: Serialize + DeserializeOwned + AvroSchema,
+ {
+ dry(freeze(obj))
+ }
+
+ // serialize
+ fn freeze<T>(obj: T) -> Vec<u8>
+ where
+ T: Serialize + AvroSchema,
+ {
+ let schema = T::get_schema();
+ let mut writer = Writer::new(&schema, Vec::new());
+ if let Err(e) = writer.append_ser(obj) {
+ panic!("{}", e.to_string());
Review Comment:
```suggestion
panic!("{:?}", e);
```
##########
lang/rust/avro_derive/tests/derive.rs:
##########
@@ -0,0 +1,409 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use apache_avro::schema::{AvroSchema, AvroSchemaWithResolved};
+use apache_avro::{from_value, Reader, Schema, Writer};
+use avro_derive::*;
+use serde::de::DeserializeOwned;
+use serde::ser::Serialize;
+use std::collections::HashMap;
+
+#[macro_use]
+extern crate serde;
+
+#[cfg(test)]
+mod test_derive {
+ use std::{
+ borrow::{Borrow, Cow},
+ sync::Mutex,
+ };
+
+ use super::*;
+
+ /// Takes in a type that implements the right combination of traits and
runs it through a Serde Cycle and asserts the result is the same
+ fn freeze_dry_assert<T>(obj: T)
+ where
+ T: std::fmt::Debug + Serialize + DeserializeOwned + AvroSchema + Clone
+ PartialEq,
+ {
+ let encoded = freeze(obj.clone());
+ let dried: T = dry(encoded);
Review Comment:
Currently `freeze_dry(obj)` is used just once below. It could have been used
here as well.
##########
lang/rust/avro_derive/tests/derive.rs:
##########
@@ -0,0 +1,409 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use apache_avro::schema::{AvroSchema, AvroSchemaWithResolved};
+use apache_avro::{from_value, Reader, Schema, Writer};
+use avro_derive::*;
+use serde::de::DeserializeOwned;
+use serde::ser::Serialize;
+use std::collections::HashMap;
+
+#[macro_use]
+extern crate serde;
+
+#[cfg(test)]
+mod test_derive {
+ use std::{
+ borrow::{Borrow, Cow},
+ sync::Mutex,
+ };
+
+ use super::*;
+
+ /// Takes in a type that implements the right combination of traits and
runs it through a Serde Cycle and asserts the result is the same
+ fn freeze_dry_assert<T>(obj: T)
+ where
+ T: std::fmt::Debug + Serialize + DeserializeOwned + AvroSchema + Clone
+ PartialEq,
+ {
+ let encoded = freeze(obj.clone());
+ let dried: T = dry(encoded);
+ assert_eq!(obj, dried);
+ }
+
+ fn freeze_dry<T>(obj: T) -> T
+ where
+ T: Serialize + DeserializeOwned + AvroSchema,
+ {
+ dry(freeze(obj))
+ }
+
+ // serialize
+ fn freeze<T>(obj: T) -> Vec<u8>
+ where
+ T: Serialize + AvroSchema,
+ {
+ let schema = T::get_schema();
+ let mut writer = Writer::new(&schema, Vec::new());
+ if let Err(e) = writer.append_ser(obj) {
+ panic!("{}", e.to_string());
+ }
+ writer.into_inner().unwrap()
+ }
+
+ // deserialize
+ fn dry<T>(encoded: Vec<u8>) -> T
+ where
+ T: DeserializeOwned + AvroSchema,
+ {
+ assert!(!encoded.is_empty());
+ let schema = T::get_schema();
+ let reader = Reader::with_schema(&schema, &encoded[..]).unwrap();
+ for res in reader {
+ match res {
+ Ok(value) => {
+ return from_value::<T>(&value).unwrap();
+ }
+ Err(e) => panic!("{}", e.to_string()),
Review Comment:
```suggestion
Err(e) => panic!("{:?}", e),
```
##########
lang/rust/avro_derive/src/lib.rs:
##########
@@ -0,0 +1,366 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use proc_macro2::{Span, TokenStream, TokenTree};
+use quote::quote;
+
+use syn::{parse_macro_input, Attribute, DeriveInput, Error, Lit, Path, Type,
TypePath};
+
+#[proc_macro_derive(AvroSchema, attributes(namespace))]
+// Templated from Serde
+pub fn proc_macro_derive_avro_schema(input: proc_macro::TokenStream) ->
proc_macro::TokenStream {
+ let mut input = parse_macro_input!(input as DeriveInput);
+ derive_avro_schema(&mut input)
+ .unwrap_or_else(to_compile_errors)
+ .into()
+}
+
+fn derive_avro_schema(input: &mut DeriveInput) -> Result<TokenStream,
Vec<syn::Error>> {
+ let namespace = get_namespace_from_attributes(&input.attrs)?;
+ let full_schema_name = vec![namespace, Some(input.ident.to_string())]
+ .into_iter()
+ .flatten()
+ .collect::<Vec<String>>()
+ .join(".");
+ let schema_def = match &input.data {
+ syn::Data::Struct(s) => {
+ get_data_struct_schema_def(&full_schema_name, s,
input.ident.span())?
+ }
+ syn::Data::Enum(e) => get_data_enum_schema_def(&full_schema_name, e,
input.ident.span())?,
+ _ => {
+ return Err(vec![Error::new(
+ input.ident.span(),
+ "AvroSchema derive only works for structs and simple enums ",
+ )])
+ }
+ };
+
+ let ty = &input.ident;
+ let (impl_generics, ty_generics, where_clause) =
input.generics.split_for_impl();
+ Ok(quote! {
+ impl #impl_generics apache_avro::schema::AvroSchemaWithResolved for
#ty #ty_generics #where_clause {
+ fn get_schema_with_resolved(resolved_schemas: &mut
HashMap<apache_avro::schema::Name, apache_avro::schema::Schema>) ->
apache_avro::schema::Schema {
+ let name =
apache_avro::schema::Name::new(#full_schema_name).expect(&format!("Unable to
parse schema name {}", #full_schema_name)[..]);
+ if resolved_schemas.contains_key(&name) {
+ resolved_schemas.get(&name).unwrap().clone()
+ }else {
+ resolved_schemas.insert(name.clone(), Schema::Ref{name:
name.clone()});
+ #schema_def
+ }
+ }
+ }
+ })
+}
+
+fn get_namespace_from_attributes(attrs: &[Attribute]) ->
Result<Option<String>, Vec<Error>> {
+ let namespace_attr_path_constant: Path = syn::parse2::<Path>(quote!
{namespace}).unwrap();
+ const NAMESPACE_PARSING_ERROR_CONSTANST: &str =
+ "Namespace attribute must be in form #[namespace =
\"com.testing.namespace\"]";
+ // parse out namespace if present. Requires strict syntax
+ for attr in attrs {
+ if namespace_attr_path_constant == attr.path {
+ let mut input_tokens = attr.tokens.clone().into_iter();
+ if let (
+ Some(TokenTree::Punct(punct)),
+ Some(TokenTree::Literal(namespace_literal)),
+ None,
+ ) = (
+ input_tokens.next(),
+ input_tokens.next(),
+ input_tokens.next(),
+ ) {
+ if punct.as_char() == '=' {
+ if let Lit::Str(lit_str) = Lit::new(namespace_literal) {
+ return Ok(Some(lit_str.value()));
+ }
+ }
+ }
+ return Err(vec![Error::new_spanned(
+ &attr.tokens,
+ NAMESPACE_PARSING_ERROR_CONSTANST,
+ )]);
+ }
+ }
+ Ok(None)
+}
+
+fn get_data_struct_schema_def(
+ full_schema_name: &str,
+ s: &syn::DataStruct,
+ error_span: Span,
+) -> Result<TokenStream, Vec<Error>> {
+ let mut record_field_exprs = vec![];
+ match s.fields {
+ syn::Fields::Named(ref a) => {
+ for (position, field) in a.named.iter().enumerate() {
+ let name = field.ident.as_ref().unwrap().to_string(); // we
know everything has a name
+ let schema_expr = type_to_schema_expr(&field.ty)?;
+ let position = position;
+ record_field_exprs.push(quote! {
+ apache_avro::schema::RecordField {
+ name: #name.to_string(),
+ doc: Option::None,
+ default: Option::None,
+ schema: #schema_expr,
+ order:
apache_avro::schema::RecordFieldOrder::Ignore,
+ position: #position,
+ }
+ });
+ }
+ }
+ syn::Fields::Unnamed(_) => {
+ return Err(vec![Error::new(
+ error_span,
+ "AvroSchema derive does not work for tuple structs",
+ )])
+ }
+ syn::Fields::Unit => {
+ return Err(vec![Error::new(
+ error_span,
+ "AvroSchema derive does not work for unit structs",
+ )])
+ }
+ }
+ Ok(quote! {
+ let schema_fields = vec![#(#record_field_exprs),*];
+ let name =
apache_avro::schema::Name::new(#full_schema_name).expect(&format!("Unable to
struct name for schema {}", #full_schema_name)[..]);
+ apache_avro::schema::record_schema_for_fields(name, None, None,
schema_fields)
+ })
+}
+
+fn get_data_enum_schema_def(
+ full_schema_name: &str,
+ e: &syn::DataEnum,
+ error_span: Span,
+) -> Result<TokenStream, Vec<Error>> {
+ if e.variants.iter().all(|v| syn::Fields::Unit == v.fields) {
+ let symbols: Vec<String> = e
+ .variants
+ .iter()
+ .map(|varient| varient.ident.to_string())
+ .collect();
+ Ok(quote! {
+ apache_avro::schema::Schema::Enum {
+ name:
apache_avro::schema::Name::new(#full_schema_name).expect(&format!("Unable to
parse enum name for schema {}", #full_schema_name)[..]),
+ aliases: None,
+ doc: None,
+ symbols: vec![#(#symbols.to_owned()),*]
+ }
+ })
+ } else {
+ Err(vec![Error::new(
+ error_span,
+ "AvroSchema derive does not work for enums with non unit structs",
+ )])
+ }
+}
+
+/// Takes in the Tokens of a type and returns the tokens of an expression with
return type `Schema`
+fn type_to_schema_expr(ty: &Type) -> Result<TokenStream, Vec<Error>> {
+ if let Type::Path(p) = ty {
+ let type_string = p.path.segments.last().unwrap().ident.to_string();
+
+ let schema = match &type_string[..] {
+ "bool" => quote! {Schema::Boolean},
+ "i8" | "i16" | "i32" | "u8" | "u16" => quote!
{apache_avro::schema::Schema::Int},
+ "i64" => quote! {apache_avro::schema::Schema::Long},
+ "f32" => quote! {apache_avro::schema::Schema::Float},
+ "f64" => quote! {apache_avro::schema::Schema::Double},
+ "String" | "str" => quote! {apache_avro::schema::Schema::String},
+ "char" => {
+ return Err(vec![Error::new_spanned(
+ ty,
+ "AvroSchema: Cannot guarentee sucessful deserialization of
this type",
+ )])
+ }
+ "u32" | "u64" => {
+ return Err(vec![Error::new_spanned(
+ ty,
+ "Cannot guarentee sucessful serialization of this type due to
overflow concerns",
+ )])
+ } //Can't guarentee serialization type
+ _ => {
+ // Fails when the type does not implement
AvroSchemaWithResolved directly or covered by blanket implementation
+ // TODO check and error report with something like
https://docs.rs/quote/1.0.15/quote/macro.quote_spanned.html#example
+ type_path_schema_expr(p)
+ }
+ };
+ Ok(schema)
+ } else if let Type::Array(ta) = ty {
+ let inner_schema_expr = type_to_schema_expr(&ta.elem)?;
+ Ok(quote!
{apache_avro::schema::Schema::Array(Box::new(#inner_schema_expr))})
+ } else if let Type::Reference(tr) = ty {
+ type_to_schema_expr(&tr.elem)
+ } else {
+ Err(vec![Error::new_spanned(
+ ty,
+ format!("Unable to generate schema for type: {:?}", ty),
+ )])
+ }
+}
+
+/// Generates the schema def expression for fully qualified type paths using
the associated function
+/// - `A -> <A as AvroSchemaWithResolved>::get_schema_with_resolved()`
+/// - `A<T> -> <A<T> as AvroSchemaWithResolved>::get_schema_with_resolved()`
+fn type_path_schema_expr(p: &TypePath) -> TokenStream {
+ quote! {<#p as
apache_avro::schema::AvroSchemaWithResolved>::get_schema_with_resolved(resolved_schemas)}
+}
+
+/// Stolen from serde
+fn to_compile_errors(errors: Vec<syn::Error>) -> proc_macro2::TokenStream {
+ let compile_errors = errors.iter().map(syn::Error::to_compile_error);
+ quote!(#(#compile_errors)*)
+}
+
+#[cfg(test)]
+mod tests {
+ // Note this useful idiom: importing names from outer (for mod tests)
scope.
+ use super::*;
+ #[test]
+ fn basic_case() {
+ let test_struct = quote! {
+ struct A {
+ a: i32,
+ b: String
+ }
+ };
+
+ match syn::parse2::<DeriveInput>(test_struct) {
+ Ok(mut input) => {
+ assert!(derive_avro_schema(&mut input).is_ok())
+ }
+ Err(error) => panic!(
+ "Failied to parse as derive input when it should be able to.
Error: {:?}",
+ error
+ ),
+ };
+ }
+
+ #[test]
+ fn tuple_struct_unsupported() {
+ let test_tuple_struct = quote! {
+ struct B (i32, String);
+ };
+
+ match syn::parse2::<DeriveInput>(test_tuple_struct) {
+ Ok(mut input) => {
+ assert!(derive_avro_schema(&mut input).is_err())
+ }
+ Err(error) => panic!(
+ "Failied to parse as derive input when it should be able to.
Error: {:?}",
+ error
+ ),
+ };
+ }
+
+ #[test]
+ fn unit_struct_unsupported() {
+ let test_tuple_struct = quote! {
+ struct AbsoluteUnit;
+ };
+
+ match syn::parse2::<DeriveInput>(test_tuple_struct) {
+ Ok(mut input) => {
+ assert!(derive_avro_schema(&mut input).is_err())
+ }
+ Err(error) => panic!(
+ "Failied to parse as derive input when it should be able to.
Error: {:?}",
+ error
+ ),
+ };
+ }
+
+ #[test]
+ fn optional_type_generating() {
+ let stuct_with_optional = quote! {
+ struct Test4 {
+ a : Option<i32>
+ }
+ };
+ match syn::parse2::<DeriveInput>(stuct_with_optional) {
+ Ok(mut input) => {
+ assert!(derive_avro_schema(&mut input).is_ok())
+ }
+ Err(error) => panic!(
+ "Failied to parse as derive input when it should be able to.
Error: {:?}",
+ error
+ ),
+ };
+ }
+
+ #[test]
+ fn test_basic_enum() {
+ let basic_enum = quote! {
+ enum Basic {
+ A,
+ B,
+ C,
+ D
+ }
+ };
+ match syn::parse2::<DeriveInput>(basic_enum) {
+ Ok(mut input) => {
+ assert!(derive_avro_schema(&mut input).is_ok())
+ }
+ Err(error) => panic!(
+ "Failied to parse as derive input when it should be able to.
Error: {:?}",
+ error
+ ),
+ };
+ }
+
+ #[test]
+ fn test_namespace() {
+ let test_struct = quote! {
+ #[namespace = "namespace.testing"]
+ struct A {
+ a: i32,
+ b: String
+ }
+ };
+
+ match syn::parse2::<DeriveInput>(test_struct) {
+ Ok(mut input) => {
+ assert!(derive_avro_schema(&mut input).is_ok())
Review Comment:
Can we assert that the namespace is used in the derived schema (TokenStream)
?
Issue Time Tracking
-------------------
Worklog Id: (was: 754480)
Remaining Estimate: 0h
Time Spent: 10m
> [rust] Derive Avro Schema macro
> -------------------------------
>
> Key: AVRO-3479
> URL: https://issues.apache.org/jira/browse/AVRO-3479
> Project: Apache Avro
> Issue Type: Improvement
> Reporter: Jack Klamer
> Assignee: Jack Klamer
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> The tracking Issue for the Avro Derive Feature of the rust SDK.
> Proposal (copied from email):
> Have another rust crate that is importable as a feature on the main crate (in
> the same manner as serde derive), that will provide a derive proc_macro that
> implements a simple trait that returns the schema for the implementing type.
> Right now, schemas must be parsed from strings ( or read from files first),
> and closely coordinated with the associated struct. This makes sense for
> workflows that need to associate the same type across languages. For programs
> that are all within Rust, there are usability advantages of the proc_macro.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)