[ 
https://issues.apache.org/jira/browse/AVRO-3479?focusedWorklogId=754897&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-754897
 ]

ASF GitHub Bot logged work on AVRO-3479:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 09/Apr/22 05:30
            Start Date: 09/Apr/22 05:30
    Worklog Time Spent: 10m 
      Work Description: jklamer commented on code in PR #1631:
URL: https://github.com/apache/avro/pull/1631#discussion_r846569639


##########
lang/rust/avro_derive/src/lib.rs:
##########
@@ -0,0 +1,366 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use proc_macro2::{Span, TokenStream, TokenTree};
+use quote::quote;
+
+use syn::{parse_macro_input, Attribute, DeriveInput, Error, Lit, Path, Type, 
TypePath};
+
+#[proc_macro_derive(AvroSchema, attributes(namespace))]
+// Templated from Serde
+pub fn proc_macro_derive_avro_schema(input: proc_macro::TokenStream) -> 
proc_macro::TokenStream {
+    let mut input = parse_macro_input!(input as DeriveInput);
+    derive_avro_schema(&mut input)
+        .unwrap_or_else(to_compile_errors)
+        .into()
+}
+
+fn derive_avro_schema(input: &mut DeriveInput) -> Result<TokenStream, 
Vec<syn::Error>> {
+    let namespace = get_namespace_from_attributes(&input.attrs)?;
+    let full_schema_name = vec![namespace, Some(input.ident.to_string())]
+        .into_iter()
+        .flatten()
+        .collect::<Vec<String>>()
+        .join(".");
+    let schema_def = match &input.data {
+        syn::Data::Struct(s) => {
+            get_data_struct_schema_def(&full_schema_name, s, 
input.ident.span())?
+        }
+        syn::Data::Enum(e) => get_data_enum_schema_def(&full_schema_name, e, 
input.ident.span())?,
+        _ => {
+            return Err(vec![Error::new(
+                input.ident.span(),
+                "AvroSchema derive only works for structs and simple enums ",
+            )])
+        }
+    };
+
+    let ty = &input.ident;
+    let (impl_generics, ty_generics, where_clause) = 
input.generics.split_for_impl();
+    Ok(quote! {
+        impl #impl_generics apache_avro::schema::AvroSchemaWithResolved for 
#ty #ty_generics #where_clause {
+            fn get_schema_with_resolved(resolved_schemas: &mut 
HashMap<apache_avro::schema::Name, apache_avro::schema::Schema>) -> 
apache_avro::schema::Schema {
+                let name =  
apache_avro::schema::Name::new(#full_schema_name).expect(&format!("Unable to 
parse schema name {}", #full_schema_name)[..]);
+                if resolved_schemas.contains_key(&name) {
+                    resolved_schemas.get(&name).unwrap().clone()
+                }else {
+                    resolved_schemas.insert(name.clone(), Schema::Ref{name: 
name.clone()});
+                    #schema_def
+                }
+            }
+        }
+    })
+}
+
+fn get_namespace_from_attributes(attrs: &[Attribute]) -> 
Result<Option<String>, Vec<Error>> {
+    let namespace_attr_path_constant: Path = syn::parse2::<Path>(quote! 
{namespace}).unwrap();
+    const NAMESPACE_PARSING_ERROR_CONSTANST: &str =
+        "Namespace attribute must be in form #[namespace = 
\"com.testing.namespace\"]";
+    // parse out namespace if present. Requires strict syntax
+    for attr in attrs {
+        if namespace_attr_path_constant == attr.path {
+            let mut input_tokens = attr.tokens.clone().into_iter();
+            if let (
+                Some(TokenTree::Punct(punct)),
+                Some(TokenTree::Literal(namespace_literal)),
+                None,
+            ) = (
+                input_tokens.next(),
+                input_tokens.next(),
+                input_tokens.next(),
+            ) {
+                if punct.as_char() == '=' {
+                    if let Lit::Str(lit_str) = Lit::new(namespace_literal) {
+                        return Ok(Some(lit_str.value()));
+                    }
+                }
+            }
+            return Err(vec![Error::new_spanned(
+                &attr.tokens,
+                NAMESPACE_PARSING_ERROR_CONSTANST,
+            )]);
+        }
+    }
+    Ok(None)
+}
+
+fn get_data_struct_schema_def(
+    full_schema_name: &str,
+    s: &syn::DataStruct,
+    error_span: Span,
+) -> Result<TokenStream, Vec<Error>> {
+    let mut record_field_exprs = vec![];
+    match s.fields {
+        syn::Fields::Named(ref a) => {
+            for (position, field) in a.named.iter().enumerate() {
+                let name = field.ident.as_ref().unwrap().to_string(); // we 
know everything has a name
+                let schema_expr = type_to_schema_expr(&field.ty)?;
+                let position = position;
+                record_field_exprs.push(quote! {
+                    apache_avro::schema::RecordField {
+                            name: #name.to_string(),
+                            doc: Option::None,
+                            default: Option::None,
+                            schema: #schema_expr,
+                            order: 
apache_avro::schema::RecordFieldOrder::Ignore,
+                            position: #position,
+                        }
+                });
+            }
+        }
+        syn::Fields::Unnamed(_) => {
+            return Err(vec![Error::new(
+                error_span,
+                "AvroSchema derive does not work for tuple structs",
+            )])
+        }
+        syn::Fields::Unit => {
+            return Err(vec![Error::new(
+                error_span,
+                "AvroSchema derive does not work for unit structs",
+            )])
+        }
+    }
+    Ok(quote! {
+        let schema_fields = vec![#(#record_field_exprs),*];
+        let name = 
apache_avro::schema::Name::new(#full_schema_name).expect(&format!("Unable to 
struct name for schema {}", #full_schema_name)[..]);
+        apache_avro::schema::record_schema_for_fields(name, None, None, 
schema_fields)
+    })
+}
+
+fn get_data_enum_schema_def(
+    full_schema_name: &str,
+    e: &syn::DataEnum,
+    error_span: Span,
+) -> Result<TokenStream, Vec<Error>> {
+    if e.variants.iter().all(|v| syn::Fields::Unit == v.fields) {
+        let symbols: Vec<String> = e
+            .variants
+            .iter()
+            .map(|varient| varient.ident.to_string())
+            .collect();
+        Ok(quote! {
+            apache_avro::schema::Schema::Enum {
+                name: 
apache_avro::schema::Name::new(#full_schema_name).expect(&format!("Unable to 
parse enum name for schema {}", #full_schema_name)[..]),
+                aliases: None,
+                doc: None,
+                symbols: vec![#(#symbols.to_owned()),*]
+            }
+        })
+    } else {
+        Err(vec![Error::new(
+            error_span,
+            "AvroSchema derive does not work for enums with non unit structs",
+        )])
+    }
+}
+
+/// Takes in the Tokens of a type and returns the tokens of an expression with 
return type `Schema`
+fn type_to_schema_expr(ty: &Type) -> Result<TokenStream, Vec<Error>> {
+    if let Type::Path(p) = ty {
+        let type_string = p.path.segments.last().unwrap().ident.to_string();
+
+        let schema = match &type_string[..] {
+            "bool" => quote! {Schema::Boolean},
+            "i8" | "i16" | "i32" | "u8" | "u16" => quote! 
{apache_avro::schema::Schema::Int},
+            "i64" => quote! {apache_avro::schema::Schema::Long},

Review Comment:
   The current serde implementation for u32 that we have is 
   ```
   fn serialize_u32(self, v: u32) -> Result<Self::Ok, Self::Error> {
           if v <= i32::MAX as u32 {
               self.serialize_i32(v as i32)
           } else {
               self.serialize_i64(i64::from(v))
           }
       }
   ```
   The schema would have to be value dependent so I couldn't always create a 
schema that I could guarantee to work. Unless I always did `[integer, long]` 
but that felt like unexpected behavior. Just making the user revert to manual 
definition felt better. Lots of ways we could handle, what do you think?





Issue Time Tracking
-------------------

    Worklog Id:     (was: 754897)
    Time Spent: 1h 20m  (was: 1h 10m)

> [rust] Derive Avro Schema macro
> -------------------------------
>
>                 Key: AVRO-3479
>                 URL: https://issues.apache.org/jira/browse/AVRO-3479
>             Project: Apache Avro
>          Issue Type: Improvement
>            Reporter: Jack Klamer
>            Assignee: Jack Klamer
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> The tracking Issue for the Avro Derive Feature of the rust SDK. 
> Proposal (copied from email):
> Have another rust crate that is importable as a feature on the main crate (in 
> the same manner as serde derive), that will provide a derive proc_macro that 
> implements a simple trait that returns the schema for the implementing type. 
> Right now, schemas must be parsed from strings ( or read from files first), 
> and closely coordinated with the associated struct. This makes sense for 
> workflows that need to associate the same type across languages. For programs 
> that are all within Rust, there are usability advantages of the proc_macro. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to