[ https://issues.apache.org/jira/browse/AVRO-3532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17571895#comment-17571895 ]
Christophe Le Saec commented on AVRO-3532: ------------------------------------------ For C, (like on Java), null character should stay forbidden. I test accent, it does not work as expected due to [is_avro_id|https://github.com/apache/avro/blob/master/lang/c/src/schema.c#L49] function. I wonder if we could use an external library like [ICU|https://unicode-org.github.io/icu/], with function like [u_isUAlphabetic()|https://unicode-org.github.io/icu-docs/apidoc/released/icu4c/uchar_8h.html#a063b8b8c01c1c8246682dd81dd46da00]. If yes, how do we add such a dependency in C ? (What is the maven equivalent ;) ?) > Align naming rules on code > -------------------------- > > Key: AVRO-3532 > URL: https://issues.apache.org/jira/browse/AVRO-3532 > Project: Apache Avro > Issue Type: Wish > Reporter: Christophe Le Saec > Priority: Major > > Description of [naming rule on > documentation|https://avro.apache.org/docs/current/spec.html#names] is > {noformat} > - start with [A-Za-z_] > - subsequently contain only [A-Za-z0-9_] > {noformat} > But [java > code|https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/Schema.java#L1578] > use Character.isLetter method > {code:java} > char first = name.charAt(0); > if (!(Character.isLetter(first) || first == '_')) > throw new SchemaParseException("Illegal initial character: " + name); > for (int i = 1; i < length; i++) { > char c = name.charAt(i); > if (!(Character.isLetterOrDigit(c) || c == '_')) > throw new SchemaParseException("Illegal character in: " + name); > } > return name; > {code} > This method accept accent éùàçË ... and also chinese character (我) ... > So, the aim of this ticket is to see if we can update the documentation, if > other implementations (rust, C# ...) are also compatible with ? -- This message was sent by Atlassian Jira (v8.20.10#820010)