Thanks for sharing the url https://langsec.org/occupy/, very interesting
site.

* internal aspects like unicode normalisation* => Yes, may be main argument
to keep specification like this and fix validation code. Does someone knows
how java compiler validates names (*for method, variables ...*), as it
accepts unicode name ? could it be a solution for Avro ?

This 2 arguments would also be valuables for properties (names & values
when String) contained in JsonProperties class which is parent of Schema &
Field classes ?

So, why let one without control (*properties name and value*) and second
with restrictive control (*field name*) ?

Best regards,
Christophe.

Le mer. 23 nov. 2022 à 20:20, Ryan Skraba <[email protected]> a écrit :

> Hello!  I have a specific opinion about the "Robustness Principle",
> especially in this case!
>
> "Accepting liberally, generating strictly" (the paraphrasing of
> Postel's idea) has it's place, and might be a good principle for
> binary encoding and decoding.  It's not so great for "accepting and
> generating schemas".  In this case, it's led directly to this debate:
> accepting "invalid" names has become one facto standard for a
> _certain_ category of users, who are now blocked from interoperating
> with other language SDKs (and potentially future versions of their own
> SDK, if "fixed").
>
> > If we can use a "non rigorous validation" and
> > it can run wthout bugs, why switch to a rigorous validation mode that
> would
> > follow current specification and not change the specification to "accept
> > schemas as liberally as possible" (meaning, while it doesn't generate
> bugs).
>
> Here's where I think the logic is faulty: even if we don't count
> interoperability failures as a bug, I'm not convinced that using names
> outside the specification run without bugs!  There's several things to
> think about: internal aspects like unicode normalisation, internal
> features like schema evolution (which might actually be OK), but
> especially external ones like downstream projects and tools.
>
> As it is, if you follow the specification, Java and Python are
> interoperable and there's a certain guarantee that existing libraries
> and projects can count on.
>
> The configuration approach is one that would allow upstream projects
> to continue working with out-of-spec names, while alerting them that
> these could cause interoperability problems outside of their current
> cases!  One thing for certain, the specification should allow invalid
> names for "aliases" so that users can migrate away from these issues.
>
> A slightly related resource: https://langsec.org/occupy/
>
> All my best, Ryan
>
>
> On Tue, Nov 15, 2022 at 4:41 PM Christophe Le Saëc <[email protected]>
> wrote:
> >
> > Avro-2659 <https://issues.apache.org/jira/browse/AVRO-2659> indeed shows
> > that name should not contains space (pb when generate java code) nor dot
> > (pb to separate names in a path).
> >
> > AVRO-1022 <https://issues.apache.org/jira/browse/AVRO-1022>: last
> comment
> > reinforce rules for dot and contains a nice principle : "accept schemas
> as
> > liberally as possible"
> >
> >
> > ** allowing two language SDKs to implement the spec differently will make
> > users unhappy about cross-platform, cross-language compatibility.*
> > -> Indeed, that's the case with current version, where Java and C# accept
> > accents when C and Rust strictly follow the spec.
> >
> > Others possibilities :
> > - *putting human-readable or internationalised names in other metadata
> > properties* : Yes, this can already be done on record fields for example
> as
> > field is a JsonProperties class (and we use it already in some case,
> that's
> > help).
> > - *using configuration / environment / system properties to turn rigorous
> > spec validation on and off* : If we can use a "non rigorous validation"
> and
> > it can run wthout bugs, why switch to a rigorous validation mode that
> would
> > follow current specification and not change the specification to "accept
> > schemas as liberally as possible" (meaning, while it doesn't generate
> bugs).
> >
> >
> >
> > *My preference would be to *tighten* the SDKs to match the existing Avro
> > spec, and provide language-specific ways to easily disable validating
> names
> > if desired*
> > Personnally, i like the idea to have mandatory name control you can't
> > deactivate, to ensure it won't generate bug (For Java code generation
> > mainly and to be able to separate name and namespace), but control
> > specification should be limited to ban names that would generate a bug
> (and
> > not a rule that seems to have no real reason, until it would be explain
> in
> > doc).
> >
> > Best regards,
> > Christophe.
> >
> >
> >
> > Le jeu. 10 nov. 2022 à 19:15, Ryan Skraba <[email protected]> a écrit :
> >
> > > Hello!  Here's a couple of related JIRA from the past that we can use
> > > to inform our discussion:
> > >
> > > * AVRO-2659 demonstrates a pretty disastrous schema name that the Java
> > > SDK accepts.
> > > * AVRO-1022 (10 years ago!) has a somewhat tepid discussion about
> > > accepting UTF-8 that didn't quite get enough follow-up to make it into
> > > the spec!
> > >
> > > We're been in the current (unsatisfactory) state for so long because:
> > >
> > > * making a change to an SDK changing its behaviour (even to "fix it")
> > > will make users unhappy about backwards/forwards version
> > > compatibility, and
> > > * allowing two language SDKs to implement the spec differently will
> > > make users unhappy about cross-platform, cross-language compatibility.
> > >
> > > In my opinion, with modern streaming and event processing, we have to
> > > take the latter into account!
> > >
> > > There were a couple of other options than the two you propose in the
> > > original discussion thread (such as putting human-readable or
> > > internationalised names in other metadata properties, or using
> > > configuration / environment / system properties to turn rigorous spec
> > > validation on and off).  Have you given them any consideration for
> > > your use case?
> > >
> > > My preference would be to *tighten* the SDKs to match the existing
> > > Avro spec, and provide language-specific ways to easily disable
> > > validating names if desired.  There's some precedence for this in the
> > > Schema.Parser#validate method.
> > >
> > > There's a bit more going on here that's worth doing right for the
> future!
> > >
> > > All my best, Ryan
> > >
> > >
> > >
> > > On Thu, Nov 10, 2022 at 4:53 PM Oscar Westra van Holthe - Kind
> > > <[email protected]> wrote:
> > > >
> > > > On Thu, 10 Nov 2022 at 08:55, Christophe Le Saëc <[email protected]
> >
> > > wrote:
> > > >
> > > > > So, discussion is to choose between
> > > > >
> > > > >    1. "change the documentation" (and adapt module as proposed in
> this
> > > PR
> > > > >    for RUST <https://github.com/apache/avro/pull/1787> and this
> other
> > > for
> > > > > C
> > > > >    <https://github.com/apache/avro/pull/1798>)
> > > > >    2. change the code (in Java and C# at least) to be conformed to
> > > > >    documentation.
> > > > >
> > > >
> > > > For compatibility, I like option 1. If we're to change naming rules,
> I'd
> > > > vote for logging warnings before tightening the rules.
> > > >
> > > > Kind regards,
> > > > Oscar
> > > >
> > > > --
> > > >
> > > > ✉️ Oscar Westra van Holthe - Kind <[email protected]>
> > >
>

Reply via email to