>
> Thanks for the extensive response! I think a lot of what you are saying
> is very spot on!
>

I hope it's useful, if and when your paper is published, I'd love to give
it a read.


> I'm working on a paper surveying ~13 binary schema-less and
> schema-driven serialization formats (including Avro) that can handle any
> data structure that JSON can represent. Therefore, I was particularly
> interested on why you wanted to convert JSON Schema to Avro IDL.
>

So GraphQL's IDL (interface definition language) isn't quite a JSON Schema,
but the responses are often represented as JSON, but it's not JSON Schema
per-se.

For use the use-case is very different, even if Avro and GraphQL's IDLs
could _almost_ be losslessly interchanged at some level, they both have a
decent type system, they both allow definition of RPC services, one is a
great candidate for public APIs (GraphQL has "directives", and really nice
annotation and documentation generation tools), and Avro is ideal for our
internal APIs. An Avro payload for us runs ~12-30 bytes, where JSON would
be at least 2-3x the size (we send a lot of very small messages, very
similar ones, so JSON reserializing the keys every time would kill us). So
Avro gives us something nothing JSON oriented can. Also, we use Avro as our
archival format using the
https://avro.apache.org/docs/current/spec.html#Object+Container+Files which
I believe is also sort-of unique.


> Is JSON Schema the ubiquitous "contract" language that you are using in
> your company, so you want to keep it as the source of truth while also
> being able to work with Avro?
>

It's just public vs. private (or, internal) APIs, and being deliberate
about storing those IDL files in separate repositories and training teams
to get into the habit of planning and co-designing changes to these
ubiquitous contracts before they need to do implementation work, since
changing the contract affects everyone. (be that some "near realtime" RPC
service that is in the hot path of customer requests  on the web API, or
whether that's offline processing by our BI teams who are running reports
based on the archived data from the datawarehouse)

The company just went through explosive growth, so, whilst we
adopted/inherited Avro as part of adopting JVM/Akka stuff for some parts of
the infra, the pivot to nominate these IDLs as the point at which teams
have to synchronize and coordinate is still something we are building out.


> On Tue, Jan 26, 2021 at 04:38:18PM +0100, Lee Hambley wrote:
> > I would say that in general, being around the industry for 15 or so years
> > now, that there has been a definite uptake in these binary protocols.
> >
> > If I had to speculate, I'd that that outside a few niches, the ASN.1 and
> > similar protocols never *really* took-off outside telecoms, which is
> > regrettable because they are really fantastic protocols (they are used
> > extensively in certificates, DER/PEM are in the ASN.1 family of things,
> SSL
> > certs are all ASN.1 encoded, usually, etc.)
> >
> > These days seems like everyone has some "big data" pipe, and having
> > Hadoop/Spark/etc has become the must-have thing in most SMEs, so you
> > inherit some of these things by "accident".
> >
> > I personally come from the event-sourcing, CQRS, domain-driven-design
> > circles, here having a ubiquitous language "contract", preferably a
> > bullet-proof one with good change management tooling is something that
> you
> > explicitly go looking for. In that sphere you come across msgpack,
> > capnproto, protobufs, thrift, etc which all offer insane performance,
> very
> > compact payloads, but Avro is unique in offering something like a schema
> > registry and concrete guarantees about rolling coordinating deploys with
> > between producers and consumers (note: I _think_ protobufs got something
> > like a schema registry now, but I never used it)
> >
> > Another increasingly good option for this in the "SDL" (schema definition
> > language) spec space is GraphQL which isn't a _binary_ packing format,
> but
> > does offer a standalone schema definition language for defining service
> > contracts. Whilst Avro does account for RPC protocols
> > <https://avro.apache.org/docs/current/spec.html#Protocol+Declaration>, I
> > haven't really seen that used so much in the wild, but maybe that's just
> my
> > "bubble" speaking. GraphQL doesn't *really* have the schema migration
> tools
> > that Avro has, but at least when dealing with GraphQL payloads, most
> > language implementations give you the underlying syntax tree for the
> > payload, so it's a bit easier to see what clients are requesting and what
> > fields need various levels of scrutiny before being changed.
> >
> > Anyway, probably nothing of this is really interesting to your paper,
> but I
> > never miss a good opportunity to share unsolicited opinions :D
> >
> > Lee Hambley
> > http://lee.hambley.name/
> > +49 (0) 170 298 5667
> >
> >
> > On Tue, 26 Jan 2021 at 16:27, Juan Cruz Viotti <j...@jviotti.com> wrote:
> >
> > > > I don't mean to make light of your question, just to point out that I
> > > > don't think many companies are proudly announcing to the world that
> > > > they use Avro... why would they?
> > >
> > > Indeed, I totally agree. I'm writing a research paper involving Apache
> > > Avro and just wanted to enrich the historical sections a bit with some
> > > industry usage information!
> > >
> > > On Mon, Jan 25, 2021 at 10:40:31PM +0100, Lee Hambley wrote:
> > > > I work for two companies using Avro (contractor, I won't name them)
> but I
> > > > don't know what good it serves anyone knowing that we use them.
> Would you
> > > > ask the same question about JSON, or XML, or whether we use nginx or
> > > > apache?
> > > >
> > > > Avro is one of about 5 components in the distributed messaging
> > > > architectures, and aside that is is very nicely designed (I believe
> the
> > > > schema versioning and rigorously documented canonical forms are an
> almost
> > > > unique point of attraction)
> > > >
> > > > I don't mean to make light of your question, just to point out that I
> > > don't
> > > > think many companies are proudly announcing to the world that they
> use
> > > > Avro... why would they?
> > > >
> > > > Lee Hambley
> > > > http://lee.hambley.name/
> > > > +49 (0) 170 298 5667
> > > >
> > > >
> > > > On Mon, 25 Jan 2021 at 22:30, M. Manna <manme...@gmail.com> wrote:
> > > >
> > > > >
> > > > > I believe Confluent and Imply are the two companies I know of.
> > > > >
> > > > >
> > > > > On Mon, 25 Jan 2021 at 20:28, Juan Cruz Viotti <j...@jviotti.com>
> wrote:
> > > > >
> > > > >> Hey there!
> > > > >>
> > > > >> Do you know where can I find a list of relatively well-known
> companies
> > > > >> that make use of Apache Avro? I'm trying to collect a small list
> for
> > > > >> research purposes and my search is not yielding many results apart
> > > from
> > > > >> Facebook.
> > > > >>
> > > > >> Thanks in advance,
> > > > >>
> > > > >> --
> > > > >> Juan Cruz Viotti
> > > > >> Software Engineer
> > > > >> https://www.jviotti.com
> > > > >>
> > > > >
> > >
> > > --
> > > Juan Cruz Viotti
> > > Software Engineer
> > > https://www.jviotti.com
> > >
>
> --
> Juan Cruz Viotti
> Software Engineer
> https://www.jviotti.com
>

Reply via email to