Re: Manageable avro schema evolution in Java

2022-06-27 Thread Juan Cruz Viotti
Hi Niels,

Thanks a lot for sharing. Very interesting talk! Adding thumbs up :)

One comment around JSON Schema: in the talk you mention that JSON Schema
is still on beta given it is a draft. 

While JSON Schema is a "draft" from the point of view of IETF, it is
considered production-ready and the industry-standard for describing
JSON documents already. We hope to start publishing the JSON Schema
standard outside of IETF at some point to be able to workaround this
common perceptions problem!

We are starting to document use cases of JSON Schema in production on
YouTube:
https://www.youtube.com/playlist?list=PLHVhS4Tj1YZOrrvl7_a9LaBAtst7BWH8a.

-- 
Juan Cruz Viotti
Technical Lead @ Postman.com
https://www.jviotti.com


Papers discussing Apache Avro

2022-01-13 Thread Juan Cruz Viotti
Hey there!

As part of my MSc dissertation at University of Oxford, I wrote and
published two papers covering the characteristics of various binary
serialization formats, including Apache Avro and performing a
space-efficiency benchmark, respectively.

Sharing them here in case anybody finds them interesting! The first
paper explains how Apache Avro works including an annotated hexadecimal
example and the second compares Apache Avro to various other popular
serialization formats.

- A Survey of JSON-compatible Binary Serialization Specifications:
  https://arxiv.org/abs/2201.02089
- A Benchmark of JSON-compatible Binary Serialization Specifications:
  https://arxiv.org/abs/2201.03051

The benchmark study has proved Apache Avro to be one of the most
space-efficient formats considered.

All the best!

-- 
Juan Cruz Viotti
Technical Lead @ Postman.com
https://www.jviotti.com


Re: Companies using Apache Avro

2021-01-26 Thread Juan Cruz Viotti
Hi Lee,

Thanks for the extensive response! I think a lot of what you are saying
is very spot on! 

I'm working on a paper surveying ~13 binary schema-less and
schema-driven serialization formats (including Avro) that can handle any
data structure that JSON can represent. Therefore, I was particularly
interested on why you wanted to convert JSON Schema to Avro IDL. 

Is JSON Schema the ubiquitous "contract" language that you are using in
your company, so you want to keep it as the source of truth while also
being able to work with Avro?

On Tue, Jan 26, 2021 at 04:38:18PM +0100, Lee Hambley wrote:
> I would say that in general, being around the industry for 15 or so years
> now, that there has been a definite uptake in these binary protocols.
> 
> If I had to speculate, I'd that that outside a few niches, the ASN.1 and
> similar protocols never *really* took-off outside telecoms, which is
> regrettable because they are really fantastic protocols (they are used
> extensively in certificates, DER/PEM are in the ASN.1 family of things, SSL
> certs are all ASN.1 encoded, usually, etc.)
> 
> These days seems like everyone has some "big data" pipe, and having
> Hadoop/Spark/etc has become the must-have thing in most SMEs, so you
> inherit some of these things by "accident".
> 
> I personally come from the event-sourcing, CQRS, domain-driven-design
> circles, here having a ubiquitous language "contract", preferably a
> bullet-proof one with good change management tooling is something that you
> explicitly go looking for. In that sphere you come across msgpack,
> capnproto, protobufs, thrift, etc which all offer insane performance, very
> compact payloads, but Avro is unique in offering something like a schema
> registry and concrete guarantees about rolling coordinating deploys with
> between producers and consumers (note: I _think_ protobufs got something
> like a schema registry now, but I never used it)
> 
> Another increasingly good option for this in the "SDL" (schema definition
> language) spec space is GraphQL which isn't a _binary_ packing format, but
> does offer a standalone schema definition language for defining service
> contracts. Whilst Avro does account for RPC protocols
> <https://avro.apache.org/docs/current/spec.html#Protocol+Declaration>, I
> haven't really seen that used so much in the wild, but maybe that's just my
> "bubble" speaking. GraphQL doesn't *really* have the schema migration tools
> that Avro has, but at least when dealing with GraphQL payloads, most
> language implementations give you the underlying syntax tree for the
> payload, so it's a bit easier to see what clients are requesting and what
> fields need various levels of scrutiny before being changed.
> 
> Anyway, probably nothing of this is really interesting to your paper, but I
> never miss a good opportunity to share unsolicited opinions :D
> 
> Lee Hambley
> http://lee.hambley.name/
> +49 (0) 170 298 5667
> 
> 
> On Tue, 26 Jan 2021 at 16:27, Juan Cruz Viotti  wrote:
> 
> > > I don't mean to make light of your question, just to point out that I
> > > don't think many companies are proudly announcing to the world that
> > > they use Avro... why would they?
> >
> > Indeed, I totally agree. I'm writing a research paper involving Apache
> > Avro and just wanted to enrich the historical sections a bit with some
> > industry usage information!
> >
> > On Mon, Jan 25, 2021 at 10:40:31PM +0100, Lee Hambley wrote:
> > > I work for two companies using Avro (contractor, I won't name them) but I
> > > don't know what good it serves anyone knowing that we use them. Would you
> > > ask the same question about JSON, or XML, or whether we use nginx or
> > > apache?
> > >
> > > Avro is one of about 5 components in the distributed messaging
> > > architectures, and aside that is is very nicely designed (I believe the
> > > schema versioning and rigorously documented canonical forms are an almost
> > > unique point of attraction)
> > >
> > > I don't mean to make light of your question, just to point out that I
> > don't
> > > think many companies are proudly announcing to the world that they use
> > > Avro... why would they?
> > >
> > > Lee Hambley
> > > http://lee.hambley.name/
> > > +49 (0) 170 298 5667
> > >
> > >
> > > On Mon, 25 Jan 2021 at 22:30, M. Manna  wrote:
> > >
> > > >
> > > > I believe Confluent and Imply are the two companies I know of.
> > > >
> > > >
> > > > On Mon, 25 Jan 2021 at 20:28, Juan Cruz Viotti  w

Re: Companies using Apache Avro

2021-01-26 Thread Juan Cruz Viotti
> I don't mean to make light of your question, just to point out that I
> don't think many companies are proudly announcing to the world that
> they use Avro... why would they?

Indeed, I totally agree. I'm writing a research paper involving Apache
Avro and just wanted to enrich the historical sections a bit with some
industry usage information!

On Mon, Jan 25, 2021 at 10:40:31PM +0100, Lee Hambley wrote:
> I work for two companies using Avro (contractor, I won't name them) but I
> don't know what good it serves anyone knowing that we use them. Would you
> ask the same question about JSON, or XML, or whether we use nginx or
> apache?
> 
> Avro is one of about 5 components in the distributed messaging
> architectures, and aside that is is very nicely designed (I believe the
> schema versioning and rigorously documented canonical forms are an almost
> unique point of attraction)
> 
> I don't mean to make light of your question, just to point out that I don't
> think many companies are proudly announcing to the world that they use
> Avro... why would they?
> 
> Lee Hambley
> http://lee.hambley.name/
> +49 (0) 170 298 5667
> 
> 
> On Mon, 25 Jan 2021 at 22:30, M. Manna  wrote:
> 
> >
> > I believe Confluent and Imply are the two companies I know of.
> >
> >
> > On Mon, 25 Jan 2021 at 20:28, Juan Cruz Viotti  wrote:
> >
> >> Hey there!
> >>
> >> Do you know where can I find a list of relatively well-known companies
> >> that make use of Apache Avro? I'm trying to collect a small list for
> >> research purposes and my search is not yielding many results apart from
> >> Facebook.
> >>
> >> Thanks in advance,
> >>
> >> --
> >> Juan Cruz Viotti
> >> Software Engineer
> >> https://www.jviotti.com
> >>
> >

-- 
Juan Cruz Viotti
Software Engineer
https://www.jviotti.com


Companies using Apache Avro

2021-01-25 Thread Juan Cruz Viotti
Hey there!

Do you know where can I find a list of relatively well-known companies
that make use of Apache Avro? I'm trying to collect a small list for
research purposes and my search is not yielding many results apart from
Facebook.

Thanks in advance,

-- 
Juan Cruz Viotti
Software Engineer
https://www.jviotti.com