Spark can be used with tools like great expectations as well to implement the data contracts . I am not sure though if spark alone can do the data contracts . I was reading a blog on data mesh and how to glue it together with data contracts , that’s where I came across this spark and great expectations mention .
HTH -Deepak On Tue, 13 Jun 2023 at 12:48 AM, Elliot West <tea...@gmail.com> wrote: > Hi Phillip, > > While not as fine-grained as your example, there do exist schema systems > such as that in Avro that can can evaluate compatible and incompatible > changes to the schema, from the perspective of the reader, writer, or both. > This provides some potential degree of enforcement, and means to > communicate a contract. Interestingly I believe this approach has been > applied to both JsonSchema and protobuf as part of the Confluent Schema > registry. > > Elliot. > > On Mon, 12 Jun 2023 at 12:43, Phillip Henry <londonjava...@gmail.com> > wrote: > >> Hi, folks. >> >> There currently seems to be a buzz around "data contracts". From what I >> can tell, these mainly advocate a cultural solution. But instead, could big >> data tools be used to enforce these contracts? >> >> My questions really are: are there any plans to implement data >> constraints in Spark (eg, an integer must be between 0 and 100; the date in >> column X must be before that in column Y)? And if not, is there an appetite >> for them? >> >> Maybe we could associate constraints with schema metadata that are >> enforced in the implementation of a FileFormatDataWriter? >> >> Just throwing it out there and wondering what other people think. It's an >> area that interests me as it seems that over half my problems at the day >> job are because of dodgy data. >> >> Regards, >> >> Phillip >> >>