Re: Data Contracts

2023-07-16 Thread Phillip Henry
egards, >> >> Phillip >> >> >> >> >> On Mon, Jun 19, 2023 at 9:37 AM Deepak Sharma >> wrote: >> >>> It can be as simple as adding a function to the spark session builder >>> specifically on the read which can take the yaml file(defini

Re: Data Contracts

2023-06-19 Thread Deepak Sharma
n, Jun 19, 2023 at 9:37 AM Deepak Sharma > wrote: > >> It can be as simple as adding a function to the spark session builder >> specifically on the read which can take the yaml file(definition if data >> co tracts to be in yaml) and apply it to the data frame . >&g

Re: Data Contracts

2023-06-19 Thread Phillip Henry
wrote: > It can be as simple as adding a function to the spark session builder > specifically on the read which can take the yaml file(definition if data > co tracts to be in yaml) and apply it to the data frame . > It can ignore the rows not matching the data contracts defined

Re: Data Contracts

2023-06-19 Thread Deepak Sharma
It can be as simple as adding a function to the spark session builder specifically on the read which can take the yaml file(definition if data co tracts to be in yaml) and apply it to the data frame . It can ignore the rows not matching the data contracts defined in the yaml . Thanks Deepak

Re: Data Contracts

2023-06-19 Thread Phillip Henry
implement data contracts within their ecosystem. Unfortunately, I think it's closed source and Python only. Regards, Phillip On Sat, Jun 17, 2023 at 11:06 AM Mich Talebzadeh wrote: > It would be interesting if we think about creating a contract validation > library written in JSON format. T

Re: Data Contracts

2023-06-17 Thread Mich Talebzadeh
y, like schema > validation, some rules validations. Spark could also generate an embryo of > data contracts… > > —jgp > > > On Jun 13, 2023, at 07:25, Mich Talebzadeh > wrote: > > From my limited understanding of data contracts, there are two factors > that deem nece

Re: Data Contracts

2023-06-14 Thread Jean-Georges Perrin
validations. Spark could also generate an embryo of data contracts… —jgp > On Jun 13, 2023, at 07:25, Mich Talebzadeh wrote: > > From my limited understanding of data contracts, there are two factors that > deem necessary. > > procedure matter > technical matter > I

Re: Data Contracts

2023-06-13 Thread Mich Talebzadeh
>From my limited understanding of data contracts, there are two factors that deem necessary. 1. procedure matter 2. technical matter I mean this is nothing new. Some tools like Cloud data fusion can assist when the procedures are validated. Simply "The process of integrating multi

Re: Data Contracts

2023-06-13 Thread Phillip Henry
collaborate on changing the contract and making sure that the change has > gotten enough attention before pushing it to production. Hope this helps! > > Kind regards, > Fokko > > Op di 13 jun 2023 om 04:31 schreef Deepak Sharma : > >> Spark can be used with tools lik

Re: Data Contracts

2023-06-13 Thread Fokko Driesprong
before pushing it to production. Hope this helps! Kind regards, Fokko Op di 13 jun 2023 om 04:31 schreef Deepak Sharma : > Spark can be used with tools like great expectations as well to implement > the data contracts . > I am not sure though if spark alone can do the data contracts . > I

Re: Data Contracts

2023-06-12 Thread Deepak Sharma
Spark can be used with tools like great expectations as well to implement the data contracts . I am not sure though if spark alone can do the data contracts . I was reading a blog on data mesh and how to glue it together with data contracts , that’s where I came across this spark and great

Re: Data Contracts

2023-06-12 Thread Elliot West
to communicate a contract. Interestingly I believe this approach has been applied to both JsonSchema and protobuf as part of the Confluent Schema registry. Elliot. On Mon, 12 Jun 2023 at 12:43, Phillip Henry wrote: > Hi, folks. > > There currently seems to be a buzz around "data co

Re: Data Contracts

2023-06-12 Thread Ryan Blue
Hey Phillip, You're right that we can improve tooling to help with data contracts, but I think that a contract still needs to be an agreement between people. Constraints help by helping to ensure a data producer adheres to the contract and gives feedback as soon as possible when assumptions

Data Contracts

2023-06-12 Thread Phillip Henry
Hi, folks. There currently seems to be a buzz around "data contracts". From what I can tell, these mainly advocate a cultural solution. But instead, could big data tools be used to enforce these contracts? My questions really are: are there any plans to implement data constraints in