I would really appreciate that, I'm probably going to just write a planner rule for now which matches up my table schema with the query output if they are valid, and fails analysis otherwise. This approach is how I got metadata columns in so I believe it would work for writing as well.
On Wed, May 13, 2020 at 5:13 PM Ryan Blue <rb...@netflix.com> wrote: > I agree with adding a table capability for this. This is something that we > support in our Spark branch so that users can evolve tables without > breaking existing ETL jobs -- when you add an optional column, it shouldn't > fail the existing pipeline writing data to a table. I can contribute the > changes to validation if people are interested. > > On Wed, May 13, 2020 at 2:57 PM Russell Spitzer <russell.spit...@gmail.com> > wrote: > >> In DSV1 this was pretty easy to do because of the burden of verification >> for writes had to be in the datasource, the new setup makes partial writes >> difficult. >> >> resolveOuptutColumns checks the table schema against the writeplan's >> output and will fail any requests which don't contain every column as >> specified in the table schema. >> I would like it if instead if either we made this check optional for a >> datasource, perhaps an "allow partial writes" trait for the table? Or just >> allowed analysis >> to fail on "withInputDataSchema" where an implementer could throw >> exceptions on underspecified writes. >> >> >> The use case here is that C* (and many other sinks) have mandated columns >> that must be present during an insert as well as those >> which are not required. >> >> Please let me know if i've misread this, >> >> Thanks for your time again, >> Russ >> > > > -- > Ryan Blue > Software Engineer > Netflix >