yes Lineage that is actually replayable is what is needed for Validation
process. So we can address questions like how a system arrived at a state S
at a time T. I guess a good analogy is event sourcing.
On Thu, Apr 6, 2017 at 10:30 PM, Jörn Franke wrote:
> I do think
I do think this is the right way, you will have to do testing with test data
verifying that the expected output of the calculation is the output.
Even if the logical Plan Is correct your calculation might not be. E.g. There
can be bugs in Spark, in the UI or (what is very often) the client
Hi,
I think that every client wants a validation process, but showing lineage
is a approach that they are not asking, and may not be the right way to
prove it.
Regards,
Gourav
On Tue, Apr 4, 2017 at 4:19 AM, kant kodali wrote:
> Hi All,
>
> I am wondering if there a way
How about storing logical plans (or printDebugString, in case of RDD) to an
external file on the driver?
On Tue, Apr 4, 2017 at 1:19 PM, kant kodali wrote:
> Hi All,
>
> I am wondering if there a way to persist the lineages generated by spark
> underneath? Some of our
Hi All,
I am wondering if there a way to persist the lineages generated by spark
underneath? Some of our clients want us to prove if the result of the
computation that we are showing on a dashboard is correct and for that If
we can show the lineage of transformations that are executed to get to