Re: Lineage between Datasets

2017-04-12 Thread Chang Chen
by > calling explain(true) and look at the analyzed plan. > > > On Wed, Apr 12, 2017 at 3:03 AM Chang Chen wrote: > >> Hi All >> >> I believe that there is no lineage between datasets. Consider this case: >> >> val people = spark.read.parquet(&q

Re: Lineage between Datasets

2017-04-12 Thread Reynold Xin
there is no lineage between datasets. Consider this case: > > val people = spark.read.parquet("...").as[Person] > > val ageGreatThan30 = people.filter("age > 30") > > Since the second DS can push down the condition, they are obviously > different

Lineage between Datasets

2017-04-12 Thread Chang Chen
Hi All I believe that there is no lineage between datasets. Consider this case: val people = spark.read.parquet("...").as[Person] val ageGreatThan30 = people.filter("age > 30") Since the second DS can push down the condition, they are obviously different logical plans