On Fri, Aug 4, 2017 at 4:36 PM, Jean Georges Perrin wrote:
> Thanks Daniel,
>
> I like your answer for #1. It makes sense.
>
> However, I don't get why you say that there are always pending
> transformations... After you call an action, you should be "clean" from
> pending
Thanks Daniel,
I like your answer for #1. It makes sense.
However, I don't get why you say that there are always pending
transformations... After you call an action, you should be "clean" from pending
transformations, no?
> On Aug 3, 2017, at 5:53 AM, Daniel Darabos
On Wed, Aug 2, 2017 at 2:16 PM, Jean Georges Perrin wrote:
> Hi Sparkians,
>
> I understand the lazy evaluation mechanism with transformations and
> actions. My question is simpler: 1) are show() and/or printSchema()
> actions? I would assume so...
>
show() is an action (it prints
Hey Jörn,
The "pending" was more something like a flag like myDf.hasCatalystWorkToDo() or
myDf.isPendingActions(). Maybe an access to the DAG?
I just did that:
ordersDf = ordersDf.withColumn(
"time_to_ship",
datediff(ordersDf.col("ship_date"), ordersDf.col("order_date")));
I assume printschema would not trigger an evaluation. Show might partially
triggger an evaluation (not all data is shown only a certain number of rows by
default).
Keep in mind that even a count might not trigger evaluation of all rows
(especially in the future) due to updates on the optimizer.
Hi Sparkians,
I understand the lazy evaluation mechanism with transformations and actions. My
question is simpler: 1) are show() and/or printSchema() actions? I would assume
so...
and optional question: 2) is there a way to know if there are transformations
"pending"?
Thanks!
jg