Hey Jörn,

The "pending" was more something like a flag like myDf.hasCatalystWorkToDo() or 
myDf.isPendingActions(). Maybe an access to the DAG?

I just did that:
    ordersDf = ordersDf.withColumn(
        "time_to_ship", 
        datediff(ordersDf.col("ship_date"), ordersDf.col("order_date")));
    
    ordersDf.printSchema();
    ordersDf.show();

and the schema and data is correct, so I was wondering what triggered 
Catalyst...

jg



> On Aug 2, 2017, at 8:29 AM, Jörn Franke <jornfra...@gmail.com> wrote:
> 
> I assume printschema would not trigger an evaluation. Show might partially 
> triggger an evaluation (not all data is shown only a certain number of rows 
> by default).
> Keep in mind that even a count might not trigger evaluation of all rows 
> (especially in the future) due to updates on the optimizer. 
> 
> What do you mean by pending ? You can see the status of the job in the UI. 
> 
>> On 2. Aug 2017, at 14:16, Jean Georges Perrin <j...@jgp.net> wrote:
>> 
>> Hi Sparkians,
>> 
>> I understand the lazy evaluation mechanism with transformations and actions. 
>> My question is simpler: 1) are show() and/or printSchema() actions? I would 
>> assume so...
>> 
>> and optional question: 2) is there a way to know if there are 
>> transformations "pending"?
>> 
>> Thanks!
>> 
>> jg
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>> 
> 
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
> 

Reply via email to