Hey Jörn, The "pending" was more something like a flag like myDf.hasCatalystWorkToDo() or myDf.isPendingActions(). Maybe an access to the DAG?
I just did that: ordersDf = ordersDf.withColumn( "time_to_ship", datediff(ordersDf.col("ship_date"), ordersDf.col("order_date"))); ordersDf.printSchema(); ordersDf.show(); and the schema and data is correct, so I was wondering what triggered Catalyst... jg > On Aug 2, 2017, at 8:29 AM, Jörn Franke <jornfra...@gmail.com> wrote: > > I assume printschema would not trigger an evaluation. Show might partially > triggger an evaluation (not all data is shown only a certain number of rows > by default). > Keep in mind that even a count might not trigger evaluation of all rows > (especially in the future) due to updates on the optimizer. > > What do you mean by pending ? You can see the status of the job in the UI. > >> On 2. Aug 2017, at 14:16, Jean Georges Perrin <j...@jgp.net> wrote: >> >> Hi Sparkians, >> >> I understand the lazy evaluation mechanism with transformations and actions. >> My question is simpler: 1) are show() and/or printSchema() actions? I would >> assume so... >> >> and optional question: 2) is there a way to know if there are >> transformations "pending"? >> >> Thanks! >> >> jg >> >> >> --------------------------------------------------------------------- >> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >> > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org >