Hi all,
   Just wanted to thank all for the dataset API - most of the times we see
only bugs in these lists ;o).

   - Putting some context, this weekend I was updating the SQL chapters of
   my book - it had all the ugliness of SchemaRDD,
   registerTempTable, take(10).foreach(println)
   and take(30).foreach(e=>println("%15s | %9.2f |".format(e(0),e(1)))) ;o)
   - I remember Hossein Falaki chiding me about the ugly println statements
      !
      - Took me a little while to grok the dataset, sparksession,
      spark.read.option("header","true").option("inferSchema","true").csv(...)
et
      al.
         - I am a big R fan and know the language pretty decent - so the
         constructs are familiar
      - Once I got it ( I am sure still there are more mysteries to uncover
   ...) it was just beautiful - well done folks !!!
   - One sees the contrast a lot better while teaching or writing books,
   because one has to think thru the old, the new and the transitional arc
      - I even remember the good old days when we were discussing whether
      Spark would get the dataframes like R at one of Paco's sessions !
      - And now, it looks very decent for data wrangling.

Cheers & keep up the good work
<k/>
P.S: My next chapter is the MLlib - need to convert to ml. Should be
interesting ... I am a glutton for punishment - of the Spark kind, of
course !

Reply via email to