(Maybe unrelated FYI): in case you're using only Scala or Java with
Spark, I would recommend to use Datasets instead of DataFrames. They
provide exactly the same functionality, yet offer more type-safety.
On Thu, Sep 8, 2016 at 11:05 AM, Lee Becker wrote:
>
> On Thu, Sep
On Thu, Sep 8, 2016 at 11:35 AM, Ashish Tadose
wrote:
> I wish to organize these dataframe operations by grouping them Scala
> Object methods.
> Something like below
>
>
>
>> *Object Driver {*
>> *def main(args: Array[String]) {*
>> * val df =
Hi Team,
I have Spark job with large number of dataframe operations.
This job reads various lookup data from external table as MySql and also
run lot of dataframe operations on large data on hdfs in parquet.
Job works fine in cluster however jobdriver code looks clumsy because of
large number