Thanks for the in depth explanation.
These methods would require us to architect our Server around Spark and it
is actually designed to be independent of the ML implementation. SparkML is
an important algo source, to be sure, but so is TensorFlow, and Python
non-spark libs among others. So Spark
Please keep in mind i'm fairly new to spark.
I have some spark code where i load two textfiles as datasets and after some
map and filter operations to bring the columns in a specific shape, i join
the datasets.
The join takes place on a common column (of type string).
Is there any way to avoid