SQL in addition to simplicity also provides standard way of analysis across
multiple databases. That aspect is something that users would like with
machine learning as well.
Flexibility of Spark's API is definitely helpful but a simple and standard
way for new users is desired when it comes to
My $0.02 -- this isn't worthwhile.
Yes, there are ML-in-SQL tools. I'm thinking of MADlib for example. I think
these hold over from days when someone's only interface to a data warehouse
was SQL, and so there had to be SQL-language support for invoking ML jobs.
There was no programmatic
BTW, I can contribute if there is already an effort going on somewhere.
On Fri, Aug 31, 2018 at 3:35 PM Hemant Bhanawat
wrote:
> We allow our users to interact with spark cluster using SQL queries only.
> That's easy for them. MLLib does not have SQL extensions and we cannot
> expose it to our
We allow our users to interact with spark cluster using SQL queries only.
That's easy for them. MLLib does not have SQL extensions and we cannot
expose it to our users.
SQL extensions can further accelerate MLLib's adoption. See
https://cloud.google.com/bigquery/docs/bigqueryml-intro.
Hemant
What are you interested in accomplishing?
The spark.ml package has provided a machine learning API based on
DataFrames for quite some time. If you are interested in mixing query
processing and machine learning, this is certainly the best place to start.
See here:
Is there a plan to support SQL extensions for mllib? Or is there an effort
already underway?
Any information is appreciated.
Thanks in advance.
Hemant