Re: mllib + SQL

2018-09-01 Thread Hemant Bhanawat
SQL in addition to simplicity also provides standard way of analysis across multiple databases. That aspect is something that users would like with machine learning as well. Flexibility of Spark's API is definitely helpful but a simple and standard way for new users is desired when it comes to

Re: mllib + SQL

2018-08-31 Thread Sean Owen
My $0.02 -- this isn't worthwhile. Yes, there are ML-in-SQL tools. I'm thinking of MADlib for example. I think these hold over from days when someone's only interface to a data warehouse was SQL, and so there had to be SQL-language support for invoking ML jobs. There was no programmatic

Re: mllib + SQL

2018-08-31 Thread Hemant Bhanawat
BTW, I can contribute if there is already an effort going on somewhere. On Fri, Aug 31, 2018 at 3:35 PM Hemant Bhanawat wrote: > We allow our users to interact with spark cluster using SQL queries only. > That's easy for them. MLLib does not have SQL extensions and we cannot > expose it to our

Re: mllib + SQL

2018-08-31 Thread Hemant Bhanawat
We allow our users to interact with spark cluster using SQL queries only. That's easy for them. MLLib does not have SQL extensions and we cannot expose it to our users. SQL extensions can further accelerate MLLib's adoption. See https://cloud.google.com/bigquery/docs/bigqueryml-intro. Hemant

Re: mllib + SQL

2018-08-30 Thread William Benton
What are you interested in accomplishing? The spark.ml package has provided a machine learning API based on DataFrames for quite some time. If you are interested in mixing query processing and machine learning, this is certainly the best place to start. See here:

mllib + SQL

2018-08-30 Thread Hemant Bhanawat
Is there a plan to support SQL extensions for mllib? Or is there an effort already underway? Any information is appreciated. Thanks in advance. Hemant