Re: StreamingLogisticRegressionWithSGD : Multiclass Classification : Options

2018-01-19 Thread Sundeep Kumar Mehta
Thanks a lot Patrick, this was helpful... Regards Sundeep On Sat, Jan 20, 2018 at 1:35 AM, Patrick McCarthy wrote: > Rather than use a fancy purpose-built class, I was thinking that you could > rather generate a series of label vectors, vector A is 1 when class a is >

Re: StreamingLogisticRegressionWithSGD : Multiclass Classification : Options

2018-01-19 Thread Patrick McCarthy
Rather than use a fancy purpose-built class, I was thinking that you could rather generate a series of label vectors, vector A is 1 when class a is positive and 0 when any other is, vector B is 1 when class b is positive and 0 when any other is, etc. I don't know anything about streaming in

is there a way to write a Streaming Dataframe/Dataset to Cassandra with auto mapping?

2018-01-19 Thread kant kodali
Hi All, I was wondering if there is a way to write a Streaming Dataframe/Dataset to Cassandra with auto mapping? By auto mapping I mean mapping DataSet/Dataframe schema to Cassandra Table schema? I can for example get Dataframe.dtypes() and then map Spark SQL types to CQL types but I was

Spark MLLib vs. SciKitLearn

2018-01-19 Thread Aakash Basu
Hi all, I am totally new to ML APIs. Trying to get the *ROC_Curve* for Model Evaluation on both *ScikitLearn* and *PySpark MLLib*. I do not find any API for ROC_Curve calculation for BinaryClassification in SparkMLLib. The codes below have a wrapper function which is creating the respective

Re: [ML] Allow CrossValidation ParamGrid on SVMWithSGD

2018-01-19 Thread Nick Pentreath
SVMWithSGD sits in the older "mllib" package and is not compatible directly with the DataFrame API. I suppose one could write a ML-API wrapper around it. However, there is LinearSVC in Spark 2.2.x: http://spark.apache.org/docs/latest/ml-classification-regression.html#linear-support-vector-machine

[ML] Allow CrossValidation ParamGrid on SVMWithSGD

2018-01-19 Thread Tomasz Dudek
Hello, is there any way to use CrossValidation's ParamGrid with SVMWithSGD? usually, when e.g. using RandomForest you can specify a lot of parameters, to automatise the param grid search (when used with CrossValidation) val algorithm = new RandomForestClassifier() val paramGrid = { new