[ML] Allow CrossValidation ParamGrid on SVMWithSGD

2018-01-19 Thread Tomasz Dudek
Hello, is there any way to use CrossValidation's ParamGrid with SVMWithSGD? usually, when e.g. using RandomForest you can specify a lot of parameters, to automatise the param grid search (when used with CrossValidation) val algorithm = new RandomForestClassifier() val paramGrid = { new

Reverse MinMaxScaler in SparkML

2018-01-08 Thread Tomasz Dudek
Hello, since the similar question on StackOverflow remains unanswered ( https://stackoverflow.com/questions/46092114/is-there-no-inverse-transform-method-for-a-scaler-like-minmaxscaler-in-spark ) and perhaps there is a solution that I am not aware of, I'll ask: After traning MinMaxScaler(or

Re: Row Encoder For DataSet

2017-12-10 Thread Tomasz Dudek
Hello Sandeep, you can pass Row to UDAF. Just provide a proper inputSchema to your UDAF. Check out this example https://docs.databricks.com/ spark/latest/spark-sql/udaf-scala.html Yours, Tomasz 2017-12-10 11:55 GMT+01:00 Sandip Mehta : > Thanks Georg. I have looked

Re: Question on using pseudo columns in spark jdbc options

2017-12-07 Thread Tomasz Dudek
Hey Ravion, yes, you can obviously specify other column than a primary key. Be aware though, that if the key range is not spread evenly (for example in your code, if there's a "gap" in primary keys and no row has id between 0 and 17220) some of the executors may not assist in loading data