Re: How to use StringIndexer for multiple input /output columns in Spark Java

2018-05-16 Thread Bryan Cutler
Yes, the workaround is to create multiple StringIndexers as you described. OneHotEncoderEstimator is only in Spark 2.3.0, you will have to use just OneHotEncoder. On Tue, May 15, 2018, 8:40 AM Mina Aslani wrote: > Hi, > > So, what is the workaround? Should I create

Re: How to use StringIndexer for multiple input /output columns in Spark Java

2018-05-15 Thread Mina Aslani
Hi, So, what is the workaround? Should I create multiple indexer(one for each column), and then create pipeline and set stages to have all the StringIndexers? I am using 2.2.1 as I cannot move to 2.3.0. Looks like oneHotEncoderEstimator is broken, please see my email sent today with subject:

Re: How to use StringIndexer for multiple input /output columns in Spark Java

2018-05-15 Thread Nick Pentreath
Multi column support for StringIndexer didn’t make it into Spark 2.3.0 The PR is still in progress I think - should be available in 2.4.0 On Mon, 14 May 2018 at 22:32, Mina Aslani wrote: > Please take a look at the api doc: >

Re: How to use StringIndexer for multiple input /output columns in Spark Java

2018-05-14 Thread Mina Aslani
Please take a look at the api doc: https://spark.apache.org/docs/2.3.0/api/java/org/apache/spark/ml/feature/StringIndexer.html On Mon, May 14, 2018 at 4:30 PM, Mina Aslani wrote: > Hi, > > There is no SetInputCols/SetOutputCols for StringIndexer in Spark java. > How

How to use StringIndexer for multiple input /output columns in Spark Java

2018-05-14 Thread Mina Aslani
Hi, There is no SetInputCols/SetOutputCols for StringIndexer in Spark java. How multiple input/output columns can be specified then? Regards, Mina