Amen
> On Nov 13, 2016, at 7:55 PM, janardhan shetty wrote:
>
> These Jiras' are still unresolved:
> https://issues.apache.org/jira/browse/SPARK-11215
>
> Also there is https://issues.apache.org/jira/browse/SPARK-8418
>
>> On Wed, Aug 17, 2016 at 11:15 AM, Nisha
These Jiras' are still unresolved:
https://issues.apache.org/jira/browse/SPARK-11215
Also there is https://issues.apache.org/jira/browse/SPARK-8418
On Wed, Aug 17, 2016 at 11:15 AM, Nisha Muktewar wrote:
>
> The OneHotEncoder does *not* accept multiple columns.
>
> You can
The OneHotEncoder does *not* accept multiple columns.
You can use Michal's suggestion where he uses Pipeline to set the stages
and then executes them.
The other option is to write a function that performs one hot encoding on a
column and returns a dataframe with the encoded column and then call
I had already tried this way :
scala> val featureCols = Array("category","newone")
featureCols: Array[String] = Array(category, newone)
scala> val indexer = new
StringIndexer().setInputCol(featureCols).setOutputCol("categoryIndex").fit(df1)
:29: error: type mismatch;
found : Array[String]
I don't think it does. From the documentation:
https://spark.apache.org/docs/2.0.0-preview/ml-features.html#onehotencoder,
I see that it still accepts one column at a time.
On Wed, Aug 17, 2016 at 10:18 AM, janardhan shetty
wrote:
> 2.0:
>
> One hot encoding currently
You can it just map over your columns and create a pipeline:
val columns = Array("colA", "colB", "colC")
val transformers: Array[PipelineStage] = columns.map {
x => new OneHotEncoder().setInputCol(x).setOutputCol(x + "Encoded")
}
val pipeline = new Pipeline()
.setStages(transformers)
On 17
2.0:
One hot encoding currently accepts single input column is there a way to
include multiple columns ?