hi all
 i have been toying around with this well known RandomForestExample code

val forest = RandomForest.trainClassifier(
  trainData, 7, Map(10 -> 4, 11 -> 40), 20,
  "auto", "entropy", 30, 300)

This comes from this link (
and also Sean Owen's presentation


and now i want to migrate it to use ML Libraries.
The problem i have is that the MLLib  example has categorical features, and
i cannot find
a way to use categorical features with ML
Apparently i should use VectorIndexer, but VectorIndexer assumes only one
column for features.
I am at the moment using Vectorassembler instead, but i cannot find a way
to achieve the
I have checed spark samples, but all i can see is RandomForestClassifier
using VectorIndexer for 1 feature

Could anyone assist?
This is my current code....what do i need to add to take into account
categorical features?

val labelIndexer = new StringIndexer()

    val features = new VectorAssembler()
        "Col1", "Col2", "Col3", "Col4", "Col5",
        "Col6", "Col7", "Col8", "Col9", "Col10"))

    val labelConverter = new IndexToString()

    val rf = new RandomForestClassifier()

    println("Kicking off pipeline..")

    val pipeline = new Pipeline()
      .setStages(Array(labelIndexer, features, rf, labelConverter))

thanks in advance and regards

Reply via email to