Hi,
I have a csv file like:
uid      mid      features       label
123    5231    [0, 1, 3, ...]    True

Both  "features" and "label" columns are used for GBTClassifier.

However, when I read the file:
Dataset<Row> samples = sparkSession.read().csv(file);
The type of samples.select("features") is String.

My question is:
How to map samples.select("features") to Vector or any appropriate type,
so I can use it to train like:
        GBTClassifier gbdt = new GBTClassifier()
                .setLabelCol("label")
                .setFeaturesCol("features")
                .setMaxIter(2)
                .setMaxDepth(7);

Thanks.

Reply via email to