Thank you, I saw this before, but it is just a binary classification, so
how can I extract this to multiple classification.
Simply add different labels?
e.g.:
new LabeledDocument(0L, a b c d e spark, 1.0),
new LabeledDocument(1L, b d, 0.0),
new LabeledDocument(2L, hadoop f g h, 2.0),
Hi!
I want to implement a multiclass classification for documents.
So I have different kinds of text files, and I want to classificate them
with spark mllib in java.
Do you have any code examples?
Thanks!
--
*Egyed Zsombor *
Junior Big Data Engineer
Mobile: +36 70 320 65 81 |
I would check out the Pipeline code example
https://spark.apache.org/docs/latest/ml-guide.html#example-pipeline
On Sat, Aug 29, 2015 at 9:23 PM, Zsombor Egyed egye...@starschema.net
wrote:
Hi!
I want to implement a multiclass classification for documents.
So I have different kinds of text
I think the spark.ml logistic regression currently only supports 0/1
labels. If you need multiclass, I would suggest to look at either the
spark.ml decision trees. If you don't care too much for pipelines, then you
could use the spark.mllib logistic regression after featurizing.
On Sat, Aug 29,