Re: Best way to tranform string label into long label for classification problem
Thank you Xinh. That's what I need. Le mar. 28 juin 2016 à 17:43, Xinh Huynha écrit : > Hi Jao, > > Here's one option: > http://spark.apache.org/docs/latest/ml-features.html#stringindexer > "StringIndexer encodes a string column of labels to a column of label > indices. The indices are in [0, numLabels), ordered by label frequencies." > > Xinh > > On Tue, Jun 28, 2016 at 12:29 AM, Jaonary Rabarisoa > wrote: > >> Dear all, >> >> I'm trying to a find a way to transform a DataFrame into a data that is >> more suitable for third party classification algorithm. The DataFrame have >> two columns : "feature" represented by a vector and "label" represented by >> a string. I want the "label" to be a number between [0, number of classes - >> 1]. >> Do you have any ideas to do it efficiently ? >> >> Cheers, >> >> Jao >> > >
Re: Best way to tranform string label into long label for classification problem
Hi Jao, Here's one option: http://spark.apache.org/docs/latest/ml-features.html#stringindexer "StringIndexer encodes a string column of labels to a column of label indices. The indices are in [0, numLabels), ordered by label frequencies." Xinh On Tue, Jun 28, 2016 at 12:29 AM, Jaonary Rabarisoawrote: > Dear all, > > I'm trying to a find a way to transform a DataFrame into a data that is > more suitable for third party classification algorithm. The DataFrame have > two columns : "feature" represented by a vector and "label" represented by > a string. I want the "label" to be a number between [0, number of classes - > 1]. > Do you have any ideas to do it efficiently ? > > Cheers, > > Jao >