Trying to improve the old solution.
Do we have a better text classifier now in Spark Mllib?
Regards,
lmk
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
-
To unsubscribe e-mail:
Hi,
Sorry to bounce back the old thread.
What is the state now? Is this problem solved. How spark handle categorical
data now?
Regards,
Ashutosh
--
View this message in context:
This operation requires two transformers:
1) Indexer, which maps string features into categorical features
2) OneHotEncoder, which flatten categorical features into binary features
We are working on the new dataset implementation, so we can easily
express those transformations. Sorry for late!
...@gmail.commailto:lakshmi.muralikrish...@gmail.com]
Sent: Wednesday, June 25, 2014 1:27 PM
To: u...@spark.incubator.apache.orgmailto:u...@spark.incubator.apache.org
Subject: RE: Prediction using Classification with text attributes in Apache
Spark MLLib
Hi Alexander,
Just one more question
...@gmail.com]
Sent: Wednesday, June 25, 2014 1:27 PM
To: u...@spark.incubator.apache.org
Subject: RE: Prediction using Classification with text attributes in Apache
Spark MLLib
Hi Alexander,
Just one more question on a related note. Should I be following the same
procedure even if my data
Hi,
You need to convert your text to vector space model:
http://en.wikipedia.org/wiki/Vector_space_model
and then pass it to SVM. As far as I know, in previous versions of MLlib there
was a special class for doing this:
Hi Alexander,
Thanks for your prompt response. Earlier I was executing this Prediction
using Weka only. But now we are moving to a huge dataset and hence to Apache
Spark MLLib. Is there any other way to convert to libSVM format? Or is there
any other simpler algorithm that I can use in mllib?
: Tuesday, June 24, 2014 3:41 PM
To: u...@spark.incubator.apache.org
Subject: RE: Prediction using Classification with text attributes in Apache
Spark MLLib
Hi Alexander,
Thanks for your prompt response. Earlier I was executing this Prediction using
Weka only. But now we are moving to a huge dataset
On Tue, Jun 24, 2014 at 12:28 PM, Ulanov, Alexander
alexander.ula...@hp.com wrote:
You need to convert your text to vector space model:
http://en.wikipedia.org/wiki/Vector_space_model
and then pass it to SVM. As far as I know, in previous versions of MLlib
there was a special class for doing