I am currently using naive bayes for text classification.
I prefer NB over SVM because;
- SVM has long training time
- NB can be incremental
- NB can be fully parallel
the main decisions you should make while using NB is using tf or tfidf and
using binary NB or multinomial
if you classify short
This is not right. THe sequential version would have finished long before
this for any reasonable value of k.
I do note, however, that you have set k = 200,000 where you only have
300,000 documents. Depending on which value you set (I don't have the code
handy), this may actually be increased
Hi Gokhan,
Thank you for the clarification.
Does it mean that Mahout is using the mapred API everywhere and there is
no mapreduce API left? As far as I know, the mapreduce API needs to be
recompiled and I remember needing to recompile Mahout for CDH4 when it
first came out.
Thanks, Zoltan
Mahout is using the newer mapreduce API and not the older mapred API.
Was that what u were looking for?
On Wednesday, December 11, 2013 1:53 PM, Zoltan Prekopcsak
preko1...@gmail.com wrote:
Hi Gokhan,
Thank you for the clarification.
Does it mean that Mahout is using the mapred API
I think there are still parts of the code (e.g. in DistributedRowMatrix)
that use the old API.
--sebastian
On 11.12.2013 19:56, Suneel Marthi wrote:
Mahout is using the newer mapreduce API and not the older mapred API.
Was that what u were looking for?
On Wednesday, December 11,
Sebastian,
R we still using SplitInputJob, seems like its been replaced by a much newer
SplitInput.
Do u think this needs to be purged from the codebase for 0.9, its been marked
as deprecated anyways?
On Wednesday, December 11, 2013 2:08 PM, Suneel Marthi
suneel_mar...@yahoo.com wrote:
Hi Zoltan,
I am saying that hadoop2-stable and hadoop1 are binary compatible. I don't know
what version of hadoop is used in cdh4-mr2 but I guess it was hadoop2 alpha,
since bigtop was at hadoop 2.0.6 alpha last time I checked, which was last week.
Just try it and let us know if you experience
Could you check the following?
Are you sure that your hadoop cluster is hadoop 2.2.0?
Are you sure other dependencies of your project do not have a transitive
dependency to hadoop?
Gokhan
On Wed, Dec 11, 2013 at 9:46 PM, Hi There srudamas...@yahoo.com wrote:
I tried to run
Per this link, one notability incompatibility is Counter and CounterGroup.
http://hadoop.apache.org/docs/r2.2.0/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduce_Compatibility_Hadoop1_Hadoop2.html
On Wednesday, December 11, 2013 2:46 PM, Hi There srudamas...@yahoo.com wrote:
I
Here are the full contents of my pom file:
project xmlns=http://maven.apache.org/POM/4.0.0;
xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance;
xsi:schemaLocation=http://maven.apache.org/POM/4.0.0
http://maven.apache.org/xsd/maven-4.0.0.xsd;
modelVersion4.0.0/modelVersion
In the meantime, you might apply the patch in MAHOUT-1354, build mahout
using mvn package -Phadoop2 -DskipTests=true, use that mahout version and
see if that works
Gokhan
On Wed, Dec 11, 2013 at 10:09 PM, Gokhan Capan gkhn...@gmail.com wrote:
I apologize, Suneel is right, Counter breaks the
Am i able to run `Decision tree` from mahout in Eclipse without installing.
Should i `install` Mahout in my system or download all `jar` dependencies
and include them in lib.
I want to Know the working of Decision Tree.
Where can i find the `source code` for Mahout Decision tree.
--
*Thanks
12 matches
Mail list logo