For binary classification, any click-through data (like online ad click-through
data) is extremely unbalanced. Of the order of 0.5% positive examples.
Yahoo has some large data sets of this nature, that can be downloaded free for
research purposes from Yahoo Research (I think it's
This means you are running on a headless machine without a monitor. The
program needs to show a window with graphics but cant.
On Mar 9, 2012 6:48 AM, rahul raghavendhra rahulraghavendh...@gmail.com
wrote:
hi Lance,
i tried as u said, but now i got a new exception
Exception in thread main
In this case, the code in question is the non-distributed code rather
than Hadoop. But yes I agree it will make a perhaps bigger difference
on Hadoop. All of the Hadoop stuff uses integer keys.
On Fri, Mar 9, 2012 at 2:10 AM, Paritosh Ranjan pran...@xebia.com wrote:
Are these identifiers used as
Pat,
MatrixDump expects an input file of Text, MatrixWritable . The matrix that
gets created from RowIdJob is IntWritable, VectorWritable and you cannot run
MatrixDump to see the contents of the matrix. You need to use seqdumper as you
had done.
From:
I assume that the other matrix operations will consume and produce
Text, MatrixWritable? If so how do you create Text, MatrixWritable
from the output of rowid IntWritable, VectorWritable?
Also while we are at it how do you use vectordump? If you do bin/mahout
vectordump --help you get some
I plan to enable some degree if a mixed environment between R and
Mahout but it will probably take several months before i will get
meaningful coverage of stuff Mahout produces.
On Wed, Feb 1, 2012 at 7:36 PM, Daniel Quach danqu...@cs.ucla.edu wrote:
I just ran k-means over a set of data and I
No, the matrix multiplication operations all (probably) take
int,vector where int is the row number. There has to be a
universally unique row number. If there is no row number associated
with a row in a distributed matrix op, how can the reducers know which
rows they have?
Rows do not necessarily
hi ,
I am already run RecommenderJob on hadoop cluster.
output of RecommenderJob is in hdfs is like==user_id [item_id:score].I
was get it in file.
But,not getting idea about how to integrate recommendations like this in
web application ==user_id item_id
can any one have an idea about it??