don't match what I get.
> >
> > I get LLR = 117.
> >
> > This is wildly anomalous so this pair should definitely be connected.
> Both
> > items are quite rare (15/300,000 or 20/300,000 rates) but they occur
> > together most of the time that they appear.
>
= row entropy + col entropy and
> LLR = 0.
>
>
> On Wed, Apr 10, 2013 at 10:15 AM, Phoenix Bai wrote:
> > Hi,
> >
> > the counts for two events are:
> > * **Event A**Everything but A**Event B**k11=7**k12=8**Everything but B**
> > k21=13**k22=300,000*
Hi,
the counts for two events are:
* **Event A**Everything but A**Event B**k11=7**k12=8**Everything but B**
k21=13**k22=300,000*
according to the code, I will get:
rowEntropy = entropy(7,8) + entropy(13, 300,000) = 222
colEntropy = entropy(7,13) + entropy(8, 300,000) = 152
matrixEntropy(entropy(7
. Instead, you can use a mapping to/from 64-bit
> values. See IDMigrator for instance.
>
> On Mon, Apr 8, 2013 at 3:51 AM, Phoenix Bai wrote:
> > Hi All,
> >
> > the input format required for mahout recommender is :
> >
> > *userId (long), itemId (long),
Hi All,
the input format required for mahout recommender is :
*userId (long), itemId (long), rating (optional)*
while, currently, my input format is:
*userId (UUID, which is 128bit long), itemId (long), boolean*
so, my question is, how could I convert userId in UUID format to long
datatype?
e.
Hi All,
the input format required for mahout recommender is :
*userId (long), itemId (long), rating (optional)*
while, currently, my input format is:
*userId (UUID, which is 128bit long), itemId (long), boolean*
so, my question is, how could I convert userId in UUID format to long
datatype?
e.
Raju,
like Sebastian said, it probably due to the default sampling restriction of
hadoop-based implementation.
maxPrefsPerUserInItemSimilarity", "max number of preferences to consider
per user in the "
+ "item similarity computation phase, users with more
preferences will be sampled d
ChineseAnalyzer you'll have to add it as a
> dependency either by modifying maven dependencies and rebuiling, or just by
> injecting the ChineseAnalyzer class into the jar (using jar xf, jar cf,
> etc.).
>
>
> Jeremie
>
> 2012/11/21 Phoenix Bai
>
> > HI All,
>
a single canopy and you can go smaller until you get a reasonable
> number. There are also T3 and T4 arguments that allow you to specify the T1
> and T2 values used by the reducer.
>
>
> On 11/13/12 7:01 AM, Phoenix Bai wrote:
>
>> Hi All,
>>
>> 1) data size:
>
> > I imagine the best use of your time and effort is to convince your admins
> > that running a 3 year old version of hadoop is a bad idea. Things are
> only
> > going to get worse...
> > Mat
> > On Sep 13, 2012 7:15 PM, "Phoenix Bai" wrote:
> >
&g
Hi guys,
I am trying to compile my application code using mahout 0.7 and hadoop 0.19.
during the compile process, it is throwing errors as below:
$ hadoop jar cluster-0.0.1-SNAPSHOT-jar-with-dependencies.jar
mahout.sample.ClusterVideos
12/09/13 20:36:18 INFO vectorizer.SparseVectorsFromSequenceFi
in your current mahout version (0.7?) , you should use --input (-i) input
instead of --seqDir.
for the detailed usage, you should check out:
$mahout clusterdump -h
On Wed, Sep 5, 2012 at 3:26 PM, javaboom wrote:
> I've tried to use "clusterdump". I followed this manual
> https://cwiki.apache.o
a breakpoint in
> ClusterClassificationDriver.**shouldClassify()
> (you'd need to edit it a bit first) you could determine if this was
> removing any of your input points.
>
>
>
> On 8/27/12 10:26 PM, Phoenix Bai wrote:
>
>> Hi Jeff,
>>
>> first of all, thank
so, then
> using the directory instead might help:
>
> --pointsDir
> /group/tbdev/zhimo.bmz/mahout/**output/videotags-kmeans-**clusters/clusteredPoints
> \
>
>
>
>
> On 8/27/12 2:49 AM, Phoenix Bai wrote:
>
>> --pointsDir
>> /group/tbdev/zhi
Hi All,
Good afternoon.
I run the following three steps and got the clustered data I expected.
My input data is 1124 object (it is in key:value format), However, from the
output, I only received 491 objects.
What happened to the 1124-491=633 objects?
I checked out the options of seq2sparse, kmea
Or instead of invoking mahout in format "$ hadoop jar mahout-core-0.5.jar ",
you should try "$mahout ..".
in $MAHOUT_HOME/bin, there lies the mahout script which will load all
necessary jar files before run any classes. the jars that required by
mahout are normally put in $MAHOUT_HOME/lib
e.g.
16 matches
Mail list logo