from:"Phoenix Bai"

Re: log-likelihood ratio value in item similarity calculation

2013-04-12 Thread Phoenix Bai

don't match what I get. > > > > I get LLR = 117. > > > > This is wildly anomalous so this pair should definitely be connected. > Both > > items are quite rare (15/300,000 or 20/300,000 rates) but they occur > > together most of the time that they appear. >

Re: log-likelihood ratio value in item similarity calculation

2013-04-10 Thread Phoenix Bai

= row entropy + col entropy and > LLR = 0. > > > On Wed, Apr 10, 2013 at 10:15 AM, Phoenix Bai wrote: > > Hi, > > > > the counts for two events are: > > * **Event A**Everything but A**Event B**k11=7**k12=8**Everything but B** > > k21=13**k22=300,000*

log-likelihood ratio value in item similarity calculation

2013-04-10 Thread Phoenix Bai

Hi, the counts for two events are: * **Event A**Everything but A**Event B**k11=7**k12=8**Everything but B** k21=13**k22=300,000* according to the code, I will get: rowEntropy = entropy(7,8) + entropy(13, 300,000) = 222 colEntropy = entropy(7,13) + entropy(8, 300,000) = 152 matrixEntropy(entropy(7

Re: How to map UUID to userId in Preference class to use mahout recommender?

2013-04-07 Thread Phoenix Bai

. Instead, you can use a mapping to/from 64-bit > values. See IDMigrator for instance. > > On Mon, Apr 8, 2013 at 3:51 AM, Phoenix Bai wrote: > > Hi All, > > > > the input format required for mahout recommender is : > > > > *userId (long), itemId (long),

How to map UUID to userId in Preference class to use mahout recommender?

2013-04-07 Thread Phoenix Bai

Hi All, the input format required for mahout recommender is : *userId (long), itemId (long), rating (optional)* while, currently, my input format is: *userId (UUID, which is 128bit long), itemId (long), boolean* so, my question is, how could I convert userId in UUID format to long datatype? e.

How to map UUID to userId in Preference class to use mahout recommender?

2013-04-07 Thread Phoenix Bai

Hi All, the input format required for mahout recommender is : *userId (long), itemId (long), rating (optional)* while, currently, my input format is: *userId (UUID, which is 128bit long), itemId (long), boolean* so, my question is, how could I convert userId in UUID format to long datatype? e.

Re: Regarding ItemBased Recommendation Results

2013-04-01 Thread Phoenix Bai

Raju, like Sebastian said, it probably due to the default sampling restriction of hadoop-based implementation. maxPrefsPerUserInItemSimilarity", "max number of preferences to consider per user in the " + "item similarity computation phase, users with more preferences will be sampled d

Re: seq2sparse -a analyzerClass is throwing: ClassNotFoundException

2012-11-23 Thread Phoenix Bai

ChineseAnalyzer you'll have to add it as a > dependency either by modifying maven dependencies and rebuiling, or just by > injecting the ChineseAnalyzer class into the jar (using jar xf, jar cf, > etc.). > > > Jeremie > > 2012/11/21 Phoenix Bai > > > HI All, >

Re: Issue: Canopy is processing extremly slow, what goes wrong?

2012-11-14 Thread Phoenix Bai

a single canopy and you can go smaller until you get a reasonable > number. There are also T3 and T4 arguments that allow you to specify the T1 > and T2 values used by the reducer. > > > On 11/13/12 7:01 AM, Phoenix Bai wrote: > >> Hi All, >> >> 1) data size: >

Re: hadoop-0.19 and mahout 0.7: throwing incompatible errors, how can I fix it?

2012-09-21 Thread Phoenix Bai

> > I imagine the best use of your time and effort is to convince your admins > > that running a 3 year old version of hadoop is a bad idea. Things are > only > > going to get worse... > > Mat > > On Sep 13, 2012 7:15 PM, "Phoenix Bai" wrote: > > &g

hadoop-0.19 and mahout 0.7: throwing incompatible errors, how can I fix it?

2012-09-13 Thread Phoenix Bai

Hi guys, I am trying to compile my application code using mahout 0.7 and hadoop 0.19. during the compile process, it is throwing errors as below: $ hadoop jar cluster-0.0.1-SNAPSHOT-jar-with-dependencies.jar mahout.sample.ClusterVideos 12/09/13 20:36:18 INFO vectorizer.SparseVectorsFromSequenceFi

Re: Does clusterdump still support option "--seqFileDir"?

2012-09-12 Thread Phoenix Bai

in your current mahout version (0.7?) , you should use --input (-i) input instead of --seqDir. for the detailed usage, you should check out: $mahout clusterdump -h On Wed, Sep 5, 2012 at 3:26 PM, javaboom wrote: > I've tried to use "clusterdump". I followed this manual > https://cwiki.apache.o

Re: does seq2sparse or kmeans filter data ? I am losing data!

2012-08-29 Thread Phoenix Bai

a breakpoint in > ClusterClassificationDriver.**shouldClassify() > (you'd need to edit it a bit first) you could determine if this was > removing any of your input points. > > > > On 8/27/12 10:26 PM, Phoenix Bai wrote: > >> Hi Jeff, >> >> first of all, thank

Re: does seq2sparse or kmeans filter data ? I am losing data!

2012-08-27 Thread Phoenix Bai

so, then > using the directory instead might help: > > --pointsDir > /group/tbdev/zhimo.bmz/mahout/**output/videotags-kmeans-**clusters/clusteredPoints > \ > > > > > On 8/27/12 2:49 AM, Phoenix Bai wrote: > >> --pointsDir >> /group/tbdev/zhi

does seq2sparse or kmeans filter data ? I am losing data!

2012-08-26 Thread Phoenix Bai

Hi All, Good afternoon. I run the following three steps and got the clustered data I expected. My input data is 1124 object (it is in key:value format), However, from the output, I only received 491 objects. What happened to the 1124-491=633 objects? I checked out the options of seq2sparse, kmea

Re: java.lang.NoClassDefFoundError: org/apache/commons/cli2/Option

2012-08-26 Thread Phoenix Bai

Or instead of invoking mahout in format "$ hadoop jar mahout-core-0.5.jar ", you should try "$mahout ..". in $MAHOUT_HOME/bin, there lies the mahout script which will load all necessary jar files before run any classes. the jars that required by mahout are normally put in $MAHOUT_HOME/lib e.g.

Re: log-likelihood ratio value in item similarity calculation

Re: log-likelihood ratio value in item similarity calculation

log-likelihood ratio value in item similarity calculation

Re: How to map UUID to userId in Preference class to use mahout recommender?

How to map UUID to userId in Preference class to use mahout recommender?

How to map UUID to userId in Preference class to use mahout recommender?

Re: Regarding ItemBased Recommendation Results

Re: seq2sparse -a analyzerClass is throwing: ClassNotFoundException

Re: Issue: Canopy is processing extremly slow, what goes wrong?

Re: hadoop-0.19 and mahout 0.7: throwing incompatible errors, how can I fix it?

hadoop-0.19 and mahout 0.7: throwing incompatible errors, how can I fix it?

Re: Does clusterdump still support option "--seqFileDir"?

Re: does seq2sparse or kmeans filter data ? I am losing data!

Re: does seq2sparse or kmeans filter data ? I am losing data!

does seq2sparse or kmeans filter data ? I am losing data!

Re: java.lang.NoClassDefFoundError: org/apache/commons/cli2/Option

16 matches

Site Navigation

Mail list logo

Footer information