Re: Is any more detailed documentation aout the sgd logistic regression example.

2011-05-07 Thread Ted Dunning
Huh? What program are you talking about? On Fri, May 6, 2011 at 9:36 PM, Xiaobo Gu wrote: > >> > 2. In production mode, don't use csv, you will find most of the time > >> spent > >> > are on parse the csv data and hash them to features. You might encode > the > >> > feature to vector and serial

Re: Which maven command to use to put all the binaries into the distribution layout?

2011-05-07 Thread Ted Dunning
Use mvn -DskipTests package On Fri, May 6, 2011 at 8:50 PM, Xiaobo Gu wrote: > On Fri, May 6, 2011 at 11:34 PM, Sean Owen wrote: > > I think you'd have to set up release keys and all that to make the > package. > > Does "mvn release:prepare" (without -Prelease) do what you want or am > > I cra

Re: Is any more detailed documentation aout the sgd logistic regression example.

2011-05-07 Thread Xiaobo Gu
trainlogistic and runlogistic 2011/5/7, Ted Dunning : > Huh? > > What program are you talking about? > > On Fri, May 6, 2011 at 9:36 PM, Xiaobo Gu wrote: > >> >> > 2. In production mode, don't use csv, you will find most of the time >> >> spent >> >> > are on parse the csv data and hash them to f

Anyone Experienced in HTTP Logs as Data Source for Recommendations

2011-05-07 Thread Shem Cristobal
Dear All, we are hoping to generate a recommendation from HTTP logs of a certain web site. Is this even advisable? What sort of recommendations have you experienced using such HTTP logs? Thanks a lot! Best regards, @shemcristobal

Re: Which maven command to use to put all the binaries into the distribution layout?

2011-05-07 Thread Xiaobo Gu
On Sat, May 7, 2011 at 3:30 PM, Ted Dunning wrote: > Use > > mvn -DskipTests package But the above command only creates jar files in seperated target directory, does not assamble them in a releaseable layout, I write the following lines in a package.sh file, but found several directories are not

Re: Anyone Experienced in HTTP Logs as Data Source for Recommendations

2011-05-07 Thread Sean Owen
As far as Mahout is concerned, you just need input of the form "user,item" (no rating necessary) where those are two numerical identifiers. I imagine each logged request contains something like a user ID and other thing you want to recommend -- video ID, item ID, etc. (If it's not numeric, you'd ha

Re: Which maven command to use to put all the binaries into the distribution layout?

2011-05-07 Thread Sean Owen
It sounds like you want to write your own Maven release target to create the output you're looking for. Otherwise, yes this is what you need to do. 'package' is the right target as far as the project is concerned, and yes you can write a script like this to assemble whatever you want. That's also a

Re: Which maven command to use to put all the binaries into the distribution layout?

2011-05-07 Thread Xiaobo Gu
On Sat, May 7, 2011 at 9:41 PM, Sean Owen wrote: > It sounds like you want to write your own Maven release target to > create the output you're looking for. Otherwise, yes this is what you > need to do. 'package' is the right target as far as the project is > concerned, and yes you can write a scr

RE: Anyone Experienced in HTTP Logs as Data Source for Recommendations

2011-05-07 Thread Danny Leshem
(18) קצת מזכיר לי את דרבי בר... אבל לא נראה לי שזה קשור מחיפוש באינטרנט. (15) זה כמובן ננוצ'קה בלילינבלום. -Original Message- From: Shem Cristobal [mailto:shem.cristo...@gmail.com] Sent: Saturday, May 07, 2011 15:41 To: user@mahout.apache.org Subject: Anyone Experienced in HTTP Logs as D

Re: Anyone Experienced in HTTP Logs as Data Source for Recommendations

2011-05-07 Thread Steven Bourke
Hi Shem, I've tried something similar, and it is indeed more than possible. The real problems comes down to how you'll actually interpret user interactions on the site. A users session may vary drastically across multiple different sessions, also if you are just tracking by IP address you may lose

Re: Anyone Experienced in HTTP Logs as Data Source for Recommendations

2011-05-07 Thread Ted Dunning
Shem, What Steven says is very much correct. I have used web logs several times for recommendations with very good results. I would add to Steven's comment about how to interpret user actions that you really need to think about what action indicates user interest. It is common to use clicks for

Re: Which maven command to use to put all the binaries into the distribution layout?

2011-05-07 Thread Benson Margulies
I imagine it's -Prelease,mahout_release that is missing from the conversation. The build economizes the expensive process of building full release packages unless you turn on the profile. On Sat, May 7, 2011 at 9:43 AM, Xiaobo Gu wrote: > On Sat, May 7, 2011 at 9:41 PM, Sean Owen wrote: >> It so

Re: Anyone Experienced in HTTP Logs as Data Source for Recommendations

2011-05-07 Thread Benson Margulies
Did you really mean to send this? It's not obviously relevant even if translated into english. 2011/5/7 Danny Leshem : >  (18) קצת מזכיר לי את דרבי בר... אבל לא נראה לי שזה קשור מחיפוש באינטרנט. > (15) זה כמובן ננוצ'קה בלילינבלום. > > -Original Message- > From: Shem Cristobal [mailto:shem.

Re: Is any more detailed documentation aout the sgd logistic regression example.

2011-05-07 Thread Ted Dunning
You can't do that directly. You can use the http address of the file in HDFS. Note also that trainlogistic and runlogistic are intended pretty much only for simple demonstration purposes. For large scale data, you should use AdaptiveLogisticRegression 2011/5/7 Xiaobo Gu > trainlogistic and ru

Re: Which maven command to use to put all the binaries into the distribution layout?

2011-05-07 Thread Jake Mannix
I think what Xiaobo wants is something totally reasonable: we have, on the website, a full distribution, in .zip, .gz, and .bz2 formats. He wants to be able to produce a -SNAPSHOT form of this as well, from trunk checkouts. I've tried this with mvn -Prelease (requires gpg keys), mvn release:prep

Starting out with Mahout

2011-05-07 Thread Brent Downs
Hey, Hopefully this isn't too asine a question. I have downloaded Mahout and am trying to build it. When try "mvn install" everything looks fine until the end. Has anyone seen this particular error? If so, do you know how to rectify it? -BD [INFO] 1 error[INFO] ---

Re: Which maven command to use to put all the binaries into the distribution layout?

2011-05-07 Thread Ted Dunning
I just tried mvn -DskipTests -Prelease,mahout_release package and got the distribution artifacts in distribution/target I also got a warning about the mahout_release profile. I am trying it again with just -Prelease On Sat, May 7, 2011 at 2:21 PM, Jake Mannix wrote: > I think what Xiaobo

Re: Starting out with Mahout

2011-05-07 Thread Ted Dunning
What size machine? What OS? What java do you have? On Sat, May 7, 2011 at 2:30 PM, Brent Downs wrote: > > Hey, > Hopefully this isn't too asine a question. I have downloaded Mahout and am > trying to build it. When try "mvn install" everything looks fine until the > end. Has anyone seen thi

Re: Which maven command to use to put all the binaries into the distribution layout?

2011-05-07 Thread Jake Mannix
On Sat, May 7, 2011 at 2:40 PM, Ted Dunning wrote: > I just tried > > mvn -DskipTests -Prelease,mahout_release package > > and got the distribution artifacts in distribution/target > > I also got a warning about the mahout_release profile. I am trying it > again > with just -Prelease > See, w

Re: Which maven command to use to put all the binaries into the distribution layout?

2011-05-07 Thread Jake Mannix
Wow, now it's working fine, not complaining. Xiaobo: this is the way to build your distribution! mvn -DskipTests -Prelease,mahout_release package and look in distribution/target for your zip/bz2/etc packages. -jake On Sat, May 7, 2011 at 2:45 PM, Jake Mannix wrote: > > > On Sat, May 7,

RE: Starting out with Mahout

2011-05-07 Thread Brent Downs
For Java, I havejava version "1.6.0_17"Java(TM) SE Runtime Environment (build 1.6.0_17-b04-248)Java HotSpot(TM) 64-Bit Server VM (build 14.3-b01-101, mixed mode) My machine is a MacBook Pro with 4GB of memory and runs OSX 10.6.7 > From: ted.dunn...@gmail.com > Date: Sat, 7 May 2011 14:41:26 -07

Re: Starting out with Mahout

2011-05-07 Thread Dmitriy Lyubimov
the jvm seems a little bit dated (hadoop and by extension Mahout is not recommended for use on 18 but 19 and on should be fine) but otherwise I must admit i don't see anything wrong with the setup. On Sat, May 7, 2011 at 2:52 PM, Brent Downs wrote: > > For Java, I havejava version "1.6.0_17"Java(

Re: Starting out with Mahout

2011-05-07 Thread Stevo Slavić
Configure MAVEN_OPTS environment variable, to give Maven more memory. Regards, Stevo. On May 7, 2011 11:59 PM, "Dmitriy Lyubimov" wrote: the jvm seems a little bit dated (hadoop and by extension Mahout is not recommended for use on 18 but 19 and on should be fine) but otherwise I must admit i d

Re: Starting out with Mahout

2011-05-07 Thread Ted Dunning
I use a MacBook with 4GB all the time for compiling Mahout with no problems. I use Java(TM) SE Runtime Environment (build 1.6.0_24-b07-334-10M3326) On Sat, May 7, 2011 at 3:04 PM, Stevo Slavić wrote: > Configure MAVEN_OPTS environment variable, to give Maven more memory. >

Re: Vectorizing arbitrary value types with seq2sparse

2011-05-07 Thread Dmitriy Lyubimov
potentially one might be able to use compound key consisting, essentially, of doc id and the value category and then re-vectorize it (or bastardize seq2sparse) by adding the quantitative feature to the values. yes n-grams might get screwed a little but who cares. Output still might be useful. n-gr

Re: Vectorizing arbitrary value types with seq2sparse

2011-05-07 Thread Dmitriy Lyubimov
compound=composite, sorry. I've been mixing these words up since i was in the first grade. On Sat, May 7, 2011 at 3:13 PM, Dmitriy Lyubimov wrote: > potentially one might be able to use compound key consisting, > essentially, of doc id and the value category and then re-vectorize it > (or bastard

Re: Which maven command to use to put all the binaries into the distribution layout?

2011-05-07 Thread Ted Dunning
This simpler form works as well: mvn -DskipTests -Prelease package On Sat, May 7, 2011 at 2:52 PM, Jake Mannix wrote: > Wow, now it's working fine, not complaining. > > Xiaobo: this is the way to build your distribution! > > mvn -DskipTests -Prelease,mahout_release package > > and look

Re: Starting out with Mahout

2011-05-07 Thread Jake Mannix
export MAVEN_OPTS="-Xmx1g" and try again. :) On Sat, May 7, 2011 at 2:30 PM, Brent Downs wrote: > > Hey, > Hopefully this isn't too asine a question. I have downloaded Mahout and am > trying to build it. When try "mvn install" everything looks fine until the > end. Has anyone seen this parti

RE: Starting out with Mahout

2011-05-07 Thread Brent Downs
export MAVEN_OPTS="-Xmx1g" works! Thanks! -Brent Downs > From: jake.man...@gmail.com > Date: Sat, 7 May 2011 15:33:54 -0700 > Subject: Re: Starting out with Mahout > To: user@mahout.apache.org > > export MAVEN_OPTS="-Xmx1g" > > and try again. :) > > On Sat, May 7, 2011 at 2:30 PM, Brent Downs

Re: Starting out with Mahout

2011-05-07 Thread Ted Dunning
Ahh... yes. In about u16, the default behavior of java was changed with respect to the default maximum heap size. If you update your java, you won't have to use this option. On Sat, May 7, 2011 at 4:11 PM, Brent Downs wrote: > > export MAVEN_OPTS="-Xmx1g" works! > Thanks! > -Brent Downs > > >

RE: Which maven command to use to put all the binaries into the distribution layout?

2011-05-07 Thread XiaoboGu
The command does what I want, but during building, the output contains something like this: [INFO] --- maven-javadoc-plugin:2.7:jar (attach-javadocs) @ mahout-taste-webapp --- [ERROR] Error fetching link: /home/gpadmin/mahout/src-trunk/trunk/buildtools/target/apidocs/package-list. Ignored it.

Re: Which maven command to use to put all the binaries into the distribution layout?

2011-05-07 Thread Jake Mannix
Nope, that's spurious output, don't worry about it. -jake On Sat, May 7, 2011 at 7:56 PM, XiaoboGu wrote: > The command does what I want, but during building, the output contains > something like this: > > [INFO] --- maven-javadoc-plugin:2.7:jar (attach-javadocs) @ > mahout-taste-webapp --- >

Re: Starting out with Mahout

2011-05-07 Thread Dmitriy Lyubimov
No, i did not see that one either on linux or windows -- and i have been running Mahout build pretty frequently. Although i run it on trunk. I guess you are not using 32bit system/java by any chance, are you? On Sat, May 7, 2011 at 2:30 PM, Brent Downs wrote: > > Hey, > Hopefully this isn't too a