Re: 20 Newsgroups Example
(Hope it's OK to break the code freeze just to fix these:) The SL4FJ business -- perhaps related to the update to SLF4J 1.5.6. I discovered some references to 1.5.5 in a few places and fixed those. Update and try again. PriorityQueue: yeah shouldn't be used as a raw type. It seemed clear enough how to fix so I did. (I wonder why we don't just use java.util.PriorityQueue with a Comparator?) fullyDelete() -- another deprecation I had fixed in parts of the code, not sure why it popped up again. The last brings up a side point -- I see a ton of copy-and-paste in the code base. The PQ business was fixed twice in what appeared to be mostly identical classes. fullyDelete() came back I assume due to a copy and paste of older code? I know everyone knows it's not optimal to copy and paste; raising the issue and pointing out that in my experience it is never and issue that people actually go back and fix if they didn't do the first time. 2009/3/20 Jeff Eastman j...@windwardsolutions.com: I'm trying to run this example and have run the first ant task, but the second one bombs. It looks like there are some classpath problems: jeff-eastmans-macbook-pro:examples jeff$ ant -f build-deprecated.xml extract-20news-18828 Buildfile: build-deprecated.xml
Re: 20 Newsgroups Example
Boy, I knew not porting it would bite me. I ran the first command, but not the second. The workaround, obviously, is to just get those things in the classpath. One Maven command that comes in handy from time to time is: mvn dependencies:copy-dependencies. This will download all the project dependencies into your target directory, from which you can then add to a classpath. I think we should just offer the workaround on the wiki for now and not necessarily have to fix it for 0.1. Of course, fixing it is fine, too. If someone patches it, just make sure you apply it to the 0.1 tag, too. On Mar 19, 2009, at 8:21 PM, Jeff Eastman wrote: I'm trying to run this example and have run the first ant task, but the second one bombs. It looks like there are some classpath problems: jeff-eastmans-macbook-pro:examples jeff$ ant -f build-deprecated.xml extract-20news-18828 Buildfile: build-deprecated.xml check-files: build-core: compile: [javac] Compiling 262 source files to /Users/jeff/Desktop/ mahout-0.1/core/build/classes [javac] warning: [path] bad path element /Users/jeff/Desktop/ mahout-0.1/core/lib/jfreechart-1.0.6.jar: no such file or directory [javac] /Users/jeff/Desktop/mahout-0.1/core/src/main/java/org/ apache/mahout/cf/taste/impl/common/IOUtils.java:20: package org.slf4j does not exist [javac] import org.slf4j.Logger; [javac] ^ [javac] /Users/jeff/Desktop/mahout-0.1/core/src/main/java/org/ apache/mahout/cf/taste/impl/common/IOUtils.java:21: package org.slf4j does not exist [javac] import org.slf4j.LoggerFactory; [javac] ^ [javac] /Users/jeff/Desktop/mahout-0.1/core/src/main/java/org/ apache/mahout/cf/taste/impl/common/IOUtils.java:42: cannot find symbol [javac] symbol : class Logger [javac] location: class org.apache.mahout.cf.taste.impl.common.IOUtils [javac] private static final Logger log = LoggerFactory.getLogger(IOUtils.class); [javac]^ [javac] /Users/jeff/Desktop/mahout-0.1/core/src/main/java/org/ apache/mahout/cf/taste/impl/common/RefreshHelper.java:21: package org.slf4j does not exist [javac] import org.slf4j.Logger; [javac] ^ [javac] /Users/jeff/Desktop/mahout-0.1/core/src/main/java/org/ apache/mahout/cf/taste/impl/common/RefreshHelper.java:22: package org.slf4j does not exist [javac] import org.slf4j.LoggerFactory; [javac] ^ [javac] /Users/jeff/Desktop/mahout-0.1/core/src/main/java/org/ apache/mahout/cf/taste/impl/common/RefreshHelper.java:35: cannot find symbol [javac] symbol : class Logger [javac] location: class org.apache.mahout.cf.taste.impl.common.RefreshHelper [javac] private static final Logger log = LoggerFactory.getLogger(RefreshHelper.class); [javac]^ [javac] /Users/jeff/Desktop/mahout-0.1/core/src/main/java/org/ apache/mahout/cf/taste/impl/eval/ AbstractDifferenceRecommenderEvaluator.java:32: package org.slf4j does not exist [javac] import org.slf4j.Logger; [javac] ^ [javac] /Users/jeff/Desktop/mahout-0.1/core/src/main/java/org/ apache/mahout/cf/taste/impl/eval/ AbstractDifferenceRecommenderEvaluator.java:33: package org.slf4j does not exist [javac] import org.slf4j.LoggerFactory; [javac] ^ [javac] /Users/jeff/Desktop/mahout-0.1/core/src/main/java/org/ apache/mahout/cf/taste/impl/eval/ AbstractDifferenceRecommenderEvaluator.java:46: cannot find symbol [javac] symbol : class Logger [javac] location: class org .apache .mahout.cf.taste.impl.eval.AbstractDifferenceRecommenderEvaluator [javac] private static final Logger log = LoggerFactory.getLogger(AbstractDifferenceRecommenderEvaluator.class); [javac]^ [javac] /Users/jeff/Desktop/mahout-0.1/core/src/main/java/org/ apache/mahout/cf/taste/impl/eval/ AverageAbsoluteDifferenceRecommenderEvaluator.java:28: package org.slf4j does not exist [javac] import org.slf4j.Logger; [javac] ^ [javac] /Users/jeff/Desktop/mahout-0.1/core/src/main/java/org/ apache/mahout/cf/taste/impl/eval/ AverageAbsoluteDifferenceRecommenderEvaluator.java:29: package org.slf4j does not exist [javac] import org.slf4j.LoggerFactory; [javac] ^ [javac] /Users/jeff/Desktop/mahout-0.1/core/src/main/java/org/ apache/mahout/cf/taste/impl/eval/ AverageAbsoluteDifferenceRecommenderEvaluator.java:42: cannot find symbol [javac] symbol : class Logger [javac] location: class org .apache .mahout .cf.taste.impl.eval.AverageAbsoluteDifferenceRecommenderEvaluator [javac] private static final Logger log = LoggerFactory .getLogger(AverageAbsoluteDifferenceRecommenderEvaluator.class); [javac]^ [javac] /Users/jeff/Desktop/mahout-0.1/core/src/main/java/org/ apache/mahout/cf/taste/impl/eval/ GenericRecommenderIRStatsEvaluator.java:40: package org.slf4j does
Re: 20 Newsgroups Example
On Mar 20, 2009, at 7:49 AM, Grant Ingersoll wrote: One Maven command that comes in handy from time to time is: mvn dependencies:copy-dependencies. This will download all the project dependencies into your target directory, from which you can then add to a classpath. Sorry, that should have been mvn dependency:copy-dependencies People will probably find http://maven.apache.org/plugins/index.html helpful for knowing what goals are available.
Re: 20 Newsgroups Example
OK, I've posted a workaround. Give it a try. On Mar 20, 2009, at 9:06 AM, Grant Ingersoll wrote: On Mar 20, 2009, at 7:49 AM, Grant Ingersoll wrote: One Maven command that comes in handy from time to time is: mvn dependencies:copy-dependencies. This will download all the project dependencies into your target directory, from which you can then add to a classpath. Sorry, that should have been mvn dependency:copy-dependencies People will probably find http://maven.apache.org/plugins/index.html helpful for knowing what goals are available.
Re: [VOTE] Mahout 0.1
Just realized, I didn't add my +1, although it seems implied since I produced the candidate. Anyway, +1 -Grant On Mar 19, 2009, at 5:36 PM, Grant Ingersoll wrote: Please review and vote for releasing Mahout 0.1. This is our first release and is all new code. The artifacts in are located in: http://people.apache.org/~gsingers/staging-repo/mahout/org/apache/mahout/ The mahout directory contains a tarball/zip of the whole project (for building from source) The core, examples and taste-web directories contain the artifacts for each of those components. The other directories contain various dependencies and artifacts. Thanks, Grant
Useful? Data Format reader
http://mloss.org/software/view/163/ License is ASF friendly.
Re: [VOTE] Mahout 0.1
Hi guys. Not much activity from me -- really ashamed of it, but swamped in other duties. Anyway, downloaded mahout-0.1-project.tar.bz2 and (OpenSuSE 10.3): tar -jxf *.bz2 gives a warning: tar: A lone zero block at 14473 Running mvn:install (Maven 2.0.9) hangs for a long time on one of the test cases and takes a total of 15 minutes, 16 seconds on my machine. I didn't see any lag during packaging. Dawid Grant Ingersoll wrote: Please review and vote for releasing Mahout 0.1. This is our first release and is all new code. The artifacts in are located in: http://people.apache.org/~gsingers/staging-repo/mahout/org/apache/mahout/ The mahout directory contains a tarball/zip of the whole project (for building from source) The core, examples and taste-web directories contain the artifacts for each of those components. The other directories contain various dependencies and artifacts. Thanks, Grant
Re: [VOTE] Mahout 0.1
On Mar 20, 2009, at 10:02 AM, Dawid Weiss wrote: Hi guys. Not much activity from me -- really ashamed of it, but swamped in other duties. Anyway, downloaded mahout-0.1- project.tar.bz2 and (OpenSuSE 10.3): tar -jxf *.bz2 gives a warning: tar: A lone zero block at 14473 I assume you are on a Mac? I get that too, but it always seems to be fine. Running mvn:install (Maven 2.0.9) hangs for a long time on one of the test cases and takes a total of 15 minutes, 16 seconds on my machine. I didn't see any lag during packaging. -DskipTests=true can speed things up if you don't care about the tests at any particular point. Dawid Grant Ingersoll wrote: Please review and vote for releasing Mahout 0.1. This is our first release and is all new code. The artifacts in are located in: http://people.apache.org/~gsingers/staging-repo/mahout/org/apache/mahout/ The mahout directory contains a tarball/zip of the whole project (for building from source) The core, examples and taste-web directories contain the artifacts for each of those components. The other directories contain various dependencies and artifacts. Thanks, Grant
Re: [VOTE] Mahout 0.1
tar: A lone zero block at 14473 I assume you are on a Mac? I get that too, but it always seems to be fine. Nope, it's OpenSuSE (Linux), 64-bit. I've seen these warnins with gzip and bzip-compressed tar files occasionally, but they never meant anything that would indicate data corruption. D.
Re: 20 Newsgroups Example
What error do you get? And how are you running it? I am running on a Macbook just fine. On Mar 20, 2009, at 11:12 AM, Jeff Eastman wrote: Good, that works, and the wiki is the place to put the fix. I can't seem to actually bring up hadoop on my macbook (it runs in standalone mode but start-all fails for some port 22 connection- related issue that I don't understand) so I will have to bring up a cluster on EC2 for further testing. I'd like to test all of the examples in the next few days. Jeff Grant Ingersoll wrote: OK, I've posted a workaround. Give it a try. On Mar 20, 2009, at 9:06 AM, Grant Ingersoll wrote: On Mar 20, 2009, at 7:49 AM, Grant Ingersoll wrote: One Maven command that comes in handy from time to time is: mvn dependencies:copy-dependencies. This will download all the project dependencies into your target directory, from which you can then add to a classpath. Sorry, that should have been mvn dependency:copy-dependencies People will probably find http://maven.apache.org/plugins/ index.html helpful for knowing what goals are available.
Re: 20 Newsgroups Example
jeff-eastmans-macbook-pro:hadoop-0.19.1 jeff$ bin/start-all.sh starting namenode, logging to /Users/jeff/hadoop/hadoop-0.19.1/bin/../logs/hadoop-jeff-namenode-jeff-eastmans-macbook-pro.local.out localhost: ssh: connect to host localhost port 22: Connection refused localhost: ssh: connect to host localhost port 22: Connection refused starting jobtracker, logging to /Users/jeff/hadoop/hadoop-0.19.1/bin/../logs/hadoop-jeff-jobtracker-jeff-eastmans-macbook-pro.local.out localhost: ssh: connect to host localhost port 22: Connection refused I'm using the out of the box config Jeff Grant Ingersoll wrote: What error do you get? And how are you running it? I am running on a Macbook just fine. On Mar 20, 2009, at 11:12 AM, Jeff Eastman wrote: Good, that works, and the wiki is the place to put the fix. I can't seem to actually bring up hadoop on my macbook (it runs in standalone mode but start-all fails for some port 22 connection-related issue that I don't understand) so I will have to bring up a cluster on EC2 for further testing. I'd like to test all of the examples in the next few days. Jeff Grant Ingersoll wrote: OK, I've posted a workaround. Give it a try. On Mar 20, 2009, at 9:06 AM, Grant Ingersoll wrote: On Mar 20, 2009, at 7:49 AM, Grant Ingersoll wrote: One Maven command that comes in handy from time to time is: mvn dependencies:copy-dependencies. This will download all the project dependencies into your target directory, from which you can then add to a classpath. Sorry, that should have been mvn dependency:copy-dependencies People will probably find http://maven.apache.org/plugins/index.html helpful for knowing what goals are available. PGP.sig Description: PGP signature
Re: GSOC Mentor
Hi guys, I'm actually interested with your project. I haven't started my proposal yet, because I'm still working on my finals now, I'll be writing it soon and let you guys know any updates. But I'm generally interested this idea: http://wiki.apache.org/general/SummerOfCode2008#lucene I had Machine Learning class but haven't had the chance to implement algorithm. I used Lucene previously, and I have a strong interest with Machine Learning, so I thought it would be nice if I could spend my summer implementing Machine Learning algorithm. Regards, Grady On Fri, Mar 20, 2009 at 4:27 AM, Grant Ingersoll gsing...@apache.orgwrote: Hey Gang, The ASF has been accepted to participate in GSOC. If you want to be a mentor, you can now sign up to be one. Just choose to be a part of the ASF. http://socghop.appspot.com/program/home/google/gsoc2009 You should also subscribe to code-awa...@a.o for ASF specific info. Note, you have to be a committer to be a mentor. -Grant -- Grady Laksmono gradyfau...@laksmono.com www.laksmono.com I know the plans I have for you, declares the Lord, plans to prosper you and not to harm you, plans to give you hope and a future. ~ Jeremiah 29:11 ~
Re: 20 Newsgroups Example
You need to have passphraseless SSH setup, even for localhost. See http://hadoop.apache.org/core/docs/current/quickstart.html and the section on Pseudo-Distributed Operation HTH, Grant On Mar 20, 2009, at 1:48 PM, Jeff Eastman wrote: jeff-eastmans-macbook-pro:hadoop-0.19.1 jeff$ bin/start-all.sh starting namenode, logging to /Users/jeff/hadoop/hadoop-0.19.1/ bin/../logs/hadoop-jeff-namenode-jeff-eastmans-macbook-pro.local.out localhost: ssh: connect to host localhost port 22: Connection refused localhost: ssh: connect to host localhost port 22: Connection refused starting jobtracker, logging to /Users/jeff/hadoop/hadoop-0.19.1/ bin/../logs/hadoop-jeff-jobtracker-jeff-eastmans-macbook-pro.local.out localhost: ssh: connect to host localhost port 22: Connection refused I'm using the out of the box config Jeff Grant Ingersoll wrote: What error do you get? And how are you running it? I am running on a Macbook just fine. On Mar 20, 2009, at 11:12 AM, Jeff Eastman wrote: Good, that works, and the wiki is the place to put the fix. I can't seem to actually bring up hadoop on my macbook (it runs in standalone mode but start-all fails for some port 22 connection- related issue that I don't understand) so I will have to bring up a cluster on EC2 for further testing. I'd like to test all of the examples in the next few days. Jeff Grant Ingersoll wrote: OK, I've posted a workaround. Give it a try. On Mar 20, 2009, at 9:06 AM, Grant Ingersoll wrote: On Mar 20, 2009, at 7:49 AM, Grant Ingersoll wrote: One Maven command that comes in handy from time to time is: mvn dependencies:copy-dependencies. This will download all the project dependencies into your target directory, from which you can then add to a classpath. Sorry, that should have been mvn dependency:copy-dependencies People will probably find http://maven.apache.org/plugins/index.html helpful for knowing what goals are available.
Re: 20 Newsgroups Example
I enabled rlogin as Sean suggested and that got past the port 22 problem. I can ssh localhost, but I get this now: jeff-eastmans-macbook-pro:hadoop-0.19.1 jeff$ bin/start-all.sh starting namenode, logging to /Users/jeff/hadoop/hadoop-0.19.1/bin/../logs/hadoop-jeff-namenode-jeff-eastmans-macbook-pro.local.out localhost: starting datanode, logging to /Users/jeff/hadoop/hadoop-0.19.1/bin/../logs/hadoop-jeff-datanode-jeff-eastmans-macbook-pro.local.out localhost: Error: JAVA_HOME is not set. localhost: starting secondarynamenode, logging to /Users/jeff/hadoop/hadoop-0.19.1/bin/../logs/hadoop-jeff-secondarynamenode-jeff-eastmans-macbook-pro.local.out localhost: Error: JAVA_HOME is not set. starting jobtracker, logging to /Users/jeff/hadoop/hadoop-0.19.1/bin/../logs/hadoop-jeff-jobtracker-jeff-eastmans-macbook-pro.local.out localhost: starting tasktracker, logging to /Users/jeff/hadoop/hadoop-0.19.1/bin/../logs/hadoop-jeff-tasktracker-jeff-eastmans-macbook-pro.local.out localhost: Error: JAVA_HOME is not set. JAVA_HOME is set, even when I ssh Jeff Grant Ingersoll wrote: You need to have passphraseless SSH setup, even for localhost. See http://hadoop.apache.org/core/docs/current/quickstart.html and the section on Pseudo-Distributed Operation HTH, Grant On Mar 20, 2009, at 1:48 PM, Jeff Eastman wrote: jeff-eastmans-macbook-pro:hadoop-0.19.1 jeff$ bin/start-all.sh starting namenode, logging to /Users/jeff/hadoop/hadoop-0.19.1/bin/../logs/hadoop-jeff-namenode-jeff-eastmans-macbook-pro.local.out localhost: ssh: connect to host localhost port 22: Connection refused localhost: ssh: connect to host localhost port 22: Connection refused starting jobtracker, logging to /Users/jeff/hadoop/hadoop-0.19.1/bin/../logs/hadoop-jeff-jobtracker-jeff-eastmans-macbook-pro.local.out localhost: ssh: connect to host localhost port 22: Connection refused I'm using the out of the box config Jeff Grant Ingersoll wrote: What error do you get? And how are you running it? I am running on a Macbook just fine. On Mar 20, 2009, at 11:12 AM, Jeff Eastman wrote: Good, that works, and the wiki is the place to put the fix. I can't seem to actually bring up hadoop on my macbook (it runs in standalone mode but start-all fails for some port 22 connection-related issue that I don't understand) so I will have to bring up a cluster on EC2 for further testing. I'd like to test all of the examples in the next few days. Jeff Grant Ingersoll wrote: OK, I've posted a workaround. Give it a try. On Mar 20, 2009, at 9:06 AM, Grant Ingersoll wrote: On Mar 20, 2009, at 7:49 AM, Grant Ingersoll wrote: One Maven command that comes in handy from time to time is: mvn dependencies:copy-dependencies. This will download all the project dependencies into your target directory, from which you can then add to a classpath. Sorry, that should have been mvn dependency:copy-dependencies People will probably find http://maven.apache.org/plugins/index.html helpful for knowing what goals are available. PGP.sig Description: PGP signature
Re: 20 Newsgroups Example
The NameNode and JobTracker start ok, but the DataNode and SecondaryNameNode have the java problem Jeff Eastman wrote: * PGP Signed: 03/20/09 at 11:28:38 Sean Owen wrote: I export JAVA_HOME in my ~/.profile file to make sure it's available even for non-login shells, perhaps that does it? not sure. (I am not a bash expert so the above might not be optimal.) I have a similar export in my .bash_profile file, and when I to ssh localhost the JAVA_HOME variable is set by that. * Jeff Eastman j...@windwardsolutions.com * 0x6BFF1277 . PGP.sig Description: PGP signature
Re: 20 Newsgroups Example
Sean Owen wrote: I export JAVA_HOME in my ~/.profile file to make sure it's available even for non-login shells, perhaps that does it? not sure. (I am not a bash expert so the above might not be optimal.) I have a similar export in my .bash_profile file, and when I to ssh localhost the JAVA_HOME variable is set by that. PGP.sig Description: PGP signature
Re: Useful? Data Format reader
Cool. Their heart is definitely in the right place. The code appears very, very new as of now and they are working on different kinds of stuff than we are. On Fri, Mar 20, 2009 at 6:52 AM, Grant Ingersoll gsing...@apache.orgwrote: http://mloss.org/software/view/163/ License is ASF friendly. -- Ted Dunning, CTO DeepDyve
Re: 20 Newsgroups Example
I'm out of my depth here. I'm just using the default OS-X bash shell and it uses .bash_profile not .bashrc. I tried creating .bashrc and it did not solve the problem that Ted's snippet verified: ssh localhost env | grep JAVA returns nothing. Jeff Sean Owen wrote: Ah, solution -- add this to .bashrc, not .profile. That's the one bash uses for non-interactive shells. Now I remember. On Fri, Mar 20, 2009 at 7:15 PM, Ted Dunning ted.dunn...@gmail.com wrote: ssh with a command does not log in, but instead works like a subshell command. Try [ssh localhost env | grep JAVA]. That may give different results than ssh to localhost with an interactive shell. PGP.sig Description: PGP signature
Re: 20 Newsgroups Example
This might help even more: Additionally, ssh reads ~/.ssh/environment, and adds lines of the format ``VARNAME=value'' to the environment if the file exists and users are allowed to change their environment. For more information, see the PermitUserEnvironment option in sshd_config(5). You might be able to put JAVA_HOME in this environment file. On Fri, Mar 20, 2009 at 2:44 PM, Ted Dunning ted.dunn...@gmail.com wrote: Does this part of the manual for bash help? When bash is started non-interactively, to run a shell script, for example, it looks for the variable BASH_ENV in the environment, expands its value if it appears there, and uses the expanded value as the name of a file to read and execute. Bash behaves as if the following com- mand were executed: if [ -n $BASH_ENV ]; then . $BASH_ENV; fi but the value of the PATH variable is not used to search for the file name. In particular setting BASH_ENV on the local side and then using -o SendEnv=true might do something useful (or not!). There is also a hadoop-env.sh file in the hadoop configuration that might be useful. You could do something like . ~/bash_profile there. None of these are good answers. I have exactly the same problem with lots of my EC2 invoking scripts. There, I build a script, throw it over to the new instance and the first few lines source key profile files. Not a good solution, but at least it works. For Mahout running hadoop, this is a whole lot less viable. On Fri, Mar 20, 2009 at 2:25 PM, Jeff Eastman j...@windwardsolutions.comwrote: I'm out of my depth here. I'm just using the default OS-X bash shell and it uses .bash_profile not .bashrc. I tried creating .bashrc and it did not solve the problem that Ted's snippet verified: ssh localhost env | grep JAVA returns nothing. Jeff Sean Owen wrote: Ah, solution -- add this to .bashrc, not .profile. That's the one bash uses for non-interactive shells. Now I remember. On Fri, Mar 20, 2009 at 7:15 PM, Ted Dunning ted.dunn...@gmail.com wrote: ssh with a command does not log in, but instead works like a subshell command. Try [ssh localhost env | grep JAVA]. That may give different results than ssh to localhost with an interactive shell. -- Ted Dunning, CTO DeepDyve 111 West Evelyn Ave. Ste. 202 Sunnyvale, CA 94086 www.deepdyve.com 408-773-0110 ext. 738 858-414-0013 (m) 408-773-0220 (fax) -- Ted Dunning, CTO DeepDyve 111 West Evelyn Ave. Ste. 202 Sunnyvale, CA 94086 www.deepdyve.com 408-773-0110 ext. 738 858-414-0013 (m) 408-773-0220 (fax)