Nice protocol Drew, I've been able to get part way into it with the
tar.gz release. It builds and runs all tests just fine. My problem is
getting Hadoop to run pseudo-distributed so I can go further. For some
reason I am not able to start a JobTracker and the Hadoop wiki
configuration stuff is out of date. It may be my Mac configuration too,
IDK. In case anybody can help, here is my job tracker log:
2010-03-14 14:02:17,183 INFO org.apache.hadoop.mapred.JobTracker:
STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting JobTracker
STARTUP_MSG: host = jeff-eastmans-macbook-pro.local/192.168.1.114
STARTUP_MSG: args = []
STARTUP_MSG: version = 0.20.2
STARTUP_MSG: build =
https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r
911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
************************************************************/
2010-03-14 14:02:17,624 INFO org.apache.hadoop.mapred.JobTracker:
Scheduler configured with (memSizeForMapSlotOnJT,
memSizeForReduceSlotOnJT, limitMaxMemForMapTasks,
limitMaxMemForReduceTasks) (-1, -1, -1, -1)
2010-03-14 14:02:18,181 FATAL org.apache.hadoop.mapred.JobTracker:
java.lang.RuntimeException: Not a host:port pair: local
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:136)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:123)
at org.apache.hadoop.mapred.JobTracker.getAddress(JobTracker.java:1807)
at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:1579)
at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:183)
at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:175)
at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:3702)
Drew Farris wrote:
Finally had a chance to take a look. Although I could probably spend
some more time running the different algorithms, I think it looks
good.
+1 for release.
Here's what I've tried:
Source release:
- Untarred mahout-0.3-src.tar.bz2
- Built using 'mvn clean install' from an empty repo and default settings,
- Executed bin/mahout vectordump -s file and was able to successfully
dump a vector file generated from an older version of mahout.
- Executed bin/mahout ./bin/mahout
org.apache.mahout.utils.nlp.collocations.llr.CollocDriver (locally)
- Set HADOOP_HOME, HADOOP_CONF_DIR and was able to use bin/mahout to
run kmeans on the cluster
Binary release::
- Moved existing mahout 0.3 directory from source release out of the way.
- Untarred mahout-0.3.tar.bz2 (binary release),
- Ran a vector dump as a above
- Used bin/mahout to run kmeans clustering (locally)
- Set HADOOP_HOME, HADOOP_CONF_DIR and was able to use bin/mahout to
run kmeans on the cluster.
Environment:
- Ubuntu 9.04 i386
- JDK 1.6.0_16
- Maven 2.2.1
- Ant 1.7.1
- hadoop 0.20.1
On Thu, Mar 11, 2010 at 9:01 PM, Benson Margulies <bimargul...@gmail.com> wrote:
Proposed Mahout release 0.3 artifacts can by found at:
https://repository.apache.org/content/repositories/orgapachemahout-005/
A report of issues addressed in this release can be found at
https://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=true&&pid=12310751&status=5&status=6&fixfor=12314281&sorter/field=issuekey&sorter/order=DESC
.
Please vote to release or not.