question about released version id

2009-03-02 Thread 鞠適存
hi, I wonder how to make the hadoop version number. The HowToRelease page on the hadoop web site just describes the process about new release but not mentions the rules on assigning the version number. Are there any criteria for version number? For example,under what condition the next version of

Re: Potential race condition (Hadoop 18.3)

2009-03-02 Thread Ryan Shih
Koji - That looks like it did the trick - we're smooth sailing now. Thanks a lot! On Mon, Mar 2, 2009 at 2:02 PM, Ryan Shih wrote: > Koji - That makes a lot of sense. The two tasks are probably stepping over > each other. I'll give it a try and let you know how it goes. > > Malcolm - if you turn

Re: OutOfMemory error processing large amounts of gz files

2009-03-02 Thread Runping Qi
Your job tracker out-of-memory problem may be related to https://issues.apache.org/jira/browse/HADOOP-4766 Runping On Mon, Mar 2, 2009 at 4:29 PM, bzheng wrote: > > Thanks for all the info. Upon further investigation, we are dealing with > two > separate issues: > > 1. problem processing a l

Re: Jobs run slower and slower

2009-03-02 Thread Runping Qi
Your problem may be related to https://issues.apache.org/jira/browse/HADOOP-4766 Runping On Mon, Mar 2, 2009 at 4:46 PM, Sean Laurent wrote: > Hi all, > I'm conducting some initial tests with Hadoop to better understand how well > it will handle and scale with some of our specific problems. As

Jobs run slower and slower

2009-03-02 Thread Sean Laurent
Hi all, I'm conducting some initial tests with Hadoop to better understand how well it will handle and scale with some of our specific problems. As a result, I've written some M/R jobs that are representative of the work we want to do. I then run the jobs multiple times in a row (sequentially) to g

Re: OutOfMemory error processing large amounts of gz files

2009-03-02 Thread bzheng
Thanks for all the info. Upon further investigation, we are dealing with two separate issues: 1. problem processing a lot of gz files we have tried the hadoop.native.lib setting and it makes little difference. however, this is not that big a deal since we can use multiple jobs each processin

RE: Potential race condition (Hadoop 18.3)

2009-03-02 Thread Malcolm Matalka
Ryan, Thanks a lot. I need to do some more investigation but I believe that solved my problem. One question though. Should my Combine Input Records always be less than or equal to my Map output records? I appear to be seeing a Combine Input amount larger than my Map output amount. Thanks ag

Re: Potential race condition (Hadoop 18.3)

2009-03-02 Thread Ryan Shih
I'm not sure what your accum.extend(m) does, but in 18, the value records are reused (rather than in a previous version where a new copy was made). So if you are storing a reference to your values, note that they all are going to point to the same thing unless you make a copy of it. Try: Aggregate

RE: Potential race condition (Hadoop 18.3)

2009-03-02 Thread Malcolm Matalka
Sure. Note: I am using my own class for keys and values in this. The key is called StringArrayWritable and it implements WritableComparable. The value is called AggregateRecord and it implements Writable. I have done some debugging and here is what I have found: While running in local mode I ge

Re: Issues installing FUSE_DFS

2009-03-02 Thread Brian Bockelman
Hey Matthew, We use the following command on 0.19.0: fuse_dfs -oserver=hadoop-name -oport=9000 /mnt/hadoop -oallow_other - ordbufffer=131072 Brian On Mar 2, 2009, at 4:12 PM, Hyatt, Matthew G wrote: When we try to mount the dfs from fuse we are getting the following errors. Has anyone seen

Issues installing FUSE_DFS

2009-03-02 Thread Hyatt, Matthew G
When we try to mount the dfs from fuse we are getting the following errors. Has anyone seen this issues in the past? This is on version 0.19.0 [r...@socdvmhdfs1]# fuse_dfs dfs://socdvmhdfs1:9000 /hdfs port=9000,server=socdvmhdfs1 fuse-dfs didn't recognize /hdfs,-2 [r...@socdvmhdfs1]# df -h Fil

Re: Potential race condition (Hadoop 18.3)

2009-03-02 Thread Ryan Shih
Koji - That makes a lot of sense. The two tasks are probably stepping over each other. I'll give it a try and let you know how it goes. Malcolm - if you turned off speculative execution and are still getting the problem, it doesn't sound the same. Do you want to do a cut&paste of your reduce code

Re: Shuffle speed?

2009-03-02 Thread hc busy
There are a few things that caused this to happen to me earlier on. Make sure to check that it actually makes progress. Sometimes, slowness is result of negative progress: it gets to say 10% complete on reduce, and then drop back down to 5%...In that case the output can output that line with the s

RE: Potential race condition (Hadoop 18.3)

2009-03-02 Thread Malcolm Matalka
I have a situation which may be related. I am running hadoop 0.18.1. I am on a cluster with 5 machines and testing on very small input of 10 lines. Mapper produces either 1 or 0 output per line of input yet somehow I get 18 lines of output from the reducer. For example I have one input where th

Ant Build fails on Eclipse 3.4 and Ant 1.7 (Windows)

2009-03-02 Thread Aviad sela
I am having problems building the project for release 0.19.1 I am using Eclipse 3.4 and ant 1.7 I recieve error compiling core classes * compile-core-classes*: BUILD FAILED *D:\Work\AviadWork\workspace\cur\WSAD\Hadoop_Core_19_1\Hadoop\build.xml:302: java.lang.ExceptionInInitializerError* ** Thi

RE: Potential race condition (Hadoop 18.3)

2009-03-02 Thread Koji Noguchi
Ryan, If you're using getOutputPath, try replacing it with getWorkOutputPath. http://hadoop.apache.org/core/docs/r0.18.3/api/org/apache/hadoop/mapred/ FileOutputFormat.html#getWorkOutputPath(org.apache.hadoop.mapred.JobConf ) Koji -Original Message- From: Ryan Shih [mailto:ryan.s...@gma

Re: [ANNOUNCE] Hadoop release 0.19.1 available

2009-03-02 Thread Aviad sela
Nigel Thanks, I have extracted the new project. However, I am having problems building the project I am using Eclipse 3.4 and ant 1.7 I recieve error compiling core classes * compile-core-classes*: BUILD FAILED * D:\Work\AviadWork\workspace\cur\WSAD\Hadoop_Core_19_1\Hadoop\build.xml:302: j

Re: MapReduce jobs with expensive initialization

2009-03-02 Thread Owen O'Malley
On Mar 2, 2009, at 3:03 AM, Tom White wrote: I believe the static singleton approach outlined by Scott will work since the map classes are in a single classloader (but I haven't actually tried this). Even easier, you should just be able to do it with static initialization in the Mapper clas

Announcing CloudBase-1.2.1 release

2009-03-02 Thread Tarandeep Singh
Hi, We have just released 1.2.1 version of CloudBase on sourceforge- http://cloudbase.sourceforge.net [ CloudBase is a data warehouse system built on top of Hadoop's Map-Reduce architecture. It uses ANSI SQL as its query language and comes with a JDBC driver. It is developed by Business.com and i

Potential race condition (Hadoop 18.3)

2009-03-02 Thread Ryan Shih
Hi - I'm not sure yet, but I think I might be hitting a race condition in Hadoop 18.3. What seems to happen is that in the reduce phase, some of my tasks perform speculative execution but when the initial task completes successfully, it sends a kill to the new task started. After all is said and do

can some one provide some use cases for ChainMapper & ChainReducer

2009-03-02 Thread Nick Cen
>From the api docs i see that we can make a chain convertion like this (m1,k1) -> (m2,k2) -> ... -> (mn,kn). but i can not find a use case why we need this convertion, and not make a direct convetion from (m1,k1) -> (mn,kn). Can someone provide an example ,thanks in advance. -- http://daily.appsp

Re: Eclipse plugin

2009-03-02 Thread Arijit Mukherjee
Hi All I've having some trouble in using the eclipse plugin on a windows XP machine to connect to the HDFS (hadoop 0.19.0) on a linux server - I'm getting the error:null message, although the port number etc are correct. Can this be related to the user information? I've set it to the hadoop user o

Re: What's the cause of this Exception

2009-03-02 Thread Nick Cen
Hi, Just to provide more info. By setting the "mapred.job.tracker" to local which make the program run locally, everything works fine. but turn to fully cluster the exception comes. 2009/3/2 Nick Cen > Hi, > > I have set the seperator value, but the same exception is thrown. > As i take the fir

Re: How does NVidia GPU compare to Hadoop/MapReduce

2009-03-02 Thread Steve Loughran
Dan Zinngrabe wrote: On Fri, Feb 27, 2009 at 11:21 AM, Doug Cutting wrote: I think they're complementary. Hadoop's MapReduce lets you run computations on up to thousands of computers potentially processing petabytes of data. It gets data from the grid to your computation, reliably stores outp

Re: MapReduce jobs with expensive initialization

2009-03-02 Thread Tom White
On any particular tasktracker slot, task JVMs are shared only between tasks of the same job. When the job is complete the task JVM will go away. So there is certainly no sharing between jobs. I believe the static singleton approach outlined by Scott will work since the map classes are in a single