Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.

2013-08-12 Thread lei liu
The fsimage file size is 1658934155 2013/8/13 Harsh J > How large are your checkpointed fsimage files? > > On Mon, Aug 12, 2013 at 3:42 PM, lei liu wrote: > > When Standby Namenode is doing checkpoint, upload the image file to > Active > > NameNode, the Active NameNode is very slow. What is r

Re: Unable to load native-hadoop library for your platform

2013-08-12 Thread Harsh J
If you use tarballs, you will need to build native libraries for your OS. Follow instructions for native libraries under src/BUILDING.txt of a source tarball. Alternatively, pick up Apache Hadoop packages from Apache Bigtop for 2.0.5-alpha and they'll come with pre-built, proper native libraries.

Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.

2013-08-12 Thread Harsh J
How large are your checkpointed fsimage files? On Mon, Aug 12, 2013 at 3:42 PM, lei liu wrote: > When Standby Namenode is doing checkpoint, upload the image file to Active > NameNode, the Active NameNode is very slow. What is reason result to the > Active NameNode is slow? > > > Thanks, > > LiuL

Re: Jobtracker page hangs ..again.

2013-08-12 Thread Harsh J
If you're not already doing it, run a local name caching daemon (such as ncsd, etc.) on each cluster node. Hadoop does a lot of lookups and a local cache would do good in reducing the load on your DNS. On Tue, Aug 13, 2013 at 3:09 AM, Patai Sangbutsarakum wrote: > Update, after adjust the network

Re: Hardware Selection for Hadoop

2013-08-12 Thread Sambit Tripathy
I understand. But sometimes there is a lock-in with a particular vendor and you are not allowed to put the servers inside corporate data center if they are procured from some another vendor. The idea is to start from basic and then grow. You can tell me some numbers in $s if you have, preferred ;

Re: Hardware Selection for Hadoop

2013-08-12 Thread Chris Embree
As we always say in Technology... it depends! What country are you in? That makes a difference. How much buying power do you have? I work for a Fortune 100 Company and we -- absurdly -- pay about 60% off retail when we buy servers. Are you buying a bunch at once? You best bet is to contact 3 or

Re: Hardware Selection for Hadoop

2013-08-12 Thread Sambit Tripathy
Any rough ideas how much this would cost? Actually I kinda require a budget approval and need to put some rough figures in $ on the paper. Help! 1. 6 X 2 TB hard disk JBOD, 2 quad cores, 24-48 GB RAM. 2. I rack mount unit 3. I gbe switch for the rack 4. 10 gbe switch for the network Regards, Samb

Re: How to tune fileSystem.listFiles("/", true) if you like walk though almost all files

2013-08-12 Thread Christian Schneider
Hi, i found out that it works much faster with fileSystem.listStatus() and a recursion by hand. listFiles= 4021 Files in 14.27 s listStatus = 4021 Files 364.3 ms Currently i just tested it on localhost. Tomorrow I check it against the cluster. public class Main { static AtomicInteger count =

Re: Jobtracker page hangs ..again.

2013-08-12 Thread Patai Sangbutsarakum
Update, after adjust the network routing, dns query speed is in micro sec as suppose to be. the issue is completely solve. Jobtracker page doesn't hang anymore when launch 100k mappers job.. Cheers, On Mon, Aug 12, 2013 at 1:29 PM, Patai Sangbutsarakum < silvianhad...@gmail.com> wrote: > Ok, a

Re: Jobtracker page hangs ..again.

2013-08-12 Thread Patai Sangbutsarakum
Ok, after some sweat, i think I found the root cause but still need another team to help me fix it. It lines on the DNS. Somehow each of the tip:task line, through the tcpdump, i saw the dns query to dns server. Timestamp seems matches to me. 2013-08-11 20:39:23,493 INFO org.apache.hadoop.mapred.

How to tune fileSystem.listFiles("/", true) if you like walk though almost all files

2013-08-12 Thread Christian Schneider
Hi, is there a way to tune this? I walk though the files with: RemoteIterator listFiles = fileSystem.listFiles(new Path(uri), true); while(listFiles.hasNext()) { listFiles.next(); }; I need to get some information about those files, therefore i like to scan them all. Is there any way to tun

Re: Hosting Hadoop

2013-08-12 Thread alex bohr
I've had good experience running a large hadoop cluster on EC2 instances. After almost 1 year we haven't had any significant down time, just lost a small # of data nodes. I don't think EMR is an ideal solution if your cluster will be running 24/7. But for running a large cluster, I don't see how

Re: How to import custom Python module in MapReduce job?

2013-08-12 Thread Andrei
For some reason using -archives option leads to "Error in configuring object" without any further information. However, I found out that -files option works pretty well for this purpose. I was able to run my example as follows. 1. I put `main.py` and `lib.py` into `app` directory. 2. In `main.py`

Re: What is the resolution for HADOOP-9346

2013-08-12 Thread Sathwik B P
Seems like the hadoop builds are failing for the same reason https://builds.apache.org/job/Hadoop-Yarn-trunk/299/ Is there a fix coming soon? Do we fall back on protoc 2.4.1? On Mon, Aug 12, 2013 at 10:38 AM, Sathwik B P wrote: > Hi Guys, > > I upgraded protoc to 2.5.0 in order to build another

Re: How exactly Oozie works internally?

2013-08-12 Thread Wellington Chevreuil
I had similar issue before... I'm not sure, but I think in my case oozie was always connecting through ssh as oozie user, event if I was running it as a different user. If it's not a big effort to you, I would recommend you to try create oozie usr in your edge node and give it all required rights t

What is the resolution for HADOOP-9346

2013-08-12 Thread Sathwik B P
Hi Guys, I upgraded protoc to 2.5.0 in order to build another apache project. Now I am not able to build hadoop-common trunk. What is the resolution for HADOOP-9346? regards, sathwik

Re: How to import custom Python module in MapReduce job?

2013-08-12 Thread Binglin Chang
Maybe you doesn't specify symlink name in you cmd line, so the symlink name will be just lib.jar, so I am not sure how you import lib module in your main.py file. Please try this: put main.py lib.py in same jar file, e.g. app.zip *-archives hdfs://hdfs-namenode/user/me/app.zip#app* -mapper "app/ma

when Standby Namenode is doing checkpoint, the Active NameNode is slow.

2013-08-12 Thread lei liu
When Standby Namenode is doing checkpoint, upload the image file to Active NameNode, the Active NameNode is very slow. What is reason result to the Active NameNode is slow? Thanks, LiuLei

Re: How to import custom Python module in MapReduce job?

2013-08-12 Thread Andrei
Hi Binglin, thanks for your explanation, now it makes sense. However, I'm not sure how to implement suggested method with. First of all, I found out that `-cachArchive` option is deprecated, so I had to use `-archives` instead. I put my `lib.py` to directory `lib` and then zipped it to `lib.zip`.

Re: How exactly Oozie works internally?

2013-08-12 Thread Kasa V Varun Tej
Hi WC, I'm triggering the job as root user and i want to run some command on the edge node. Yes i made sure of the permissions. Thanks, Kasa On Mon, Aug 12, 2013 at 3:07 PM, Wellington Chevreuil < wellington.chevre...@gmail.com> wrote: > Hi Kasa, > > did you create the oozie user on the targe

Re: How exactly Oozie works internally?

2013-08-12 Thread Wellington Chevreuil
Hi Kasa, did you create the oozie user on the target ssh server, and does this have all user rights to execute want it should on the target server? Regards, Wellington. 2013/8/12 Kasa V Varun Tej > Folks, > > I have been working on this oozie SSH action from past 2 days. I'm unable > to imple

fair scheduler :: reducer preemption

2013-08-12 Thread Ravi Shetye
Hi folks I have a hadoop cluster running Fairscheduler ( hadoop.apache.org/docs/stable/fair_scheduler.html) with preemption set to true. The scheduler preemption policy work well for mappers but the reducers are not getting preempted. Any thoughts on this? 1) is reducer preemption not supposed to

Re: DefaultResourceCalculator class not found, ResourceManager fails to start.

2013-08-12 Thread Rob Blah
Problem solved. Thank you for your help. @Ted Yu Other issues where my mistakes. I have a dedicated script which updates/builds/"deploys" YARN from sources. I was starting NN with the "-upgrade" option which unsynchronized NN version, also leading to broken DN. Quick NN format and deletion of DN d

Re: How to import custom Python module in MapReduce job?

2013-08-12 Thread Binglin Chang
Hi, The problem seems to caused by symlink, hadoop uses file cache, so every file is in fact a symlink. lrwxrwxrwx 1 root root 65 Aug 12 15:22 lib.py -> /root/hadoop3/data/nodemanager/usercache/root/filecache/13/lib.py lrwxrwxrwx 1 root root 66 Aug 12 15:23 main.py -> /root/hadoop3/data/nodemanag

How exactly Oozie works internally?

2013-08-12 Thread Kasa V Varun Tej
Folks, I have been working on this oozie SSH action from past 2 days. I'm unable to implement anything using SSH action. I'm facing some permissions issues, so i thought if someone can provide me with some information how it actually works, it may help me debug the issues i'm facing. Task i want