Re: number of map and reduce task does not change in M/R program

2013-10-21 Thread Dieter De Witte
Anseh, Let's assume that your job is fully scalable, then it should take: 100 000 000 / 600 000 times the amount of time of the first job, which is 1000 / 6 = 167 times longer. This is an ideal, probably it will be something like 200 times. Also try using units in your questions + scientific

Re: CDH4.4 and HBASE-8912 issue

2013-10-21 Thread Boris Emelyanov
Boris, what does hbck say? We have had this issue a couple times before. To fix it I had to stop the cluster, run offline meta repair tool, delete zk-store on each zk quorum node Offline Meta repair tool will not work if there are inconsistencies in HBase - you better try hbase hbck -fixAll

Re: CDH4.4 and HBASE-8912 issue

2013-10-21 Thread Samir Ahmic
Hi, Boris Did you check RS logs ? There should be exception regarding why assignment failed. Can you past that exception ? Cheers :) On Mon, Oct 21, 2013 at 9:53 AM, Boris Emelyanov emelya...@post.km.ruwrote: Boris, what does hbck say? We have had this issue a couple times before. To fix

Re: CDH4.4 and HBASE-8912 issue

2013-10-21 Thread Boris Emelyanov
On 21.10.2013 12:17, Samir Ahmic wrote: Hi, Boris Did you check RS logs ? There should be exception regarding why assignment failed. Can you past that exception ? Cheers :) On Mon, Oct 21, 2013 at 9:53 AM, Boris Emelyanov emelya...@post.km.ru mailto:emelya...@post.km.ru wrote:

disk errors during reducer's sorting phase

2013-10-21 Thread Dieter De Witte
Hi all, I am currently pushing the limits of my hadoop cluster. Unfortunately I am a bit confused about the memory requirements during the copy phase and the sort phases. I have made an effort to fully explain my setup and problems at the following link:

Regarding CDR Data

2013-10-21 Thread Aijas Mohammed
Dear All, Please let me know how to get the CDR Sample Data files. And how to deploy and analysis CDR (call detail records) files on Hadoop Cluster. Thanks Regards, Aijas Mohammed Ext:- 1148 DISCLAIMER: This email may contain confidential information and is

Re: Regarding CDR Data

2013-10-21 Thread Nitin Pawar
Are you sure you wanted to send this mail to common hadoop users and dev? If you want to put files on hadoop cluster, there are ways like hadoop cli, java client, webhdfs etc. What analysis you want to do is in your brain, not really sure what help you need on that. How do download CDR data, you

Re: CDH4.4 and HBASE-8912 issue

2013-10-21 Thread Samir Ahmic
I can't see anything wrong in your logs, but fact that you trigger this issue by running balancer makes me think that some of your RS may have some problem. Here is what would i do in this situation: 1. Make sure that system time, OS configuration, hadoop/HBase configuration is synced on all

RE: Regarding CDR Data

2013-10-21 Thread Aijas Mohammed
Dear ALL, I want to ANALYZE Call Detail Records Data. For Example CDR Analysis System: * Capture for extended periods of time from hours to months * Once calls are captured, a search for calls of interest can be performed while live capturing continues * Drill-down to problem calls

Time taken for starting AMRMClientAsync

2013-10-21 Thread Krishna Kishore Bonagiri
Hi, I am seeing the following call to start() on AMRMClientAsync taking from 0.9 to 1 second. Why does it take that long? Is there a way to reduce it, I mean does it depend on any of the interval parameters or so in configuration files? I have tried reducing the value of the first argument below

RE: temporary file locations for YARN applications

2013-10-21 Thread John Lilley
Thanks again. This gives me a lot of options; we will see what works. Do you know if there are any permissions issues if we directly access the folders of LOCAL_DIR_ENV? Regarding LocalDirAllocator, I see its constructor: LocalDirAllocator(String contextCfgItemName) and a note mentioning that

Re: Time taken for starting AMRMClientAsync

2013-10-21 Thread Alejandro Abdelnur
Hi Krishna, Those 900ms seems consistent with the numbers we found while doing some benchmarks in the context of Llama: http://cloudera.github.io/llama/ We found that the first application master created from a client process takes around 900 ms to be ready to submit resource requests.

Re: temporary file locations for YARN applications

2013-10-21 Thread Harsh J
The dirs in that env-var are app-specific and are for the app's user to utilize. You shouldn't have any permission issues working within them. The LocalDirAllocator is still somewhat MR-bound but you can still be able to make it work by giving it a config with the values it needs. On Mon, Oct

issue in hdfs when starting namenode in hadoop 2.1

2013-10-21 Thread Kazue Watanabe
Hi, I installed Hadoop 2.1 from this site: http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.5.0/bk_installing_manually_book/content/rpm-chap13.html And have followed the installation guide. I am at the section for formatting and starting HDFS. However I am getting this error saying $JAVA

Re: temporary file locations for YARN applications

2013-10-21 Thread Jian He
This post might help a bit. http://hortonworks.com/blog/management-of-application-dependencies-in-yarn/ Thanks, Jian On Mon, Oct 21, 2013 at 11:11 AM, Harsh J ha...@cloudera.com wrote: The dirs in that env-var are app-specific and are for the app's user to utilize. You shouldn't have any

Re: Regarding CDR Data

2013-10-21 Thread Jitendra Yadav
Hi, Due to some security concerns I can't share the real time CDR logs but as an alternative you can create your own script that will generate dummy CDR records for your analysis. Below link might be helpful. http://www.gedis-studio.com/online-call-detail-records-cdr-generator.html Regards

RE: temporary file locations for YARN applications

2013-10-21 Thread John Lilley
Thanks, sounds like LOCAL_DIR_ENV is the way to go. john -Original Message- From: Harsh J [mailto:ha...@cloudera.com] Sent: Monday, October 21, 2013 12:11 PM To: user@hadoop.apache.org Subject: Re: temporary file locations for YARN applications The dirs in that env-var are app-specific

ResourceManager webapp code runs OOM

2013-10-21 Thread Prashant Kommireddi
Hello, We are noticing the RM running out of memory in the webapp code. It happens in org.apache.hadoop.yarn.server.resourcemanager.webapp.AppsBlock.renderBlock(Block html). The StringBuilder object appsTableData grows too large in this case while appending AppInfo. Ignoring the heap size (this

RE: temporary file locations for YARN applications

2013-10-21 Thread John Lilley
Right, that's very useful for ensuring that copies of read-only data are available to all nodes. We do use LocalResources for the transport of our executable environment to the nodes. Cheers, John From: Jian He [mailto:j...@hortonworks.com] Sent: Monday, October 21, 2013 12:22 PM To:

Hadoop-2.2 getting started, how to init a HDFS server on single node?

2013-10-21 Thread Jerome
Dear all I'm was using hadoop-1.2 for some project, and be very enthousiast. Now, i want to switch to the new version 2.2, with Yarn. But reading the Getting Started document, i'm facing a egg and chicken problem: in the Setting up a Single Node Cluster, it is asuming that we get HDFS

Hadoop 2.2.0 MR tasks failing

2013-10-21 Thread Robert Dyer
I recently setup a 2.2.0 test cluster. For some reason, all of my MR jobs are failing. The maps and reduces all run to completion, without any errors. Yet the app is marked failed and there is no final output. Any ideas? Application Type: MAPREDUCE State: FINISHED FinalStatus: FAILED

Re: Hadoop 2.2.0 MR tasks failing

2013-10-21 Thread Arun C Murthy
If you follow the links on the web-ui to the logs of the map/reduce tasks, what do you see there? Arun On Oct 21, 2013, at 9:55 PM, Robert Dyer psyb...@gmail.com wrote: I recently setup a 2.2.0 test cluster. For some reason, all of my MR jobs are failing. The maps and reduces all run to