RE: CPU utilization keeps increasing when using HDFS

2014-09-01 Thread Shiyuan Xiao
Because we are running the application with accessing local disk now, I can’t give the “top” command’s output when running with HDFS. But we used “top” and “pidstat” to check the CPU utilization of our application, I can confirm the CPU utilization of our application was increasing and the C

HDFS rollingUpgrade failed due to unexpected storage info

2014-09-01 Thread sam liu
Hi Experts, According to section 'Upgrading Non-Federated Clusters' of http://hadoop.apache.org/docs/r2.4.0/hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html, I tried to upgrade hadoop 2.2.0 to hadoop 2.4.1. However, I failed on step 2.2 'Start NN2 as standby with the "-rollingUpgrade starte

Re: CPU utilization keeps increasing when using HDFS

2014-09-01 Thread Gordon Wang
Because you are using one node Pseudo cluster. When HDFS client write data to HDFS, client will compute the data chunk checksum and the datanode will verify it. It costs cpu shares. You can monitoring the cpu usages for each process. I guess the NameNode cpu usage is OK. But the client process and

Replication factor affecting write performance

2014-09-01 Thread Laurens Bronwasser
Hi, We have a setup with two clusters. On cluster shows a very strong degradation when we increase the replication factor. Another cluster shows hardly any degradation with increased replication factor. Any idea how to find out the bottleneck in the slower cluster? [cid:1CB8F7F8-8572-407C-B14C-A

Re: Replication factor affecting write performance

2014-09-01 Thread Laurens Bronwasser
And now with the right label on the Y-axis. [cid:7D64F387-C6F1-4B37-9894-0166EC949EF9] From: Microsoft Office User mailto:laurens.bronwas...@imc.nl>> Date: Monday, September 1, 2014 at 9:56 AM To: "user@hadoop.apache.org" mailto:user@hadoop.apache.org>> Cc: Julien

Re: Error when running WordCount Hadoop program in Eclipse

2014-09-01 Thread alex
u can try this : create log4j.properties file in src directory or class path directory log4j.rootLogger=INFO, stdout log4j.appender.stdout=org.apache.log4j.ConsoleAppender log4j.appender.stdout.layout=org.apache.log4j.PatternLayout log4j.appender.stdout.layout.ConversionPattern=%d %p [%c]

RE: CPU utilization keeps increasing when using HDFS

2014-09-01 Thread Shiyuan Xiao
Yes, the client process used the most CPU shares. But could you please help explain why the CPU utilization kept increasing? We are sure that the traffic of provisioned data into HDFS was stable. Thanks BR/Shiyuan From: Gordon Wang [mailto:gw...@pivotal.io] Sent: 2014年9月1日 15:48 To: user@hado

Re: total number of map tasks

2014-09-01 Thread Chris MacKenzie
Thanks for the update ;O) Regards, Chris MacKenzie Expert in all aspects of photography telephone: 0131 332 6967 email: stu...@chrismackenziephotography.co.uk corporate: www.chrismackenziephotography.co.uk

toolrunner issue

2014-09-01 Thread rab ra
Hello I m having an issue in running one simple map reduce job. The portion of the code is below. It gives a warning that Hadoop command line parsing was not peformed. This occurs despite the class implements Tool interface. Any clue? public static void main(String[] args) throws Exception {

Re: toolrunner issue

2014-09-01 Thread unmesha sreeveni
public class MyClass extends Configured implements Tool{ public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); int res = ToolRunner.run(conf, new MyClass(), args); System.exit(res); } @Override public int run(String[] args) throws Exception { // T

Installing Hadoop from .deb installer

2014-09-01 Thread Tomas Delvechio
Hello I found an officially hadoop 1.2.1 .deb installer, and I'm trying to install it on Ubuntu 12.04 (I previously installed the JDK). After the installation, I found these scripts: * hadoop * hadoop-daemon.sh * hadoop-setup-applications.sh * hadoop-setup-hdfs.sh * hadoop-validate-setup.sh * ha

Re: toolrunner issue

2014-09-01 Thread rab ra
Yes, its my bad.. U r right.. Thanks On 1 Sep 2014 17:11, "unmesha sreeveni" wrote: > public class MyClass extends Configured implements Tool{ > public static void main(String[] args) throws Exception { > Configuration conf = new Configuration(); > int res = ToolRunner.run(conf, new MyClass(), a

cannot start tasktracker because java.lang.NullPointerException

2014-09-01 Thread Dereje Teklu
I created Single node cluster using Hadoop 1.2.1 following Running Hadoop on Ubuntu Linux(single node cluster) .However while trying to run single node cluster I found error cannot start tasktracker because j

Re: cannot start tasktracker because java.lang.NullPointerException

2014-09-01 Thread Harsh J
It appears you have made changes to the source and recompiled it. The actual release source line 247 of the failing class can be seen at https://github.com/apache/hadoop-common/blob/release-1.2.1/src/mapred/org/apache/hadoop/mapred/TaskTracker.java#L247, which can never end in a NPE. You need to f

Re: Tez and MapReduce

2014-09-01 Thread Alexander Pivovarov
e.g. in hive to switch engines set hive.execution.engine=mr; or set hive.execution.engine=tez; tez is faster especially on complex queries. On Aug 31, 2014 10:33 PM, "Adaryl "Bob" Wakefield, MBA" < adaryl.wakefi...@hotmail.com> wrote: > Can Tez and MapReduce live together and get along in the s

unsubscribe

2014-09-01 Thread Stanislaw Vasiljev
unsubscribe

Unsubscribe

2014-09-01 Thread Sankar
" X-Mailer: iPhone Mail (11D257) Sent from my iPhone

Re: Tez and MapReduce

2014-09-01 Thread jay vyas
Yes as an example of running a mapreduce job followed by a tez you can see our last post on this https://blogs.apache.org/bigtop/entry/testing_apache_tez_with_apache . You can see in the bigtop/tez testing blogpost that you can confirm that Tez is being used easily on the web ui. >From TezClent.j

Re: Tez and MapReduce

2014-09-01 Thread Bing Jiang
By the way, mapreduce.framework.name can be set yarn or yarn-tez. It will make differences. 2014-09-02 8:24 GMT+08:00 jay vyas : > Yes as an example of running a mapreduce job followed by a tez you can see > our last post on this > https://blogs.apache.org/bigtop/entry/testing_apache_tez_with_ap

Re: Replication factor affecting write performance

2014-09-01 Thread Stanley Shi
What's the network setup and topology? Also, the size of the cluster? On Mon, Sep 1, 2014 at 4:10 PM, Laurens Bronwasser < laurens.bronwas...@imc.nl> wrote: > And now with the right label on the Y-axis. > > > From: Microsoft Office User > Date: Monday, September 1, 2014 at 9:56 AM > To: "use

RE: Replication factor affecting write performance

2014-09-01 Thread mike Zarrin
Unsubscribe From: Stanley Shi [mailto:s...@pivotal.io] Sent: Monday, September 01, 2014 7:31 PM To: user@hadoop.apache.org Cc: Julien Lehuen; Tyler McDougall Subject: Re: Replication factor affecting write performance What's the network setup and topology? Also, the size of the cluster

RE: Replication factor affecting write performance

2014-09-01 Thread Vimal Jain
Mike, Plz send email to user-unsubscr...@hbase.apache.org Dont spam entire mailing list. Unsubscribe *From:* Stanley Shi [mailto:s...@pivotal.io] *Sent:* Monday, September 01, 2014 7:31 PM *To:* user@hadoop.apache.org *Cc:* Julien Lehuen; Tyler McDougall *Subject:* Re: Replication factor affe

Re: Hadoop 2.5.0 unit tests failures

2014-09-01 Thread Ray Chiang
Just as a quick follow up, you can also search the JIRAs to see which tests are already known to be on the flakier side (e.g. race conditions like Zhijie mentions, or some similar hard-to-replicate reason). -Ray On Sun, Aug 31, 2014 at 11:40 PM, Zhijie Shen wrote: > Hi Rajat, > > It is the si

Re: Hadoop InputFormat - Processing large number of small files

2014-09-01 Thread rab ra
Hi > > > > I tried to use your CombileFileInputFormat implementation. However, I get the following exception > > > > ‘not org.apache.hadoop.mapred.InputFormat’ > > > > I am using hadoop 2.4.1 and it looks like it expect older interface as it does not accept ‘org.apache.hadoop.mapreduce.lib.input.Co

Any issue with large concurrency due to single active instance of YARN Resource Manager?

2014-09-01 Thread bo yang
Hi Guys, I am thinking how many concurrent jobs a single Resource Manager might be able to manage? Following is my understanding, please correct me if I am wrong. Let's say if we have 1000 concurrent jobs running. Resource Manager will have 1000 records in memory to manage these jobs. And it will