write to most datanode fail quickly

2014-10-14 Thread sunww
HiI'm using hbase with about 20 regionserver. And  one regionserver failed 
to write  most of datanodes quickly, finally cause this regionserver die. While 
other regionserver is ok. 
logs like this:java.io.IOException: Bad response ERROR for block 
BP-165080589-132.228.248.11-1371617709677:blk_5069077415583579127_39339217 from 
datanode 132.228.248.20:50010  at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:681)2014-10-13
 09:23:01,227 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block 
BP-165080589-132.228.248.11-1371617709677:blk_5069077415583579127_39339217 in 
pipeline 132.228.248.17:50010, 132.228.248.20:50010, 132.228.248.41:50010: bad 
datanode 132.228.248.20:500102014-10-13 09:23:32,021 WARN 
org.apache.hadoop.hdfs.DFSClient: DFSOutputStream ResponseProcessor exception  
for block 
BP-165080589-132.228.248.11-1371617709677:blk_5069077415583579127_39339415java.io.IOException:
 Bad response ERROR for block 
BP-165080589-132.228.248.11-1371617709677:blk_5069077415583579127_39339415 from 
datanode 132.228.248.41:50010 at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:681)
then serveral  firstBadLink error 2014-10-13 09:23:33,390 
INFO org.apache.hadoop.hdfs.DFSClient: Exception in 
createBlockOutputStreamjava.io.IOException: Bad connect ack with firstBadLink 
as 132.228.248.18:50010 at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1090)
then serveral Failed to add a datanode2014-10-13 09:23:44,331 
WARN org.apache.hadoop.hdfs.DFSClient: Error while syncingjava.io.IOException: 
Failed to add a datanode.  User may turn off this feature by setting 
dfs.client.block.write.replace-datanode-on-failure.policy in configuration, 
where the current policy is DEFAULT.  (Nodes: current=[132.228.248.17:50010, 
132.228.248.35:50010], original=[132.228.248.17:50010, 132.228.248.35:50010])
the full log is in http://paste2.org/xfn16jm2Any suggestion will be 
appreciated. Thanks.  

Re: write to most datanode fail quickly

2014-10-14 Thread Ted Yu
Which Hadoop release are you using ?

Have you run fsck ?

Cheers

On Oct 14, 2014, at 2:31 AM, sunww spe...@outlook.com wrote:

 Hi
 I'm using hbase with about 20 regionserver. And  one regionserver failed 
 to write  most of datanodes quickly, finally cause this regionserver die. 
 While other regionserver is ok. 
 
 logs like this:
 
 java.io.IOException: Bad response ERROR for block 
 BP-165080589-132.228.248.11-1371617709677:blk_5069077415583579127_39339217 
 from datanode 132.228.248.20:50010
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:681)
 2014-10-13 09:23:01,227 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery 
 for block 
 BP-165080589-132.228.248.11-1371617709677:blk_5069077415583579127_39339217 in 
 pipeline 132.228.248.17:50010, 132.228.248.20:50010, 132.228.248.41:50010: 
 bad datanode 132.228.248.20:50010
 2014-10-13 09:23:32,021 WARN org.apache.hadoop.hdfs.DFSClient: 
 DFSOutputStream ResponseProcessor exception  for block 
 BP-165080589-132.228.248.11-1371617709677:blk_5069077415583579127_39339415
 java.io.IOException: Bad response ERROR for block 
 BP-165080589-132.228.248.11-1371617709677:blk_5069077415583579127_39339415 
 from datanode 132.228.248.41:50010
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:681)
 
 
 
 then serveral  firstBadLink error 
 2014-10-13 09:23:33,390 INFO org.apache.hadoop.hdfs.DFSClient: Exception 
 in createBlockOutputStream
 java.io.IOException: Bad connect ack with firstBadLink as 132.228.248.18:50010
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1090)
 
 
 then serveral Failed to add a datanode
 2014-10-13 09:23:44,331 WARN org.apache.hadoop.hdfs.DFSClient: Error 
 while syncing
 java.io.IOException: Failed to add a datanode.  User may turn off this 
 feature by setting dfs.client.block.write.replace-datanode-on-failure.policy 
 in configuration, where the current policy is DEFAULT.  (Nodes: 
 current=[132.228.248.17:50010, 132.228.248.35:50010], 
 original=[132.228.248.17:50010, 132.228.248.35:50010])
 
 the full log is in http://paste2.org/xfn16jm2
 
 Any suggestion will be appreciated. Thanks.


RE: write to most datanode fail quickly

2014-10-14 Thread sunww

I'm using Hadoop 2.0.0 and  not  run fsck.  only one regionserver have these 
dfs logs,   strange.

Thanks
CC: user@hadoop.apache.org
From: yuzhih...@gmail.com
Subject: Re: write   to most datanode fail quickly
Date: Tue, 14 Oct 2014 02:43:26 -0700
To: user@hadoop.apache.org

Which Hadoop release are you using ?
Have you run fsck ?
Cheers
On Oct 14, 2014, at 2:31 AM, sunww spe...@outlook.com wrote:




HiI'm using hbase with about 20 regionserver. And  one regionserver failed 
to write  most of datanodes quickly, finally cause this regionserver die. While 
other regionserver is ok. 
logs like this:java.io.IOException: Bad response ERROR for block 
BP-165080589-132.228.248.11-1371617709677:blk_5069077415583579127_39339217 from 
datanode 132.228.248.20:50010  at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:681)2014-10-13
 09:23:01,227 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block 
BP-165080589-132.228.248.11-1371617709677:blk_5069077415583579127_39339217 in 
pipeline 132.228.248.17:50010, 132.228.248.20:50010, 132.228.248.41:50010: bad 
datanode 132.228.248.20:500102014-10-13 09:23:32,021 WARN 
org.apache.hadoop.hdfs.DFSClient: DFSOutputStream ResponseProcessor exception  
for block 
BP-165080589-132.228.248.11-1371617709677:blk_5069077415583579127_39339415java.io.IOException:
 Bad response ERROR for block 
BP-165080589-132.228.248.11-1371617709677:blk_5069077415583579127_39339415 from 
datanode 132.228.248.41:50010 at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:681)
then serveral  firstBadLink error 2014-10-13 09:23:33,390 
INFO org.apache.hadoop.hdfs.DFSClient: Exception in 
createBlockOutputStreamjava.io.IOException: Bad connect ack with firstBadLink 
as 132.228.248.18:50010 at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1090)
then serveral Failed to add a datanode2014-10-13 09:23:44,331 
WARN org.apache.hadoop.hdfs.DFSClient: Error while syncingjava.io.IOException: 
Failed to add a datanode.  User may turn off this feature by setting 
dfs.client.block.write.replace-datanode-on-failure.policy in configuration, 
where the current policy is DEFAULT.  (Nodes: current=[132.228.248.17:50010, 
132.228.248.35:50010], original=[132.228.248.17:50010, 132.228.248.35:50010])
the full log is in http://paste2.org/xfn16jm2Any suggestion will be 
appreciated. Thanks.  
  

Re: write to most datanode fail quickly

2014-10-14 Thread Ted Yu
Can you check NameNode log for 132.228.48.20 ?

Have you turned on short circuit read ?

Cheers

On Oct 14, 2014, at 3:00 AM, sunww spe...@outlook.com wrote:

 
 I'm using Hadoop 2.0.0 and  not  run fsck.  
 only one regionserver have these dfs logs,   strange.
 
 Thanks
 CC: user@hadoop.apache.org
 From: yuzhih...@gmail.com
 Subject: Re: write to most datanode fail quickly
 Date: Tue, 14 Oct 2014 02:43:26 -0700
 To: user@hadoop.apache.org
 
 Which Hadoop release are you using ?
 
 Have you run fsck ?
 
 Cheers
 
 On Oct 14, 2014, at 2:31 AM, sunww spe...@outlook.com wrote:
 
 Hi
 I'm using hbase with about 20 regionserver. And  one regionserver failed 
 to write  most of datanodes quickly, finally cause this regionserver die. 
 While other regionserver is ok. 
 
 logs like this:
 
 java.io.IOException: Bad response ERROR for block 
 BP-165080589-132.228.248.11-1371617709677:blk_5069077415583579127_39339217 
 from datanode 132.228.248.20:50010
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:681)
 2014-10-13 09:23:01,227 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery 
 for block 
 BP-165080589-132.228.248.11-1371617709677:blk_5069077415583579127_39339217 in 
 pipeline 132.228.248.17:50010, 132.228.248.20:50010, 132.228.248.41:50010: 
 bad datanode 132.228.248.20:50010
 2014-10-13 09:23:32,021 WARN org.apache.hadoop.hdfs.DFSClient: 
 DFSOutputStream ResponseProcessor exception  for block 
 BP-165080589-132.228.248.11-1371617709677:blk_5069077415583579127_39339415
 java.io.IOException: Bad response ERROR for block 
 BP-165080589-132.228.248.11-1371617709677:blk_5069077415583579127_39339415 
 from datanode 132.228.248.41:50010
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:681)
 
 
 
 then serveral  firstBadLink error 
 2014-10-13 09:23:33,390 INFO org.apache.hadoop.hdfs.DFSClient: Exception 
 in createBlockOutputStream
 java.io.IOException: Bad connect ack with firstBadLink as 132.228.248.18:50010
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1090)
 
 
 then serveral Failed to add a datanode
 2014-10-13 09:23:44,331 WARN org.apache.hadoop.hdfs.DFSClient: Error 
 while syncing
 java.io.IOException: Failed to add a datanode.  User may turn off this 
 feature by setting dfs.client.block.write.replace-datanode-on-failure.policy 
 in configuration, where the current policy is DEFAULT.  (Nodes: 
 current=[132.228.248.17:50010, 132.228.248.35:50010], 
 original=[132.228.248.17:50010, 132.228.248.35:50010])
 
 the full log is in http://paste2.org/xfn16jm2
 
 Any suggestion will be appreciated. Thanks.


RE: write to most datanode fail quickly

2014-10-14 Thread sunww
Hi
dfs.client.read.shortcircuit is true.
this is namenode log at that moment:http://paste2.org/U0zDA9ms
It seems like there is no special in namenode log. 

Thanks
CC: user@hadoop.apache.org
From: yuzhih...@gmail.com
Subject: Re: write   to most datanode fail quickly
Date: Tue, 14 Oct 2014 03:09:24 -0700
To: user@hadoop.apache.org

Can you check NameNode log for 132.228.48.20 ?
Have you turned on short circuit read ?
Cheers
On Oct 14, 2014, at 3:00 AM, sunww spe...@outlook.com wrote:





I'm using Hadoop 2.0.0 and  not  run fsck.  only one regionserver have these 
dfs logs,   strange.

Thanks
CC: user@hadoop.apache.org
From: yuzhih...@gmail.com
Subject: Re: write   to most datanode fail quickly
Date: Tue, 14 Oct 2014 02:43:26 -0700
To: user@hadoop.apache.org

Which Hadoop release are you using ?
Have you run fsck ?
Cheers
On Oct 14, 2014, at 2:31 AM, sunww spe...@outlook.com wrote:




HiI'm using hbase with about 20 regionserver. And  one regionserver failed 
to write  most of datanodes quickly, finally cause this regionserver die. While 
other regionserver is ok. 
logs like this:java.io.IOException: Bad response ERROR for block 
BP-165080589-132.228.248.11-1371617709677:blk_5069077415583579127_39339217 from 
datanode 132.228.248.20:50010  at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:681)2014-10-13
 09:23:01,227 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block 
BP-165080589-132.228.248.11-1371617709677:blk_5069077415583579127_39339217 in 
pipeline 132.228.248.17:50010, 132.228.248.20:50010, 132.228.248.41:50010: bad 
datanode 132.228.248.20:500102014-10-13 09:23:32,021 WARN 
org.apache.hadoop.hdfs.DFSClient: DFSOutputStream ResponseProcessor exception  
for block 
BP-165080589-132.228.248.11-1371617709677:blk_5069077415583579127_39339415java.io.IOException:
 Bad response ERROR for block 
BP-165080589-132.228.248.11-1371617709677:blk_5069077415583579127_39339415 from 
datanode 132.228.248.41:50010 at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:681)
then serveral  firstBadLink error 2014-10-13 09:23:33,390 
INFO org.apache.hadoop.hdfs.DFSClient: Exception in 
createBlockOutputStreamjava.io.IOException: Bad connect ack with firstBadLink 
as 132.228.248.18:50010 at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1090)
then serveral Failed to add a datanode2014-10-13 09:23:44,331 
WARN org.apache.hadoop.hdfs.DFSClient: Error while syncingjava.io.IOException: 
Failed to add a datanode.  User may turn off this feature by setting 
dfs.client.block.write.replace-datanode-on-failure.policy in configuration, 
where the current policy is DEFAULT.  (Nodes: current=[132.228.248.17:50010, 
132.228.248.35:50010], original=[132.228.248.17:50010, 132.228.248.35:50010])
the full log is in http://paste2.org/xfn16jm2Any suggestion will be 
appreciated. Thanks.  
  
  

The filesystem under path '/' has n CORRUPT files

2014-10-14 Thread Margusja

Hi

I am playing with hadoop-2 filesystem. I have two namenodes with HA and 
six datanodes.

I tried different configurations and killed namenodes ans so on...
Now I have situation where most of my data are there but some corrupted 
blocks exists.
hdfs fsck / - gives my loads of Under replicated blocks. Will they 
recover? My replica factor is 3.

Filespystem Status: HEALTHY
Via Web UI I see many missing blocks message.

hdfs fsck / -list-corruptfileblocks gives me many corrupted blocks.
In example - blk_1073745897  /user/hue/cdr/2014/12/10/table10.csv
[hdfs@bigdata1 dfs]$ hdfs fsck /user/hue/cdr/2014/12/10/table10.csv 
-files -locations -blocks

Connecting to namenode via http://namenode1:50070
FSCK started by hdfs (auth:SIMPLE) from /192.168.81.108 for path 
/user/hue/cdr/2014/12/10/table10.csv at Tue Oct 14 16:51:55 EEST 2014

Path '/user/hue/cdr/2014/12/10/table10.csv' does not exist

As I understand There is nothing to do.
Tried to delete it
[hdfs@bigdata1 dfs]$ hdfs dfs -rm /user/hue/cdr/2014/12/10/table10.csv
rm: `/user/hue/cdr/2014/12/10/table10.csv': No such file or directory

So what sould I do?

--
Best regards, Margus (Margusja) Roo
+372 51 48 780
http://margus.roo.ee
http://ee.linkedin.com/in/margusroo
skype: margusja
ldapsearch -x -h ldap.sk.ee -b c=EE (serialNumber=37303140314)



C++ development framework under Hadoop

2014-10-14 Thread Y. Z.

Hi Experts,

I'm going to to do some computation-intensive operation under Hadoop 
framework. I'm wondering which is the best way to code in C++ under 
Hadoop framework? I'm aware of three options: Hadoop Streaming, Hadoop 
Pipes, and Hadoop C++ Extension. I heard that Hadoop Pipes has/would be 
deprecated in Hadoop 2.*. I'm also not sure if Hadoop C++ Extension is 
still well maintained. Meanwhile, Hadoop Streaming has high I/O overhead.


What are your opinions? Thanks!

--
Sincerely,
Y. Z.



Trying MapReduce with MRUnit and MultipleOutput.

2014-10-14 Thread gortiz
I'm trying to test some MapReduces with MRUnit 1.1.0, but I didn't get 
results.


  The code that I execute is:

mapTextDriver.withInput(new LongWritable(1), new Text(content));
ListPairNullWritable, Text outputs = mapTextDriver.run();

But, I never got an output,  the list always has size 0.

I was reading the JIRA https://issues.apache.org/jira/browse/MRUNIT-13 
where they add the new feature to MRUnit but I don't know what I'm 
missing..


I included the annotations:

@RunWith(..)
@PrepareForTest(..)

I tried as well to execute the test like:
   mapTextDriver.runTest();
and it works. But I like more to get a list of result and analyze them.


AVISO CONFIDENCIAL\nEste correo y la información contenida o adjunta al mismo 
es privada y confidencial y va dirigida exclusivamente a su destinatario. 
Pragsis informa a quien pueda haber recibido este correo por error que contiene 
información confidencial cuyo uso, copia, reproducción o distribución está 
expresamente prohibida. Si no es Vd. el destinatario del mismo y recibe este 
correo por error, le rogamos lo ponga en conocimiento del emisor y proceda a su 
eliminación sin copiarlo, imprimirlo o utilizarlo de ningún 
modo.\nCONFIDENTIALITY WARNING.\nThis message and the information contained in 
or attached to it are private and confidential and intended exclusively for the 
addressee. Pragsis informs to whom it may receive it in error that it contains 
privileged information and its use, copy, reproduction or distribution is 
prohibited. If you are not an intended recipient of this E-mail, please notify 
the sender, delete it and do not read, act upon, print, disclose, copy, reta
in or redistribute any portion of this E-mail.


Re: write to most datanode fail quickly

2014-10-14 Thread Ted Yu
132.228.48.20 didn't show up in the snippet (spanning 3 minutes only) you
posted.

I don't see error or exception either.

Perhaps search in wider scope.

On Tue, Oct 14, 2014 at 5:36 AM, sunww spe...@outlook.com wrote:

 Hi

 dfs.client.read.shortcircuit is true.

 this is namenode log at that moment:
 http://paste2.org/U0zDA9ms

 It seems like there is no special in namenode log.

 Thanks
 --
 CC: user@hadoop.apache.org
 From: yuzhih...@gmail.com
 Subject: Re: write to most datanode fail quickly
 Date: Tue, 14 Oct 2014 03:09:24 -0700

 To: user@hadoop.apache.org

 Can you check NameNode log for 132.228.48.20 ?

 Have you turned on short circuit read ?

 Cheers

 On Oct 14, 2014, at 3:00 AM, sunww spe...@outlook.com wrote:


 I'm using Hadoop 2.0.0 and  not  run fsck.
 only one regionserver have these dfs logs,   strange.

 Thanks
 --
 CC: user@hadoop.apache.org
 From: yuzhih...@gmail.com
 Subject: Re: write to most datanode fail quickly
 Date: Tue, 14 Oct 2014 02:43:26 -0700
 To: user@hadoop.apache.org

 Which Hadoop release are you using ?

 Have you run fsck ?

 Cheers

 On Oct 14, 2014, at 2:31 AM, sunww spe...@outlook.com wrote:

 Hi
 I'm using hbase with about 20 regionserver. And  one regionserver
 failed to write  most of datanodes quickly, finally cause this
 regionserver die. While other regionserver is ok.

 logs like this:

 java.io.IOException: Bad response ERROR for block
 BP-165080589-132.228.248.11-1371617709677:blk_5069077415583579127_39339217
 from datanode 132.228.248.20:50010
 at
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:681)
 2014-10-13 09:23:01,227 WARN org.apache.hadoop.hdfs.DFSClient: Error
 Recovery for block
 BP-165080589-132.228.248.11-1371617709677:blk_5069077415583579127_39339217
 in pipeline 132.228.248.17:50010, 132.228.248.20:50010,
 132.228.248.41:50010: bad datanode 132.228.248.20:50010
 2014-10-13 09:23:32,021 WARN org.apache.hadoop.hdfs.DFSClient:
 DFSOutputStream ResponseProcessor exception  for block
 BP-165080589-132.228.248.11-1371617709677:blk_5069077415583579127_39339415
 java.io.IOException: Bad response ERROR for block
 BP-165080589-132.228.248.11-1371617709677:blk_5069077415583579127_39339415
 from datanode 132.228.248.41:50010
 at
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:681)



 then serveral  firstBadLink error 
 2014-10-13 09:23:33,390 INFO org.apache.hadoop.hdfs.DFSClient:
 Exception in createBlockOutputStream
 java.io.IOException: Bad connect ack with firstBadLink as
 132.228.248.18:50010
 at
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1090)


 then serveral Failed to add a datanode
 2014-10-13 09:23:44,331 WARN org.apache.hadoop.hdfs.DFSClient: Error
 while syncing
 java.io.IOException: Failed to add a datanode.  User may turn off this
 feature by setting
 dfs.client.block.write.replace-datanode-on-failure.policy in configuration,
 where the current policy is DEFAULT.  (Nodes: current=[
 132.228.248.17:50010, 132.228.248.35:50010], original=[
 132.228.248.17:50010, 132.228.248.35:50010])

 the full log is in http://paste2.org/xfn16jm2

 Any suggestion will be appreciated. Thanks.




Building 1.2.1

2014-10-14 Thread buddhika chamith
Hi All,

Apologies if this has been addressed somewhere already. But I couldn't find
relevant information on building from source at [1]. I have downloaded
1.2.1 from [2]. Any pointers appreciated. (I am on OSX. But I could switch
to Linux if required which I expect to be the case?).

Regards
Bud

[1] http://hadoop.apache.org/docs/r1.2.1/
[2] http://mirrors.sonic.net/apache/hadoop/common/hadoop-1.2.1/


Re: Building 1.2.1

2014-10-14 Thread Ray Chiang
Try the instructions for branch-1 at

https://wiki.apache.org/hadoop/QwertyManiac/BuildingHadoopTrunk

-Ray


On Tue, Oct 14, 2014 at 11:08 AM, buddhika chamith chamibuddh...@gmail.com
wrote:

 Hi All,

 Apologies if this has been addressed somewhere already. But I couldn't
 find relevant information on building from source at [1]. I have downloaded
 1.2.1 from [2]. Any pointers appreciated. (I am on OSX. But I could switch
 to Linux if required which I expect to be the case?).

 Regards
 Bud

 [1] http://hadoop.apache.org/docs/r1.2.1/
 [2] http://mirrors.sonic.net/apache/hadoop/common/hadoop-1.2.1/



Unmanaged AM in secure cluster

2014-10-14 Thread Sevada Abraamyan
Hi,

I am trying to find out if the Yarn unmanaged AM can be used in a secure
cluster (Kerberized). I stumbled upon a ticket from 2012 that points to the
fact that there might not be a way for the tokens to be passed to unmanaged
AM yet https://issues.apache.org/jira/browse/YARN-937. However, I am aware
that Llama is using the unmanaged AM and they've been able to use it in a
secure cluster. Maybe this ticket is no longer in sync with the current
state of the project?

Can someone with deeper understanding of Yarn internals please clear up
this issue for me?

Thanks!


RE: write to most datanode fail quickly

2014-10-14 Thread sunww
Hithe correct  ip is  132.228.248.20.I check  hdfs log in  the dead 
regionserver, it have some error message, maybe it's useful.
http://paste2.org/NwpcaGVv
Thanks

Date: Tue, 14 Oct 2014 10:28:31 -0700
Subject: Re: write to most datanode fail quickly
From: yuzhih...@gmail.com
To: user@hadoop.apache.org

132.228.48.20 didn't show up in the snippet (spanning 3 minutes only) you 
posted.

I don't see error or exception either.
Perhaps search in wider scope.
On Tue, Oct 14, 2014 at 5:36 AM, sunww spe...@outlook.com wrote:



Hi
dfs.client.read.shortcircuit is true.
this is namenode log at that moment:http://paste2.org/U0zDA9ms
It seems like there is no special in namenode log. 

Thanks
CC: user@hadoop.apache.org
From: yuzhih...@gmail.com
Subject: Re: write   to most datanode fail quickly
Date: Tue, 14 Oct 2014 03:09:24 -0700
To: user@hadoop.apache.org

Can you check NameNode log for 132.228.48.20 ?
Have you turned on short circuit read ?
Cheers
On Oct 14, 2014, at 3:00 AM, sunww spe...@outlook.com wrote:





I'm using Hadoop 2.0.0 and  not  run fsck.  only one regionserver have these 
dfs logs,   strange.

Thanks
CC: user@hadoop.apache.org
From: yuzhih...@gmail.com
Subject: Re: write   to most datanode fail quickly
Date: Tue, 14 Oct 2014 02:43:26 -0700
To: user@hadoop.apache.org

Which Hadoop release are you using ?
Have you run fsck ?
Cheers
On Oct 14, 2014, at 2:31 AM, sunww spe...@outlook.com wrote:




HiI'm using hbase with about 20 regionserver. And  one regionserver failed 
to write  most of datanodes quickly, finally cause this regionserver die. While 
other regionserver is ok. 
logs like this:java.io.IOException: Bad response ERROR for block 
BP-165080589-132.228.248.11-1371617709677:blk_5069077415583579127_39339217 from 
datanode 132.228.248.20:50010  at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:681)2014-10-13
 09:23:01,227 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block 
BP-165080589-132.228.248.11-1371617709677:blk_5069077415583579127_39339217 in 
pipeline 132.228.248.17:50010, 132.228.248.20:50010, 132.228.248.41:50010: bad 
datanode 132.228.248.20:500102014-10-13 09:23:32,021 WARN 
org.apache.hadoop.hdfs.DFSClient: DFSOutputStream ResponseProcessor exception  
for block 
BP-165080589-132.228.248.11-1371617709677:blk_5069077415583579127_39339415java.io.IOException:
 Bad response ERROR for block 
BP-165080589-132.228.248.11-1371617709677:blk_5069077415583579127_39339415 from 
datanode 132.228.248.41:50010 at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:681)
then serveral  firstBadLink error 2014-10-13 09:23:33,390 
INFO org.apache.hadoop.hdfs.DFSClient: Exception in 
createBlockOutputStreamjava.io.IOException: Bad connect ack with firstBadLink 
as 132.228.248.18:50010 at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1090)
then serveral Failed to add a datanode2014-10-13 09:23:44,331 
WARN org.apache.hadoop.hdfs.DFSClient: Error while syncingjava.io.IOException: 
Failed to add a datanode.  User may turn off this feature by setting 
dfs.client.block.write.replace-datanode-on-failure.policy in configuration, 
where the current policy is DEFAULT.  (Nodes: current=[132.228.248.17:50010, 
132.228.248.35:50010], original=[132.228.248.17:50010, 132.228.248.35:50010])
the full log is in http://paste2.org/xfn16jm2Any suggestion will be 
appreciated. Thanks.  
  
  

  

Re: C++ development framework under Hadoop

2014-10-14 Thread Azuryy Yu
Hadoop streaming is the best option for you. It doesn't  has high I/O
overhead if you don't add a high I/O in your c++ code.

hadoop streaming use buidin MapReduce, it just redirect input/out stream
for your c++ application.


On Tue, Oct 14, 2014 at 10:33 PM, Y. Z. zhaoyansw...@gmail.com wrote:

 Hi Experts,

 I'm going to to do some computation-intensive operation under Hadoop
 framework. I'm wondering which is the best way to code in C++ under Hadoop
 framework? I'm aware of three options: Hadoop Streaming, Hadoop Pipes, and
 Hadoop C++ Extension. I heard that Hadoop Pipes has/would be deprecated in
 Hadoop 2.*. I'm also not sure if Hadoop C++ Extension is still well
 maintained. Meanwhile, Hadoop Streaming has high I/O overhead.

 What are your opinions? Thanks!

 --
 Sincerely,
 Y. Z.




Re: C++ development framework under Hadoop

2014-10-14 Thread Y Z

Thanks, Azuryy!

I found some examples about Pipes. Is Hadoop Pipes still support in 
Hadoop 2.2?


Sincerely,
Yongan

On 10/14/2014 11:20 PM, Azuryy Yu wrote:
Hadoop streaming is the best option for you. It doesn't  has high I/O 
overhead if you don't add a high I/O in your c++ code.


hadoop streaming use buidin MapReduce, it just redirect input/out 
stream for your c++ application.



On Tue, Oct 14, 2014 at 10:33 PM, Y. Z. zhaoyansw...@gmail.com 
mailto:zhaoyansw...@gmail.com wrote:


Hi Experts,

I'm going to to do some computation-intensive operation under
Hadoop framework. I'm wondering which is the best way to code in
C++ under Hadoop framework? I'm aware of three options: Hadoop
Streaming, Hadoop Pipes, and Hadoop C++ Extension. I heard that
Hadoop Pipes has/would be deprecated in Hadoop 2.*. I'm also not
sure if Hadoop C++ Extension is still well maintained. Meanwhile,
Hadoop Streaming has high I/O overhead.

What are your opinions? Thanks!

-- 
Sincerely,

Y. Z.






Re: C++ development framework under Hadoop

2014-10-14 Thread Azuryy Yu
yes. it always supports hadoop pipe in v2.

On Wed, Oct 15, 2014 at 11:33 AM, Y Z zhaoyansw...@gmail.com wrote:

  Thanks, Azuryy!

 I found some examples about Pipes. Is Hadoop Pipes still support in Hadoop
 2.2?

 Sincerely,
 Yongan

 On 10/14/2014 11:20 PM, Azuryy Yu wrote:

 Hadoop streaming is the best option for you. It doesn't  has high I/O
 overhead if you don't add a high I/O in your c++ code.

  hadoop streaming use buidin MapReduce, it just redirect input/out stream
 for your c++ application.


 On Tue, Oct 14, 2014 at 10:33 PM, Y. Z. zhaoyansw...@gmail.com wrote:

 Hi Experts,

 I'm going to to do some computation-intensive operation under Hadoop
 framework. I'm wondering which is the best way to code in C++ under Hadoop
 framework? I'm aware of three options: Hadoop Streaming, Hadoop Pipes, and
 Hadoop C++ Extension. I heard that Hadoop Pipes has/would be deprecated in
 Hadoop 2.*. I'm also not sure if Hadoop C++ Extension is still well
 maintained. Meanwhile, Hadoop Streaming has high I/O overhead.

 What are your opinions? Thanks!

 --
 Sincerely,
 Y. Z.






Re: C++ development framework under Hadoop

2014-10-14 Thread Y Z

Thanks!:)

Sincerely,
Yongan

On 10/14/2014 11:38 PM, Azuryy Yu wrote:

yes. it always supports hadoop pipe in v2.

On Wed, Oct 15, 2014 at 11:33 AM, Y Z zhaoyansw...@gmail.com 
mailto:zhaoyansw...@gmail.com wrote:


Thanks, Azuryy!

I found some examples about Pipes. Is Hadoop Pipes still support
in Hadoop 2.2?

Sincerely,
Yongan

On 10/14/2014 11:20 PM, Azuryy Yu wrote:

Hadoop streaming is the best option for you. It doesn't  has high
I/O overhead if you don't add a high I/O in your c++ code.

hadoop streaming use buidin MapReduce, it just redirect input/out
stream for your c++ application.


On Tue, Oct 14, 2014 at 10:33 PM, Y. Z. zhaoyansw...@gmail.com
mailto:zhaoyansw...@gmail.com wrote:

Hi Experts,

I'm going to to do some computation-intensive operation under
Hadoop framework. I'm wondering which is the best way to code
in C++ under Hadoop framework? I'm aware of three options:
Hadoop Streaming, Hadoop Pipes, and Hadoop C++ Extension. I
heard that Hadoop Pipes has/would be deprecated in Hadoop
2.*. I'm also not sure if Hadoop C++ Extension is still well
maintained. Meanwhile, Hadoop Streaming has high I/O overhead.

What are your opinions? Thanks!

-- 
Sincerely,

Y. Z.