date:20140220

Service Level Authorization

2014-02-20 Thread Juan Carlos

Where could I find some information about ACL? I only could find the
available in
http://hadoop.apache.org/docs/r2.2.0/hadoop-project-dist/hadoop-common/ServiceLevelAuth.html,
which isn't so detailed.
Regards

Juan Carlos Fernández Rodríguez
Consultor Tecnológico

Telf: +34918105294
Móvil: +34639311788

CEDIANT
Centro para el Desarrollo, Investigación y Aplicación de Nuevas Tecnologías
HPC Business Solutions

* AVISO LEGAL *
Este mensaje es solamente para la persona a la que va dirigido. Puede
contener información confidencial o legalmente protegida. No hay renuncia a
la confidencialidad o privilegio por cualquier transmisión mala/errónea. Si
usted ha recibido este mensaje por error,le rogamos que borre de su sistema
inmediatamente el mensaje asi como todas sus copias, destruya todas las
copias del mismo de su disco duro y notifique al remitente. No debe,
directa o indirectamente, usar, revelar, distribuir, imprimir o copiar
ninguna de las partes de este mensaje si no es usted el destinatario.
Cualquier opinión expresada en este mensaje proviene del remitente, excepto
cuando el mensaje establezca lo contrario y el remitente esté autorizado
para establecer que dichas opiniones provienen de 'CEDIANT'. Nótese que el
correo electrónico vía Internet no permite asegurar ni la confidencialidad
de los mensajes que se transmiten ni la correcta recepción de los mismos.
En el caso de que el destinatario de este mensaje no consintiera la
utilización del correo electrónico vía Internet, rogamos lo ponga en
nuestro conocimiento de manera inmediata.

* DISCLAIMER *
This message is intended exclusively for the named person. It may contain
confidential, propietary or legally privileged information. No
confidentiality or privilege is waived or lost by any mistransmission. If
you receive this message in error, please immediately delete it and all
copies of it from your system, destroy any hard copies of it an notify the
sender. Your must not, directly or indirectly, use, disclose, distribute,
print, or copy any part of this message if you are not the intended
recipient. Any views expressed in this message are those of the individual
sender, except where the message states otherwise and the sender is
authorised to state them to be the views of 'CEDIANT'. Please note that
internet e-mail neither guarantees the confidentiality nor the proper
receipt of the message sent. If the addressee of this message does not
consent to the use of internet e-mail, please communicate it to us
immediately.

[no subject]

2014-02-20 Thread x

Reg:Hive query with mapreduce

2014-02-20 Thread Ranjini Rathinam

Hi,

How to implement the Hive query such as

select * from table comp;

select empId from comp where sal12000;

in mapreduce.

Need to use this query in mapreduce code. How to implement the above query
in the code using mapreduce , JAVA.


Please provide the sample code.

Thanks in advance for the support

Regards

Ranjini

Re: Reg:Hive query with mapreduce

2014-02-20 Thread Nitin Pawar

try this

http://ysmart.cse.ohio-state.edu/online.html


On Thu, Feb 20, 2014 at 5:55 PM, Ranjini Rathinam ranjinibe...@gmail.comwrote:

 Hi,

 How to implement the Hive query such as

 select * from table comp;

 select empId from comp where sal12000;

 in mapreduce.

 Need to use this query in mapreduce code. How to implement the above query
 in the code using mapreduce , JAVA.


 Please provide the sample code.

 Thanks in advance for the support

 Regards

 Ranjini








-- 
Nitin Pawar

Re: Service Level Authorization

2014-02-20 Thread Alex Nastetsky

Juan,

What kind of information are you looking for? The service level ACLs are
for limiting which services can communicate under certain protocols, by
username or user group.

Perhaps you are looking for client level ACL, something like the MapReduce
ACLs?
https://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html#Job+Authorization

Alex.

2014-02-20 4:58 GMT-05:00 Juan Carlos jcfernan...@cediant.es:

Juan Carlos Fernández Rodríguez
Consultor Tecnológico

Telf: +34918105294
Móvil: +34639311788

CEDIANT
Centro para el Desarrollo, Investigación y Aplicación de Nuevas Tecnologías
HPC Business Solutions

Re: Service Level Authorization

2014-02-20 Thread Juan Carlos

Yes, that is what I'm looking for, but I couldn't find this information for
hadoop 2.2.0. I saw mapreduce.cluster.acls.enabled it's now the parameter
to use. But I don't know how to set my ACLs.
I'm using capacity schedurler and I've created 3 new queues test (which is
under root at the same level as default) and test1 and test2, which are
under test. As I said, I enabled mapreduce.cluster.acls.enabled in
mapred-site.xml and later added the parameter
yarn.scheduler.capacity.root.test1.acl_submit_applications with value
jcfernandez . If I submit a job to queue test1 with user hadoop, it
allows it to run it.
Which is my error?

2014-02-20 16:41 GMT+01:00 Alex Nastetsky anastet...@spryinc.com:

Juan,

What kind of information are you looking for? The service level ACLs are
for limiting which services can communicate under certain protocols, by
username or user group.

Perhaps you are looking for client level ACL, something like the MapReduce
ACLs?
https://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html#Job+Authorization

Alex.

2014-02-20 4:58 GMT-05:00 Juan Carlos jcfernan...@cediant.es:

Juan Carlos Fernández Rodríguez
Consultor Tecnológico

Telf: +34918105294
Móvil: +34639311788

CEDIANT
Centro para el Desarrollo, Investigación y Aplicación de Nuevas
Tecnologías
HPC Business Solutions

* DISCLAIMER *
This message is intended exclusively for the named person. It may
contain confidential, propietary or legally privileged information. No
confidentiality or privilege is waived or lost by any mistransmission. If
you receive this message in error, please immediately delete it and all
copies of it from your system, destroy any hard copies of it an notify the
sender. Your must not, directly or indirectly, use, disclose, distribute,
print, or copy any part of this message if you are not the intended
recipient. Any views expressed in this message are those of the individual
sender, except where the message states otherwise and the sender is
authorised to state them to be the views of 'CEDIANT'. Please note that
internet e-mail neither guarantees the confidentiality nor the proper
receipt of the message sent. If the addressee of this message does not
consent to the use of internet e-mail, please communicate it to us
immediately.

Re: Service Level Authorization

2014-02-20 Thread Alex Nastetsky

If your test1 queue is under test queue, then you have to specify the path
in the same way:

yarn.scheduler.capacity.root.test.test1.acl_submit_applications (you are
missing the test)

Also, if your hadoop user is a member of user group hadoop, that is the
default value of the mapreduce.cluster.administrators in mapred-site.xml.
Users of that group can submit jobs to and administer all queues.

On Thu, Feb 20, 2014 at 11:28 AM, Juan Carlos juc...@gmail.com wrote:

Yes, that is what I'm looking for, but I couldn't find this information
for hadoop 2.2.0. I saw mapreduce.cluster.acls.enabled it's now the
parameter to use. But I don't know how to set my ACLs.
I'm using capacity schedurler and I've created 3 new queues test (which is
under root at the same level as default) and test1 and test2, which are
under test. As I said, I enabled mapreduce.cluster.acls.enabled in
mapred-site.xml and later added the parameter
yarn.scheduler.capacity.root.test1.acl_submit_applications with value
jcfernandez . If I submit a job to queue test1 with user hadoop, it
allows it to run it.
Which is my error?

2014-02-20 16:41 GMT+01:00 Alex Nastetsky anastet...@spryinc.com:

Juan,

What kind of information are you looking for? The service level ACLs are
for limiting which services can communicate under certain protocols, by
username or user group.

Perhaps you are looking for client level ACL, something like the
MapReduce ACLs?
https://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html#Job+Authorization

Alex.

2014-02-20 4:58 GMT-05:00 Juan Carlos jcfernan...@cediant.es:

Juan Carlos Fernández Rodríguez
Consultor Tecnológico

Telf: +34918105294
Móvil: +34639311788

CEDIANT
Centro para el Desarrollo, Investigación y Aplicación de Nuevas
Tecnologías
HPC Business Solutions

* DISCLAIMER *
This message is intended exclusively for the named person. It may
contain confidential, propietary or legally privileged information. No
confidentiality or privilege is waived or lost by any mistransmission. If
you receive this message in error, please immediately delete it and all
copies of it from your system, destroy any hard copies of it an notify the
sender. Your must not, directly or indirectly, use, disclose, distribute,
print, or copy any part of this message if you are not the intended
recipient. Any views expressed in this message are those of the individual
sender, except where the message states otherwise and the sender is
authorised to state them to be the views of 'CEDIANT'. Please note that
internet e-mail neither guarantees the confidentiality nor the proper
receipt of the message sent. If the addressee of this message does not
consent to the use of internet e-mail, please communicate it to us
immediately.

har file globbing problem

2014-02-20 Thread Dan Buchan

We have a dataset of ~8Milllion files about .5 to 2 Megs each. And we're
having trouble getting them analysed after building a har file.

The files are already in a pre-existing directory structure, with, two
nested set of dirs with 20-100 pdfs at the bottom of each leaf of the dir
tree.

user-hadoop-/all_the_files/*/*/*.pdf

It was trivial to move these to hdfs and to build a har archive; I used the
following command to make the archive

bin/hadoop archive -archiveName test.har -p /user/hadoop/
all_the_files/*/*/ /user/hadoop/

Listing the contents of the har (bin/hadoop fs -lsr
har:///user/hadoop/epc_test.har) and everything looks as I'd expect.

When we come to run the hadoop job with this command, trying to wildcard
the archive:

bin/hadoop jar My.jar har:///user/hadoop/test.har/all_the_files/*/*/ output

it fails with the following exception

Exception in thread main java.lang.IllegalArgumentException: Can not
create a Path from an empty string

Running the job with the non-archived files is fine i.e:

bin/hadoop jar My.jar all_the_files/*/*/ output

However this only works for our modest test set of files. Any substantial
number of files quickly makes the namenode run out of memory.

Can you use file globs with the har archives? Is there a different way to
build the archive to just include the files which I've missed?
I appreciate that a sequence file might be a better fit for this task but
I'd like to know the solution to this issue if there is one.

-- 
 

t.  020 7739 3277
a. 131 Shoreditch High Street, London E1 6JE

datanode is slow

2014-02-20 Thread lei liu

I use Hbase0.94 and CDH4. There are 25729 tcp connections in one
machine,example:
hadoop@apayhbs081 ~ $ netstat -a | wc -l
25729

The linux configration is :
   softcore0
   hardrss 1
   hardnproc   20
   softnproc   20
   hardnproc   50
   hardnproc   0
   maxlogins   4
   nproc  20480
nofile 204800


When there are 25729 tcp connections in one machine, the datanode is very
slow.
How can I resolve the question?

Re: Reg:Hive query with mapreduce

2014-02-20 Thread Shekhar Sharma

Assuming you are using TextInputFormat and your data set is comma separated
value , where secondColumn is empId third column is salary, then your
mapfunction would look like this



public class FooMapper extends MapperLongWritable,Text,Text,NullWritable
{


public void map(LongWritable offset, Text empRecord, Context context)
{
   String[]  splits = empRecord.toString().split(,);
   double salary = Double.parseDouble(splits[2]);
   if(salary  12)
{
  context.write(new Text(splits[1],null);
}

}


set the number of reducer tasks to zero.

No of output files would be equal to number of map tasks in this case and
if you want to have single output file then

(1) Set the mapred.min.split.size=Equal to file size or some bigger value
like Long.MAX_VALUE. It will spawn only one mapper task and you will get
one output file



}

Regards,
Som Shekhar Sharma
+91-8197243810


On Thu, Feb 20, 2014 at 5:55 PM, Ranjini Rathinam ranjinibe...@gmail.comwrote:

 Hi,

 How to implement the Hive query such as

 select * from table comp;

 select empId from comp where sal12000;

 in mapreduce.

 Need to use this query in mapreduce code. How to implement the above query
 in the code using mapreduce , JAVA.


 Please provide the sample code.

 Thanks in advance for the support

 Regards

 Ranjini

Re: datanode is slow

2014-02-20 Thread Haohui Mai

It looks like your datanode is overloaded. You can scale your system by
adding more datanodes.

You can also try tighten the admission control to recover. You can lower
the number of dfs.datanode.max.transfer.threads so that the datanode
accepts fewer concurrent requests (but which also means that it serves less
number of clients).

~Haohui



On Thu, Feb 20, 2014 at 8:44 AM, lei liu liulei...@gmail.com wrote:

 I use Hbase0.94 and CDH4. There are 25729 tcp connections in one
 machine,example:
 hadoop@apayhbs081 ~ $ netstat -a | wc -l
 25729

 The linux configration is :
softcore0
hardrss 1
hardnproc   20
softnproc   20
hardnproc   50
hardnproc   0
maxlogins   4
nproc  20480
 nofile 204800


 When there are 25729 tcp connections in one machine, the datanode is very
 slow.
 How can I resolve the question?





-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

any optimize suggestion for high concurrent write into hdfs?

2014-02-20 Thread ch huang

hi,maillist:
  is there any optimize for large of write into hdfs in same time ?
thanks

history server for 2 clusters

2014-02-20 Thread Anfernee Xu

Hi,

I'm at 2.2.0 release and I have a HDFS cluster which is shared by 2
YARN(MR) cluster, also I have a single shared history server, what I'm
seeing is I can see all job summary for all jobs from history server UI, I
also can see task log for jobs running in one cluster, but if I want to see
log for jobs running in another cluster, it showed me below error

Logs not available for attempt_1392933787561_0024_m_00_0. Aggregation
may not be complete, Check back later or try the nodemanager at
slc03jvt.mydomain.com:31303

Here's my configuration:

Note: my history server is running on RM node of the MR cluster where I can
see the log.


mapred-site.xml
property
  namemapreduce.jobhistory.address/name
  valueslc00dgd:10020/value
  descriptionMapReduce JobHistory Server IPC host:port/description
/property

property
  namemapreduce.jobhistory.webapp.address/name
  valueslc00dgd:19888/value
  descriptionMapReduce JobHistory Server Web UI host:port/description
/property

--yarn-site.xml
  property
 nameyarn.log-aggregation-enable/name
 valuetrue/value
   /property

   property
 nameyarn.nodemanager.remote-app-log-dir-suffix/name
 valuedc/value
   /property

Above configuration are almost same for both clusters, the only difference
is yarn.nodemanager.remote-app-log-dir-suffix, they have different suffix.



-- 
--Anfernee

issue about write append into hdfs ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: ch12:50010:DataXceiver error processing READ_BLOCK operation

2014-02-20 Thread ch huang

hi,maillist:
  i see the following info in my hdfs log ,and the block belong to
the file which write by scribe ,i do not know why
is there any limit in hdfs system ?

2014-02-21 10:33:30,235 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: opReadBlock
BP-1043055049-192.168.11.11-1382442676609:blk_-8536558734938003208_3823240
received exc
eption java.io.IOException: Replica gen stamp  block genstamp,
block=BP-1043055049-192.168.11.11-1382442676609:blk_-8536558734938003208_3823240,
replica=ReplicaWaitingToBeRecov
ered, blk_-8536558734938003208_3820986, RWR
  getNumBytes() = 35840
  getBytesOnDisk()  = 35840
  getVisibleLength()= -1
  getVolume()   = /data/4/dn/current
  getBlockFile()=
/data/4/dn/current/BP-1043055049-192.168.11.11-1382442676609/current/rbw/blk_-8536558734938003208
  unlinked=false
2014-02-21 10:33:30,235 WARN
org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(192.168.11.12,
storageID=DS-754202132-192.168.11.12-50010-1382443087835, infoP
ort=50075, ipcPort=50020,
storageInfo=lv=-40;cid=CID-0e777b8c-19f3-44a1-8af1-916877f2506c;nsid=2086828354;c=0):Got
exception while serving BP-1043055049-192.168.11.11-1382442676
609:blk_-8536558734938003208_3823240 to /192.168.11.15:56564
java.io.IOException: Replica gen stamp  block genstamp,
block=BP-1043055049-192.168.11.11-1382442676609:blk_-8536558734938003208_3823240,
replica=ReplicaWaitingToBeRecovered, b
lk_-8536558734938003208_3820986, RWR
  getNumBytes() = 35840
  getBytesOnDisk()  = 35840
  getVisibleLength()= -1
  getVolume()   = /data/4/dn/current
  getBlockFile()=
/data/4/dn/current/BP-1043055049-192.168.11.11-1382442676609/current/rbw/blk_-8536558734938003208
  unlinked=false
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.init(BlockSender.java:205)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:326)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
at java.lang.Thread.run(Thread.java:744)
2014-02-21 10:33:30,236 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode: ch12:50010:DataXceiver
error processing READ_BLOCK operation  src: /192.168.11.15:56564 dest: /
192.168.11.12:50010
java.io.IOException: Replica gen stamp  block genstamp,
block=BP-1043055049-192.168.11.11-1382442676609:blk_-8536558734938003208_3823240,
replica=ReplicaWaitingToBeRecovered, blk_-8536558734938003208_3820986, RWR
  getNumBytes() = 35840
  getBytesOnDisk()  = 35840
  getVisibleLength()= -1
  getVolume()   = /data/4/dn/current
  getBlockFile()=
/data/4/dn/current/BP-1043055049-192.168.11.11-1382442676609/current/rbw/blk_-8536558734938003208
  unlinked=false
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.init(BlockSender.java:205)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:326)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
at java.lang.Thread.run(Thread.java:744)

No job shown in Hadoop resource manager web UI when running jobs in the cluster

2014-02-20 Thread Chen, Richard

Dear group,

I compiled hadoop 2.2.0 x64 and running it on a cluster. When I do hadoop job 
-list or hadoop job -list all, it throws a NPE like this:
14/01/28 17:18:39 INFO Configuration.deprecation: session.id is deprecated. 
Instead, use dfs.metrics.session-id
14/01/28 17:18:39 INFO jvm.JvmMetrics: Initializing JVM Metrics with 
processName=JobTracker, sessionId=
Exception in thread main java.lang.NullPointerException
at org.apache.hadoop.mapreduce.tools.CLI.listJobs(CLI.java:504)
at org.apache.hadoop.mapreduce.tools.CLI.run(CLI.java:312)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.mapred.JobClient.main(JobClient.java:1237)
and on hadoop webapp like jobhistory ( I turn on the jobhistory server). It 
shows no job was running and no job finishing although I was running jobs.
Please help me to solve this problem.
Thanks!!

Richard Chen

Re: issue about write append into hdfs ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: ch12:50010:DataXceiver error processing READ_BLOCK operation

2014-02-20 Thread Ted Yu

Which hadoop release are you using ?

Cheers


On Thu, Feb 20, 2014 at 8:57 PM, ch huang justlo...@gmail.com wrote:

 hi,maillist:
   i see the following info in my hdfs log ,and the block belong to
 the file which write by scribe ,i do not know why
 is there any limit in hdfs system ?

 2014-02-21 10:33:30,235 INFO
 org.apache.hadoop.hdfs.server.datanode.DataNode: opReadBlock
 BP-1043055049-192.168.11.11-1382442676609:blk_-8536558734938003208_3823240
 received exc
 eption java.io.IOException: Replica gen stamp  block genstamp,
 block=BP-1043055049-192.168.11.11-1382442676609:blk_-8536558734938003208_3823240,
 replica=ReplicaWaitingToBeRecov
 ered, blk_-8536558734938003208_3820986, RWR
   getNumBytes() = 35840
   getBytesOnDisk()  = 35840
   getVisibleLength()= -1
   getVolume()   = /data/4/dn/current
   getBlockFile()=
 /data/4/dn/current/BP-1043055049-192.168.11.11-1382442676609/current/rbw/blk_-8536558734938003208
   unlinked=false
 2014-02-21 10:33:30,235 WARN
 org.apache.hadoop.hdfs.server.datanode.DataNode:
 DatanodeRegistration(192.168.11.12,
 storageID=DS-754202132-192.168.11.12-50010-1382443087835, infoP
 ort=50075, ipcPort=50020,
 storageInfo=lv=-40;cid=CID-0e777b8c-19f3-44a1-8af1-916877f2506c;nsid=2086828354;c=0):Got
 exception while serving BP-1043055049-192.168.11.11-1382442676
 609:blk_-8536558734938003208_3823240 to /192.168.11.15:56564
 java.io.IOException: Replica gen stamp  block genstamp,
 block=BP-1043055049-192.168.11.11-1382442676609:blk_-8536558734938003208_3823240,
 replica=ReplicaWaitingToBeRecovered, b
 lk_-8536558734938003208_3820986, RWR
   getNumBytes() = 35840
   getBytesOnDisk()  = 35840
   getVisibleLength()= -1
   getVolume()   = /data/4/dn/current
   getBlockFile()=
 /data/4/dn/current/BP-1043055049-192.168.11.11-1382442676609/current/rbw/blk_-8536558734938003208
   unlinked=false
 at
 org.apache.hadoop.hdfs.server.datanode.BlockSender.init(BlockSender.java:205)
 at
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:326)
 at
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
 at
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
 at
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
 at java.lang.Thread.run(Thread.java:744)
 2014-02-21 10:33:30,236 ERROR
 org.apache.hadoop.hdfs.server.datanode.DataNode: ch12:50010:DataXceiver
 error processing READ_BLOCK operation  src: /192.168.11.15:56564 dest: /
 192.168.11.12:50010
 java.io.IOException: Replica gen stamp  block genstamp,
 block=BP-1043055049-192.168.11.11-1382442676609:blk_-8536558734938003208_3823240,
 replica=ReplicaWaitingToBeRecovered, blk_-8536558734938003208_3820986, RWR
   getNumBytes() = 35840
   getBytesOnDisk()  = 35840
   getVisibleLength()= -1
   getVolume()   = /data/4/dn/current
   getBlockFile()=
 /data/4/dn/current/BP-1043055049-192.168.11.11-1382442676609/current/rbw/blk_-8536558734938003208
   unlinked=false
 at
 org.apache.hadoop.hdfs.server.datanode.BlockSender.init(BlockSender.java:205)
 at
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:326)
 at
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
 at
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
 at
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
 at java.lang.Thread.run(Thread.java:744)

Re: any optimize suggestion for high concurrent write into hdfs?

2014-02-20 Thread Chen Wang

Ch,
you may consider using flume as it already has a flume sink that can sink
to hdfs. What I did is to set up a flume listening on an Avro sink, and
then sink to hdfs. Then in my application, i just send my data to avro
socket.
Chen


On Thu, Feb 20, 2014 at 5:07 PM, ch huang justlo...@gmail.com wrote:

 hi,maillist:
   is there any optimize for large of write into hdfs in same time
 ? thanks

Re: any optimize suggestion for high concurrent write into hdfs?

2014-02-20 Thread Suresh Srinivas

Another alternative is to write block sized chunks into multiple hdfs files 
concurrently followed by concat to all those into a single file. 

Sent from phone

 On Feb 20, 2014, at 8:15 PM, Chen Wang chen.apache.s...@gmail.com wrote:
 
 Ch,
 you may consider using flume as it already has a flume sink that can sink to 
 hdfs. What I did is to set up a flume listening on an Avro sink, and then 
 sink to hdfs. Then in my application, i just send my data to avro socket.
 Chen
 
 
 On Thu, Feb 20, 2014 at 5:07 PM, ch huang justlo...@gmail.com wrote:
 hi,maillist:
   is there any optimize for large of write into hdfs in same time ? 
 thanks
 

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: issue about write append into hdfs ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: ch12:50010:DataXceiver error processing READ_BLOCK operation

2014-02-20 Thread Anurag Tangri

Did you check your unix open file limit and data node xceiver value ?

Is it too low for the number of blocks/data in your cluster ? 

Thanks,
Anurag Tangri

 On Feb 20, 2014, at 6:57 PM, ch huang justlo...@gmail.com wrote:
 
 hi,maillist:
   i see the following info in my hdfs log ,and the block belong to 
 the file which write by scribe ,i do not know why
 is there any limit in hdfs system ?
  
 2014-02-21 10:33:30,235 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
 opReadBlock 
 BP-1043055049-192.168.11.11-1382442676609:blk_-8536558734938003208_3823240 
 received exc
 eption java.io.IOException: Replica gen stamp  block genstamp, 
 block=BP-1043055049-192.168.11.11-1382442676609:blk_-8536558734938003208_3823240,
  replica=ReplicaWaitingToBeRecov
 ered, blk_-8536558734938003208_3820986, RWR
   getNumBytes() = 35840
   getBytesOnDisk()  = 35840
   getVisibleLength()= -1
   getVolume()   = /data/4/dn/current
   getBlockFile()= 
 /data/4/dn/current/BP-1043055049-192.168.11.11-1382442676609/current/rbw/blk_-8536558734938003208
   unlinked=false
 2014-02-21 10:33:30,235 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
 DatanodeRegistration(192.168.11.12, 
 storageID=DS-754202132-192.168.11.12-50010-1382443087835, infoP
 ort=50075, ipcPort=50020, 
 storageInfo=lv=-40;cid=CID-0e777b8c-19f3-44a1-8af1-916877f2506c;nsid=2086828354;c=0):Got
  exception while serving BP-1043055049-192.168.11.11-1382442676
 609:blk_-8536558734938003208_3823240 to /192.168.11.15:56564
 java.io.IOException: Replica gen stamp  block genstamp, 
 block=BP-1043055049-192.168.11.11-1382442676609:blk_-8536558734938003208_3823240,
  replica=ReplicaWaitingToBeRecovered, b
 lk_-8536558734938003208_3820986, RWR
   getNumBytes() = 35840
   getBytesOnDisk()  = 35840
   getVisibleLength()= -1
   getVolume()   = /data/4/dn/current
   getBlockFile()= 
 /data/4/dn/current/BP-1043055049-192.168.11.11-1382442676609/current/rbw/blk_-8536558734938003208
   unlinked=false
 at 
 org.apache.hadoop.hdfs.server.datanode.BlockSender.init(BlockSender.java:205)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:326)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
 at java.lang.Thread.run(Thread.java:744)
 2014-02-21 10:33:30,236 ERROR 
 org.apache.hadoop.hdfs.server.datanode.DataNode: ch12:50010:DataXceiver error 
 processing READ_BLOCK operation  src: /192.168.11.15:56564 dest: 
 /192.168.11.12:50010
 java.io.IOException: Replica gen stamp  block genstamp, 
 block=BP-1043055049-192.168.11.11-1382442676609:blk_-8536558734938003208_3823240,
  replica=ReplicaWaitingToBeRecovered, blk_-8536558734938003208_3820986, RWR
   getNumBytes() = 35840
   getBytesOnDisk()  = 35840
   getVisibleLength()= -1
   getVolume()   = /data/4/dn/current
   getBlockFile()= 
 /data/4/dn/current/BP-1043055049-192.168.11.11-1382442676609/current/rbw/blk_-8536558734938003208
   unlinked=false
 at 
 org.apache.hadoop.hdfs.server.datanode.BlockSender.init(BlockSender.java:205)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:326)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
 at java.lang.Thread.run(Thread.java:744)

Re: history server for 2 clusters

2014-02-20 Thread Vinod Kumar Vavilapalli

Interesting use-case and setup. We never had this use-case in mind so far - we 
so far assumed a history-server per YARN cluster. You may be running into some 
issues where this assumption is not valid.

Why do you need two separate YARN clusters for the same underlying data on 
HDFS? And if that can't change, why can't you have two history-servers?

+Vinod

On Feb 20, 2014, at 6:08 PM, Anfernee Xu anfernee...@gmail.com wrote:

 Hi,
 
 I'm at 2.2.0 release and I have a HDFS cluster which is shared by 2 YARN(MR) 
 cluster, also I have a single shared history server, what I'm seeing is I can 
 see all job summary for all jobs from history server UI, I also can see task 
 log for jobs running in one cluster, but if I want to see log for jobs 
 running in another cluster, it showed me below error
 
 Logs not available for attempt_1392933787561_0024_m_00_0. Aggregation may 
 not be complete, Check back later or try the nodemanager at 
 slc03jvt.mydomain.com:31303 
 
 Here's my configuration:
 
 Note: my history server is running on RM node of the MR cluster where I can 
 see the log.
 
 
 mapred-site.xml
 property
   namemapreduce.jobhistory.address/name
   valueslc00dgd:10020/value
   descriptionMapReduce JobHistory Server IPC host:port/description
 /property
 
 property
   namemapreduce.jobhistory.webapp.address/name
   valueslc00dgd:19888/value
   descriptionMapReduce JobHistory Server Web UI host:port/description
 /property
 
 --yarn-site.xml
   property
  nameyarn.log-aggregation-enable/name
  valuetrue/value
/property
 
property
  nameyarn.nodemanager.remote-app-log-dir-suffix/name
  valuedc/value
/property
 
 Above configuration are almost same for both clusters, the only difference is 
 yarn.nodemanager.remote-app-log-dir-suffix, they have different suffix.
 
 
 
 -- 
 --Anfernee


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


signature.asc
Description: Message signed with OpenPGP using GPGMail

Re: issue about write append into hdfs ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: ch12:50010:DataXceiver error processing READ_BLOCK operation

2014-02-20 Thread ch huang

hi, i use CDH4.4

On Fri, Feb 21, 2014 at 12:04 PM, Ted Yu yuzhih...@gmail.com wrote:

 Which hadoop release are you using ?

 Cheers


 On Thu, Feb 20, 2014 at 8:57 PM, ch huang justlo...@gmail.com wrote:

  hi,maillist:
   i see the following info in my hdfs log ,and the block belong
 to the file which write by scribe ,i do not know why
 is there any limit in hdfs system ?

 2014-02-21 10:33:30,235 INFO
 org.apache.hadoop.hdfs.server.datanode.DataNode: opReadBlock
 BP-1043055049-192.168.11.11-1382442676609:blk_-8536558734938003208_3823240
 received exc
 eption java.io.IOException: Replica gen stamp  block genstamp,
 block=BP-1043055049-192.168.11.11-1382442676609:blk_-8536558734938003208_3823240,
 replica=ReplicaWaitingToBeRecov
 ered, blk_-8536558734938003208_3820986, RWR
   getNumBytes() = 35840
   getBytesOnDisk()  = 35840
   getVisibleLength()= -1
   getVolume()   = /data/4/dn/current
   getBlockFile()=
 /data/4/dn/current/BP-1043055049-192.168.11.11-1382442676609/current/rbw/blk_-8536558734938003208
   unlinked=false
 2014-02-21 10:33:30,235 WARN
 org.apache.hadoop.hdfs.server.datanode.DataNode:
 DatanodeRegistration(192.168.11.12,
 storageID=DS-754202132-192.168.11.12-50010-1382443087835, infoP
 ort=50075, ipcPort=50020,
 storageInfo=lv=-40;cid=CID-0e777b8c-19f3-44a1-8af1-916877f2506c;nsid=2086828354;c=0):Got
 exception while serving BP-1043055049-192.168.11.11-1382442676
 609:blk_-8536558734938003208_3823240 to /192.168.11.15:56564
 java.io.IOException: Replica gen stamp  block genstamp,
 block=BP-1043055049-192.168.11.11-1382442676609:blk_-8536558734938003208_3823240,
 replica=ReplicaWaitingToBeRecovered, b
 lk_-8536558734938003208_3820986, RWR
   getNumBytes() = 35840
   getBytesOnDisk()  = 35840
   getVisibleLength()= -1
   getVolume()   = /data/4/dn/current
   getBlockFile()=
 /data/4/dn/current/BP-1043055049-192.168.11.11-1382442676609/current/rbw/blk_-8536558734938003208
   unlinked=false
 at
 org.apache.hadoop.hdfs.server.datanode.BlockSender.init(BlockSender.java:205)
 at
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:326)
 at
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
 at
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
 at
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
 at java.lang.Thread.run(Thread.java:744)
 2014-02-21 10:33:30,236 ERROR
 org.apache.hadoop.hdfs.server.datanode.DataNode: ch12:50010:DataXceiver
 error processing READ_BLOCK operation  src: /192.168.11.15:56564 dest: /
 192.168.11.12:50010
 java.io.IOException: Replica gen stamp  block genstamp,
 block=BP-1043055049-192.168.11.11-1382442676609:blk_-8536558734938003208_3823240,
 replica=ReplicaWaitingToBeRecovered, blk_-8536558734938003208_3820986, RWR
   getNumBytes() = 35840
   getBytesOnDisk()  = 35840
   getVisibleLength()= -1
   getVolume()   = /data/4/dn/current
   getBlockFile()=
 /data/4/dn/current/BP-1043055049-192.168.11.11-1382442676609/current/rbw/blk_-8536558734938003208
   unlinked=false
 at
 org.apache.hadoop.hdfs.server.datanode.BlockSender.init(BlockSender.java:205)
 at
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:326)
 at
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
 at
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
 at
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
 at java.lang.Thread.run(Thread.java:744)

Re: issue about write append into hdfs ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: ch12:50010:DataXceiver error processing READ_BLOCK operation

2014-02-20 Thread ch huang

i use default value it seems the value is 4096,

and also i checked hdfs user limit ,it's large enough

-bash-4.1$ ulimit -a
core file size  (blocks, -c) 0
data seg size   (kbytes, -d) unlimited
scheduling priority (-e) 0
file size   (blocks, -f) unlimited
pending signals (-i) 514914
max locked memory   (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files  (-n) 32768
pipe size(512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority  (-r) 0
stack size  (kbytes, -s) 10240
cpu time   (seconds, -t) unlimited
max user processes  (-u) 65536
virtual memory  (kbytes, -v) unlimited
file locks  (-x) unlimited


On Fri, Feb 21, 2014 at 12:25 PM, Anurag Tangri anurag_tan...@yahoo.comwrote:

  Did you check your unix open file limit and data node xceiver value ?

 Is it too low for the number of blocks/data in your cluster ?

 Thanks,
 Anurag Tangri

 On Feb 20, 2014, at 6:57 PM, ch huang justlo...@gmail.com wrote:

   hi,maillist:
   i see the following info in my hdfs log ,and the block belong to
 the file which write by scribe ,i do not know why
 is there any limit in hdfs system ?

 2014-02-21 10:33:30,235 INFO
 org.apache.hadoop.hdfs.server.datanode.DataNode: opReadBlock
 BP-1043055049-192.168.11.11-1382442676609:blk_-8536558734938003208_3823240
 received exc
 eption java.io.IOException: Replica gen stamp  block genstamp,
 block=BP-1043055049-192.168.11.11-1382442676609:blk_-8536558734938003208_3823240,
 replica=ReplicaWaitingToBeRecov
 ered, blk_-8536558734938003208_3820986, RWR
   getNumBytes() = 35840
   getBytesOnDisk()  = 35840
   getVisibleLength()= -1
   getVolume()   = /data/4/dn/current
   getBlockFile()=
 /data/4/dn/current/BP-1043055049-192.168.11.11-1382442676609/current/rbw/blk_-8536558734938003208
   unlinked=false
 2014-02-21 10:33:30,235 WARN
 org.apache.hadoop.hdfs.server.datanode.DataNode:
 DatanodeRegistration(192.168.11.12,
 storageID=DS-754202132-192.168.11.12-50010-1382443087835, infoP
 ort=50075, ipcPort=50020,
 storageInfo=lv=-40;cid=CID-0e777b8c-19f3-44a1-8af1-916877f2506c;nsid=2086828354;c=0):Got
 exception while serving BP-1043055049-192.168.11.11-1382442676
 609:blk_-8536558734938003208_3823240 to /192.168.11.15:56564
 java.io.IOException: Replica gen stamp  block genstamp,
 block=BP-1043055049-192.168.11.11-1382442676609:blk_-8536558734938003208_3823240,
 replica=ReplicaWaitingToBeRecovered, b
 lk_-8536558734938003208_3820986, RWR
   getNumBytes() = 35840
   getBytesOnDisk()  = 35840
   getVisibleLength()= -1
   getVolume()   = /data/4/dn/current
   getBlockFile()=
 /data/4/dn/current/BP-1043055049-192.168.11.11-1382442676609/current/rbw/blk_-8536558734938003208
   unlinked=false
 at
 org.apache.hadoop.hdfs.server.datanode.BlockSender.init(BlockSender.java:205)
 at
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:326)
 at
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
 at
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
 at
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
 at java.lang.Thread.run(Thread.java:744)
 2014-02-21 10:33:30,236 ERROR
 org.apache.hadoop.hdfs.server.datanode.DataNode: ch12:50010:DataXceiver
 error processing READ_BLOCK operation  src: /192.168.11.15:56564 dest: /
 192.168.11.12:50010
 java.io.IOException: Replica gen stamp  block genstamp,
 block=BP-1043055049-192.168.11.11-1382442676609:blk_-8536558734938003208_3823240,
 replica=ReplicaWaitingToBeRecovered, blk_-8536558734938003208_3820986, RWR
   getNumBytes() = 35840
   getBytesOnDisk()  = 35840
   getVisibleLength()= -1
   getVolume()   = /data/4/dn/current
   getBlockFile()=
 /data/4/dn/current/BP-1043055049-192.168.11.11-1382442676609/current/rbw/blk_-8536558734938003208
   unlinked=false
 at
 org.apache.hadoop.hdfs.server.datanode.BlockSender.init(BlockSender.java:205)
 at
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:326)
 at
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
 at
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
 at
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
 at java.lang.Thread.run(Thread.java:744)

Re: Capacity Scheduler capacity vs. maximum-capacity

2014-02-20 Thread Vinod Kumar Vavilapalli


Yes, it does take those extra resources away back to queue B. How quickly it 
takes them away depends on whether preemption is enabled or not. If preemption 
is not enabled, it 'takes away' as and when containers from queue A start 
finishing.

+Binod

On Feb 19, 2014, at 5:35 PM, Alex Nastetsky anastet...@spryinc.com wrote:

 Will the scheduler take away the 10% from queue B and give it back to queue A 
 even if queue B needs it? If not, it would seem that the scheduler is 
 reneging on its guarantee.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


signature.asc
Description: Message signed with OpenPGP using GPGMail

Re: issue about write append into hdfs ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: ch12:50010:DataXceiver error processing READ_BLOCK operation

2014-02-20 Thread ch huang

one more question is if i need add the value of data node xceiver
need i add it to my NN config file?



On Fri, Feb 21, 2014 at 12:25 PM, Anurag Tangri anurag_tan...@yahoo.comwrote:

  Did you check your unix open file limit and data node xceiver value ?

 Is it too low for the number of blocks/data in your cluster ?

 Thanks,
 Anurag Tangri

 On Feb 20, 2014, at 6:57 PM, ch huang justlo...@gmail.com wrote:

   hi,maillist:
   i see the following info in my hdfs log ,and the block belong to
 the file which write by scribe ,i do not know why
 is there any limit in hdfs system ?

 2014-02-21 10:33:30,235 INFO
 org.apache.hadoop.hdfs.server.datanode.DataNode: opReadBlock
 BP-1043055049-192.168.11.11-1382442676609:blk_-8536558734938003208_3823240
 received exc
 eption java.io.IOException: Replica gen stamp  block genstamp,
 block=BP-1043055049-192.168.11.11-1382442676609:blk_-8536558734938003208_3823240,
 replica=ReplicaWaitingToBeRecov
 ered, blk_-8536558734938003208_3820986, RWR
   getNumBytes() = 35840
   getBytesOnDisk()  = 35840
   getVisibleLength()= -1
   getVolume()   = /data/4/dn/current
   getBlockFile()=
 /data/4/dn/current/BP-1043055049-192.168.11.11-1382442676609/current/rbw/blk_-8536558734938003208
   unlinked=false
 2014-02-21 10:33:30,235 WARN
 org.apache.hadoop.hdfs.server.datanode.DataNode:
 DatanodeRegistration(192.168.11.12,
 storageID=DS-754202132-192.168.11.12-50010-1382443087835, infoP
 ort=50075, ipcPort=50020,
 storageInfo=lv=-40;cid=CID-0e777b8c-19f3-44a1-8af1-916877f2506c;nsid=2086828354;c=0):Got
 exception while serving BP-1043055049-192.168.11.11-1382442676
 609:blk_-8536558734938003208_3823240 to /192.168.11.15:56564
 java.io.IOException: Replica gen stamp  block genstamp,
 block=BP-1043055049-192.168.11.11-1382442676609:blk_-8536558734938003208_3823240,
 replica=ReplicaWaitingToBeRecovered, b
 lk_-8536558734938003208_3820986, RWR
   getNumBytes() = 35840
   getBytesOnDisk()  = 35840
   getVisibleLength()= -1
   getVolume()   = /data/4/dn/current
   getBlockFile()=
 /data/4/dn/current/BP-1043055049-192.168.11.11-1382442676609/current/rbw/blk_-8536558734938003208
   unlinked=false
 at
 org.apache.hadoop.hdfs.server.datanode.BlockSender.init(BlockSender.java:205)
 at
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:326)
 at
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
 at
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
 at
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
 at java.lang.Thread.run(Thread.java:744)
 2014-02-21 10:33:30,236 ERROR
 org.apache.hadoop.hdfs.server.datanode.DataNode: ch12:50010:DataXceiver
 error processing READ_BLOCK operation  src: /192.168.11.15:56564 dest: /
 192.168.11.12:50010
 java.io.IOException: Replica gen stamp  block genstamp,
 block=BP-1043055049-192.168.11.11-1382442676609:blk_-8536558734938003208_3823240,
 replica=ReplicaWaitingToBeRecovered, blk_-8536558734938003208_3820986, RWR
   getNumBytes() = 35840
   getBytesOnDisk()  = 35840
   getVisibleLength()= -1
   getVolume()   = /data/4/dn/current
   getBlockFile()=
 /data/4/dn/current/BP-1043055049-192.168.11.11-1382442676609/current/rbw/blk_-8536558734938003208
   unlinked=false
 at
 org.apache.hadoop.hdfs.server.datanode.BlockSender.init(BlockSender.java:205)
 at
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:326)
 at
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
 at
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
 at
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
 at java.lang.Thread.run(Thread.java:744)

Re: issue about write append into hdfs ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: ch12:50010:DataXceiver error processing READ_BLOCK operation

2014-02-20 Thread ch huang

i changed all datanode config add dfs.datanode.max.xcievers value is 131072
and restart all DN, still no use

On Fri, Feb 21, 2014 at 12:25 PM, Anurag Tangri anurag_tan...@yahoo.comwrote:

  Did you check your unix open file limit and data node xceiver value ?

 Is it too low for the number of blocks/data in your cluster ?

 Thanks,
 Anurag Tangri

 On Feb 20, 2014, at 6:57 PM, ch huang justlo...@gmail.com wrote:

   hi,maillist:
   i see the following info in my hdfs log ,and the block belong to
 the file which write by scribe ,i do not know why
 is there any limit in hdfs system ?

 2014-02-21 10:33:30,235 INFO
 org.apache.hadoop.hdfs.server.datanode.DataNode: opReadBlock
 BP-1043055049-192.168.11.11-1382442676609:blk_-8536558734938003208_3823240
 received exc
 eption java.io.IOException: Replica gen stamp  block genstamp,
 block=BP-1043055049-192.168.11.11-1382442676609:blk_-8536558734938003208_3823240,
 replica=ReplicaWaitingToBeRecov
 ered, blk_-8536558734938003208_3820986, RWR
   getNumBytes() = 35840
   getBytesOnDisk()  = 35840
   getVisibleLength()= -1
   getVolume()   = /data/4/dn/current
   getBlockFile()=
 /data/4/dn/current/BP-1043055049-192.168.11.11-1382442676609/current/rbw/blk_-8536558734938003208
   unlinked=false
 2014-02-21 10:33:30,235 WARN
 org.apache.hadoop.hdfs.server.datanode.DataNode:
 DatanodeRegistration(192.168.11.12,
 storageID=DS-754202132-192.168.11.12-50010-1382443087835, infoP
 ort=50075, ipcPort=50020,
 storageInfo=lv=-40;cid=CID-0e777b8c-19f3-44a1-8af1-916877f2506c;nsid=2086828354;c=0):Got
 exception while serving BP-1043055049-192.168.11.11-1382442676
 609:blk_-8536558734938003208_3823240 to /192.168.11.15:56564
 java.io.IOException: Replica gen stamp  block genstamp,
 block=BP-1043055049-192.168.11.11-1382442676609:blk_-8536558734938003208_3823240,
 replica=ReplicaWaitingToBeRecovered, b
 lk_-8536558734938003208_3820986, RWR
   getNumBytes() = 35840
   getBytesOnDisk()  = 35840
   getVisibleLength()= -1
   getVolume()   = /data/4/dn/current
   getBlockFile()=
 /data/4/dn/current/BP-1043055049-192.168.11.11-1382442676609/current/rbw/blk_-8536558734938003208
   unlinked=false
 at
 org.apache.hadoop.hdfs.server.datanode.BlockSender.init(BlockSender.java:205)
 at
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:326)
 at
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
 at
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
 at
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
 at java.lang.Thread.run(Thread.java:744)
 2014-02-21 10:33:30,236 ERROR
 org.apache.hadoop.hdfs.server.datanode.DataNode: ch12:50010:DataXceiver
 error processing READ_BLOCK operation  src: /192.168.11.15:56564 dest: /
 192.168.11.12:50010
 java.io.IOException: Replica gen stamp  block genstamp,
 block=BP-1043055049-192.168.11.11-1382442676609:blk_-8536558734938003208_3823240,
 replica=ReplicaWaitingToBeRecovered, blk_-8536558734938003208_3820986, RWR
   getNumBytes() = 35840
   getBytesOnDisk()  = 35840
   getVisibleLength()= -1
   getVolume()   = /data/4/dn/current
   getBlockFile()=
 /data/4/dn/current/BP-1043055049-192.168.11.11-1382442676609/current/rbw/blk_-8536558734938003208
   unlinked=false
 at
 org.apache.hadoop.hdfs.server.datanode.BlockSender.init(BlockSender.java:205)
 at
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:326)
 at
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:92)
 at
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:64)
 at
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
 at java.lang.Thread.run(Thread.java:744)

Service Level Authorization

[no subject]

Reg:Hive query with mapreduce

Re: Reg:Hive query with mapreduce

Re: Service Level Authorization

Re: Service Level Authorization

Re: Service Level Authorization

har file globbing problem

datanode is slow

Re: Reg:Hive query with mapreduce

Re: datanode is slow

any optimize suggestion for high concurrent write into hdfs?

history server for 2 clusters

issue about write append into hdfs ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: ch12:50010:DataXceiver error processing READ_BLOCK operation

No job shown in Hadoop resource manager web UI when running jobs in the cluster

Re: issue about write append into hdfs ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: ch12:50010:DataXceiver error processing READ_BLOCK operation

Re: any optimize suggestion for high concurrent write into hdfs?

Re: any optimize suggestion for high concurrent write into hdfs?

Re: issue about write append into hdfs ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: ch12:50010:DataXceiver error processing READ_BLOCK operation

Re: history server for 2 clusters

Re: issue about write append into hdfs ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: ch12:50010:DataXceiver error processing READ_BLOCK operation

Re: issue about write append into hdfs ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: ch12:50010:DataXceiver error processing READ_BLOCK operation

Re: Capacity Scheduler capacity vs. maximum-capacity

Re: issue about write append into hdfs ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: ch12:50010:DataXceiver error processing READ_BLOCK operation

Re: issue about write append into hdfs ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: ch12:50010:DataXceiver error processing READ_BLOCK operation

25 matches

Site Navigation

Mail list logo

Footer information