Service Level Authorization

2014-02-20 Thread Juan Carlos
Where could I find some information about ACL? I only could find the available in http://hadoop.apache.org/docs/r2.2.0/hadoop-project-dist/hadoop-common/ServiceLevelAuth.html, which isn't so detailed. Regards Juan Carlos Fernández Rodríguez Consultor Tecnológico Telf: +34918105294 Móvil:

[no subject]

2014-02-20 Thread x

Reg:Hive query with mapreduce

2014-02-20 Thread Ranjini Rathinam
Hi, How to implement the Hive query such as select * from table comp; select empId from comp where sal12000; in mapreduce. Need to use this query in mapreduce code. How to implement the above query in the code using mapreduce , JAVA. Please provide the sample code. Thanks in advance for

Re: Reg:Hive query with mapreduce

2014-02-20 Thread Nitin Pawar
try this http://ysmart.cse.ohio-state.edu/online.html On Thu, Feb 20, 2014 at 5:55 PM, Ranjini Rathinam ranjinibe...@gmail.comwrote: Hi, How to implement the Hive query such as select * from table comp; select empId from comp where sal12000; in mapreduce. Need to use this query in

Re: Service Level Authorization

2014-02-20 Thread Alex Nastetsky
Juan, What kind of information are you looking for? The service level ACLs are for limiting which services can communicate under certain protocols, by username or user group. Perhaps you are looking for client level ACL, something like the MapReduce ACLs?

Re: Service Level Authorization

2014-02-20 Thread Juan Carlos
Yes, that is what I'm looking for, but I couldn't find this information for hadoop 2.2.0. I saw mapreduce.cluster.acls.enabled it's now the parameter to use. But I don't know how to set my ACLs. I'm using capacity schedurler and I've created 3 new queues test (which is under root at the same level

Re: Service Level Authorization

2014-02-20 Thread Alex Nastetsky
If your test1 queue is under test queue, then you have to specify the path in the same way: yarn.scheduler.capacity.root.test.test1.acl_submit_applications (you are missing the test) Also, if your hadoop user is a member of user group hadoop, that is the default value of the

har file globbing problem

2014-02-20 Thread Dan Buchan
We have a dataset of ~8Milllion files about .5 to 2 Megs each. And we're having trouble getting them analysed after building a har file. The files are already in a pre-existing directory structure, with, two nested set of dirs with 20-100 pdfs at the bottom of each leaf of the dir tree.

datanode is slow

2014-02-20 Thread lei liu
I use Hbase0.94 and CDH4. There are 25729 tcp connections in one machine,example: hadoop@apayhbs081 ~ $ netstat -a | wc -l 25729 The linux configration is : softcore0 hardrss 1 hardnproc 20 softnproc 20

Re: Reg:Hive query with mapreduce

2014-02-20 Thread Shekhar Sharma
Assuming you are using TextInputFormat and your data set is comma separated value , where secondColumn is empId third column is salary, then your mapfunction would look like this public class FooMapper extends MapperLongWritable,Text,Text,NullWritable { public void map(LongWritable offset,

Re: datanode is slow

2014-02-20 Thread Haohui Mai
It looks like your datanode is overloaded. You can scale your system by adding more datanodes. You can also try tighten the admission control to recover. You can lower the number of dfs.datanode.max.transfer.threads so that the datanode accepts fewer concurrent requests (but which also means that

any optimize suggestion for high concurrent write into hdfs?

2014-02-20 Thread ch huang
hi,maillist: is there any optimize for large of write into hdfs in same time ? thanks

history server for 2 clusters

2014-02-20 Thread Anfernee Xu
Hi, I'm at 2.2.0 release and I have a HDFS cluster which is shared by 2 YARN(MR) cluster, also I have a single shared history server, what I'm seeing is I can see all job summary for all jobs from history server UI, I also can see task log for jobs running in one cluster, but if I want to see log

issue about write append into hdfs ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: ch12:50010:DataXceiver error processing READ_BLOCK operation

2014-02-20 Thread ch huang
hi,maillist: i see the following info in my hdfs log ,and the block belong to the file which write by scribe ,i do not know why is there any limit in hdfs system ? 2014-02-21 10:33:30,235 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: opReadBlock

No job shown in Hadoop resource manager web UI when running jobs in the cluster

2014-02-20 Thread Chen, Richard
Dear group, I compiled hadoop 2.2.0 x64 and running it on a cluster. When I do hadoop job -list or hadoop job -list all, it throws a NPE like this: 14/01/28 17:18:39 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id 14/01/28 17:18:39 INFO

Re: issue about write append into hdfs ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: ch12:50010:DataXceiver error processing READ_BLOCK operation

2014-02-20 Thread Ted Yu
Which hadoop release are you using ? Cheers On Thu, Feb 20, 2014 at 8:57 PM, ch huang justlo...@gmail.com wrote: hi,maillist: i see the following info in my hdfs log ,and the block belong to the file which write by scribe ,i do not know why is there any limit in hdfs system ?

Re: any optimize suggestion for high concurrent write into hdfs?

2014-02-20 Thread Chen Wang
Ch, you may consider using flume as it already has a flume sink that can sink to hdfs. What I did is to set up a flume listening on an Avro sink, and then sink to hdfs. Then in my application, i just send my data to avro socket. Chen On Thu, Feb 20, 2014 at 5:07 PM, ch huang justlo...@gmail.com

Re: any optimize suggestion for high concurrent write into hdfs?

2014-02-20 Thread Suresh Srinivas
Another alternative is to write block sized chunks into multiple hdfs files concurrently followed by concat to all those into a single file. Sent from phone On Feb 20, 2014, at 8:15 PM, Chen Wang chen.apache.s...@gmail.com wrote: Ch, you may consider using flume as it already has a flume

Re: issue about write append into hdfs ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: ch12:50010:DataXceiver error processing READ_BLOCK operation

2014-02-20 Thread Anurag Tangri
Did you check your unix open file limit and data node xceiver value ? Is it too low for the number of blocks/data in your cluster ? Thanks, Anurag Tangri On Feb 20, 2014, at 6:57 PM, ch huang justlo...@gmail.com wrote: hi,maillist: i see the following info in my hdfs log ,and

Re: history server for 2 clusters

2014-02-20 Thread Vinod Kumar Vavilapalli
Interesting use-case and setup. We never had this use-case in mind so far - we so far assumed a history-server per YARN cluster. You may be running into some issues where this assumption is not valid. Why do you need two separate YARN clusters for the same underlying data on HDFS? And if that

Re: issue about write append into hdfs ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: ch12:50010:DataXceiver error processing READ_BLOCK operation

2014-02-20 Thread ch huang
hi, i use CDH4.4 On Fri, Feb 21, 2014 at 12:04 PM, Ted Yu yuzhih...@gmail.com wrote: Which hadoop release are you using ? Cheers On Thu, Feb 20, 2014 at 8:57 PM, ch huang justlo...@gmail.com wrote: hi,maillist: i see the following info in my hdfs log ,and the block belong to

Re: issue about write append into hdfs ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: ch12:50010:DataXceiver error processing READ_BLOCK operation

2014-02-20 Thread ch huang
i use default value it seems the value is 4096, and also i checked hdfs user limit ,it's large enough -bash-4.1$ ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited

Re: Capacity Scheduler capacity vs. maximum-capacity

2014-02-20 Thread Vinod Kumar Vavilapalli
Yes, it does take those extra resources away back to queue B. How quickly it takes them away depends on whether preemption is enabled or not. If preemption is not enabled, it 'takes away' as and when containers from queue A start finishing. +Binod On Feb 19, 2014, at 5:35 PM, Alex Nastetsky

Re: issue about write append into hdfs ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: ch12:50010:DataXceiver error processing READ_BLOCK operation

2014-02-20 Thread ch huang
one more question is if i need add the value of data node xceiver need i add it to my NN config file? On Fri, Feb 21, 2014 at 12:25 PM, Anurag Tangri anurag_tan...@yahoo.comwrote: Did you check your unix open file limit and data node xceiver value ? Is it too low for the number of

Re: issue about write append into hdfs ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: ch12:50010:DataXceiver error processing READ_BLOCK operation

2014-02-20 Thread ch huang
i changed all datanode config add dfs.datanode.max.xcievers value is 131072 and restart all DN, still no use On Fri, Feb 21, 2014 at 12:25 PM, Anurag Tangri anurag_tan...@yahoo.comwrote: Did you check your unix open file limit and data node xceiver value ? Is it too low for the number of