RE: Lots of warning messages and exception in namenode logs

2017-06-22 Thread omprakash
Hi Arpit, I will enable the settings as suggested and will post the results. I am just curious about setting Namenode RPC service port. As I have checked the hdfs-site.xml properties, dfs.namenode.rpc-address is already set which will be default value to RPC service port also. Does

Re: How to monitor YARN application memory per container?

2017-06-22 Thread Miklos Szegedi
Hello, MAPREDUCE-6829 was about showing the peak memory usage for mapreduce. Here are some of the new counters: [root@42e243b8cf16 hadoop]# bin/yarn jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-jar pi 1 1000 Number of Maps = 1 Samples per Map = 1000 ... Peak Map Physical

Re: Lots of warning messages and exception in namenode logs

2017-06-22 Thread Ravi Prakash
Hi Omprakash! How big are your disks? Just 20Gb? Just out of curiosity, are these SSDs? In addition to Arpit's reply, I'm also concerned with the number of under-replicated blocks you have: Under replicated blocks: 141863 When there are fewer replicas for a block than there are supposed to be

Re: How to monitor YARN application memory per container?

2017-06-22 Thread Jasson Chenwei
hi, Please take a look at Timeline Server 2 which supports aggregate nodemenager side info into HBase. These infos include both node level info(e.g., node memory usage, cpu usage) as well as caontainer(e.g., container memory usage and container cpu usage ) level info. I am currently trying to

Re: Different input format based on files names in driver code

2017-06-22 Thread vivek
Thanks! On Jun 22, 2017 20:15, "Erik Krogen" wrote: > You would need to write a custom InputFormat which would return an > appropriate RecordReader based on the file format involved in each > InputSplit. You can have InputFormat#getSplits load InputSplits for both > file

Re: Different input format based on files names in driver code

2017-06-22 Thread vivek
Thanks! On Jun 22, 2017 20:15, "Erik Krogen" wrote: > You would need to write a custom InputFormat which would return an > appropriate RecordReader based on the file format involved in each > InputSplit. You can have InputFormat#getSplits load InputSplits for both > file

Re: Different input format based on files names in driver code

2017-06-22 Thread Erik Krogen
You would need to write a custom InputFormat which would return an appropriate RecordReader based on the file format involved in each InputSplit. You can have InputFormat#getSplits load InputSplits for both file types and have InputFormat#createRecordReader() delegate to the two different

Re: How to monitor YARN application memory per container?

2017-06-22 Thread Shmuel Blitz
Hi, Thanks for your response. We are using CDH, and our version doesn't support the solusions above. Also, ATS is not relevant for us now. We have decided to turn on JMX for all our jobs (spark/hadoop map-reduce) and use jmap to collect the data and send it to datadog. Shmuel On Thu, Jun

Re: Lots of warning messages and exception in namenode logs

2017-06-22 Thread Arpit Agarwal
Hi Omprakash, Your description suggests DataNodes cannot send timely reports to the NameNode. You can check it by looking for ‘stale’ DataNodes in the NN web UI when this situation is occurring. A few ideas: * Try increasing the NameNode RPC handler count a bit (set