How to see total pending containers ?

2014-06-27 Thread Ashwin Shankar
Hi,
Is there a way to see total pending containers in a cluster,so that
we know how far behind we are with etl ?

There is a pending containers field on the scheduler page under dr. who
table,but that is always zero.

-- 
Thanks,
Ashwin


Why resource requests are normalized in RM ?

2014-06-11 Thread Ashwin Shankar
Hi,
Anyone knows why resource requests from AMs are normalized to
be multiples of yarn.scheduler.minimum-allocation-mb which is 1G
by default ?
Also is there any problem with reducing yarn.scheduler.minimum-allocation-mb
to less
than 1G ?

 /**

   * Utility method to normalize a list of resource requests, by insuring
that

   * the memory for each request is a multiple of minMemory and is not zero.

   */

SchedulerUtils.normalizeRequests()
-- 
Thanks,
Ashwin


assignMultiple and continuous scheduling in Fair scheduler

2014-04-30 Thread Ashwin Shankar
Hi,
I see these two knobs in Fair scheduler - 'assignMultiple'
and 'continuos scheduling'.

1. Are there performance benefits using them ? What are the cons ?
2. Also is there any problem with 'continuous  scheduling',I'm asking
because this is not mentioned in the FS doc ?

-- 
Thanks,
Ashwin


Re: Differences between HistoryServer and Yarn TimeLine server?

2014-04-25 Thread Ashwin Shankar
Thanks Zhijie !
I had few more questions  :
1. I played around with the timeline server ui today which showed the
generic application history details,
but I couldn't find any page for application specific data. Is the
expectation that every application
needs to build their own UI using the exposed REST apis and somehow install
it with timeline server ?
Or am I missing something.
2. Are there REST apis for accessing both generic and framework specific
data in 2.4.0 ?
3. Is there an approximate timeframe for timeline server to be feature
complete ?
4. Tez doesn't have any job history UI,is there any work being done to
integrate Tez with timeline server ?
If not,is the timeline server ready for such integration in case someone
wants to pick this up ?

Thanks,
Ashwin



On Thu, Apr 24, 2014 at 12:00 AM, Zhijie Shen zs...@hortonworks.com wrote:

 Ashwin,

 YARN-321 focuses on the issue in the scope of generic application history
 service, while YARN-1530 covers the framework specific data service. And
 yes, the timeline server is going to cover both.

 We've not such a Jira before, but it is described in YARN-321's design
 doc. Anyway, I open a Jira (MAPREDUCE-5858) to track this issue.


 On Wed, Apr 23, 2014 at 11:25 PM, Ashwin Shankar 
 ashwinshanka...@gmail.com wrote:

 Hi Zhijie,
 There seems to two umbrella jiras for this - YARN-321 and YARN-1530,can
 you please let me know what is the
 difference ? Is timeline server finally going to be YARN321+YARN1530 ?

 You mentioned that MR is going to integrated with timeline server,is
 there a jira I can watch ?

 Thanks,
 Ashwin


 On Wed, Apr 23, 2014 at 10:15 PM, Zhijie Shen zs...@hortonworks.comwrote:

 Sam,

 You're right. We can definitely integrate MapReduce to use the timeline
 server to store and serve its specific data, and this is actually our plan.

 However, it's a big move, and we still need time to get it done. In
 addition, not to disturb the users that are currently relying on JHS for MR
 job information, we cannot simply remove JHS from Hadoop.


 On Wed, Apr 23, 2014 at 8:15 PM, sam liu samliuhad...@gmail.com wrote:

 Zhijie,

 I am much clear now. Thanks a lot!

 As my understanding, besides previous Job History Server, hadoop now
 has a new timeline server which could restore both the generic YARN
 application history and the framework specific information. However, I
 think the timeline server also include the functions of Job History Server,
 because it can store the framework specific information(of course, include
 mapreduce framework). In another words, Job History Server is not necessary
 any more.* If that's the case, why hadoop still include Job History
 Server?*


 2014-04-23 12:56 GMT+08:00 Zhijie Shen zs...@hortonworks.com:

  In Hadoop 2.4, we have delivered the timeline server at a preview
 stage, which actually can serve some generic YARN application history as
 well as the framework specific information. Due to the development
 logistics, we have created the two concepts: History Server and Timeline
 Server. To be simple, you can consider the history server of the service 
 of
 the generic YARN application information, while consider the timeline
 server of the service of the framework specific information. Importantly,
 we just have one daemon, which includes both services, and which we'd like
 to call timeline server (unfortunately, the confusing thing is that the
 command to start the daemon is historyserver). We're going on working on
 the timeline server to integrate these two parts, including refactoring 
 the
 names.

 BTW, if you mean MapReduce JobHistoryServer by HistoryServer, it's a
 different daemon, which serves the historic information of MapReduce jobs
 only.


 On Tue, Apr 22, 2014 at 8:44 PM, sam liu samliuhad...@gmail.comwrote:

 Hi Experts,

 I am confusing on these two concepts. Could you help explain the
 differences?

 Thanks!




 --
 Zhijie Shen
 Hortonworks Inc.
 http://hortonworks.com/

 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or
 entity to which it is addressed and may contain information that is
 confidential, privileged and exempt from disclosure under applicable law.
 If the reader of this message is not the intended recipient, you are 
 hereby
 notified that any printing, copying, dissemination, distribution,
 disclosure or forwarding of this communication is strictly prohibited. If
 you have received this communication in error, please contact the sender
 immediately and delete it from your system. Thank You.





 --
 Zhijie Shen
 Hortonworks Inc.
 http://hortonworks.com/

 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity
 to which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution

Re: Differences between HistoryServer and Yarn TimeLine server?

2014-04-24 Thread Ashwin Shankar
Hi Zhijie,
There seems to two umbrella jiras for this - YARN-321 and YARN-1530,can you
please let me know what is the
difference ? Is timeline server finally going to be YARN321+YARN1530 ?

You mentioned that MR is going to integrated with timeline server,is there
a jira I can watch ?

Thanks,
Ashwin


On Wed, Apr 23, 2014 at 10:15 PM, Zhijie Shen zs...@hortonworks.com wrote:

 Sam,

 You're right. We can definitely integrate MapReduce to use the timeline
 server to store and serve its specific data, and this is actually our plan.

 However, it's a big move, and we still need time to get it done. In
 addition, not to disturb the users that are currently relying on JHS for MR
 job information, we cannot simply remove JHS from Hadoop.


 On Wed, Apr 23, 2014 at 8:15 PM, sam liu samliuhad...@gmail.com wrote:

 Zhijie,

 I am much clear now. Thanks a lot!

 As my understanding, besides previous Job History Server, hadoop now has
 a new timeline server which could restore both the generic YARN application
 history and the framework specific information. However, I think the
 timeline server also include the functions of Job History Server, because
 it can store the framework specific information(of course, include
 mapreduce framework). In another words, Job History Server is not necessary
 any more.* If that's the case, why hadoop still include Job History
 Server?*


 2014-04-23 12:56 GMT+08:00 Zhijie Shen zs...@hortonworks.com:

  In Hadoop 2.4, we have delivered the timeline server at a preview
 stage, which actually can serve some generic YARN application history as
 well as the framework specific information. Due to the development
 logistics, we have created the two concepts: History Server and Timeline
 Server. To be simple, you can consider the history server of the service of
 the generic YARN application information, while consider the timeline
 server of the service of the framework specific information. Importantly,
 we just have one daemon, which includes both services, and which we'd like
 to call timeline server (unfortunately, the confusing thing is that the
 command to start the daemon is historyserver). We're going on working on
 the timeline server to integrate these two parts, including refactoring the
 names.

 BTW, if you mean MapReduce JobHistoryServer by HistoryServer, it's a
 different daemon, which serves the historic information of MapReduce jobs
 only.


 On Tue, Apr 22, 2014 at 8:44 PM, sam liu samliuhad...@gmail.com wrote:

 Hi Experts,

 I am confusing on these two concepts. Could you help explain the
 differences?

 Thanks!




 --
 Zhijie Shen
 Hortonworks Inc.
 http://hortonworks.com/

 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity
 to which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.





 --
 Zhijie Shen
 Hortonworks Inc.
 http://hortonworks.com/

 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity
 to which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.




-- 
Thanks,
Ashwin


Re: Setting debug log level for individual daemons

2014-04-16 Thread Ashwin Shankar
Yes, thank you Stanley !

Ashwin


On Tue, Apr 15, 2014 at 8:01 PM, Stanley Shi s...@gopivotal.com wrote:

 Is this what you are looking for?

 http://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-common/CommandsManual.html#daemonlog

 Regards,
 *Stanley Shi,*



 On Wed, Apr 16, 2014 at 2:06 AM, Ashwin Shankar ashwinshanka...@gmail.com
  wrote:

 Thanks Gordon and Stanley, but this would require us to bounce the
 process.
 Is there a way to change log levels without bouncing the process ?



 On Tue, Apr 15, 2014 at 3:23 AM, Gordon Wang gw...@gopivotal.com wrote:

 Put the following line in the log4j setting file.

 log4j.logger.org.apache.hadoop.yarn.server.resourcemanager=DEBUG,console


 On Tue, Apr 15, 2014 at 8:33 AM, Ashwin Shankar 
 ashwinshanka...@gmail.com wrote:

 Hi,
 How do we set log level to debug for lets say only Resource manager
 and not the other hadoop daemons ?

 --
 Thanks,
 Ashwin





 --
 Regards
 Gordon Wang




 --
 Thanks,
 Ashwin






-- 
Thanks,
Ashwin


Re: Setting debug log level for individual daemons

2014-04-15 Thread Ashwin Shankar
Thanks Gordon and Stanley, but this would require us to bounce the process.
Is there a way to change log levels without bouncing the process ?



On Tue, Apr 15, 2014 at 3:23 AM, Gordon Wang gw...@gopivotal.com wrote:

 Put the following line in the log4j setting file.

 log4j.logger.org.apache.hadoop.yarn.server.resourcemanager=DEBUG,console


 On Tue, Apr 15, 2014 at 8:33 AM, Ashwin Shankar ashwinshanka...@gmail.com
  wrote:

 Hi,
 How do we set log level to debug for lets say only Resource manager
 and not the other hadoop daemons ?

 --
 Thanks,
 Ashwin





 --
 Regards
 Gordon Wang




-- 
Thanks,
Ashwin


Setting debug log level for individual daemons

2014-04-14 Thread Ashwin Shankar
Hi,
How do we set log level to debug for lets say only Resource manager
and not the other hadoop daemons ?

-- 
Thanks,
Ashwin


Resetting dead datanodes list

2014-04-11 Thread Ashwin Shankar
Hi,
Hadoop-1's name node UI displays dead datanodes
even if those instances are terminated and are not part of the cluster
anymore.
Is there a way to reset the dead datenode list without bouncing namenode ?

This would help me in my script(which would run nightly) which parses the
html page,terminates
dead datanodes and resize the cluster.
-- 
Thanks,
Ashwin


Job fails if I change HADOOP_USER_NAME

2014-03-21 Thread Ashwin Shankar
Hi,
I'm writing a new feature in Fair scheduler and wanted to test it out
by running jobs submitted by different users from my laptop.

My sleep job runs fine as long as the user name is my mac user name.
If I change my hadoop user name by setting HADOOP_USER_NAME,
my jobs fail with the exception
*org.apache.hadoop.util.Shell$ExitCodeException.*
I also tried creating a new user account on my laptop and running a job as
that user but I get the same exception.

Please let me know if any of you have come across this.
I tried changing ulimits max proc(to 1024),but doesn't solve the problem.

Here is the stack trace :

Job job_1395389889916_0001 failed with state FAILED due to: Application
application_1395389889916_0001 failed 3 times due to AM Container for
appattempt_1395389889916_0001_03 exited with  exitCode: 1 due to:
Exception from container-launch:
org.apache.hadoop.util.Shell$ExitCodeException:
org.apache.hadoop.util.Shell$ExitCodeException:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:505)
at org.apache.hadoop.util.Shell.run(Shell.java:418)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
at
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)

-- 
Thanks,
Ashwin


Re: Job fails if I change HADOOP_USER_NAME

2014-03-21 Thread Ashwin Shankar
Hi Rohit,
How I enable debug for AM container logs ? and to which location are they
written to ?
I tried changing log4j.prop and can see DEBUGs for RM,NM etc but I don't
see AM related debug logs.

Thanks,
Ashwin


On Fri, Mar 21, 2014 at 3:05 AM, Rohith Sharma K S 
rohithsharm...@huawei.com wrote:

  Hi



 The below stack trace is generic for any am launcher failed to launch. Can
 debug on AM container logs, so get proper stacktrace.?





 Thanks  Regards

 Rohith Sharma K S



 *From:* Ashwin Shankar [mailto:ashwinshanka...@gmail.com]
 *Sent:* 21 March 2014 14:02
 *To:* user@hadoop.apache.org
 *Subject:* Job fails if I change HADOOP_USER_NAME



 Hi,

 I'm writing a new feature in Fair scheduler and wanted to test it out

 by running jobs submitted by different users from my laptop.



 My sleep job runs fine as long as the user name is my mac user name.

 If I change my hadoop user name by setting HADOOP_USER_NAME,

 my jobs fail with the exception
 *org.apache.hadoop.util.Shell$ExitCodeException.*

 I also tried creating a new user account on my laptop and running a job as
 that user but I get the same exception.



 Please let me know if any of you have come across this.

 I tried changing ulimits max proc(to 1024),but doesn't solve the problem.



 Here is the stack trace :



 Job job_1395389889916_0001 failed with state FAILED due to: Application
 application_1395389889916_0001 failed 3 times due to AM Container for
 appattempt_1395389889916_0001_03 exited with  exitCode: 1 due to:
 Exception from container-launch:
 org.apache.hadoop.util.Shell$ExitCodeException:

 org.apache.hadoop.util.Shell$ExitCodeException:

 at org.apache.hadoop.util.Shell.runCommand(Shell.java:505)

 at org.apache.hadoop.util.Shell.run(Shell.java:418)

 at
 org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)

 at
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)



 --

 Thanks,
 Ashwin




-- 
Thanks,
Ashwin