[jira] [Updated] (OOZIE-1864) Improve chid job id aggregation logic

2014-05-30 Thread Purshotam Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Purshotam Shah updated OOZIE-1864:
--

Description: 
Current chid job id aggregation logic
Once launcher job complete submitting child job (jobs in case on pig), it
writes jobID to file.

From Oozie server side, we collect childID in two ways
1. As soon as we submit launcher jobs, we check if launcher job terminated or
not. If it's terminated, we read child-id from file and populated to DB.  And
once kill command is issued we kill all child jobs.
2. We have a timer task (ActionCheckerService) which keeps on checking the
status of all running actions and if launcher job is terminated, it's update
the DB with childIDs.

Jobend notification is rejected if action is not running.  

Assume that launcher is killed after it has submitted child job.
Child job will never be killed.


To fix this, we should do following things.

1. If oozie receives job end notification and if launcher job is killed, collect
all child job and kill them if they are not killed.

2. Have a better way logic to collect child job id. Launcher job can call 
callbackServlet ( may be periodically) to
update child job ids. This could be useful in pig jobs. In current scenario we
report child jobs job only when launcher job completes.

 Improve chid job id aggregation logic
 -

 Key: OOZIE-1864
 URL: https://issues.apache.org/jira/browse/OOZIE-1864
 Project: Oozie
  Issue Type: Bug
Reporter: Purshotam Shah

 Current chid job id aggregation logic
 Once launcher job complete submitting child job (jobs in case on pig), it
 writes jobID to file.
 From Oozie server side, we collect childID in two ways
 1. As soon as we submit launcher jobs, we check if launcher job terminated or
 not. If it's terminated, we read child-id from file and populated to DB.  And
 once kill command is issued we kill all child jobs.
 2. We have a timer task (ActionCheckerService) which keeps on checking the
 status of all running actions and if launcher job is terminated, it's update
 the DB with childIDs.
 Jobend notification is rejected if action is not running.  
 Assume that launcher is killed after it has submitted child job.
 Child job will never be killed.
 To fix this, we should do following things.
 1. If oozie receives job end notification and if launcher job is killed, 
 collect
 all child job and kill them if they are not killed.
 2. Have a better way logic to collect child job id. Launcher job can call 
 callbackServlet ( may be periodically) to
 update child job ids. This could be useful in pig jobs. In current scenario we
 report child jobs job only when launcher job completes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (OOZIE-1864) Improve chid job id aggregation logic

2014-05-30 Thread Purshotam Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Purshotam Shah updated OOZIE-1864:
--

Description: 
Improve chid job id aggregation logic


Current chid job id aggregation logic
Once launcher job complete submitting child job (jobs in case on pig), it 
writes jobID to file.

From Oozie server side, we collect childID in two ways
1. As soon as we submit launcher jobs, we check if launcher job terminated or 
not. If it's terminated, we read child-id from file and populated to DB.  And 
once kill command is issued we kill all child jobs.
2. We have a timer task (ActionCheckerService) which keeps on checking the 
status of all running actions and if launcher job is terminated, it's update 
the DB with childIDs.

Jobend notification is rejected if action is not running.  

Assume that launcher is killed after it has submitted child job.
Child job will never be killed.


To fix this, we should do following things.

1. If oozie receives job end notification and if launcher job is killed, 
collect all child job and kill them if they are not killed.
2. Have a better way logic to collect child job id. Launcher job can call 
callbackServlet ( may be periodically) to update child job ids. This could be 
useful in pig jobs. In current  scenario we report child jobs job only when 
launcher job completes.
 

  was:
Current chid job id aggregation logic
Once launcher job complete submitting child job (jobs in case on pig), it
writes jobID to file.

From Oozie server side, we collect childID in two ways
1. As soon as we submit launcher jobs, we check if launcher job terminated or
not. If it's terminated, we read child-id from file and populated to DB.  And
once kill command is issued we kill all child jobs.
2. We have a timer task (ActionCheckerService) which keeps on checking the
status of all running actions and if launcher job is terminated, it's update
the DB with childIDs.

Jobend notification is rejected if action is not running.  

Assume that launcher is killed after it has submitted child job.
Child job will never be killed.


To fix this, we should do following things.

1. If oozie receives job end notification and if launcher job is killed, collect
all child job and kill them if they are not killed.

2. Have a better way logic to collect child job id. Launcher job can call 
callbackServlet ( may be periodically) to
update child job ids. This could be useful in pig jobs. In current scenario we
report child jobs job only when launcher job completes.


 Improve chid job id aggregation logic
 -

 Key: OOZIE-1864
 URL: https://issues.apache.org/jira/browse/OOZIE-1864
 Project: Oozie
  Issue Type: Bug
Reporter: Purshotam Shah

 Improve chid job id aggregation logic
 Current chid job id aggregation logic
 Once launcher job complete submitting child job (jobs in case on pig), it 
 writes jobID to file.
 From Oozie server side, we collect childID in two ways
 1. As soon as we submit launcher jobs, we check if launcher job terminated or 
 not. If it's terminated, we read child-id from file and populated to DB.  And 
 once kill command is issued we kill all child jobs.
 2. We have a timer task (ActionCheckerService) which keeps on checking the 
 status of all running actions and if launcher job is terminated, it's update 
 the DB with childIDs.
 Jobend notification is rejected if action is not running.  
 Assume that launcher is killed after it has submitted child job.
 Child job will never be killed.
 To fix this, we should do following things.
 1. If oozie receives job end notification and if launcher job is killed, 
 collect all child job and kill them if they are not killed.
 2. Have a better way logic to collect child job id. Launcher job can call 
 callbackServlet ( may be periodically) to update child job ids. This could be 
 useful in pig jobs. In current  scenario we report child jobs job only when 
 launcher job completes.
  



--
This message was sent by Atlassian JIRA
(v6.2#6252)