> On Feb. 8, 2014, 12:45 a.m., Rohini Palaniswamy wrote:
> > This would not work. Need to rethink the approach.
> > 
> > 1) Using zookeeper server index to load pending SLA into memory
> >      When server 1 starts it will get everything loaded into memory. When 
> > server 2 starts it will load half of what server 1 already loaded into 
> > memory. If server 1 was restarted, it will again load up same half as what 
> > server 2 loaded and other half will not be loaded by both servers. So this 
> > is not a good way to do things as SLA will not be processed at all for some 
> > jobs or notifications will be sent out twice for some jobs.
> > 2) Removing from the cached map if the status is old or adding to the map 
> > if it is not there when getting a job status notification.
> >     Job status notifications are not guaranteed and can be lost if the 
> > queue is full. We process SLA irrespective of that by checking against DB 
> > periodically as SLA notifications are important. If you remove from the map 
> > once, it may never be added again and hence SLA processing will not be done 
> > for that job. So removing from the map is a bad idea.

here is modifeid design based on Rohini's comments

each oozie instance loads a whole set of slaMap on restart to avoid missing any 
sla event irrespective of when each oozie instance starts. therefore regular 
iteration of SlaMap need to use ZKLock on job/action ID in processing each sla 
event since multiple oozie instances could perform DB update for sla of the 
same job/action concurently. DB cost could be reduced by making interaval of 
slaMap iteration (currently every 30 sec) longer when multiple servers run. 

eventProcessed of SLASummry Table is used as source of truth indicating 
progress of SLA notification. After oozie instance starts,  slaMap could become 
inconsistent across oozie instances(job event fired on one oozie server, not on 
others), thus need to check eventProcessed on DB (in 
SLACalculatorMemory.updateJobSla() or addJobStatus) to avoid sending duplicated 
sla message or missing sending. (also need safeguard before incrementing 
eventProcessed when queueing of SLA event fails due to eventQueue being full , 
jobevent can be lost , but SLAevent should not be lost)


- Ryota


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17720/#review33989
-----------------------------------------------------------


On Feb. 7, 2014, 10:17 p.m., Ryota Egashira wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/17720/
> -----------------------------------------------------------
> 
> (Updated Feb. 7, 2014, 10:17 p.m.)
> 
> 
> Review request for oozie.
> 
> 
> Bugs: OOZIE-1678
>     https://issues.apache.org/jira/browse/OOZIE-1678
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> https://issues.apache.org/jira/browse/OOZIE-1678
> 
> 
> Diffs
> -----
> 
>   
> core/src/main/java/org/apache/oozie/executor/jpa/SLARegistrationQueryExecutor.java
>  e3b115f 
>   
> core/src/main/java/org/apache/oozie/executor/jpa/SLASummaryQueryExecutor.java 
> 79d11ed 
>   core/src/main/java/org/apache/oozie/service/JPAService.java aba8709 
>   core/src/main/java/org/apache/oozie/sla/SLACalculatorMemory.java 618d899 
>   core/src/main/java/org/apache/oozie/sla/SLARegistrationBean.java a2260a4 
>   core/src/main/java/org/apache/oozie/sla/SLASummaryBean.java 0a70326 
>   core/src/main/java/org/apache/oozie/sla/service/SLAService.java 2458e69 
>   
> core/src/test/java/org/apache/oozie/executor/jpa/TestSLARegistrationQueryExecutor.java
>  00fb677 
>   
> core/src/test/java/org/apache/oozie/executor/jpa/TestSLASummaryQueryExecutor.java
>  2e170a4 
>   core/src/test/java/org/apache/oozie/service/TestHASLAService.java 
> PRE-CREATION 
>   core/src/test/java/org/apache/oozie/test/ZKXTestCase.java 7bebaf0 
> 
> Diff: https://reviews.apache.org/r/17720/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Ryota Egashira
> 
>

Reply via email to