[ 
https://issues.apache.org/jira/browse/GRIFFIN-293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

William Guo resolved GRIFFIN-293.
---------------------------------
    Resolution: Fixed

Issue resolved by pull request 541
[https://github.com/apache/griffin/pull/541]

> [Service] livy.need.queue=true
> ------------------------------
>
>                 Key: GRIFFIN-293
>                 URL: https://issues.apache.org/jira/browse/GRIFFIN-293
>             Project: Griffin
>          Issue Type: Bug
>    Affects Versions: 0.6.0
>            Reporter: Nevena Veljkovic
>            Priority: Critical
>             Fix For: 0.6.0
>
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> While using griffin in several productions environments, having x10 jobs 
> starting at same hour, minute, second, we figured out that 2 (or more) 
> concurrent griffin jobs are not submitted and executed to the end (the last 
> was submitted multiple times, the rest never).
> example
>  2 jobs "beta_node_metrics_fact" and "beta_node_master_dimension_device", 
> difference between them is 1 millisecond
> {code:java}
> 2019-09-28 14:00:37.090 INFO 2732 --- [ryBean_Worker-4] 
> o.a.g.c.j.SparkSubmitJob [203] : {
>  "measure.type" : "griffin",
>  "id" : 60560,
>  "name" : "beta_node_metrics_fact",
> 2019-09-28 14:00:37.091 INFO 2732 --- [ryBean_Worker-5] 
> o.a.g.c.j.SparkSubmitJob [203] : {
>  "measure.type" : "griffin",
>  "id" : 63751,
>  "name" : "beta_node_master_dimension_device",
> {code}
> livy submitted 2 jobs/tasks, both contained 
> "beta_node_master_dimension_device"
> That's why decided to use setting "livy.need.queue=true".
>  During testing we figured out queueing does not work at all as 
> LivyTaskSubmitHelper's member sparkSubmitJob was not instantiated
>  
> [https://github.com/apache/griffin/blob/master/service/src/main/java/org/apache/griffin/core/job/LivyTaskSubmitHelper.java#L64]
> We fixed this and continue with testing.
> During testing we figured out that curConcurrentTaskNum does not decrease 
> finished tasks (state SUCCESS or DEAD).
>  
> [https://github.com/apache/griffin/blob/master/service/src/main/java/org/apache/griffin/core/job/JobServiceImpl.java#L632-L633]
> We fixed this also.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to