[ https://issues.apache.org/jira/browse/KYLIN-4802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
WangSheng updated KYLIN-4802: ----------------------------- Description: I met a problem when using DistributedScheduler in two nodes, my current cluster version is 2.6.6. When executing "Build N-Dimension Cuboid : level 4" step, I found this step submitted MR job in both two nodes. One node submitted a MR first, and then executed following steps, when executing "Convert Cuboid Data to HFile" step, another node submitted a MR for "Build N-Dimension Cuboid : level 4" step again. And this caused data missing when generated File. And after this job completed, query on level 4 returns empty. In the picture, we can found that "Build N-Dimension Cuboid : level 4" step finished time is later than "Convert Cuboid Data to HFile" step start time: !kylin01.png|width=264,height=280! And these two pictures are from two nodes' Kylin.log, we can found that two MR jobs are submitted for step 10, which is "Build N-Dimension Cuboid : level 4" step: !kylin02.png|width=324,height=78! !kylin03.png|width=375,height=164! was:I met a problem when using DistributedScheduler in two node, my current cluster version is 2.6.6. When executing "Build N-Dimension Cuboid : level 4" step, I found this step submitted MR job in both two nodes. One node submitted a MR first, and then executed following steps, when executing "Convert Cuboid Data to HFile" step, another node submitted a MR for "Build N-Dimension Cuboid : level 4" step again. And this caused data missing when generated File. And after this job completed, query on level 4 returns empty. > “Build N-Dimension Cuboid” execute twice when using DistributedScheduler > ------------------------------------------------------------------------ > > Key: KYLIN-4802 > URL: https://issues.apache.org/jira/browse/KYLIN-4802 > Project: Kylin > Issue Type: Bug > Affects Versions: v2.6.6 > Reporter: WangSheng > Priority: Major > Attachments: kylin01.png, kylin02.png, kylin03.png > > > I met a problem when using DistributedScheduler in two nodes, my current > cluster version is 2.6.6. When executing "Build N-Dimension Cuboid : level 4" > step, I found this step submitted MR job in both two nodes. One node > submitted a MR first, and then executed following steps, when executing > "Convert Cuboid Data to HFile" step, another node submitted a MR for "Build > N-Dimension Cuboid : level 4" step again. And this caused data missing when > generated File. And after this job completed, query on level 4 returns empty. > In the picture, we can found that "Build N-Dimension Cuboid : level 4" step > finished time is later than "Convert Cuboid Data to HFile" step start time: > !kylin01.png|width=264,height=280! > And these two pictures are from two nodes' Kylin.log, we can found that two > MR jobs are submitted for step 10, which is "Build N-Dimension Cuboid : level > 4" step: > !kylin02.png|width=324,height=78! > !kylin03.png|width=375,height=164! -- This message was sent by Atlassian Jira (v8.3.4#803005)