[ 
https://issues.apache.org/jira/browse/DRILL-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15179143#comment-15179143
 ] 

Jacques Nadeau commented on DRILL-4325:
---------------------------------------

[~vicky], would you be willing to run your same oversaturation test and see if 
our stability is better with this simple patch?

https://github.com/jacques-n/drill/tree/DRILL-4466b

I'd like to see if we can help the kernel scheduler enough that it runs work at 
a larger quantum. This won't solve the gross over-parallelization issue 
directly but it may help the system context switch less.

In reality, a change in scheduling won't actually impact the core problem of 
too many simultaneous tasks. No matter the threading model, having 4000 tasks 
competing for ~40 logical cores is going to mean slow progress. Clearly we need 
to increase the switch quantum in these cases so we make forward progress 
(hopefully impacted with my small patch). However, if we target a quantum of 
100ms, that means tasks would wait 10s between each 100ms of work. In other 
words, we can't schedule this many tasks and expect speedy forward progress. We 
need to enable inbound controls as well as ensure that we reduce the 
parallelization behavior on a heavily loaded node.

> ForemanException: One or more nodes lost connectivity during query
> ------------------------------------------------------------------
>
>                 Key: DRILL-4325
>                 URL: https://issues.apache.org/jira/browse/DRILL-4325
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Flow
>    Affects Versions: 1.5.0
>            Reporter: Victoria Markman
>         Attachments: drillbit.log.133, drillbit.log.134, drillbit.log.135, 
> drillbit.log.136, stats.133.tar, stats.134.tar, stats.135.tar, stats.136.tar, 
> zookeeper.log
>
>
> The picture pretty much looks like this: bunch of queries are running 
> (usually something more involved than just simple functional tests),usually  
> tpch or tpcds  with lots of major fragments, like query74 from tpcds . 
> Zookeeper decides that particular node is dead and queries that were running 
> at the time of the connection loss are failed by drill ( which is correct 
> behavior, I think )
> It seems that I can reliably reproduce this issue when I bump up number of 
> concurrently running queries and make all of them go to the same forman node 
> (I don't really imply here that  planning is to blame, just seems to 
> reproduce easier)
> On my 4 node cluster I can pretty much reproduce this problem relaiably by 
> running: 
> run.sh -s Advanced/tpcds/tpcds_sf100/original -g smoke -t 600 -n 10 
> {code}
> 2016-01-28 16:30:20,146 [29554d63-b478-6bae-f0f6-435d9f33ffdf:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 29554d63-b478-6bae-f0f6-435d9f33ffdf: select * from sys.version
> 2016-01-28 16:30:22,844 [29554d61-2789-babb-54e5-22b701bf2f64:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 29554d61-2789-babb-54e5-22b701bf2f64: select * from sys.drillbits
> 2016-01-28 16:30:23,281 [29554d60-5bbd-dae1-c38d-21708ad37fbe:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 29554d60-5bbd-dae1-c38d-21708ad37fbe: alter system set 
> `planner.enable_decimal_data_type` = true
> 2016-01-28 16:30:24,889 [29554d5e-d243-6299-3103-58b180135854:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 29554d5e-d243-6299-3103-58b180135854: use `dfs.tpcds_sf100_parquet_views`
> 2016-01-28 16:30:24,931 [29554d5e-b395-14aa-42a4-f6f248059363:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 29554d5e-b395-14aa-42a4-f6f248059363: use `dfs.tpcds_sf100_parquet_views`
> 2016-01-28 16:30:24,964 [29554d5f-24ac-cf00-714c-7419d3894af0:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 29554d5f-24ac-cf00-714c-7419d3894af0: use `dfs.tpcds_sf100_parquet_views`
> 2016-01-28 16:30:24,998 [29554d5e-ae92-6306-3495-be5cb7f98139:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 29554d5e-ae92-6306-3495-be5cb7f98139: use `dfs.tpcds_sf100_parquet_views`
> 2016-01-28 16:30:25,040 [29554d5e-1a20-3d6d-143b-0ee3bcd4aa11:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 29554d5e-1a20-3d6d-143b-0ee3bcd4aa11: use `dfs.tpcds_sf100_parquet_views`
> 2016-01-28 16:30:25,073 [29554d5d-e7b4-c61c-9735-ce37938aa47d:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 29554d5d-e7b4-c61c-9735-ce37938aa47d: use `dfs.tpcds_sf100_parquet_views`
> 2016-01-28 16:30:25,106 [29554d5d-823b-0536-e4df-4c6cef64b3e4:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 29554d5d-823b-0536-e4df-4c6cef64b3e4: use `dfs.tpcds_sf100_parquet_views`
> 2016-01-28 16:30:25,131 [29554d5e-099c-3acd-477e-ee4bece4dc4e:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 29554d5e-099c-3acd-477e-ee4bece4dc4e: use `dfs.tpcds_sf100_parquet_views`
> 2016-01-28 16:30:25,184 [29554d5d-b87b-fadd-d5bb-a5d0bba03671:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 29554d5d-b87b-fadd-d5bb-a5d0bba03671: use `dfs.tpcds_sf100_parquet_views`
> 2016-01-28 16:30:25,205 [29554d5d-97a8-c577-76b7-01bb6c5f5e48:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 29554d5d-97a8-c577-76b7-01bb6c5f5e48: use `dfs.tpcds_sf100_parquet_views`
> 2016-01-28 16:30:25,432 [29554d5e-353a-27cf-9c93-969f9e8866da:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 29554d5e-353a-27cf-9c93-969f9e8866da: -- start query 55 in stream 0 using 
> template query55.tpl 
> 2016-01-28 16:30:25,509 [29554d5d-c872-a1be-8386-bf5500ba6726:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 29554d5d-c872-a1be-8386-bf5500ba6726: -- start query 76 in stream 0 using 
> template query76.tpl 
> 2016-01-28 16:30:25,564 [29554d5d-a2ad-bb2a-cfd3-0ec498ef4da9:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 29554d5d-a2ad-bb2a-cfd3-0ec498ef4da9: -- start query 46 in stream 0 using 
> template query46.tpl 
> 2016-01-28 16:30:25,665 [29554d5d-9ffe-96e8-7b4b-0e4214554185:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 29554d5d-9ffe-96e8-7b4b-0e4214554185: -- start query 21 in stream 0 using 
> template query21.tpl 
> 2016-01-28 16:30:25,714 [29554d5d-a051-b35c-078f-2872372aa7bb:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 29554d5d-a051-b35c-078f-2872372aa7bb: -- start query 34 in stream 0 using 
> template query34.tpl 
> 2016-01-28 16:30:25,738 [29554d5d-9dd8-3477-9ce0-ac5f3323195d:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 29554d5d-9dd8-3477-9ce0-ac5f3323195d: -- start query 68 in stream 0 using 
> template query68.tpl 
> 2016-01-28 16:30:25,861 [29554d5e-2d2e-81bb-4427-696d27f8deb5:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 29554d5e-2d2e-81bb-4427-696d27f8deb5: -- start query 74 in stream 0 using 
> template query74.tpl 
> 2016-01-28 16:30:25,910 [29554d5e-71d6-5cf5-3ec6-58d07ce81ed5:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 29554d5e-71d6-5cf5-3ec6-58d07ce81ed5: -- start query 33 in stream 0 using 
> template query33.tpl 
> 2016-01-28 16:30:26,012 [29554d5e-4fab-a069-ea09-5d1c561664fe:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 29554d5e-4fab-a069-ea09-5d1c561664fe: -- start query 50 in stream 0 using 
> template query50.tpl 
> 2016-01-28 16:30:26,047 [29554d5c-d2ce-153d-4f5b-a8c26f3d256d:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 29554d5c-d2ce-153d-4f5b-a8c26f3d256d: -- start query 52 in stream 0 using 
> template query52.tpl 
> 2016-01-28 16:32:43,453 [29554cd4-40a8-101a-b418-a91898d720b6:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 29554cd4-40a8-101a-b418-a91898d720b6: use `dfs.tpcds_sf100_parquet_views`
> 2016-01-28 16:32:44,303 [29554cd2-8fe4-9ca4-2b3f-3d591f2144c4:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 29554cd2-8fe4-9ca4-2b3f-3d591f2144c4: -- start query 91 in stream 0 using 
> template query91.tpl 
> 2016-01-28 16:32:47,965 [29554ccf-db1d-c69b-7dc5-703c1a03f623:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 29554ccf-db1d-c69b-7dc5-703c1a03f623: use `dfs.tpcds_sf100_parquet_views`
> 2016-01-28 16:32:49,119 [29554cce-68d0-6ec4-cbb9-192430099659:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 29554cce-68d0-6ec4-cbb9-192430099659: -- start query 59 in stream 0 using 
> template query59.tpl 
> 2016-01-28 16:33:01,153 [29554cc1-9ad9-902d-ea5e-f8520b77bd8a:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 29554cc1-9ad9-902d-ea5e-f8520b77bd8a: use `dfs.tpcds_sf100_parquet_views`
> 2016-01-28 16:33:02,119 [29554cc1-74cc-89e8-cbb9-a0c5961d1018:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 29554cc1-74cc-89e8-cbb9-a0c5961d1018: -- start query 3 in stream 0 using 
> template query3.tpl 
> 2016-01-28 16:35:31,837 [29554c2b-9ebc-9509-96fd-a504657a516f:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 29554c2b-9ebc-9509-96fd-a504657a516f: use `dfs.tpcds_sf100_parquet_views`
> 2016-01-28 16:35:34,231 [29554c29-0e51-3d3c-dfa5-358d67233d9b:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 29554c29-0e51-3d3c-dfa5-358d67233d9b: -- start query 66 in stream 0 using 
> template query66.tpl 
> 2016-01-28 16:36:13,623 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554cce-68d0-6ec4-cbb9-192430099659:3:74.
> 2016-01-28 16:36:13,624 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554d5e-2d2e-81bb-4427-696d27f8deb5:6:18.
> 2016-01-28 16:36:13,636 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554d5e-2d2e-81bb-4427-696d27f8deb5:4:38.
> 2016-01-28 16:36:13,663 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554cce-68d0-6ec4-cbb9-192430099659:6:50.
> 2016-01-28 16:36:13,664 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554d5d-9dd8-3477-9ce0-ac5f3323195d:1:2.
> 2016-01-28 16:36:13,664 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554d5d-a2ad-bb2a-cfd3-0ec498ef4da9:6:30.
> 2016-01-28 16:36:13,665 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554d5e-4fab-a069-ea09-5d1c561664fe:1:10.
> 2016-01-28 16:36:13,665 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554cce-68d0-6ec4-cbb9-192430099659:6:6.
> 2016-01-28 16:36:13,665 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554d5d-9ffe-96e8-7b4b-0e4214554185:3:22.
> 2016-01-28 16:36:13,666 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554cce-68d0-6ec4-cbb9-192430099659:3:86.
> 2016-01-28 16:36:13,666 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554d5d-9ffe-96e8-7b4b-0e4214554185:1:46.
> 2016-01-28 16:36:13,666 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554d5e-4fab-a069-ea09-5d1c561664fe:3:30.
> 2016-01-28 16:36:13,666 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554d5e-2d2e-81bb-4427-696d27f8deb5:7:34.
> 2016-01-28 16:36:13,667 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554d5d-c872-a1be-8386-bf5500ba6726:1:50.
> 2016-01-28 16:36:13,667 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554d5d-a051-b35c-078f-2872372aa7bb:1:10.
> 2016-01-28 16:36:13,668 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554d5e-2d2e-81bb-4427-696d27f8deb5:7:10.
> 2016-01-28 16:36:13,669 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554cce-68d0-6ec4-cbb9-192430099659:3:90.
> 2016-01-28 16:36:13,670 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554d5e-2d2e-81bb-4427-696d27f8deb5:22:41.
> 2016-01-28 16:36:13,670 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554d5e-2d2e-81bb-4427-696d27f8deb5:6:26.
> 2016-01-28 16:36:13,670 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554d5e-4fab-a069-ea09-5d1c561664fe:3:18.
> 2016-01-28 16:36:13,671 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554d5e-2d2e-81bb-4427-696d27f8deb5:20:37.
> 2016-01-28 16:36:13,671 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554d5d-c872-a1be-8386-bf5500ba6726:1:10.
> 2016-01-28 16:36:13,767 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554cce-68d0-6ec4-cbb9-192430099659:5:2.
> 2016-01-28 16:36:13,768 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554d5d-9ffe-96e8-7b4b-0e4214554185:1:14.
> 2016-01-28 16:36:13,768 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554d5e-2d2e-81bb-4427-696d27f8deb5:5:54.
> 2016-01-28 16:36:13,768 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554d5e-2d2e-81bb-4427-696d27f8deb5:5:86.
> 2016-01-28 16:36:13,769 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554d5e-4fab-a069-ea09-5d1c561664fe:3:62.
> 2016-01-28 16:36:13,812 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554d5e-4fab-a069-ea09-5d1c561664fe:3:82.
> 2016-01-28 16:36:13,823 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554d5e-4fab-a069-ea09-5d1c561664fe:8:29.
> 2016-01-28 16:36:13,862 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554d5d-c872-a1be-8386-bf5500ba6726:1:86.
> 2016-01-28 16:36:13,950 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554d5d-9ffe-96e8-7b4b-0e4214554185:3:6.
> 2016-01-28 16:36:13,991 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554d5d-9ffe-96e8-7b4b-0e4214554185:1:66.
> 2016-01-28 16:36:13,992 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554d5d-9dd8-3477-9ce0-ac5f3323195d:3:62.
> 2016-01-28 16:36:13,992 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554cce-68d0-6ec4-cbb9-192430099659:15:13.
> 2016-01-28 16:36:13,992 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554d5e-2d2e-81bb-4427-696d27f8deb5:8:2.
> 2016-01-28 16:36:13,993 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554d5e-4fab-a069-ea09-5d1c561664fe:1:54.
> 2016-01-28 16:36:13,993 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554d5e-2d2e-81bb-4427-696d27f8deb5:5:30.
> 2016-01-28 16:36:13,993 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554d5d-a2ad-bb2a-cfd3-0ec498ef4da9:6:34.
> 2016-01-28 16:36:13,993 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554d5d-9dd8-3477-9ce0-ac5f3323195d:6:38.
> 2016-01-28 16:36:13,994 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554d5e-2d2e-81bb-4427-696d27f8deb5:6:34.
> 2016-01-28 16:36:14,095 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554d5d-a2ad-bb2a-cfd3-0ec498ef4da9:3:54.
> 2016-01-28 16:36:14,155 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554cce-68d0-6ec4-cbb9-192430099659:6:74.
> 2016-01-28 16:36:14,155 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554d5e-2d2e-81bb-4427-696d27f8deb5:6:46.
> 2016-01-28 16:36:14,155 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554cce-68d0-6ec4-cbb9-192430099659:2:2.
> 2016-01-28 16:36:14,156 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554cce-68d0-6ec4-cbb9-192430099659:3:34.
> 2016-01-28 16:36:14,156 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554d5e-4fab-a069-ea09-5d1c561664fe:0:0.
> 2016-01-28 16:36:14,157 [Curator-ServiceCache-0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Foreman atsqa4-133.qa.lab no longer 
> active.  Cancelling fragment 29554d5e-4fab-a069-ea09-5d1c561664fe:3:54.
> 2016-01-28 16:36:14,367 [Curator-ServiceCache-0] ERROR 
> o.a.drill.exec.work.foreman.Foreman - SYSTEM ERROR: ForemanException: One 
> more more nodes lost connectivity during query.  Identified nodes were 
> [atsqa4-133.qa.lab:31010].
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> ForemanException: One more more nodes lost connectivity during query.  
> Identified nodes were [atsqa4-133.qa.lab:31010].
>       at 
> org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:746)
>  [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
>       at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:858)
>  [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
>       at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:790)
>  [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
>       at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.moveToState(Foreman.java:792)
>  [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
>       at 
> org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:909) 
> [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
>       at 
> org.apache.drill.exec.work.foreman.Foreman.access$2700(Foreman.java:110) 
> [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
>       at 
> org.apache.drill.exec.work.foreman.Foreman$StateListener.moveToState(Foreman.java:1183)
>  [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
> Caused by: org.apache.drill.exec.work.foreman.ForemanException: One more more 
> nodes lost connectivity during query.  Identified nodes were 
> [atsqa4-133.qa.lab:31010].
> {code}
> Will post logs shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to