[ https://issues.apache.org/jira/browse/KUDU-3154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17149740#comment-17149740 ]
Attila Bukor commented on KUDU-3154: ------------------------------------ Hm this is very interesting. I've seen this happen again in a test. A response is never constructed, which [would be logged|[https://github.com/apache/kudu/blob/master/java/kudu-subprocess/src/main/java/org/apache/kudu/subprocess/ranger/RangerProtocolHandler.java#L62|https://github.com/apache/kudu/blob/d23ee5d38ddc4317f431dd65df0c825c00cc968a/java/kudu-subprocess/src/main/java/org/apache/kudu/subprocess/ranger/RangerProtocolHandler.java#L62]], so I guess it is possible that it never gets evaluated, but I think it's unlikely that this happens due to the high frequency of the requests. It's configured to refresh policies every [200ms|[https://github.com/apache/kudu/blob/master/src/kudu/ranger/mini_ranger.h#L193]] and from the logs we can see that it does indeed get refreshed every 200ms + epsilon and takes ~30ms so even if there's a lock shared by the authorizer and the refresher the authorizer should have plenty of time to acquire. {code:java} 2020-06-20 07:05:49.672 [DEBUG - PolicyRefresher(serviceName=kudu)-12] (PolicyRefresher.java:489) ==> PolicyRefresher(serviceName=kudu).loadRoles() 2020-06-20 07:05:49.713 [DEBUG - PolicyRefresher(serviceName=kudu)-12] (PolicyRefresher.java:287) <== PolicyRefresher(serviceName=kudu).loadPolicy() 2020-06-20 07:05:49.872 [DEBUG - PolicyRefresher(serviceName=kudu)-12] (PolicyRefresher.java:489) ==> PolicyRefresher(serviceName=kudu).loadRoles() 2020-06-20 07:05:49.896 [DEBUG - PolicyRefresher(serviceName=kudu)-12] (PolicyRefresher.java:287) <== PolicyRefresher(serviceName=kudu).loadPolicy() 2020-06-20 07:05:50.072 [DEBUG - PolicyRefresher(serviceName=kudu)-12] (PolicyRefresher.java:489) ==> PolicyRefresher(serviceName=kudu).loadRoles() 2020-06-20 07:05:50.292 [DEBUG - PolicyRefresher(serviceName=kudu)-12] (PolicyRefresher.java:287) <== PolicyRefresher(serviceName=kudu).loadPolicy() 2020-06-20 07:05:50.473 [DEBUG - PolicyRefresher(serviceName=kudu)-12] (PolicyRefresher.java:489) ==> PolicyRefresher(serviceName=kudu).loadRoles() {code} Tbh I'm not sure how that InboundRequest is logged to be taken before it's put but I find it very weird. It was also logged the same way in the failing test that I've seen. If I run this test on my Mac it succeeds and the QueueUtil messages are in the expected/correct order: {code} 2020-07-02 01:17:19.459 [DEBUG - pool-2-thread-1] (QueueUtil.java:56) Message: org.apache.kudu.subprocess.InboundRequest@1cf099ad has been put on the queue 2020-07-02 01:17:19.459 [DEBUG - pool-3-thread-3] (QueueUtil.java:41) Message: org.apache.kudu.subprocess.InboundRequest@1cf099ad has been taken from the queue {code} I tried to search for some blocking queue bugs but couldn't find any so far. > RangerClientTestBase.TestLogging sometimes fails > ------------------------------------------------ > > Key: KUDU-3154 > URL: https://issues.apache.org/jira/browse/KUDU-3154 > Project: Kudu > Issue Type: Bug > Components: ranger, test > Affects Versions: 1.13.0 > Reporter: Alexey Serbin > Priority: Major > Attachments: ranger_client-test.txt.xz > > > The {{RangerClientTestBase.TestLogging}} scenario of the > {{ranger_client-test}} sometimes fails (all types of builds) with error > message like below: > {noformat} > src/kudu/ranger/ranger_client-test.cc:398: Failure > Failed > > Bad status: Timed out: timed out while in flight > > I0620 07:06:02.907177 1140 server.cc:247] Received an EOF from the > subprocess > I0620 07:06:02.910923 1137 server.cc:317] get failed, inbound queue shut > down: Aborted: > I0620 07:06:02.910964 1141 server.cc:380] outbound queue shut down: Aborted: > > I0620 07:06:02.910995 1138 server.cc:317] get failed, inbound queue shut > down: Aborted: > I0620 07:06:02.910984 1139 server.cc:317] get failed, inbound queue shut > down: Aborted: > {noformat} > The log is attached. -- This message was sent by Atlassian Jira (v8.3.4#803005)