[ 
https://issues.apache.org/jira/browse/KUDU-3154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17149740#comment-17149740
 ] 

Attila Bukor commented on KUDU-3154:
------------------------------------

Hm this is very interesting. I've seen this happen again in a test. A response 
is never constructed, which [would be 
logged|[https://github.com/apache/kudu/blob/master/java/kudu-subprocess/src/main/java/org/apache/kudu/subprocess/ranger/RangerProtocolHandler.java#L62|https://github.com/apache/kudu/blob/d23ee5d38ddc4317f431dd65df0c825c00cc968a/java/kudu-subprocess/src/main/java/org/apache/kudu/subprocess/ranger/RangerProtocolHandler.java#L62]],
 so I guess it is possible that it never gets evaluated, but I think it's 
unlikely that this happens due to the high frequency of the requests.

It's configured to refresh policies every 
[200ms|[https://github.com/apache/kudu/blob/master/src/kudu/ranger/mini_ranger.h#L193]]
 and from the logs we can see that it does indeed get refreshed every 200ms + 
epsilon and takes ~30ms so even if there's a lock shared by the authorizer and 
the refresher the authorizer should have plenty of time to acquire.
{code:java}
2020-06-20 07:05:49.672 [DEBUG - PolicyRefresher(serviceName=kudu)-12] 
(PolicyRefresher.java:489) ==> PolicyRefresher(serviceName=kudu).loadRoles()
2020-06-20 07:05:49.713 [DEBUG - PolicyRefresher(serviceName=kudu)-12] 
(PolicyRefresher.java:287) <== PolicyRefresher(serviceName=kudu).loadPolicy()
2020-06-20 07:05:49.872 [DEBUG - PolicyRefresher(serviceName=kudu)-12] 
(PolicyRefresher.java:489) ==> PolicyRefresher(serviceName=kudu).loadRoles()
2020-06-20 07:05:49.896 [DEBUG - PolicyRefresher(serviceName=kudu)-12] 
(PolicyRefresher.java:287) <== PolicyRefresher(serviceName=kudu).loadPolicy()
2020-06-20 07:05:50.072 [DEBUG - PolicyRefresher(serviceName=kudu)-12] 
(PolicyRefresher.java:489) ==> PolicyRefresher(serviceName=kudu).loadRoles()
2020-06-20 07:05:50.292 [DEBUG - PolicyRefresher(serviceName=kudu)-12] 
(PolicyRefresher.java:287) <== PolicyRefresher(serviceName=kudu).loadPolicy()
2020-06-20 07:05:50.473 [DEBUG - PolicyRefresher(serviceName=kudu)-12] 
(PolicyRefresher.java:489) ==> PolicyRefresher(serviceName=kudu).loadRoles()
{code}

Tbh I'm not sure how that InboundRequest is logged to be taken before it's put 
but I find it very weird. It was also logged the same way in the failing test 
that I've seen.

If I run this test on my Mac it succeeds and the QueueUtil messages are in the 
expected/correct order:

{code}
2020-07-02 01:17:19.459 [DEBUG - pool-2-thread-1] (QueueUtil.java:56) Message: 
org.apache.kudu.subprocess.InboundRequest@1cf099ad has been put on the queue
2020-07-02 01:17:19.459 [DEBUG - pool-3-thread-3] (QueueUtil.java:41) Message: 
org.apache.kudu.subprocess.InboundRequest@1cf099ad has been taken from the queue
{code}

I tried to search for some blocking queue bugs but couldn't find any so far.

> RangerClientTestBase.TestLogging sometimes fails
> ------------------------------------------------
>
>                 Key: KUDU-3154
>                 URL: https://issues.apache.org/jira/browse/KUDU-3154
>             Project: Kudu
>          Issue Type: Bug
>          Components: ranger, test
>    Affects Versions: 1.13.0
>            Reporter: Alexey Serbin
>            Priority: Major
>         Attachments: ranger_client-test.txt.xz
>
>
> The {{RangerClientTestBase.TestLogging}} scenario of the 
> {{ranger_client-test}} sometimes fails (all types of builds) with error 
> message like below:
> {noformat}
> src/kudu/ranger/ranger_client-test.cc:398: Failure
> Failed                                                                        
>   
> Bad status: Timed out: timed out while in flight                              
>   
> I0620 07:06:02.907177  1140 server.cc:247] Received an EOF from the 
> subprocess  
> I0620 07:06:02.910923  1137 server.cc:317] get failed, inbound queue shut 
> down: Aborted:
> I0620 07:06:02.910964  1141 server.cc:380] outbound queue shut down: Aborted: 
>   
> I0620 07:06:02.910995  1138 server.cc:317] get failed, inbound queue shut 
> down: Aborted:
> I0620 07:06:02.910984  1139 server.cc:317] get failed, inbound queue shut 
> down: Aborted:
> {noformat}
> The log is attached.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to