zlhgo edited a comment on issue #6184:
URL: https://github.com/apache/apisix/issues/6184#issuecomment-1022833791


   昨天已经把大部分流量切回了 `kubernetes/ingress-nginx`, 今天早上还是出现了几次 
`/usr/local/apisix/apisix/balancer.lua:187: attempt to index local 'up_conf' (a 
nil value)` 同时也有路由请求到了错误的后端地址。
   
   告警日志
   ```
   At least 3 events occurred between 2022-01-27 10:23 CST and 2022-01-27 10:28 
CST
   
   instance.keyword:
   stable: 5
   inner: 1
   
   message.keyword:
   2022/01/27 02:28:18 [error] 34#34: 27527237 failed to run balancer_by_lua: 
/usr/local/apisix/apisix/balancer.lua:187: attempt to index local 'up_conf' (a 
nil value): 1
   2022/01/27 02:28:20 [error] 34#34: 27531572 failed to run balancer_by_lua: 
/usr/local/apisix/apisix/balancer.lua:187: attempt to index local 'up_conf' (a 
nil value): 1
   2022/01/27 02:28:22 [error] 36#36: 27408336 failed to run balancer_by_lua: 
/usr/local/apisix/apisix/balancer.lua:187: attempt to index local 'up_conf' (a 
nil value): 1
   2022/01/27 02:28:23 [error] 36#36: 27408064 failed to run balancer_by_lua: 
/usr/local/apisix/apisix/balancer.lua:187: attempt to index local 'up_conf' (a 
nil value): 1
   2022/01/27 02:28:23 [error] 36#36: 27408537 failed to run balancer_by_lua: 
/usr/local/apisix/apisix/balancer.lua:187: attempt to index local 'up_conf' (a 
nil value): 1
   
   @timestamp: 2022-01-27T02:28:20.029Z
   _id: k2JcmX4BJEYEGnF4xPKm
   _index: ingress_apisix-gateway_app-log-2022.01.27
   _type: _doc
   instance: stable
   message: 2022/01/27 02:28:20 [error] 34#34: 27531572 failed to run 
balancer_by_lua: /usr/local/apisix/apisix/balancer.lua:187: attempt to index 
local 'up_conf' (a nil value)
   num_hits: 6
   num_matches: 2
   ```
   
   
   阿里云sls查的最近1小时ingress access log日志。
   ```sql
   * | select proxy_upstream_name, min(time) as min_time, max(time) as 
max_time, count(*) as count where upstream_addr = '172.16.90.251:8080' and 
_container_name_='apisix-gateway' group by proxy_upstream_name
   ```
   
   查询结果
   | proxy\_upstream\_name                | min\_time            | max\_time    
        | count  |
   | ------------------------------------ | -------------------- | 
-------------------- | ------ |
   | sentry\_stable-sentry-relay\_3000    | 27/Jan/2022:02:28:21 | 
27/Jan/2022:02:28:23 | 2      |
   | h5\_h5-api\_80               | 27/Jan/2022:02:28:19 | 27/Jan/2022:02:28:19 
| 1      |
   | api\_adv-api-online\_8080 | 27/Jan/2022:02:27:03 | 27/Jan/2022:03:27:08 | 
111394 |
   | bigdata\_sa-api-online\_80        | 27/Jan/2022:02:28:22 | 
27/Jan/2022:02:28:22 | 1      |
   
   通过access log可以看到,同一个时间范围,的确有不同的路由请求打到了同一个后端 `upstream_addr = 
'172.16.90.251:8080'` 上。
   
   补充一个证据, stable-sentry-relay 这个服务已经80多天没有重启了。POD IP也没有 172.16.90.251 这个IP。
   ```
   → kubectl --context prod-1 -n sentry get pods -o wide | grep relay           
                                                                                
                                               
   stable-sentry-relay-56bf48f677-h5fqw                              1/1     
Running     0          82d     172.16.87.161    cn-beijing.172.16.24.167   
<none>           <none>
   stable-sentry-relay-56bf48f677-hzvtn                              1/1     
Running     0          82d     172.16.108.163   cn-beijing.172.16.78.67    
<none>           <none>
   ```
   
   现在我已经关闭了我自定义的分流插件,并且开启了info级别的日志,并重启了apisix-gateway。
   
   ----------------
   
   就刚准备提交这个回复之前,这个问题又出现了。请稍等我提交error_log日志。
   
   @leslie-tsang @tzssangglass @starsz 请各位大佬帮忙关注下。谢谢。


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscr...@apisix.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to