zlhgo edited a comment on issue #6184: URL: https://github.com/apache/apisix/issues/6184#issuecomment-1022833791
昨天已经把大部分流量切回了 `kubernetes/ingress-nginx`, 今天早上还是出现了几次 `/usr/local/apisix/apisix/balancer.lua:187: attempt to index local 'up_conf' (a nil value)` 同时也有路由请求到了错误的后端地址。 告警日志 ``` At least 3 events occurred between 2022-01-27 10:23 CST and 2022-01-27 10:28 CST instance.keyword: stable: 5 inner: 1 message.keyword: 2022/01/27 02:28:18 [error] 34#34: 27527237 failed to run balancer_by_lua: /usr/local/apisix/apisix/balancer.lua:187: attempt to index local 'up_conf' (a nil value): 1 2022/01/27 02:28:20 [error] 34#34: 27531572 failed to run balancer_by_lua: /usr/local/apisix/apisix/balancer.lua:187: attempt to index local 'up_conf' (a nil value): 1 2022/01/27 02:28:22 [error] 36#36: 27408336 failed to run balancer_by_lua: /usr/local/apisix/apisix/balancer.lua:187: attempt to index local 'up_conf' (a nil value): 1 2022/01/27 02:28:23 [error] 36#36: 27408064 failed to run balancer_by_lua: /usr/local/apisix/apisix/balancer.lua:187: attempt to index local 'up_conf' (a nil value): 1 2022/01/27 02:28:23 [error] 36#36: 27408537 failed to run balancer_by_lua: /usr/local/apisix/apisix/balancer.lua:187: attempt to index local 'up_conf' (a nil value): 1 @timestamp: 2022-01-27T02:28:20.029Z _id: k2JcmX4BJEYEGnF4xPKm _index: ingress_apisix-gateway_app-log-2022.01.27 _type: _doc instance: stable message: 2022/01/27 02:28:20 [error] 34#34: 27531572 failed to run balancer_by_lua: /usr/local/apisix/apisix/balancer.lua:187: attempt to index local 'up_conf' (a nil value) num_hits: 6 num_matches: 2 ``` 阿里云sls查的最近1小时ingress access log日志。 ```sql * | select proxy_upstream_name, min(time) as min_time, max(time) as max_time, count(*) as count where upstream_addr = '172.16.90.251:8080' and _container_name_='apisix-gateway' group by proxy_upstream_name ``` 查询结果 | proxy\_upstream\_name | min\_time | max\_time | count | | ------------------------------------ | -------------------- | -------------------- | ------ | | sentry\_stable-sentry-relay\_3000 | 27/Jan/2022:02:28:21 | 27/Jan/2022:02:28:23 | 2 | | h5\_h5-api\_80 | 27/Jan/2022:02:28:19 | 27/Jan/2022:02:28:19 | 1 | | api\_adv-api-online\_8080 | 27/Jan/2022:02:27:03 | 27/Jan/2022:03:27:08 | 111394 | | bigdata\_sa-api-online\_80 | 27/Jan/2022:02:28:22 | 27/Jan/2022:02:28:22 | 1 | 通过access log可以看到,同一个时间范围,的确有不同的路由请求打到了同一个后端 `upstream_addr = '172.16.90.251:8080'` 上。 补充一个证据, stable-sentry-relay 这个服务已经80多天没有重启了。POD IP也没有 172.16.90.251 这个IP。 ``` → kubectl --context prod-1 -n sentry get pods -o wide | grep relay stable-sentry-relay-56bf48f677-h5fqw 1/1 Running 0 82d 172.16.87.161 cn-beijing.172.16.24.167 <none> <none> stable-sentry-relay-56bf48f677-hzvtn 1/1 Running 0 82d 172.16.108.163 cn-beijing.172.16.78.67 <none> <none> ``` 现在我已经关闭了我自定义的分流插件,并且开启了info级别的日志,并重启了apisix-gateway。 ---------------- 就刚准备提交这个回复之前,这个问题又出现了。请稍等我提交error_log日志。 @leslie-tsang @tzssangglass @starsz 请各位大佬帮忙关注下。谢谢。 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@apisix.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org