background 1. Problem to be solved
APISIX uses etcd as the configuration center. etcd has migrated the interface to gRPC since v3. However, since there is no project in the OpenResty ecosystem that supports gRPC, APISIX can only call the HTTP interface of etcd. The HTTP interface of etcd is provided through gRPC-gateway. Essentially, etcd runs an HTTP to gRPC proxy at the server level, and external HTTP requests will be converted into gRPC requests. In practice, we also found some problems with the interaction between the HTTP API and the gRPC API. In fact, having a gRPC-gateway does not mean that it can perfectly support HTTP access, there are some subtle differences here. 2. The benefits of solving this problem 2.1 By supporting interacting with etcd via gRPC, APISIX can have first-class support from etcd. 2.2 Since gRPC shares all the streams in a TCP connection, we can also reduce the number of connections rapidly. 2.3 The connections can be made more secure by using gRPC directly Solution As the gRPC ability requires the grpc-client-nginx-module, we can't enable gRPC support when using APISIX on vanilla OpenResty. We interact with etcd in the scenes below: Check and init in the cli mode We can ship a grpcurl with APISIX, although this will make the packaging tough. read/write keys in the Admin API We need to write the gRPC version of core.etcd.set/delete/get/... Sync conf during the startup We need to make it possible to send unary call in init_worker_by_lua, just like what we have done in the HTTP way (replacing cosocket with luasocket). We can do this by introducing a separate blocking implementation for unary call in the grpc-client-nginx-module. Sync conf when running APISIX There are two kinds of sync: full sync & incremental sync. For full sync, we will reimplement readdir API in gRPC. For incremental sync, current implementation is based on HTTP long poll, so each watch operation will start a new request and have its own timeout. When we switch to gRPC, we will use one connection at the time. So we need to provide a longer default timeout for the gRPC. We also need to reimplement the one-shot watchdir API with the recv call in gRPC's server stream.