Currently, we do service discovery on the DP side. Each APISIX
instance talks with the service discovery service directly, asks for
new nodes.

This service discovery framework has been in use for over a year and
has identified a number of issues.
1. each APISIX instance needs to pull data from inside the service
discovery system, which complicates the network topology.
2. each service discovery data has to store a list of services on each
APISIX worker
3. Service discovery configuration needs to be configured once per
APISIX instance. For example, if you want to change a password, you
need to modify the configuration file and then publish it to each
APISIX.

That's why we design the v2 service discovery framework. This version
is similar to the previous one, but there are two differences:
1. Service discovery is placed at the CP side. We will introduce a new
component to do this.
2. The updated data is written to etcd if there are changes, which are
synchronized to each APISIX instance. Each instance sees the latest
node information, and no longer needs to do service discovery itself.

We don't need to do any modifications to APISIX itself. APISIX writes
data like this:

```
    "upstream": {
        "type": "roundrobin",
        "service_name": "APISIX-NACOS",
        "discovery_type": "nacos",
        "discovery_args": {
          "namespace_id": "test_ns"
        }
    }
```

Then a separate component rewrites this data to:

```
    "upstream": {
        "nodes": {
            "1.2.3.4:80":1
        },
        "type": "roundrobin",
        "_service_name": "APISIX-NACOS",
        "_discovery_type": "nacos",
        "discovery_args": {
          "namespace_id": "test_ns"
        }
    }
```

The component will watch upstream with "discovery_type" and
"_discovery_type", and make sure they are up-to-date.

Service discovery was localized, but now it is centralized. New nodes
are discovered top-down from the center.

This solves the above problems:
1. the network topology becomes simpler
2. the total data volume becomes smaller
3. it is easier to manage

But it also has some drawbacks.
1. if the component is not available, the latest node information will
not be available for each instance
2. the pressure on etcd will increase

Reply via email to