Oxidaner commented on code in PR #3340:
URL: https://github.com/apache/dubbo-go/pull/3340#discussion_r3331850881


##########
.agents/skills/debug/SKILL.md:
##########
@@ -0,0 +1,196 @@
+---
+name: dubbo-go-debugging
+description: Structured diagnosis for dubbo-go v3 runtime errors. Use when the 
user reports an error, pastes logs, mentions timeout/panic/connection 
refused/no provider/serialization mismatch, or asks why their dubbo-go service 
isn't working.
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to You under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+# Debugging dubbo-go
+
+When the user shares an error or log, match it to a pattern below. If unclear, 
ask for:
+
+1. Full error message or stack trace
+2. Registry type (Nacos / ZooKeeper / etcd / Polaris / direct)
+3. Protocol (Triple / Dubbo / gRPC / JSONRPC)
+4. Whether provider and consumer are separate processes
+
+## Quick Triage
+
+| Symptom in logs | Jump to |
+|---|---|
+| `no provider available`, `no route`, `Should has at least one way to know` | 
[No provider / no route](#no-provider--no-route) |
+| `dial tcp ... connection refused`, `i/o timeout` on connect | [Connection 
refused / dial error](#connection-refused--dial-error) |
+| `hessian: failed to decode`, `protobuf: cannot parse invalid wire-format 
data` | [Serialization / decode error](#serialization--decode-error) |
+| `context deadline exceeded`, `invoke timeout` | [Timeout](#timeout) |
+| `filter not found`, panic inside a filter | [Filter not found / filter 
panic](#filter-not-found--filter-panic) |
+| Provider exits seconds after start | [Provider starts but immediately 
exits](#provider-starts-but-immediately-exits) |
+| `404` at `/dubbo/openapi/...`, empty spec | [OpenAPI 404 / empty 
spec](#openapi-404--empty-spec) |
+| `AttachHTTPHandler` returns error | [AttachHTTPHandler 
errors](#attachhttphandler-errors) |
+| Pods drain slowly, in-flight requests truncated on shutdown | [Graceful 
shutdown](#graceful-shutdown) |
+
+## No provider / no route
+
+**Cause**: Consumer cannot find a provider in the registry.
+
+Checklist:
+- [ ] Provider started successfully? Look for `dubbo server started` and `A 
provider service ... was registered successfully` in provider logs.
+- [ ] Same registry address on both sides 
(`dubbo.WithRegistry(registry.WithAddress(...))` / YAML 
`dubbo.registries.xxx.address`)?
+- [ ] Same application name on both sides (`dubbo.WithName(...)`)? v3 defaults 
to **application-level** discovery — the registry stores the app name, not the 
interface FQN.
+- [ ] Same `interface` name passed to `pb.RegisterXxxHandler` and 
`pb.NewXxxService`?
+- [ ] Same protocol on both sides (both `tri` or both `dubbo`)?
+- [ ] Provider visible in the registry?
+
+```bash
+# Nacos — application-level discovery, query by app name
+curl 
"http://127.0.0.1:8848/nacos/v1/ns/instance/list?serviceName=<your-app-name>"
+
+# ZooKeeper
+zkCli.sh ls /services
+
+# etcd
+etcdctl get --prefix /services
+```
+
+If the consumer cannot map an interface to an application, hint with 
`client.WithProvidedBy("<app-name>")`.
+
+## Connection refused / dial error
+
+**Cause**: Network or port misconfiguration.
+
+Checklist:
+- [ ] Provider port (`protocol.WithPort(...)`) matches what the consumer is 
dialing?
+- [ ] Firewall / Docker network allows the port?
+- [ ] Inside Docker: using container/service name instead of `localhost`?
+- [ ] Provider actually listening? `lsof -i :20000` on the provider host.
+
+## Serialization / decode error
+
+```
+hessian: failed to decode
+protobuf: cannot parse invalid wire-format data
+```
+
+**Cause**: Provider and consumer using different serialization formats.
+
+Checklist:
+- [ ] Both sides on the same protocol (both `tri` or both `dubbo`)?
+- [ ] Protobuf: same `.proto` compiled on both sides? Same `go_package`?
+- [ ] Hessian2: POJO `JavaClassName()` matches the Java class FQN exactly? 
`RegisterPOJO` called in an `init()` that actually runs?
+
+For Hessian2 specifics, see `dubbo-go-java-interop`.

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to