(incubator-hugegraph-computer) 01/01: docs: add AGENTS.md and update docs & gitignore

jin Wed, 28 Jan 2026 23:39:39 -0800

This is an automated email from the ASF dual-hosted git repository.

jin pushed a commit to branch docs
in repository 
https://gitbox.apache.org/repos/asf/incubator-hugegraph-computer.git


commit 5a6a5fe7374f831603ffdbe6efd76132b8a34da1
Author: imbajin <[email protected]>
AuthorDate: Thu Jan 29 15:39:19 2026 +0800

    docs: add AGENTS.md and update docs & gitignore
    
    Add AI-assistant guidance files (AGENTS.md) at repository root and under 
vermeer, and expand documentation across the project: significantly update 
top-level README.md, computer/README.md, and vermeer/README.md with 
architecture, quick-starts, build/test instructions, and examples. Also update 
CI badge link in README and add AI-assistant-specific ignore patterns to 
.gitignore and vermeer/.gitignore to avoid tracking assistant artifacts.
---
 .gitignore         |   9 +
 AGENTS.md          | 237 +++++++++++++++++++++++
 README.md          | 234 ++++++++++++++++++++---
 computer/README.md | 523 ++++++++++++++++++++++++++++++++++++++++++++++++++-
 vermeer/.gitignore |  10 +
 vermeer/AGENTS.md  | 207 +++++++++++++++++++++
 vermeer/README.md  | 536 +++++++++++++++++++++++++++++++++++++++++++++++------
 7 files changed, 1664 insertions(+), 92 deletions(-)

diff --git a/.gitignore b/.gitignore
index e909d27e..99283ffa 100644
--- a/.gitignore
+++ b/.gitignore
@@ -55,6 +55,15 @@ build/
 *.log
 *.pyc
 
+# AI assistant specific files (we only maintain AGENTS.md)
+CLAUDE.md
+GEMINI.md
+CURSOR.md
+COPILOT.md
+.cursorrules
+.cursor/
+.github/copilot-instructions.md
+
 # maven ignore
 
 apache-hugegraph-*-incubating-*/
diff --git a/AGENTS.md b/AGENTS.md
new file mode 100644
index 00000000..b3afd40e
--- /dev/null
+++ b/AGENTS.md
@@ -0,0 +1,237 @@
+# AGENTS.md
+
+This file provides guidance to AI coding assistants when working with code in 
this repository.
+
+## Repository Overview
+
+This is the Apache HugeGraph-Computer repository containing two distinct graph 
computing systems:
+
+1. **computer** (Java/Maven): A distributed BSP/Pregel-style graph processing 
framework that runs on Kubernetes or YARN
+2. **vermeer** (Go): A high-performance in-memory graph computing platform 
with master-worker architecture
+
+Both integrate with HugeGraph for graph data input/output.
+
+## Build & Test Commands
+
+### Computer (Java)
+
+**Prerequisites:**
+- JDK 11 for building/running
+- JDK 8 for HDFS dependencies
+- Maven 3.5+
+- For K8s module: run `mvn clean install` first to generate CRD classes under 
computer-k8s
+
+**Build:**
+```bash
+cd computer
+mvn clean compile -Dmaven.javadoc.skip=true
+```
+
+**Tests:**
+```bash
+# Unit tests
+mvn test -P unit-test
+
+# Integration tests
+mvn test -P integrate-test
+```
+
+**Run single test:**
+```bash
+# Run specific test class
+mvn test -P unit-test -Dtest=ClassName
+
+# Run specific test method
+mvn test -P unit-test -Dtest=ClassName#methodName
+```
+
+**License check:**
+```bash
+mvn apache-rat:check
+```
+
+**Package:**
+```bash
+mvn clean package -DskipTests
+```
+
+### Vermeer (Go)
+
+**Prerequisites:**
+- Go 1.23+
+- `curl` and `unzip` (for downloading binary dependencies)
+
+**First-time setup:**
+```bash
+cd vermeer
+make init  # Downloads supervisord and protoc binaries, installs Go deps
+```
+
+**Build:**
+```bash
+make          # Build for current platform
+make build-linux-amd64
+make build-linux-arm64
+```
+
+**Development build with hot-reload UI:**
+```bash
+go build -tags=dev
+```
+
+**Clean:**
+```bash
+make clean      # Remove built binaries and generated assets
+make clean-all  # Also remove downloaded tools
+```
+
+**Run:**
+```bash
+# Using binary directly
+./vermeer --env=master
+./vermeer --env=worker
+
+# Using script (configure in vermeer.sh)
+./vermeer.sh start master
+./vermeer.sh start worker
+```
+
+**Regenerate protobuf (if proto files changed):**
+```bash
+go install google.golang.org/protobuf/cmd/[email protected]
+go install google.golang.org/grpc/cmd/[email protected]
+tools/protoc/osxm1/protoc *.proto --go-grpc_out=. --go_out=.
+```
+
+## Architecture
+
+### Computer (Java) - BSP/Pregel Framework
+
+**Module Structure:**
+- `computer-api`: Public interfaces for graph processing (Computation, Vertex, 
Edge, Aggregator, Combiner, GraphFactory)
+- `computer-core`: Runtime implementation (WorkerService, MasterService, 
messaging, BSP coordination, managers)
+- `computer-algorithm`: Built-in algorithms (PageRank, LPA, WCC, SSSP, 
TriangleCount, etc.)
+- `computer-driver`: Job submission and driver-side coordination
+- `computer-k8s`: Kubernetes deployment integration
+- `computer-yarn`: YARN deployment integration
+- `computer-k8s-operator`: Kubernetes operator for job management
+- `computer-dist`: Distribution packaging
+- `computer-test`: Integration and unit tests
+
+**Key Design Patterns:**
+
+1. **API/Implementation Separation**: Algorithms depend only on `computer-api` 
interfaces; `computer-core` provides runtime implementation. Algorithms are 
dynamically loaded via config.
+
+2. **Manager Pattern**: `WorkerService` composes multiple managers 
(MessageSendManager, MessageRecvManager, WorkerAggrManager, DataServerManager, 
SortManagers, SnapshotManager, etc.) with lifecycle hooks: `initAll()`, 
`beforeSuperstep()`, `afterSuperstep()`, `closeAll()`.
+
+3. **BSP Coordination**: Explicit barrier synchronization via etcd 
(EtcdBspClient). Each superstep follows:
+   - `workerStepPrepareDone` → `waitMasterStepPrepareDone`
+   - Local compute (vertices process messages)
+   - `workerStepComputeDone` → `waitMasterStepComputeDone`
+   - Aggregators/snapshots
+   - `workerStepDone` → `waitMasterStepDone` (master returns SuperstepStat)
+
+4. **Computation Contract**: Algorithms implement `Computation<M extends 
Value>`:
+   - `compute0(context, vertex)`: Initialize at superstep 0
+   - `compute(context, vertex, messages)`: Process messages in subsequent 
supersteps
+   - Access to aggregators, combiners, and message sending via 
`ComputationContext`
+
+**Important Files:**
+- Algorithm contract: 
`computer/computer-api/src/main/java/org/apache/hugegraph/computer/core/worker/Computation.java`
+- Runtime orchestration: 
`computer/computer-core/src/main/java/org/apache/hugegraph/computer/core/worker/WorkerService.java`
+- BSP coordination: 
`computer/computer-core/src/main/java/org/apache/hugegraph/computer/core/bsp/Bsp4Worker.java`
+- Example algorithm: 
`computer/computer-algorithm/src/main/java/org/apache/hugegraph/computer/algorithm/centrality/pagerank/PageRank.java`
+
+### Vermeer (Go) - In-Memory Computing Engine
+
+**Directory Structure:**
+- `algorithms/`: Go algorithm implementations (pagerank.go, sssp.go, 
louvain.go, etc.)
+- `apps/`:
+  - `bsp/`: BSP coordination helpers
+  - `graphio/`: HugeGraph I/O adapters (reads via gRPC to store/pd, writes via 
HTTP REST)
+  - `master/`: Master scheduling, HTTP endpoints, worker management
+  - `compute/`: Worker-side compute logic
+  - `protos/`: Generated protobuf/gRPC definitions
+  - `common/`: Utilities, logging, metrics
+- `client/`: Client libraries
+- `tools/`: Binary dependencies (supervisord, protoc)
+- `ui/`: Web UI assets
+
+**Key Patterns:**
+
+1. **Maker/Registry Pattern**: Graph loaders/writers register themselves via 
init() (e.g., `LoadMakers[LoadTypeHugegraph] = &HugegraphMaker{}`). Master 
selects loader by type.
+
+2. **HugeGraph Integration**:
+   - `hugegraph.go` implements HugegraphMaker, HugegraphLoader, HugegraphWriter
+   - Queries PD via gRPC for partition metadata
+   - Streams vertex/edge data via gRPC from store (ScanPartition)
+   - Writes results back via HugeGraph HTTP REST API
+
+3. **Master-Worker**: Master schedules LoadPartition tasks to workers, manages 
worker lifecycle via WorkerManager/WorkerClient, exposes HTTP admin endpoints.
+
+**Important Files:**
+- HugeGraph integration: `vermeer/apps/graphio/hugegraph.go`
+- Master scheduling: `vermeer/apps/master/tasks/tasks.go`
+- Worker management: `vermeer/apps/master/workers/workers.go`
+- HTTP endpoints: `vermeer/apps/master/services/http_master.go`
+
+## Integration with HugeGraph
+
+**Computer (Java):**
+- `WorkerInputManager` reads vertices/edges from HugeGraph via `GraphFactory` 
abstraction
+- Graph data is partitioned and distributed to workers via input splits
+
+**Vermeer (Go):**
+- Directly queries HugeGraph PD (metadata service) for partition information
+- Uses gRPC to stream graph data from HugeGraph store
+- Writes computed results back via HugeGraph HTTP REST API (adds properties to 
vertices)
+
+## Development Workflow
+
+**Adding a New Algorithm (Computer):**
+1. Create class in `computer-algorithm` implementing `Computation<MessageType>`
+2. Implement `compute0()` for initialization and `compute()` for message 
processing
+3. Use `context.sendMessage()` or `context.sendMessageToAllEdges()` for 
message passing
+4. Register aggregators in `beforeSuperstep()`, read/write in `compute()`
+5. Configure algorithm class name in job config
+
+**K8s-Operator Development:**
+- CRD classes are auto-generated; run `mvn clean install` in 
`computer-k8s-operator` first
+- Generated classes appear in `computer-k8s/target/generated-sources/`
+- CRD generation script: `computer-k8s-operator/crd-generate/Makefile`
+
+**Vermeer Asset Updates:**
+- Web UI assets must be regenerated after changes: `cd asset && go generate`
+- Or use `make generate-assets` from vermeer root
+- For dev mode with hot-reload: `go build -tags=dev`
+
+## Testing Notes
+
+**Computer:**
+- Integration tests require etcd, HDFS, HugeGraph, and Kubernetes (see 
`.github/workflows/computer-ci.yml`)
+- Test environment setup scripts in `computer-dist/src/assembly/travis/`
+- Unit tests run in isolation without external dependencies
+
+**Vermeer:**
+- Test scripts in `vermeer/test/`
+- Configuration files in `vermeer/config/` (master.ini, worker.ini templates)
+
+## CI/CD
+
+CI pipeline (`.github/workflows/computer-ci.yml`) runs:
+1. License check (Apache RAT)
+2. Setup HDFS (Hadoop 3.3.2)
+3. Setup Minikube/Kubernetes
+4. Load test data into HugeGraph
+5. Compile with Java 11
+6. Run integration tests (`-P integrate-test`)
+7. Run unit tests (`-P unit-test`)
+8. Upload coverage to Codecov
+
+## Important Notes
+
+- **Computer K8s module**: Must run `mvn clean install` before editing to 
generate CRD classes
+- **Java version**: Build requires JDK 11; HDFS dependencies require JDK 8
+- **Vermeer binary deps**: First-time builds need `make init` to download 
supervisord/protoc
+- **BSP coordination**: Computer uses etcd for barrier synchronization 
(configure via `BSP_ETCD_URL`)
+- **Memory management**: Both systems auto-manage memory by spilling to disk 
when needed
diff --git a/README.md b/README.md
index f557fc63..eee4c05e 100644
--- a/README.md
+++ b/README.md
@@ -1,50 +1,234 @@
 # Apache HugeGraph-Computer
 
 
[![License](https://img.shields.io/badge/license-Apache%202-0E78BA.svg)](https://www.apache.org/licenses/LICENSE-2.0.html)
-[![Build 
Status](https://github.com/apache/hugegraph-computer/actions/workflows/ci.yml/badge.svg)](https://github.com/apache/hugegraph-computer/actions/workflows/ci.yml)
+[![Build 
Status](https://github.com/apache/hugegraph-computer/actions/workflows/computer-ci.yml/badge.svg)](https://github.com/apache/hugegraph-computer/actions/workflows/computer-ci.yml)
 
[![codecov](https://codecov.io/gh/apache/incubator-hugegraph-computer/branch/master/graph/badge.svg)](https://codecov.io/gh/apache/incubator-hugegraph-computer)
 [![Docker 
Pulls](https://img.shields.io/docker/pulls/hugegraph/hugegraph-computer)](https://hub.docker.com/repository/docker/hugegraph/hugegraph-computer)
 [![Ask 
DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/apache/hugegraph-computer)
 
-The [hugegraph-computer](./computer/README.md) is a distributed graph 
processing system for hugegraph. 
-(Also, the in-memory computing engine(vermeer) is on the way 🚧)
+Apache HugeGraph-Computer is a comprehensive graph computing solution 
providing two complementary systems for different deployment scenarios:
+
+- **[Vermeer](./vermeer/README.md)** (Go): High-performance in-memory 
computing engine for single-machine deployments
+- **[Computer](./computer/README.md)** (Java): Distributed BSP/Pregel 
framework for large-scale cluster computing
+
+## Quick Comparison
+
+| Feature | Vermeer (Go) | Computer (Java) |
+|---------|--------------|-----------------|
+| **Best for** | Single machine, quick start | Large-scale distributed 
computing |
+| **Deployment** | Single binary | Kubernetes or YARN cluster |
+| **Memory model** | In-memory first | Auto spill to disk |
+| **Setup time** | Minutes | Hours (requires K8s/YARN) |
+| **Algorithms** | 20+ algorithms | 45+ algorithms |
+| **Architecture** | Master-Worker | BSP (Bulk Synchronous Parallel) |
+| **API** | REST + gRPC | Java API |
+| **Web UI** | Built-in dashboard | N/A |
+| **Data sources** | HugeGraph, CSV, HDFS | HugeGraph, HDFS |
+
+## Architecture Overview
+
+```mermaid
+graph TB
+    subgraph HugeGraph-Computer
+        subgraph Vermeer["Vermeer (Go) - In-Memory Engine"]
+            VM[Master :6688] --> VW1[Worker 1 :6789]
+            VM --> VW2[Worker 2 :6789]
+            VM --> VW3[Worker N :6789]
+        end
+        subgraph Computer["Computer (Java) - Distributed BSP"]
+            CM[Master Service] --> CW1[Worker Pod 1]
+            CM --> CW2[Worker Pod 2]
+            CM --> CW3[Worker Pod N]
+        end
+    end
+
+    HG[(HugeGraph Server)] <--> Vermeer
+    HG <--> Computer
+
+    style Vermeer fill:#e1f5fe
+    style Computer fill:#fff3e0
+```
+
+## HugeGraph Ecosystem Integration
+
+```
+┌─────────────────────────────────────────────────────────────┐
+│                    HugeGraph Ecosystem                      │
+├─────────────────────────────────────────────────────────────┤
+│  ┌─────────────┐    ┌─────────────┐    ┌─────────────────┐  │
+│  │   Hubble    │    │  Toolchain  │    │  HugeGraph-AI   │  │
+│  │   (UI)      │    │   (Tools)   │    │  (LLM/RAG)      │  │
+│  └──────┬──────┘    └──────┬──────┘    └────────┬────────┘  │
+│         │                  │                    │           │
+│         └──────────────────┼────────────────────┘           │
+│                            │                                │
+│                    ┌───────▼───────┐                        │
+│                    │  HugeGraph    │                        │
+│                    │   Server      │                        │
+│                    └───────┬───────┘                        │
+│                            │                                │
+│         ┌──────────────────┼──────────────────┐             │
+│         │                  │                  │             │
+│  ┌──────▼──────┐    ┌──────▼──────┐    ┌─────▼─────┐       │
+│  │  Vermeer    │    │  Computer   │    │   Store   │       │
+│  │  (Memory)   │    │  (BSP/K8s)  │    │  (PD)     │       │
+│  └─────────────┘    └─────────────┘    └───────────┘       │
+└─────────────────────────────────────────────────────────────┘
+```
 
-## Learn More
+## Getting Started with Vermeer (Recommended)
 
-The [project 
homepage](https://hugegraph.apache.org/docs/quickstart/hugegraph-computer/) 
contains more information about hugegraph-computer.
+For quick start and single-machine deployments, we recommend **Vermeer**:
 
-And here are links of other repositories:
+### Docker Quick Start
 
-1. [hugegraph](https://github.com/apache/hugegraph) (graph's core component - 
Graph server + PD + Store)
-2. [hugegraph-toolchain](https://github.com/apache/hugegraph-toolchain) (graph 
tools 
**[loader](https://github.com/apache/incubator-hugegraph-toolchain/tree/master/hugegraph-loader)/[dashboard](https://github.com/apache/incubator-hugegraph-toolchain/tree/master/hugegraph-hubble)/[tool](https://github.com/apache/incubator-hugegraph-toolchain/tree/master/hugegraph-tools)/[client](https://github.com/apache/incubator-hugegraph-toolchain/tree/master/hugegraph-client)**)
-3. [hugegraph-ai](https://github.com/apache/incubator-hugegraph-ai) 
(integrated **Graph AI/LLM/KG** system)
-4. [hugegraph-website](https://github.com/apache/hugegraph-doc) (**doc & 
website** code)
+```bash
+# Pull the image
+docker pull hugegraph/vermeer:latest
 
+# Run with docker-compose
+docker-compose up -d
+```
 
-## Note
+### Binary Quick Start
 
-- If some classes under computer-k8s cannot be found, you need to execute `mvn 
clean install` in advance to generate corresponding classes.
+```bash
+# Download and extract (example for Linux AMD64)
+wget 
https://github.com/apache/hugegraph-computer/releases/download/vX.X.X/vermeer-linux-amd64.tar.gz
+tar -xzf vermeer-linux-amd64.tar.gz
+cd vermeer
+
+# Run master and worker
+./vermeer --env=master &
+./vermeer --env=worker &
+```
+
+See the **[Vermeer README](./vermeer/README.md)** for detailed configuration 
and usage.
+
+## Getting Started with Computer (Distributed)
+
+For large-scale distributed graph processing across clusters:
+
+### Prerequisites
+
+- JDK 11 or later
+- Maven 3.5+
+- Kubernetes cluster or YARN cluster
+
+### Build from Source
+
+```bash
+cd computer
+mvn clean package -DskipTests
+```
+
+### Deploy on Kubernetes
+
+```bash
+# Configure your K8s cluster in computer-k8s module
+# Submit a graph computing job
+java -jar computer-driver.jar --config job-config.properties
+```
+
+See the **[Computer README](./computer/README.md)** for detailed deployment 
and development guide.
+
+## Supported Algorithms
+
+### Common Algorithms (Both Systems)
+
+| Category | Algorithms |
+|----------|-----------|
+| **Centrality** | PageRank, Personalized PageRank, Betweenness Centrality, 
Closeness Centrality, Degree Centrality |
+| **Community Detection** | Louvain, LPA (Label Propagation), SLPA, WCC 
(Weakly Connected Components) |
+| **Path Finding** | SSSP (Single Source Shortest Path), BFS (Breadth-First 
Search) |
+| **Graph Structure** | Triangle Count, K-Core, Clustering Coefficient, Cycle 
Detection |
+| **Similarity** | Jaccard Similarity |
+
+### Vermeer-Specific Features
+
+- In-memory optimized implementations
+- Weighted Louvain variant
+- REST API for algorithm execution
+
+### Computer-Specific Algorithms
+
+- Count Triangle (distributed implementation)
+- Rings detection
+- ClusteringCoefficient variations
+- Custom algorithm development framework
+
+See individual README files for complete algorithm lists and usage examples.
+
+## Performance Characteristics
+
+### Vermeer (In-Memory)
+
+- **Throughput**: Optimized for fast iteration on medium-sized graphs 
(millions of vertices/edges)
+- **Latency**: Sub-second query response via REST API
+- **Memory**: Requires graph to fit in total worker memory
+- **Scalability**: Horizontal scaling by adding worker nodes
+
+### Computer (Distributed BSP)
+
+- **Throughput**: Handles billions of vertices/edges via distributed processing
+- **Latency**: Batch-oriented with superstep barriers
+- **Memory**: Auto spill to disk when memory is insufficient
+- **Scalability**: Elastic scaling on K8s with pod autoscaling
+
+## Use Cases
+
+### When to Use Vermeer
+
+- Quick prototyping and experimentation
+- Interactive graph analytics with Web UI
+- Medium-scale graphs (up to hundreds of millions of edges)
+- Single-machine or small cluster deployments
+- REST API integration requirements
+
+### When to Use Computer
+
+- Large-scale batch processing (billions of vertices)
+- Existing Kubernetes or YARN infrastructure
+- Custom algorithm development with Java
+- Memory-constrained environments (auto spill to disk)
+- Integration with Hadoop ecosystem
+
+## Documentation
+
+- [Project 
Homepage](https://hugegraph.apache.org/docs/quickstart/hugegraph-computer/)
+- [Vermeer Documentation](./vermeer/README.md)
+- [Computer Documentation](./computer/README.md)
+- [HugeGraph Documentation](https://hugegraph.apache.org/docs/)
+
+## Related Projects
+
+1. [hugegraph](https://github.com/apache/hugegraph) - Graph database core 
(Server + PD + Store)
+2. [hugegraph-toolchain](https://github.com/apache/hugegraph-toolchain) - 
Graph tools (Loader/Hubble/Tools/Client)
+3. [hugegraph-ai](https://github.com/apache/incubator-hugegraph-ai) - Graph 
AI/LLM/Knowledge Graph system
+4. [hugegraph-website](https://github.com/apache/hugegraph-doc) - 
Documentation and website
 
 ## Contributing
 
-- Welcome to contribute to HugeGraph, please see [How to 
Contribute](https://hugegraph.apache.org/docs/contribution-guidelines/contribute/)
 for more information.
-- Note: It's recommended to use [GitHub Desktop](https://desktop.github.com/) 
to greatly simplify the PR and commit process.
-- Thank you to all the people who already contributed to HugeGraph!
+Welcome to contribute to HugeGraph-Computer! Please see:
 
-[![contributors 
graph](https://contrib.rocks/image?repo=apache/hugegraph-computer)](https://github.com/apache/incubator-hugegraph-computer/graphs/contributors)
+- [How to 
Contribute](https://hugegraph.apache.org/docs/contribution-guidelines/contribute/)
 for guidelines
+- [GitHub Issues](https://github.com/apache/hugegraph-computer/issues) for bug 
reports and feature requests
 
-## License
+We recommend using [GitHub Desktop](https://desktop.github.com/) to simplify 
the PR process.
+
+Thank you to all contributors!
 
-hugegraph-computer is licensed under [Apache 
2.0](https://github.com/apache/incubator-hugegraph-computer/blob/master/LICENSE)
 License.
+[![contributors 
graph](https://contrib.rocks/image?repo=apache/hugegraph-computer)](https://github.com/apache/incubator-hugegraph-computer/graphs/contributors)
 
-### Contact Us
+## License
 
----
+HugeGraph-Computer is licensed under [Apache 2.0 
License](https://github.com/apache/incubator-hugegraph-computer/blob/master/LICENSE).
 
- - [GitHub 
Issues](https://github.com/apache/incubator-hugegraph-computer/issues): 
Feedback on usage issues and functional requirements (quick response)
- - Feedback Email: 
[[email protected]](mailto:[email protected]) 
([subscriber](https://hugegraph.apache.org/docs/contribution-guidelines/subscribe/)
 only)
- - Slack: [join the ASF HugeGraph 
channel](https://the-asf.slack.com/archives/C059UU2FJ23)
- - WeChat public account: Apache HugeGraph, welcome to scan this QR code to 
follow us.
+## Contact Us
 
- <img 
src="https://github.com/apache/hugegraph-doc/blob/master/assets/images/wechat.png?raw=true";
 alt="QR png" width="350"/>
+- **GitHub Issues**: [Report bugs or request 
features](https://github.com/apache/incubator-hugegraph-computer/issues)
+- **Email**: [[email protected]](mailto:[email protected]) 
([subscribe 
first](https://hugegraph.apache.org/docs/contribution-guidelines/subscribe/))
+- **Slack**: [Join ASF HugeGraph 
channel](https://the-asf.slack.com/archives/C059UU2FJ23)
+- **WeChat**: Scan QR code to follow Apache HugeGraph official account
 
+<img 
src="https://github.com/apache/hugegraph-doc/blob/master/assets/images/wechat.png?raw=true";
 alt="WeChat QR Code" width="350"/>
diff --git a/computer/README.md b/computer/README.md
index 7368f088..173c576e 100644
--- a/computer/README.md
+++ b/computer/README.md
@@ -1,12 +1,519 @@
-# Apache HugeGraph-Computer
+# Apache HugeGraph-Computer (Java)
 
-The hugegraph-computer is a distributed graph processing system for hugegraph. 
It is an implementation of 
[Pregel](https://kowshik.github.io/JPregel/pregel_paper.pdf). It runs on 
Kubernetes or YARN framework.
+HugeGraph-Computer is a distributed graph processing framework implementing 
the [Pregel](https://kowshik.github.io/JPregel/pregel_paper.pdf) model (BSP - 
Bulk Synchronous Parallel). It runs on Kubernetes or YARN clusters and 
integrates with HugeGraph for graph input/output.
 
 ## Features
 
-- Support distributed MPP graph computing, and integrates with HugeGraph as 
graph input/output storage.
-- Based on BSP (Bulk Synchronous Parallel) model, an algorithm performs 
computing through multiple parallel iterations, every iteration is a superstep.
-- Auto memory management. The framework will never be OOM(Out of Memory) since 
it will split some data to disk if it doesn't have enough memory to hold all 
the data.
-- The part of edges or the messages of supernode can be in memory, so you will 
never lose it.
-- You can load the data from HDFS or HugeGraph, output the results to HDFS or 
HugeGraph, or adapt any other systems manually as needed.
-- Easy to develop a new algorithm. You need to focus on a vertex only 
processing just like as in a single server, without worrying about message 
transfer and memory/storage management.
+- **Distributed MPP Computing**: Massively parallel graph processing across 
cluster nodes
+- **BSP Model**: Algorithm execution through iterative supersteps with global 
synchronization
+- **Auto Memory Management**: Automatic spill to disk when memory is 
insufficient - never OOM
+- **Flexible Data Sources**: Load from HugeGraph or HDFS, output to HugeGraph 
or HDFS
+- **Easy Algorithm Development**: Focus on single-vertex logic without 
worrying about distribution
+- **Production-Ready**: Battle-tested on billion-scale graphs with Kubernetes 
integration
+
+## Architecture
+
+### Module Structure
+
+```
+┌─────────────────────────────────────────────────────────────┐
+│                    HugeGraph-Computer                       │
+├─────────────────────────────────────────────────────────────┤
+│  ┌─────────────────────────────────────────────────────┐   │
+│  │                   computer-driver                    │   │
+│  │              (Job Submission & Coordination)         │   │
+│  └─────────────────────────┬───────────────────────────┘   │
+│                            │                                │
+│  ┌─────────────────────────┼───────────────────────────┐   │
+│  │         Deployment Layer (choose one)               │   │
+│  │  ┌──────────────┐              ┌──────────────┐     │   │
+│  │  │ computer-k8s │              │ computer-yarn│     │   │
+│  │  └──────────────┘              └──────────────┘     │   │
+│  └─────────────────────────┬───────────────────────────┘   │
+│                            │                                │
+│  ┌─────────────────────────┼───────────────────────────┐   │
+│  │                   computer-core                      │   │
+│  │        (WorkerService, MasterService, BSP)          │   │
+│  │  ┌──────────────────────────────────────────────┐   │   │
+│  │  │  Managers: Message, Aggregation, Snapshot... │   │   │
+│  │  └──────────────────────────────────────────────┘   │   │
+│  └─────────────────────────┬───────────────────────────┘   │
+│                            │                                │
+│  ┌─────────────────────────┼───────────────────────────┐   │
+│  │              computer-algorithm                      │   │
+│  │    (PageRank, LPA, WCC, SSSP, TriangleCount...)     │   │
+│  └─────────────────────────┬───────────────────────────┘   │
+│                            │                                │
+│  ┌─────────────────────────┴───────────────────────────┐   │
+│  │                    computer-api                      │   │
+│  │    (Computation, Vertex, Edge, Aggregator, Value)   │   │
+│  └─────────────────────────────────────────────────────┘   │
+└─────────────────────────────────────────────────────────────┘
+```
+
+### Module Descriptions
+
+| Module | Description |
+|--------|-------------|
+| **computer-api** | Public interfaces for algorithm development 
(`Computation`, `Vertex`, `Edge`, `Aggregator`, `Combiner`) |
+| **computer-core** | Runtime implementation (WorkerService, MasterService, 
messaging, BSP coordination, memory management) |
+| **computer-algorithm** | Built-in graph algorithms (45+ implementations) |
+| **computer-driver** | Job submission and driver-side coordination |
+| **computer-k8s** | Kubernetes deployment integration |
+| **computer-yarn** | YARN deployment integration |
+| **computer-k8s-operator** | Kubernetes operator for job lifecycle management 
|
+| **computer-dist** | Distribution packaging and assembly |
+| **computer-test** | Integration tests and unit tests |
+
+## Prerequisites
+
+- **JDK 11** or later (for building and running)
+- **Maven 3.5+** for building
+- **Kubernetes cluster** or **YARN cluster** for deployment
+- **etcd** for BSP coordination (configured via `BSP_ETCD_URL`)
+
+**Note**: For K8s-operator module development, run `mvn clean install` in 
`computer-k8s-operator` first to generate CRD classes.
+
+## Quick Start
+
+### Build from Source
+
+```bash
+cd computer
+
+# Compile (skip javadoc for faster builds)
+mvn clean compile -Dmaven.javadoc.skip=true
+
+# Package (skip tests for faster packaging)
+mvn clean package -DskipTests
+```
+
+### Run Tests
+
+```bash
+# Unit tests
+mvn test -P unit-test
+
+# Integration tests (requires etcd, K8s, HugeGraph)
+mvn test -P integrate-test
+
+# Run specific test class
+mvn test -P unit-test -Dtest=ClassName
+
+# Run specific test method
+mvn test -P unit-test -Dtest=ClassName#methodName
+```
+
+### License Check
+
+```bash
+mvn apache-rat:check
+```
+
+### Deploy on Kubernetes
+
+#### 1. Configure Job
+
+Create `job-config.properties`:
+
+```properties
+# Algorithm class
+algorithm.class=org.apache.hugegraph.computer.algorithm.centrality.pagerank.PageRank
+
+# HugeGraph connection
+hugegraph.url=http://hugegraph-server:8080
+hugegraph.graph=hugegraph
+
+# K8s configuration
+k8s.namespace=default
+k8s.image=hugegraph/hugegraph-computer:latest
+k8s.master.cpu=2
+k8s.master.memory=4Gi
+k8s.worker.replicas=3
+k8s.worker.cpu=4
+k8s.worker.memory=8Gi
+
+# BSP coordination (etcd)
+bsp.etcd.url=http://etcd-cluster:2379
+
+# Algorithm parameters (PageRank example)
+pagerank.damping_factor=0.85
+pagerank.max_iterations=20
+pagerank.convergence_tolerance=0.0001
+```
+
+#### 2. Submit Job
+
+```bash
+java -jar computer-driver/target/computer-driver-${VERSION}.jar \
+  --config job-config.properties
+```
+
+#### 3. Monitor Job
+
+```bash
+# Check pod status
+kubectl get pods -n default
+
+# View master logs
+kubectl logs hugegraph-computer-master-xxx -n default
+
+# View worker logs
+kubectl logs hugegraph-computer-worker-0 -n default
+```
+
+### Deploy on YARN
+
+**Note**: YARN deployment support is under development. Use Kubernetes for 
production deployments.
+
+## Available Algorithms
+
+### Centrality Algorithms
+
+| Algorithm | Class | Description |
+|-----------|-------|-------------|
+| PageRank | `algorithm.centrality.pagerank.PageRank` | Standard PageRank |
+| Personalized PageRank | `algorithm.centrality.pagerank.PersonalizedPageRank` 
| Source-specific PageRank |
+| Betweenness Centrality | 
`algorithm.centrality.betweenness.BetweennessCentrality` | Shortest-path-based 
centrality |
+| Closeness Centrality | `algorithm.centrality.closeness.ClosenessCentrality` 
| Average distance centrality |
+| Degree Centrality | `algorithm.centrality.degree.DegreeCentrality` | In/out 
degree counting |
+
+### Community Detection
+
+| Algorithm | Class | Description |
+|-----------|-------|-------------|
+| LPA | `algorithm.community.lpa.Lpa` | Label Propagation Algorithm |
+| WCC | `algorithm.community.wcc.Wcc` | Weakly Connected Components |
+| Louvain | `algorithm.community.louvain.Louvain` | Modularity-based community 
detection |
+| K-Core | `algorithm.community.kcore.KCore` | K-core decomposition |
+
+### Path Finding
+
+| Algorithm | Class | Description |
+|-----------|-------|-------------|
+| SSSP | `algorithm.path.sssp.Sssp` | Single Source Shortest Path |
+| BFS | `algorithm.traversal.bfs.Bfs` | Breadth-First Search |
+| Rings | `algorithm.path.rings.Rings` | Cycle/ring detection |
+
+### Graph Structure
+
+| Algorithm | Class | Description |
+|-----------|-------|-------------|
+| Triangle Count | `algorithm.trianglecount.TriangleCount` | Count triangles |
+| Clustering Coefficient | 
`algorithm.clusteringcoefficient.ClusteringCoefficient` | Local clustering 
measure |
+
+**Full algorithm list**: See 
`computer-algorithm/src/main/java/org/apache/hugegraph/computer/algorithm/`
+
+## Developing Custom Algorithms
+
+### Algorithm Contract
+
+Algorithms implement the `Computation` interface from `computer-api`:
+
+```java
+package org.apache.hugegraph.computer.core.worker;
+
+public interface Computation<M extends Value> {
+    /**
+     * Initialization at superstep 0
+     */
+    void compute0(ComputationContext context, Vertex vertex);
+
+    /**
+     * Message processing in subsequent supersteps
+     */
+    void compute(ComputationContext context, Vertex vertex, Iterator<M> 
messages);
+}
+```
+
+### Example: Simple PageRank
+
+```java
+package org.apache.hugegraph.computer.algorithm.centrality.pagerank;
+
+import org.apache.hugegraph.computer.core.worker.Computation;
+import org.apache.hugegraph.computer.core.worker.ComputationContext;
+
+public class PageRank implements Computation<DoubleValue> {
+
+    public static final String OPTION_ALPHA = "pagerank.alpha";
+    public static final String OPTION_MAX_ITERATIONS = 
"pagerank.max_iterations";
+
+    private double alpha;
+    private int maxIterations;
+
+    @Override
+    public void init(Config config) {
+        this.alpha = config.getDouble(OPTION_ALPHA, 0.85);
+        this.maxIterations = config.getInt(OPTION_MAX_ITERATIONS, 20);
+    }
+
+    @Override
+    public void compute0(ComputationContext context, Vertex vertex) {
+        // Initialize: set initial PR value
+        vertex.value(new DoubleValue(1.0));
+
+        // Send PR to neighbors
+        int edgeCount = vertex.numEdges();
+        if (edgeCount > 0) {
+            double contribution = 1.0 / edgeCount;
+            context.sendMessageToAllEdges(vertex, new 
DoubleValue(contribution));
+        }
+    }
+
+    @Override
+    public void compute(ComputationContext context, Vertex vertex, 
Iterator<DoubleValue> messages) {
+        // Sum incoming PR contributions
+        double sum = 0.0;
+        while (messages.hasNext()) {
+            sum += messages.next().value();
+        }
+
+        // Calculate new PR value
+        double newPR = (1.0 - alpha) + alpha * sum;
+        vertex.value(new DoubleValue(newPR));
+
+        // Send to neighbors if not converged
+        if (context.superstep() < maxIterations) {
+            int edgeCount = vertex.numEdges();
+            if (edgeCount > 0) {
+                double contribution = newPR / edgeCount;
+                context.sendMessageToAllEdges(vertex, new 
DoubleValue(contribution));
+            }
+        } else {
+            vertex.inactivate();
+        }
+    }
+}
+```
+
+### Key Concepts
+
+#### 1. Supersteps
+
+- **Superstep 0**: Initialization via `compute0()`
+- **Superstep 1+**: Message processing via `compute()`
+- **Barrier Synchronization**: All workers complete superstep N before 
starting N+1
+
+#### 2. Message Passing
+
+```java
+// Send to specific vertex
+context.sendMessage(targetId, new DoubleValue(1.0));
+
+// Send to all outgoing edges
+context.sendMessageToAllEdges(vertex, new DoubleValue(1.0));
+```
+
+#### 3. Aggregators
+
+Global state shared across all workers:
+
+```java
+// Register aggregator in compute0()
+context.registerAggregator("sum", new DoubleValue(0.0), SumAggregator.class);
+
+// Write to aggregator
+context.aggregateValue("sum", new DoubleValue(vertex.value()));
+
+// Read aggregator value (available in next superstep)
+DoubleValue total = context.aggregatedValue("sum");
+```
+
+#### 4. Combiners
+
+Reduce message volume by combining messages at sender:
+
+```java
+public class SumCombiner implements Combiner<DoubleValue> {
+    @Override
+    public void combine(DoubleValue v1, DoubleValue v2, DoubleValue result) {
+        result.value(v1.value() + v2.value());
+    }
+}
+```
+
+### Algorithm Development Workflow
+
+1. **Implement `Computation` interface** in `computer-algorithm`
+2. **Add configuration options** with `OPTION_*` constants
+3. **Implement `compute0()` for initialization**
+4. **Implement `compute()` for message processing**
+5. **Configure in job properties**:
+   ```properties
+   algorithm.class=com.example.MyAlgorithm
+   myalgorithm.param1=value1
+   ```
+6. **Build and test**:
+   ```bash
+   mvn clean package -DskipTests
+   ```
+
+## BSP Coordination
+
+HugeGraph-Computer uses etcd for BSP barrier synchronization:
+
+### BSP Lifecycle (per superstep)
+
+1. **Worker Prepare**: `workerStepPrepareDone` → `waitMasterStepPrepareDone`
+2. **Compute Phase**: Workers process vertices and messages locally
+3. **Worker Compute Done**: `workerStepComputeDone` → 
`waitMasterStepComputeDone`
+4. **Aggregation**: Aggregators combine global state
+5. **Worker Step Done**: `workerStepDone` → `waitMasterStepDone` (master 
returns `SuperstepStat`)
+
+### Manager Pattern
+
+`WorkerService` composes multiple managers with lifecycle hooks:
+
+- `MessageSendManager`: Outgoing message buffering and sending
+- `MessageRecvManager`: Incoming message receiving and sorting
+- `WorkerAggrManager`: Aggregator value collection
+- `DataServerManager`: Inter-worker data transfer
+- `SortManagers`: Message and edge sorting
+- `SnapshotManager`: Checkpoint creation
+
+All managers implement:
+- `initAll()`: Initialize before first superstep
+- `beforeSuperstep()`: Prepare for superstep
+- `afterSuperstep()`: Cleanup after superstep
+- `closeAll()`: Shutdown cleanup
+
+## Configuration Reference
+
+### Job Configuration
+
+```properties
+# === Algorithm ===
+algorithm.class=<fully qualified class name>
+algorithm.message_class=<message value class>
+algorithm.result_class=<result value class>
+
+# === HugeGraph Input ===
+hugegraph.url=http://localhost:8080
+hugegraph.graph=hugegraph
+hugegraph.input.vertex_label=person
+hugegraph.input.edge_label=knows
+hugegraph.input.filter=<gremlin filter>
+
+# === HugeGraph Output ===
+hugegraph.output.vertex_property=pagerank_value
+hugegraph.output.edge_property=<property name>
+
+# === HDFS Input ===
+input.hdfs.path=/graph/input
+input.hdfs.format=json
+
+# === HDFS Output ===
+output.hdfs.path=/graph/output
+output.hdfs.format=json
+
+# === Worker Resources ===
+worker.count=3
+worker.memory=8Gi
+worker.cpu=4
+worker.thread_count=<cpu cores>
+
+# === BSP Coordination ===
+bsp.etcd.url=http://etcd:2379
+bsp.max_superstep=100
+bsp.log_interval=10
+
+# === Memory Management ===
+worker.data.dirs=/data1,/data2
+worker.write_buffer_size=134217728
+worker.max_spill_size=1073741824
+```
+
+## Memory Management
+
+Computer auto-manages memory to prevent OOM:
+
+1. **In-Memory Buffering**: Vertices, edges, messages buffered in memory
+2. **Spill Threshold**: When memory usage exceeds threshold, spill to disk
+3. **Disk Storage**: Configurable data directories (`worker.data.dirs`)
+4. **Automatic Cleanup**: Spilled data cleaned after superstep completion
+
+**Best Practice**: Allocate worker memory ≥ 2x graph size for optimal 
performance.
+
+## Troubleshooting
+
+### K8s CRD Classes Not Found
+
+```bash
+# Generate CRD classes first
+cd computer-k8s-operator
+mvn clean install
+```
+
+Generated classes appear in `computer-k8s/target/generated-sources/`.
+
+### etcd Connection Errors
+
+- Verify `bsp.etcd.url` is reachable from all pods
+- Check etcd cluster health: `etcdctl endpoint health`
+- Ensure firewall allows port 2379
+
+### Out of Memory Errors
+
+- Increase `worker.memory` in job config
+- Reduce `worker.write_buffer_size` to trigger earlier spilling
+- Increase `worker.count` to distribute graph across more workers
+
+### Slow Convergence
+
+- Check algorithm parameters (e.g., `pagerank.convergence_tolerance`)
+- Monitor superstep logs for progress
+- Consider using combiners to reduce message volume
+
+## Important Files
+
+| File | Description |
+|------|-------------|
+| `computer-api/.../Computation.java` | Algorithm interface contract 
(computer/computer-api/src/main/java/org/apache/hugegraph/computer/core/worker/Computation.java:25)
 |
+| `computer-core/.../WorkerService.java` | Worker runtime orchestration 
(computer/computer-core/src/main/java/org/apache/hugegraph/computer/core/worker/WorkerService.java:1)
 |
+| `computer-core/.../Bsp4Worker.java` | BSP coordination logic 
(computer/computer-core/src/main/java/org/apache/hugegraph/computer/core/bsp/Bsp4Worker.java:1)
 |
+| `computer-algorithm/.../PageRank.java` | Example algorithm implementation 
(computer/computer-algorithm/src/main/java/org/apache/hugegraph/computer/algorithm/centrality/pagerank/PageRank.java:1)
 |
+
+## Testing
+
+### CI/CD Pipeline
+
+The CI pipeline (`.github/workflows/computer-ci.yml`) runs:
+
+1. License check (Apache RAT)
+2. Setup HDFS (Hadoop 3.3.2)
+3. Setup Minikube/Kubernetes
+4. Load test data into HugeGraph
+5. Compile with JDK 11
+6. Run integration tests (`-P integrate-test`)
+7. Run unit tests (`-P unit-test`)
+8. Upload coverage to Codecov
+
+### Local Testing
+
+```bash
+# Setup test environment (etcd, HDFS, K8s)
+cd computer-dist/src/assembly/travis
+./start-etcd.sh
+./start-hdfs.sh
+./start-minikube.sh
+
+# Run tests
+cd ../../../../
+mvn test -P integrate-test
+```
+
+## Links
+
+- [Project 
Homepage](https://hugegraph.apache.org/docs/quickstart/hugegraph-computer/)
+- [Main README](../README.md)
+- [Vermeer (Go) README](../vermeer/README.md)
+- [GitHub Issues](https://github.com/apache/hugegraph-computer/issues)
+
+## Contributing
+
+See the main [Contributing Guide](../README.md#contributing) for how to 
contribute.
+
+## License
+
+HugeGraph-Computer is licensed under [Apache 2.0 
License](https://github.com/apache/incubator-hugegraph-computer/blob/master/LICENSE).
diff --git a/vermeer/.gitignore b/vermeer/.gitignore
index b9d714e5..8fb775a2 100644
--- a/vermeer/.gitignore
+++ b/vermeer/.gitignore
@@ -98,3 +98,13 @@ tools/protoc/*/include/
 # Generated files (should be generated via go generate) #
 ######################
 asset/assets_vfsdata.go
+
+# AI assistant specific files (we only maintain AGENTS.md) #
+######################
+CLAUDE.md
+GEMINI.md
+CURSOR.md
+COPILOT.md
+.cursorrules
+.cursor/
+.github/copilot-instructions.md
diff --git a/vermeer/AGENTS.md b/vermeer/AGENTS.md
new file mode 100644
index 00000000..2fb2df04
--- /dev/null
+++ b/vermeer/AGENTS.md
@@ -0,0 +1,207 @@
+# AGENTS.md
+
+This file provides guidance to AI coding assistants when working with code in 
this repository.
+
+## Repository Overview
+
+Vermeer is a high-performance in-memory graph computing platform written in 
Go. It features a single-binary deployment model with master-worker 
architecture, supporting 20+ graph algorithms and seamless HugeGraph 
integration.
+
+## Build & Test Commands
+
+**Prerequisites:**
+- Go 1.23+
+- `curl` and `unzip` (for downloading binary dependencies)
+
+**First-time setup:**
+```bash
+make init  # Downloads supervisord and protoc binaries, installs Go deps
+```
+
+**Build:**
+```bash
+make          # Build for current platform
+make build-linux-amd64
+make build-linux-arm64
+```
+
+**Development build with hot-reload UI:**
+```bash
+go build -tags=dev
+```
+
+**Clean:**
+```bash
+make clean      # Remove built binaries and generated assets
+make clean-all  # Also remove downloaded tools
+```
+
+**Run:**
+```bash
+# Using binary directly
+./vermeer --env=master
+./vermeer --env=worker
+
+# Using script (configure in vermeer.sh)
+./vermeer.sh start master
+./vermeer.sh start worker
+```
+
+**Tests:**
+```bash
+# Run with build tag vermeer_test
+go test -tags=vermeer_test -v
+
+# Specific test modes
+go test -tags=vermeer_test -v -mode=algorithms
+go test -tags=vermeer_test -v -mode=function
+go test -tags=vermeer_test -v -mode=scheduler
+```
+
+**Regenerate protobuf (if proto files changed):**
+```bash
+go install google.golang.org/protobuf/cmd/[email protected]
+go install google.golang.org/grpc/cmd/[email protected]
+tools/protoc/osxm1/protoc *.proto --go-grpc_out=. --go_out=.
+```
+
+## Architecture
+
+### Directory Structure
+
+```
+vermeer/
+├── main.go              # Single binary entry point
+├── algorithms/          # Algorithm implementations
+│   ├── algorithms.go    # AlgorithmMaker registry
+│   ├── pagerank.go
+│   ├── louvain.go
+│   └── ...
+├── apps/
+│   ├── master/          # Master service
+│   │   ├── services/    # HTTP handlers
+│   │   ├── workers/     # Worker management (WorkerManager, WorkerClient)
+│   │   ├── tasks/       # Task scheduling
+│   │   └── graphs/      # Graph metadata management
+│   ├── worker/          # Worker service entry
+│   ├── compute/         # Worker-side compute logic
+│   │   ├── api.go       # Algorithm interface definition
+│   │   ├── task.go      # Compute task execution
+│   │   └── ...
+│   ├── graphio/         # Graph I/O (HugeGraph, CSV, HDFS)
+│   │   └── hugegraph.go # HugeGraph integration
+│   ├── protos/          # gRPC definitions
+│   ├── common/          # Utilities, logging, metrics
+│   ├── structure/       # Graph data structures
+│   ├── storage/         # Persistence layer
+│   └── bsp/             # BSP coordination helpers
+├── config/              # Configuration templates
+├── tools/               # Binary dependencies (supervisord, protoc)
+└── ui/                  # Web dashboard
+```
+
+### Key Design Patterns
+
+**1. Maker/Registry Pattern**
+
+Graph loaders and writers register themselves via `init()`:
+
+```go
+func init() {
+    LoadMakers[LoadTypeHugegraph] = &HugegraphMaker{}
+}
+```
+
+Master selects loader by type from the registry. Algorithms follow the same 
pattern in `algorithms/algorithms.go`.
+
+**2. Master-Worker Architecture**
+
+- **Master**: Schedules LoadPartition tasks to workers, manages worker 
lifecycle via WorkerManager/WorkerClient, exposes HTTP endpoints for graph/task 
management
+- **Worker**: Executes compute tasks, reports status back to master via gRPC
+- Communication: Master uses gRPC clients to workers (apps/master/workers/); 
workers connect to master on startup
+
+**3. HugeGraph Integration**
+
+Implementation in `apps/graphio/hugegraph.go`:
+
+1. **Metadata Query**: Queries HugeGraph PD (metadata service) via gRPC for 
partition information
+2. **Data Loading**: Streams vertices/edges from HugeGraph Store via gRPC 
(`ScanPartition`)
+3. **Result Writing**: Writes computed results back via HugeGraph HTTP REST 
API (adds vertex properties)
+
+The loader queries PD first (`QueryPartitions`), then creates LoadPartition 
tasks for each partition, which workers execute by calling `ScanPartition` on 
store nodes.
+
+**4. Algorithm Interface**
+
+Algorithms implement the interface defined in `apps/compute/api.go`. Each 
algorithm must register itself in `algorithms/algorithms.go` by appending to 
the `Algorithms` slice.
+
+**5. Single Binary Entry Point**
+
+`main.go` loads config from `config/{env}.ini`, then starts either master or 
worker based on `run_mode` parameter. The `--env` flag specifies which config 
file to use (e.g., `--env=master` loads `config/master.ini`).
+
+## Important Files
+
+- Entry point: `main.go`
+- Algorithm interface: `apps/compute/api.go`
+- Algorithm registry: `algorithms/algorithms.go`
+- HugeGraph integration: `apps/graphio/hugegraph.go`
+- Master scheduling: `apps/master/tasks/tasks.go`
+- Worker management: `apps/master/workers/workers.go`
+- HTTP endpoints: `apps/master/services/http_master.go`
+
+## Development Workflow
+
+**Adding a New Algorithm:**
+
+1. Create file in `algorithms/` implementing the interface from 
`apps/compute/api.go`
+2. Register in `algorithms/algorithms.go` by appending to `Algorithms` slice
+3. Implement required methods: `Init()`, `Compute()`, `Aggregate()`, 
`Terminate()`
+4. Rebuild: `make`
+
+**Modifying Web UI:**
+
+1. Edit files in `ui/`
+2. Regenerate assets: `cd asset && go generate`
+3. Or use dev build: `go build -tags=dev` (hot-reload enabled)
+
+**Modifying Protobuf Definitions:**
+
+1. Edit `.proto` files in `apps/protos/`
+2. Regenerate Go code using protoc (adjust path for platform):
+   ```bash
+   tools/protoc/osxm1/protoc apps/protos/*.proto --go-grpc_out=. --go_out=.
+   ```
+
+## Configuration
+
+**Master (`config/master.ini`):**
+- `http_peer`: Master HTTP listen address (default: 0.0.0.0:6688)
+- `grpc_peer`: Master gRPC listen address (default: 0.0.0.0:6689)
+- `run_mode`: Must be "master"
+- `task_parallel_num`: Number of parallel tasks
+
+**Worker (`config/worker.ini`):**
+- `http_peer`: Worker HTTP listen address (default: 0.0.0.0:6788)
+- `grpc_peer`: Worker gRPC listen address (default: 0.0.0.0:6789)
+- `master_peer`: Master gRPC address to connect (must match master's 
`grpc_peer`)
+- `run_mode`: Must be "worker"
+
+## Memory Management
+
+Vermeer uses an in-memory-first approach. Graphs are distributed across 
workers and stored in memory. Ensure total worker memory exceeds graph size by 
2-3x for algorithm workspace.
+
+## Testing Notes
+
+Tests require the build tag `vermeer_test`:
+
+```bash
+go test -tags=vermeer_test -v
+```
+
+Test modes (set via `-mode` flag):
+- `algorithms`: Algorithm correctness tests
+- `function`: Functional integration tests
+- `scheduler`: Scheduler behavior tests
+
+Test configuration via flags:
+- `-master`: Master HTTP address
+- `-worker01/02/03`: Worker HTTP addresses
+- `-auth`: Authentication type
diff --git a/vermeer/README.md b/vermeer/README.md
index 2d9e8207..f1e71ab5 100644
--- a/vermeer/README.md
+++ b/vermeer/README.md
@@ -1,125 +1,543 @@
-# Vermeer Graph Compute Engine
+# Vermeer - High-Performance In-Memory Graph Computing
+
+Vermeer is a high-performance in-memory graph computing platform with a 
single-binary deployment model. It provides 20+ graph algorithms, custom 
algorithm extensions, and seamless integration with HugeGraph.
+
+## Key Features
+
+- **Single Binary Deployment**: Zero external dependencies, run anywhere
+- **In-Memory Performance**: Optimized for fast iteration on medium to large 
graphs
+- **Master-Worker Architecture**: Horizontal scalability by adding worker nodes
+- **REST API + gRPC**: Easy integration with existing systems
+- **Web UI Dashboard**: Built-in monitoring and job management
+- **Multi-Source Support**: HugeGraph, local CSV, HDFS
+- **20+ Graph Algorithms**: Production-ready implementations
+
+## Architecture
+
+```mermaid
+graph TB
+    subgraph Client["Client Layer"]
+        API[REST API Client]
+        UI[Web UI Dashboard]
+    end
+
+    subgraph Master["Master Node"]
+        HTTP[HTTP Server :6688]
+        GRPC_M[gRPC Server :6689]
+        GM[Graph Manager]
+        TM[Task Manager]
+        WM[Worker Manager]
+        SCH[Scheduler]
+    end
+
+    subgraph Workers["Worker Nodes"]
+        W1[Worker 1 :6789]
+        W2[Worker 2 :6789]
+        W3[Worker N :6789]
+    end
+
+    subgraph DataSources["Data Sources"]
+        HG[(HugeGraph)]
+        CSV[Local CSV]
+        HDFS[HDFS]
+    end
+
+    API --> HTTP
+    UI --> HTTP
+    HTTP --> GM
+    HTTP --> TM
+    GRPC_M <--> W1
+    GRPC_M <--> W2
+    GRPC_M <--> W3
+
+    W1 <--> HG
+    W2 <--> HG
+    W3 <--> HG
+    W1 <--> CSV
+    W1 <--> HDFS
+
+    style Master fill:#e1f5fe
+    style Workers fill:#fff3e0
+    style DataSources fill:#f1f8e9
+```
+
+### Directory Structure
 
-## Introduction
-Vermeer is a high-performance distributed graph computing platform based on 
memory, supporting more than 15 graph algorithms, custom algorithm extensions, 
and custom data source access.
+```
+vermeer/
+├── main.go              # Single binary entry point
+├── Makefile             # Build automation
+├── algorithms/          # 20+ algorithm implementations
+│   ├── pagerank.go
+│   ├── louvain.go
+│   ├── sssp.go
+│   └── ...
+├── apps/
+│   ├── master/          # Master service
+│   │   ├── services/    # HTTP handlers
+│   │   ├── workers/     # Worker management
+│   │   └── tasks/       # Task scheduling
+│   ├── compute/         # Worker-side compute logic
+│   ├── graphio/         # Graph I/O (HugeGraph, CSV, HDFS)
+│   │   └── hugegraph.go # HugeGraph integration
+│   ├── protos/          # gRPC definitions
+│   └── common/          # Utilities, logging, metrics
+├── config/              # Configuration templates
+│   ├── master.ini
+│   └── worker.ini
+├── tools/               # Binary dependencies (supervisord, protoc)
+└── ui/                  # Web dashboard
+```
 
-## Run with Docker
+## Quick Start
+
+### Option 1: Docker (Recommended)
 
 Pull the image:
-```
+
+```bash
 docker pull hugegraph/vermeer:latest
 ```
 
-Create local configuration files, for example, `~/master.ini` and 
`~/worker.ini`.
+Create local configuration files `~/master.ini` and `~/worker.ini` (see 
[Configuration](#configuration) section).
 
-Run with Docker. The `--env` flag specifies the file name.
+Run with Docker:
 
-```
-master: docker run -v ~/:/go/bin/config hugegraph/vermeer --env=master
-worker: docker run -v ~/:/go/bin/config hugegraph/vermeer --env=worker
+```bash
+# Master node
+docker run -v ~/:/go/bin/config hugegraph/vermeer --env=master
+
+# Worker node
+docker run -v ~/:/go/bin/config hugegraph/vermeer --env=worker
 ```
 
-We've also provided a `docker-compose` file. Once you've created 
`~/master.ini` and `~/worker.ini`, and updated the `master_peer` in 
`worker.ini` to `172.20.0.10:6689`, you can run it using the following command:
+#### Docker Compose
 
-```
+Update `master_peer` in `~/worker.ini` to `172.20.0.10:6689`, then:
+
+```bash
 docker-compose up -d
 ```
 
-## Start
+### Option 2: Binary Download
 
+```bash
+# Download binary (replace version and platform)
+wget 
https://github.com/apache/hugegraph-computer/releases/download/vX.X.X/vermeer-linux-amd64.tar.gz
+tar -xzf vermeer-linux-amd64.tar.gz
+cd vermeer
+
+# Run master and worker
+./vermeer --env=master &
+./vermeer --env=worker &
 ```
-master: ./vermeer --env=master
-worker: ./vermeer --env=worker01
-```
-The parameter env specifies the name of the configuration file in the 
useconfig folder.
 
-```
+The `--env` parameter specifies the configuration file name in the `config/` 
folder (e.g., `master.ini`, `worker.ini`).
+
+#### Using the Shell Script
+
+Configure parameters in `vermeer.sh`, then:
+
+```bash
 ./vermeer.sh start master
 ./vermeer.sh start worker
 ```
-Configuration items are specified in vermeer.sh
-## supervisord
-Can be used with supervisord to start and stop services, automatically start 
applications, rotate logs, and more; for the configuration file, refer to 
config/supervisor.conf;
 
-Configuration file reference config/supervisor.conf
+### Option 3: Build from Source
 
-````
-# run as daemon
-./supervisord -c supervisor.conf -d
-````
+#### Prerequisites
 
-## Build from Source
+- Go 1.23 or later
+- `curl` and `unzip` utilities (for downloading dependencies)
+- Internet connection (for first-time setup)
 
-### Requirements
-* Go 1.23 or later
-* `curl` and `unzip` utilities (for downloading dependencies)
-* Internet connection (for first-time setup)
+#### Build Steps
 
-### Quick Start
-
-**Recommended**: Use Makefile for building:
+**Recommended**: Use Makefile:
 
 ```bash
-# First time setup (downloads binary dependencies)
+# First-time setup (downloads supervisord and protoc binaries)
 make init
 
-# Build vermeer
+# Build for current platform
 make
+
+# Or build for specific platform
+make build-linux-amd64
+make build-linux-arm64
 ```
 
-**Alternative**: Use the build script:
+**Alternative**: Use build script:
 
 ```bash
-# For AMD64
-./build.sh amd64
+# Auto-detect platform
+./build.sh
 
-# For ARM64
+# Or specify architecture
+./build.sh amd64
 ./build.sh arm64
 ```
 
-# The script will:
-# - Auto-detect your OS and architecture if no parameter is provided
-# - Download required tools if not present
-# - Generate assets and build the binary
-# - Exit with error message if any step fails
+#### Development Build
 
-### Build Targets
+For development with hot-reload of web UI:
 
 ```bash
-make build              # Build for current platform
-make build-linux-amd64  # Build for Linux AMD64
-make build-linux-arm64  # Build for Linux ARM64
-make clean              # Clean generated files
-make help               # Show all available targets
+go build -tags=dev
 ```
 
-### Development Build
+#### Clean Build Artifacts
 
-For development with hot-reload of web UI:
+```bash
+make clean      # Remove binaries and generated assets
+make clean-all  # Also remove downloaded tools (supervisord, protoc)
+```
+
+## Configuration
+
+### Master Configuration (`master.ini`)
+
+```ini
+[master]
+# Master server listen address
+listen_addr = :6688
+
+# Master gRPC server address
+grpc_addr = :6689
+
+# Worker heartbeat timeout (seconds)
+worker_timeout = 30
+
+# Task execution timeout (seconds)
+task_timeout = 3600
+
+[hugegraph]
+# HugeGraph PD address for metadata
+pd_peers = 127.0.0.1:8686
+
+# HugeGraph HTTP endpoint for result writing
+server = http://127.0.0.1:8080
+
+# Graph space name
+graph = hugegraph
+```
+
+### Worker Configuration (`worker.ini`)
+
+```ini
+[worker]
+# Worker listen address
+listen_addr = :6789
+
+# Master gRPC address to connect
+master_peer = 127.0.0.1:6689
+
+# Worker ID (unique)
+worker_id = worker01
+
+# Number of compute threads
+compute_threads = 4
+
+# Memory limit (GB)
+memory_limit = 8
+
+[storage]
+# Local disk path for spilling
+data_path = ./data
+```
+
+## Available Algorithms
+
+| Algorithm | Category | Description |
+|-----------|----------|-------------|
+| **PageRank** | Centrality | Measures vertex importance via link structure |
+| **Personalized PageRank** | Centrality | PageRank from specific source 
vertices |
+| **Betweenness Centrality** | Centrality | Measures vertex importance via 
shortest paths |
+| **Closeness Centrality** | Centrality | Measures average distance to all 
other vertices |
+| **Degree Centrality** | Centrality | Simple in/out degree calculation |
+| **Louvain** | Community Detection | Modularity-based community detection |
+| **Louvain (Weighted)** | Community Detection | Weighted variant for 
edge-weighted graphs |
+| **LPA** | Community Detection | Label Propagation Algorithm |
+| **SLPA** | Community Detection | Speaker-Listener Label Propagation |
+| **WCC** | Community Detection | Weakly Connected Components |
+| **SCC** | Community Detection | Strongly Connected Components |
+| **SSSP** | Path Finding | Single Source Shortest Path (Dijkstra) |
+| **Triangle Count** | Graph Structure | Counts triangles in the graph |
+| **K-Core** | Graph Structure | Finds k-core subgraphs |
+| **K-Out** | Graph Structure | K-degree filtering |
+| **Clustering Coefficient** | Graph Structure | Measures local clustering |
+| **Cycle Detection** | Graph Structure | Detects cycles in directed graphs |
+| **Jaccard Similarity** | Similarity | Computes neighbor-based similarity |
+| **Depth (BFS)** | Traversal | Breadth-First Search depth assignment |
+
+## API Overview
+
+Vermeer exposes a REST API on port `6688` (configurable in `master.ini`).
+
+### Key Endpoints
+
+| Endpoint | Method | Description |
+|----------|--------|-------------|
+| `/api/v1/graphs` | POST | Load graph from data source |
+| `/api/v1/graphs/{graph_id}` | GET | Get graph metadata |
+| `/api/v1/graphs/{graph_id}` | DELETE | Unload graph from memory |
+| `/api/v1/compute` | POST | Execute algorithm on loaded graph |
+| `/api/v1/tasks/{task_id}` | GET | Get task status and results |
+| `/api/v1/workers` | GET | List connected workers |
+| `/ui/` | GET | Web UI dashboard |
+
+### Example: Run PageRank
 
 ```bash
-go build -tags=dev
+# 1. Load graph from HugeGraph
+curl -X POST http://localhost:6688/api/v1/graphs \
+  -H "Content-Type: application/json" \
+  -d '{
+    "graph_name": "my_graph",
+    "load_type": "hugegraph",
+    "hugegraph": {
+      "pd_peers": ["127.0.0.1:8686"],
+      "graph_name": "hugegraph"
+    }
+  }'
+
+# 2. Run PageRank
+curl -X POST http://localhost:6688/api/v1/compute \
+  -H "Content-Type: application/json" \
+  -d '{
+    "graph_name": "my_graph",
+    "algorithm": "pagerank",
+    "params": {
+      "max_iterations": 20,
+      "damping_factor": 0.85
+    },
+    "output": {
+      "type": "hugegraph",
+      "property_name": "pagerank_value"
+    }
+  }'
+
+# 3. Check task status
+curl http://localhost:6688/api/v1/tasks/{task_id}
+```
+
+### OLAP vs OLTP Modes
+
+- **OLAP Mode**: Load entire graph into memory, run multiple algorithms
+- **OLTP Mode**: Query-driven, load subgraphs on demand (planned feature)
+
+## Data Sources
+
+### HugeGraph Integration
+
+Vermeer integrates with HugeGraph via:
+
+1. **Metadata Query**: Queries HugeGraph PD (metadata service) via gRPC for 
partition information
+2. **Data Loading**: Streams vertices/edges from HugeGraph Store via gRPC 
(`ScanPartition`)
+3. **Result Writing**: Writes computed results back via HugeGraph REST API 
(adds vertex properties)
+
+Configuration in graph load request:
+
+```json
+{
+  "load_type": "hugegraph",
+  "hugegraph": {
+    "pd_peers": ["127.0.0.1:8686"],
+    "graph_name": "hugegraph",
+    "vertex_label": "person",
+    "edge_label": "knows"
+  }
+}
+```
+
+### Local CSV Files
+
+Load graphs from local CSV files:
+
+```json
+{
+  "load_type": "csv",
+  "csv": {
+    "vertex_file": "/path/to/vertices.csv",
+    "edge_file": "/path/to/edges.csv",
+    "delimiter": ","
+  }
+}
+```
+
+### HDFS
+
+Load from Hadoop Distributed File System:
+
+```json
+{
+  "load_type": "hdfs",
+  "hdfs": {
+    "namenode": "hdfs://namenode:9000",
+    "vertex_path": "/graph/vertices",
+    "edge_path": "/graph/edges"
+  }
+}
+```
+
+## Developing Custom Algorithms
+
+Custom algorithms implement the `Algorithm` interface in 
`algorithms/algorithms.go`:
+
+```go
+type Algorithm interface {
+    // Initialize the algorithm
+    Init(params map[string]interface{}) error
+
+    // Compute one iteration for a vertex
+    Compute(vertex *Vertex, messages []Message) (halt bool, outMessages 
[]Message)
+
+    // Aggregate global state (optional)
+    Aggregate() interface{}
+
+    // Check termination condition
+    Terminate(iteration int) bool
+}
+```
+
+### Example: Simple Degree Count
+
+```go
+package algorithms
+
+type DegreeCount struct {
+    maxIter int
+}
+
+func (dc *DegreeCount) Init(params map[string]interface{}) error {
+    dc.maxIter = params["max_iterations"].(int)
+    return nil
+}
+
+func (dc *DegreeCount) Compute(vertex *Vertex, messages []Message) (bool, 
[]Message) {
+    // Store degree as vertex value
+    vertex.SetValue(float64(len(vertex.OutEdges)))
+
+    // Halt after first iteration
+    return true, nil
+}
+
+func (dc *DegreeCount) Terminate(iteration int) bool {
+    return iteration >= dc.maxIter
+}
+```
+
+Register the algorithm in `algorithms/algorithms.go`:
+
+```go
+func init() {
+    RegisterAlgorithm("degree_count", &DegreeCount{})
+}
+```
+
+## Memory Management
+
+Vermeer uses an in-memory-first approach:
+
+1. **Graph Loading**: Vertices and edges are distributed across workers and 
stored in memory
+2. **Automatic Partitioning**: Master assigns partitions to workers based on 
capacity
+3. **Memory Monitoring**: Workers report memory usage to master
+4. **Graceful Degradation**: If memory is insufficient, algorithms may fail 
(disk spilling not yet implemented)
+
+**Best Practice**: Ensure total worker memory exceeds graph size by 2-3x for 
algorithm workspace.
+
+## Supervisord Integration
+
+Run Vermeer as a daemon with automatic restarts and log rotation:
+
+```bash
+# Configuration in config/supervisor.conf
+./tools/supervisord -c config/supervisor.conf -d
 ```
 
----
+Sample supervisor configuration:
 
-### Protobuf Development
+```ini
+[program:vermeer-master]
+command=/path/to/vermeer --env=master
+autostart=true
+autorestart=true
+stdout_logfile=/var/log/vermeer-master.log
+```
 
-If you need to regenerate protobuf files:
+## Protobuf Development
+
+If you modify `.proto` files, regenerate Go code:
 
 ```bash
 # Install protobuf Go plugins
 go install google.golang.org/protobuf/cmd/[email protected]
 go install google.golang.org/grpc/cmd/[email protected]
 
-# Generate protobuf files
+# Generate (adjust protoc path for your platform)
 tools/protoc/osxm1/protoc *.proto --go-grpc_out=. --go_out=.
 ```
 
----
+## Performance Tuning
+
+### Worker Configuration
+
+- **compute_threads**: Set to number of CPU cores for CPU-bound algorithms
+- **memory_limit**: Set to 70-80% of available RAM
+- **partition_count**: Increase for better parallelism (default: 
auto-calculated)
+
+### Master Configuration
+
+- **worker_timeout**: Increase for slow networks or heavily loaded workers
+- **task_timeout**: Increase for long-running algorithms (e.g., Louvain on 
large graphs)
+
+### Algorithm-Specific
+
+- **PageRank**: Use `damping_factor=0.85`, `tolerance=0.0001` for faster 
convergence
+- **Louvain**: Enable `weighted=true` only if edge weights are meaningful
+- **SSSP**: Provide source vertex ID for single-source queries
+
+## Monitoring
+
+Access the Web UI dashboard at `http://master-ip:6688/ui/` for:
+
+- Worker status and resource usage
+- Active and completed tasks
+- Graph metadata and statistics
+- Real-time logs
+
+## Troubleshooting
+
+### Workers Not Connecting
+
+- Verify `master_peer` in `worker.ini` matches master's gRPC address
+- Check firewall rules for port `6689` (gRPC)
+- Ensure master is running before starting workers
+
+### Out of Memory Errors
+
+- Reduce graph size or increase worker memory
+- Distribute graph across more workers
+- Use algorithms with lower memory footprint (e.g., degree centrality vs. 
betweenness)
+
+### Slow Algorithm Execution
+
+- Increase `compute_threads` in worker config
+- Check network latency between master and workers
+- Profile algorithm with built-in metrics (access via API)
 
+## Links
 
+- [Project 
Homepage](https://hugegraph.apache.org/docs/quickstart/hugegraph-computer/)
+- [Main README](../README.md)
+- [Computer (Java) README](../computer/README.md)
+- [GitHub Issues](https://github.com/apache/hugegraph-computer/issues)
+- [Docker Hub](https://hub.docker.com/r/hugegraph/vermeer)
 
+## Contributing
 
+See the main [Contributing Guide](../README.md#contributing) for how to 
contribute to Vermeer.
 
+## License
 
+Vermeer is part of Apache HugeGraph-Computer, licensed under [Apache 2.0 
License](https://github.com/apache/incubator-hugegraph-computer/blob/master/LICENSE).

(incubator-hugegraph-computer) 01/01: docs: add AGENTS.md and update docs & gitignore

Reply via email to