dosubot[bot] commented on issue #205:
URL:
https://github.com/apache/incubator-hugegraph-doc/issues/205#issuecomment-2242834150
<!-- Answer -->
To manage and store the results of graph computations in HugeGraph,
especially for large datasets, you can configure the system to write back
results to the graph database or a file using the following options:
### Writing Results Back to HugeGraph
You can configure HugeGraph to write computation results back to the graph
database. Here are the relevant configuration options:
- **Configuration Options:**
- `hugegraph.name`: Set the graph name to load data and write results back.
- `hugegraph.url`: Set the HugeGraph URL to load data and write results
back.
- `output.result_write_type`: Set the result write-type to output to
HugeGraph. Allowed values are: `OLAP_COMMON`, `OLAP_SECONDARY`, `OLAP_RANGE`.
- **Example Configuration:**
```yaml
hugegraph.name: hugegraph
hugegraph.url: http://127.0.0.1:8080
output.result_write_type: OLAP_COMMON
```
### Writing Results to a File (HDFS)
Alternatively, you can write the results to a file, such as HDFS. Here are
the relevant configuration options:
- **Configuration Options:**
- `output.hdfs_path_prefix`: Set the directory of HDFS output result.
- `output.hdfs_url`: Set the HDFS URL of output.
- `output.hdfs_user`: Set the HDFS user of output.
- `output.hdfs_delimiter`: Set the delimiter of HDFS output.
- `output.hdfs_merge_partitions`: Whether to merge output files of
multiple partitions.
- `output.hdfs_replication`: Set the replication number of HDFS.
- **Example Configuration:**
```yaml
output.hdfs_path_prefix: /hugegraph-computer/results
output.hdfs_url: hdfs://127.0.0.1:9000
output.hdfs_user: hadoop
output.hdfs_delimiter: ,
output.hdfs_merge_partitions: true
output.hdfs_replication: 3
```
### Best Practices
To ensure efficient handling of large datasets, consider the following best
practices:
- **Batch Size and Threads:**
- `output.batch_size`: Set the batch size of output.
- `output.batch_threads`: Set the number of threads used to batch output.
- **Retry Mechanism:**
- `output.retry_interval`: Set the retry interval when output fails.
- `output.retry_times`: Set the retry times when output fails.
- **Thread Pool Management:**
- `output.single_threads`: Set the number of threads used for single
output.
- `output.thread_pool_shutdown_timeout`: Set the timeout seconds of output
threads pool shutdown.
- **Including Properties:**
- `output.with_adjacent_edges`: Whether to output the adjacent edges of
the vertex.
- `output.with_edge_properties`: Whether to output the properties of the
edge.
- `output.with_vertex_properties`: Whether to output the properties of the
vertex.
By configuring these options, you can effectively manage and store the
results of graph computations in HugeGraph, ensuring efficient handling of
large datasets. Additionally, you can refer to the detailed configuration
options provided in the HugeGraph documentation for more advanced settings and
optimizations
<sup>[[1]](https://github.com/apache/incubator-hugegraph-doc/blob/master/content/en/docs/config/config-computer.md)</sup><sup>[[2]](https://github.com/apache/incubator-hugegraph-doc/blob/master/content/cn/docs/config/config-computer.md)</sup><sup>[[3]](https://github.com/apache/incubator-hugegraph-doc/blob/master/content/en/docs/quickstart/hugegraph-computer.md)</sup>.
<!-- Dosu Comment Footer -->
*To continue talking to [Dosu](https://dosu.dev), mention @dosu.*
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]