This is an automated email from the ASF dual-hosted git repository.
xiaokang pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/incubator-graphar.git
The following commit(s) were added to refs/heads/main by this push:
new f680bb8b fix(docs): correct punctuation and formatting in README files
(#905)
f680bb8b is described below
commit f680bb8bfd453f5417ebdbb858d10c5d3560e06b
Author: Jason Yao <[email protected]>
AuthorDate: Tue Mar 10 20:51:23 2026 +0800
fix(docs): correct punctuation and formatting in README files (#905)
---
README-zh-cn.md | 4 ++--
README.md | 22 +++++++++++-----------
2 files changed, 13 insertions(+), 13 deletions(-)
diff --git a/README-zh-cn.md b/README-zh-cn.md
index 59472d97..14605dbe 100644
--- a/README-zh-cn.md
+++ b/README-zh-cn.md
@@ -216,9 +216,9 @@ GraphAr Java 库是通过绑定到 C++ 库(当前版本为v0.12.0)创建的
### Java 库
> [!NOTE]
-> Java 库正在开发中.
+> Java 库正在开发中。
-java 库将由纯java开发,他将会包括下面的模块:
+Java 库将由纯 Java 开发,它将会包括下面的模块:
- **[Java-Info](./maven-projects/info)**:负责从yaml文件中解析Graphinfo(schema)
- **Java-io-XXX**:负责从不同存储格式读取图形数据(待实现)
- **Java-Api-XXX**:为图形操作提供高级API(待实现)
diff --git a/README.md b/README.md
index e8b88321..e180f767 100644
--- a/README.md
+++ b/README.md
@@ -257,21 +257,21 @@ Here we show statistics of datasets with hundreds of
millions of vertices from [
width="700" alt="storage consumption" />
Two baseline approaches are
-considered: 1) “plain”, which employs plain encoding for the
-source and destination columns, and 2) “plain + offset”, which
-extends the “plain” method by sorting edges and adding an
-offset column to mark each vertex’s starting edge position.
+considered: 1) "plain", which employs plain encoding for the
+source and destination columns, and 2) "plain + offset", which
+extends the "plain" method by sorting edges and adding an
+offset column to mark each vertex's starting edge position.
The result
is a notable storage advantage: on average, GraphAr requires
-only 27.3% of the storage needed by the baseline “plain +
-offset”, which is due to delta encoding.
+only 27.3% of the storage needed by the baseline "plain +
+offset", which is due to delta encoding.
### I/O speed
<img src="docs/images/benchmark_IO_time.png" class="align-center"
width="700" alt="I/O time" />
In (a) indicate that GraphAr significantly
-outperforms the baseline (CSV), achieving an average speedup of 4.9×. In
Figure (b), the immutable (“Imm”) and mutable (“Mut”) variants are two native
in-memory storage of GraphScope. It demonstrates that although the querying
time with GraphAr exceeds that of the in-memory storages, attributable to
intrinsic I/O overhead, it significantly surpasses the process of loading and
then
+outperforms the baseline (CSV), achieving an average speedup of 4.9×. In
Figure (b), the immutable ("Imm") and mutable ("Mut") variants are two native
in-memory storage of GraphScope. It demonstrates that although the querying
time with GraphAr exceeds that of the in-memory storages, attributable to
intrinsic I/O overhead, it significantly surpasses the process of loading and
then
executing the query, by 2.4× and 2.5×, respectively. This indicates that
GraphAr is a viable option for executing infrequent queries.
@@ -280,7 +280,7 @@ executing the query, by 2.4× and 2.5×, respectively. This
indicates that Graph
width="700" alt="Neighbor retrival" />
We query vertices with the largest
-degree in selected graphs, maintaining edges in CSR-like or CSC-like formats
depending on the degree type. GraphAr significantly outperforms the baselines,
achieving an average speedup of 4452× over the “plain” method, 3.05× over
“plain + offset”, and 1.23× over “delta + offset”. -->
+degree in selected graphs, maintaining edges in CSR-like or CSC-like formats
depending on the degree type. GraphAr significantly outperforms the baselines,
achieving an average speedup of 4452× over the "plain" method, 3.05× over
"plain + offset", and 1.23× over "delta + offset". -->
### Label Filtering
<img src="docs/images/benchmark_label_simple_filter.png" class="align-center"
width="700" alt="Simple condition filtering" />
@@ -288,7 +288,7 @@ width="700" alt="Simple condition filtering" />
**Performance of simple condition filtering.**
For each graph, we perform experiments where we consider
each label individually as the target label for filtering.
-GraphAr consistently outperforms the baselines. On average, it achieves a
speedup of 14.8× over the “string” method, 8.9× over the “binary (plain)”
method, and 7.4× over the “binary (RLE)” method.
+GraphAr consistently outperforms the baselines. On average, it achieves a
speedup of 14.8× over the "string" method, 8.9× over the "binary (plain)"
method, and 7.4× over the "binary (RLE)" method.
<img src="docs/images/benchmark_label_complex_filter.png" class="align-center"
width="700" alt="Complex condition filtering" />
@@ -296,7 +296,7 @@ width="700" alt="Complex condition filtering" />
**Performance of complex condition filtering.**
For each graph,
we combine two labels by AND or OR as the filtering condition.
-The merge-based decoding method yields the largest gain, where “binary (RLE) +
merge” outperforms the “binary (RLE)” method by up to 60.5×.
+The merge-based decoding method yields the largest gain, where "binary (RLE) +
merge" outperforms the "binary (RLE)" method by up to 60.5×.
<!-- ### Query efficiency
<table>
<caption style="text-align: center;">Query Execution Times (in
seconds)</caption>
@@ -386,7 +386,7 @@ The merge-based decoding method yields the largest gain,
where “binary (RLE) +
</tbody>
</table>
<p><strong>Notes: <a href="https://github.com/apache/pinot"
target="_blank">Pinot (P)</a>, <a href="https://github.com/neo4j/neo4j"
target="_blank">Neo4j (N)</a>, <a
href="https://arrow.apache.org/docs/cpp/streaming_execution.html"
target="_blank">Acero (A)</a>, and GraphAr (G).
-“OM” denotes failed execution due to out-of-memory errors.
+"OM" denotes failed execution due to out-of-memory errors.
While both Pinot and Neo4j are widely-used, they
are not natively designed for data lakes and require an Extract-Transform-Load
(ETL) process for integration. The three representative queries includes
neighbor retrieval and label filtering, reference to <a
href="https://github.com/ldbc/ldbc_snb_bi" target="_blank">LDBC SNB Business
Intelligence</a> and <a
href="https://github.com/ldbc/ldbc_snb_interactive_v1_impls"
target="_blank">LDBC SNB Interactive v1 </a> workload implementations.
</strong></p>
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]