This is an automated email from the ASF dual-hosted git repository.
github-bot pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/incubator-hugegraph-doc.git
The following commit(s) were added to refs/heads/asf-site by this push:
new d9e27d75 doc: update hugegraph-loader.md (#303)
d9e27d75 is described below
commit d9e27d75f3e93d7a9ae8640296e18ac4b7ed3a02
Author: imbajin <[email protected]>
AuthorDate: Fri Dec 15 11:24:44 2023 +0000
doc: update hugegraph-loader.md (#303)
* Update content/cn/docs/quickstart/hugegraph-loader.md
---------
Co-authored-by: imbajin <[email protected]>
23d0df1e9d2bfab82a389394273fde8a503ab449
---
cn/docs/_print/index.html | 4 ++--
cn/docs/index.xml | 3 ++-
cn/docs/quickstart/_print/index.html | 4 ++--
cn/docs/quickstart/hugegraph-loader/index.html | 10 +++++-----
cn/docs/quickstart/index.xml | 3 ++-
cn/sitemap.xml | 2 +-
docs/_print/index.html | 4 ++--
docs/index.xml | 3 ++-
docs/quickstart/_print/index.html | 4 ++--
docs/quickstart/hugegraph-loader/index.html | 10 +++++-----
docs/quickstart/index.xml | 3 ++-
en/sitemap.xml | 2 +-
sitemap.xml | 2 +-
13 files changed, 29 insertions(+), 25 deletions(-)
diff --git a/cn/docs/_print/index.html b/cn/docs/_print/index.html
index b034927a..57d554bc 100644
--- a/cn/docs/_print/index.html
+++ b/cn/docs/_print/index.html
@@ -321,7 +321,7 @@ HugeGraph支持多用户并行操作,用户可输入Gremlin查询语句,并
</span></span></span><span style=display:flex><span><span
style=color:#8f5902;font-style:italic></span><span
style=color:#000>schema</span><span
style=color:#ce5c00;font-weight:700>.</span><span
style=color:#c4a000>edgeLabel</span><span
style=color:#ce5c00;font-weight:700>(</span><span
style=color:#4e9a06>"knows"</span><span
style=color:#ce5c00;font-weight:700>).</span><span
style=color:#c4a000>sourceLabel</span><span
style=color:#ce5c00;font-weight:700>(</span><span style=color: [...]
</span></span><span style=display:flex><span><span
style=color:#8f5902;font-style:italic>// 创建 created 边类型,这类边是从 person 指向
software 的
</span></span></span><span style=display:flex><span><span
style=color:#8f5902;font-style:italic></span><span
style=color:#000>schema</span><span
style=color:#ce5c00;font-weight:700>.</span><span
style=color:#c4a000>edgeLabel</span><span
style=color:#ce5c00;font-weight:700>(</span><span
style=color:#4e9a06>"created"</span><span
style=color:#ce5c00;font-weight:700>).</span><span
style=color:#c4a000>sourceLabel</span><span
style=color:#ce5c00;font-weight:700>(</span><span style=colo [...]
-</span></span></code></pre></div><blockquote><p>关于 schema 的详细说明请参考 <a
href=/docs/clients/hugegraph-client>hugegraph-client</a>
中对应部分。</p></blockquote><h4 id=32-准备数据>3.2 准备数据</h4><p>目前 HugeGraph-Loader
支持的数据源包括:</p><ul><li>本地磁盘文件或目录</li><li>HDFS 文件或目录</li><li>部分关系型数据库</li></ul><h5
id=321-数据源结构>3.2.1 数据源结构</h5><h6 id=3211-本地磁盘文件或目录>3.2.1.1
本地磁盘文件或目录</h6><p>用户可以指定本地磁盘文件作为数据源,如果数据分散在多个文件中,也支持以某个目录作为数据源,但暂时不支持以多个目录作为数据源。</p><p>比如:我的数据分散在多个文件中,part-0、part-1
… part-n,要想执行导入,必须保证它们是放在一个目录下的 [...]
+</span></span></code></pre></div><blockquote><p>关于 schema 的详细说明请参考 <a
href=/docs/clients/hugegraph-client>hugegraph-client</a>
中对应部分。</p></blockquote><h4 id=32-准备数据>3.2 准备数据</h4><p>目前 HugeGraph-Loader
支持的数据源包括:</p><ul><li>本地磁盘文件或目录</li><li>HDFS
文件或目录</li><li>部分关系型数据库</li><li>Kafka topic</li></ul><h5 id=321-数据源结构>3.2.1
数据源结构</h5><h6 id=3211-本地磁盘文件或目录>3.2.1.1
本地磁盘文件或目录</h6><p>用户可以指定本地磁盘文件作为数据源,如果数据分散在多个文件中,也支持以某个目录作为数据源,但暂时不支持以多个目录作为数据源。</p><p>比如:我的数据分散在多个文件中,part-0、part-1
… part-n,要想 [...]
</span></span><span
style=display:flex><span>1|lop|java|328|ISBN978-7-107-18618-5
</span></span><span
style=display:flex><span>2|ripple|java|199|ISBN978-7-100-13678-5
</span></span></code></pre></div><p>CSV 是分隔符为逗号<code>,</code>的 TEXT
文件,当列值本身包含逗号时,该列值需要用双引号包起来,如:</p><div class=highlight><pre tabindex=0
style=background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code
class=language-fallback data-lang=fallback><span
style=display:flex><span>marko,29,Beijing
@@ -613,7 +613,7 @@ HugeGraph支持多用户并行操作,用户可输入Gremlin查询语句,并
</span></span><span style=display:flex><span> <span
style=color:#000;font-weight:700>]</span>
</span></span><span style=display:flex><span><span
style=color:#000;font-weight:700>}</span>
</span></span></code></pre></div></details><br><p>映射文件 1.0 版本是以顶点和边为中心,设置输入源;而
2.0 版本是以输入源为中心,设置顶点和边映射。有些输入源(比如一个文件)既能生成顶点,也能生成边,如果用 1.0 版的格式写,就需要在 vertex 和
edge 映射块中各写一次 input 块,这两次的 input 块是完全一样的;而 2.0 版本只需要写一次 input。所以 2.0 版相比于 1.0
版,能省掉一些 input 的重复书写。</p><p>在 hugegraph-loader-{version} 的 bin 目录下,有一个脚本工具
<code>mapping-convert.sh</code> 能直接将 1.0 版本的映射文件转换为 2.0 版本的,使用方式如下:</p><div
class=highlight><pre tabindex=0
style=background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4><co
[...]
-</span></span></code></pre></div><p>会在 struct.json 的同级目录下生成一个
struct-v2.json。</p><h5 id=332-输入源>3.3.2
输入源</h5><p>输入源目前分为三类:FILE、HDFS、JDBC,由<code>type</code>节点区分,我们称为本地文件输入源、HDFS
输入源和 JDBC 输入源,下面分别介绍。</p><h6 id=3321-本地文件输入源>3.3.2.1 本地文件输入源</h6><ul><li>id:
输入源的 id,该字段用于支持一些内部功能,非必填(未填时会自动生成),强烈建议写上,对于调试大有裨益;</li><li>skip: 是否跳过该输入源,由于
JSON 文件无法添加注释,如果某次导入时不想导入某个输入源,但又不想删除该输入源的配置,则可以设置为 true 将其跳过,默认为
false,非必填;</li><li>input: 输入源映射块,复合结构<ul><li>type: 输入源类型,必须填 file 或
FILE;</li><li>path: 本地文件 [...]
+</span></span></code></pre></div><p>会在 struct.json 的同级目录下生成一个
struct-v2.json。</p><h5 id=332-输入源>3.3.2
输入源</h5><p>输入源目前分为四类:FILE、HDFS、JDBC、KAFKA,由<code>type</code>节点区分,我们称为本地文件输入源、HDFS
输入源、JDBC 输入源和 KAFKA 输入源,下面分别介绍。</p><h6 id=3321-本地文件输入源>3.3.2.1
本地文件输入源</h6><ul><li>id: 输入源的
id,该字段用于支持一些内部功能,非必填(未填时会自动生成),强烈建议写上,对于调试大有裨益;</li><li>skip: 是否跳过该输入源,由于 JSON
文件无法添加注释,如果某次导入时不想导入某个输入源,但又不想删除该输入源的配置,则可以设置为 true 将其跳过,默认为
false,非必填;</li><li>input: 输入源映射块,复合结构<ul><li>type: 输入源类型,必须填 file 或 FILE;</l
[...]
</span></span></span><span style=display:flex><span><span
style=color:#8f5902;font-style:italic></span><span
style=color:#000;font-weight:700>{</span>
</span></span><span style=display:flex><span> <span
style=color:#204a87;font-weight:700>"vertices"</span><span
style=color:#000;font-weight:700>:</span> <span
style=color:#000;font-weight:700>[</span>
</span></span><span style=display:flex><span> <span
style=color:#000;font-weight:700>{</span>
diff --git a/cn/docs/index.xml b/cn/docs/index.xml
index 1540db95..3c9481e6 100644
--- a/cn/docs/index.xml
+++ b/cn/docs/index.xml
@@ -5079,6 +5079,7 @@ HugeGraph目前采用EdgeCut的分区方案。</p>
<li>本地磁盘文件或目录</li>
<li>HDFS 文件或目录</li>
<li>部分关系型数据库</li>
+<li>Kafka topic</li>
</ul>
<h5 id="321-数据源结构">3.2.1 数据源结构</h5>
<h6 id="3211-本地磁盘文件或目录">3.2.1.1 本地磁盘文件或目录</h6>
@@ -5441,7 +5442,7 @@ HugeGraph目前采用EdgeCut的分区方案。</p>
<div class="highlight"><pre tabindex="0"
style="background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code
class="language-bash" data-lang="bash"><span
style="display:flex;"><span>bin/mapping-convert.sh struct.json
</span></span></code></pre></div><p>会在 struct.json
的同级目录下生成一个 struct-v2.json。</p>
<h5 id="332-输入源">3.3.2 输入源</h5>
-<p>输入源目前分为三类:FILE、HDFS、JDBC,由<code>type</code>节点区分,我们称为本地文件输入源、HDFS
输入源和 JDBC 输入源,下面分别介绍。</p>
+<p>输入源目前分为四类:FILE、HDFS、JDBC、KAFKA,由<code>type</code>节点区分,我们称为本地文件输入源、HDFS
输入源、JDBC 输入源和 KAFKA 输入源,下面分别介绍。</p>
<h6 id="3321-本地文件输入源">3.3.2.1 本地文件输入源</h6>
<ul>
<li>id: 输入源的 id,该字段用于支持一些内部功能,非必填(未填时会自动生成),强烈建议写上,对于调试大有裨益;</li>
diff --git a/cn/docs/quickstart/_print/index.html
b/cn/docs/quickstart/_print/index.html
index d62363bf..06436434 100644
--- a/cn/docs/quickstart/_print/index.html
+++ b/cn/docs/quickstart/_print/index.html
@@ -315,7 +315,7 @@
</span></span></span><span style=display:flex><span><span
style=color:#8f5902;font-style:italic></span><span
style=color:#000>schema</span><span
style=color:#ce5c00;font-weight:700>.</span><span
style=color:#c4a000>edgeLabel</span><span
style=color:#ce5c00;font-weight:700>(</span><span
style=color:#4e9a06>"knows"</span><span
style=color:#ce5c00;font-weight:700>).</span><span
style=color:#c4a000>sourceLabel</span><span
style=color:#ce5c00;font-weight:700>(</span><span style=color: [...]
</span></span><span style=display:flex><span><span
style=color:#8f5902;font-style:italic>// 创建 created 边类型,这类边是从 person 指向
software 的
</span></span></span><span style=display:flex><span><span
style=color:#8f5902;font-style:italic></span><span
style=color:#000>schema</span><span
style=color:#ce5c00;font-weight:700>.</span><span
style=color:#c4a000>edgeLabel</span><span
style=color:#ce5c00;font-weight:700>(</span><span
style=color:#4e9a06>"created"</span><span
style=color:#ce5c00;font-weight:700>).</span><span
style=color:#c4a000>sourceLabel</span><span
style=color:#ce5c00;font-weight:700>(</span><span style=colo [...]
-</span></span></code></pre></div><blockquote><p>关于 schema 的详细说明请参考 <a
href=/docs/clients/hugegraph-client>hugegraph-client</a>
中对应部分。</p></blockquote><h4 id=32-准备数据>3.2 准备数据</h4><p>目前 HugeGraph-Loader
支持的数据源包括:</p><ul><li>本地磁盘文件或目录</li><li>HDFS 文件或目录</li><li>部分关系型数据库</li></ul><h5
id=321-数据源结构>3.2.1 数据源结构</h5><h6 id=3211-本地磁盘文件或目录>3.2.1.1
本地磁盘文件或目录</h6><p>用户可以指定本地磁盘文件作为数据源,如果数据分散在多个文件中,也支持以某个目录作为数据源,但暂时不支持以多个目录作为数据源。</p><p>比如:我的数据分散在多个文件中,part-0、part-1
… part-n,要想执行导入,必须保证它们是放在一个目录下的 [...]
+</span></span></code></pre></div><blockquote><p>关于 schema 的详细说明请参考 <a
href=/docs/clients/hugegraph-client>hugegraph-client</a>
中对应部分。</p></blockquote><h4 id=32-准备数据>3.2 准备数据</h4><p>目前 HugeGraph-Loader
支持的数据源包括:</p><ul><li>本地磁盘文件或目录</li><li>HDFS
文件或目录</li><li>部分关系型数据库</li><li>Kafka topic</li></ul><h5 id=321-数据源结构>3.2.1
数据源结构</h5><h6 id=3211-本地磁盘文件或目录>3.2.1.1
本地磁盘文件或目录</h6><p>用户可以指定本地磁盘文件作为数据源,如果数据分散在多个文件中,也支持以某个目录作为数据源,但暂时不支持以多个目录作为数据源。</p><p>比如:我的数据分散在多个文件中,part-0、part-1
… part-n,要想 [...]
</span></span><span
style=display:flex><span>1|lop|java|328|ISBN978-7-107-18618-5
</span></span><span
style=display:flex><span>2|ripple|java|199|ISBN978-7-100-13678-5
</span></span></code></pre></div><p>CSV 是分隔符为逗号<code>,</code>的 TEXT
文件,当列值本身包含逗号时,该列值需要用双引号包起来,如:</p><div class=highlight><pre tabindex=0
style=background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code
class=language-fallback data-lang=fallback><span
style=display:flex><span>marko,29,Beijing
@@ -607,7 +607,7 @@
</span></span><span style=display:flex><span> <span
style=color:#000;font-weight:700>]</span>
</span></span><span style=display:flex><span><span
style=color:#000;font-weight:700>}</span>
</span></span></code></pre></div></details><br><p>映射文件 1.0 版本是以顶点和边为中心,设置输入源;而
2.0 版本是以输入源为中心,设置顶点和边映射。有些输入源(比如一个文件)既能生成顶点,也能生成边,如果用 1.0 版的格式写,就需要在 vertex 和
edge 映射块中各写一次 input 块,这两次的 input 块是完全一样的;而 2.0 版本只需要写一次 input。所以 2.0 版相比于 1.0
版,能省掉一些 input 的重复书写。</p><p>在 hugegraph-loader-{version} 的 bin 目录下,有一个脚本工具
<code>mapping-convert.sh</code> 能直接将 1.0 版本的映射文件转换为 2.0 版本的,使用方式如下:</p><div
class=highlight><pre tabindex=0
style=background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4><co
[...]
-</span></span></code></pre></div><p>会在 struct.json 的同级目录下生成一个
struct-v2.json。</p><h5 id=332-输入源>3.3.2
输入源</h5><p>输入源目前分为三类:FILE、HDFS、JDBC,由<code>type</code>节点区分,我们称为本地文件输入源、HDFS
输入源和 JDBC 输入源,下面分别介绍。</p><h6 id=3321-本地文件输入源>3.3.2.1 本地文件输入源</h6><ul><li>id:
输入源的 id,该字段用于支持一些内部功能,非必填(未填时会自动生成),强烈建议写上,对于调试大有裨益;</li><li>skip: 是否跳过该输入源,由于
JSON 文件无法添加注释,如果某次导入时不想导入某个输入源,但又不想删除该输入源的配置,则可以设置为 true 将其跳过,默认为
false,非必填;</li><li>input: 输入源映射块,复合结构<ul><li>type: 输入源类型,必须填 file 或
FILE;</li><li>path: 本地文件 [...]
+</span></span></code></pre></div><p>会在 struct.json 的同级目录下生成一个
struct-v2.json。</p><h5 id=332-输入源>3.3.2
输入源</h5><p>输入源目前分为四类:FILE、HDFS、JDBC、KAFKA,由<code>type</code>节点区分,我们称为本地文件输入源、HDFS
输入源、JDBC 输入源和 KAFKA 输入源,下面分别介绍。</p><h6 id=3321-本地文件输入源>3.3.2.1
本地文件输入源</h6><ul><li>id: 输入源的
id,该字段用于支持一些内部功能,非必填(未填时会自动生成),强烈建议写上,对于调试大有裨益;</li><li>skip: 是否跳过该输入源,由于 JSON
文件无法添加注释,如果某次导入时不想导入某个输入源,但又不想删除该输入源的配置,则可以设置为 true 将其跳过,默认为
false,非必填;</li><li>input: 输入源映射块,复合结构<ul><li>type: 输入源类型,必须填 file 或 FILE;</l
[...]
</span></span></span><span style=display:flex><span><span
style=color:#8f5902;font-style:italic></span><span
style=color:#000;font-weight:700>{</span>
</span></span><span style=display:flex><span> <span
style=color:#204a87;font-weight:700>"vertices"</span><span
style=color:#000;font-weight:700>:</span> <span
style=color:#000;font-weight:700>[</span>
</span></span><span style=display:flex><span> <span
style=color:#000;font-weight:700>{</span>
diff --git a/cn/docs/quickstart/hugegraph-loader/index.html
b/cn/docs/quickstart/hugegraph-loader/index.html
index 8a008525..23c69fa6 100644
--- a/cn/docs/quickstart/hugegraph-loader/index.html
+++ b/cn/docs/quickstart/hugegraph-loader/index.html
@@ -9,14 +9,14 @@ HugeGraph-Loader 是 HugeGraph 的数据导入组件,能够将多种数据源
注意:使用 HugeGraph-Loader 需要依赖 HugeGraph Server 服务,下载和启动 Server 请参考
HugeGraph-Server Quick Start
2 获取 HugeGraph-Loader 有两种方式可以获取 HugeGraph-Loader:
使用 Docker 镜像 (推荐) 下载已编译的压缩包 克隆源码编译安装 2.1 使用 Docker 镜像 我们可以使用 docker run -itd
--name loader hugegraph/loader部署 loader 服务。对于需要加载的数据,则可以通过挂载 -v
/path/to/data/file:/loader/file 或者docker cp的方式将文件复制到 loader 容器内部。
-或者使用 docker-compose 启动 loader, 启动命令为 docker-compose up -d, 样例的
docker-compose."><meta property="og:type" content="article"><meta
property="og:url" content="/cn/docs/quickstart/hugegraph-loader/"><meta
property="article:section" content="docs"><meta
property="article:modified_time" content="2023-11-20T21:13:54+08:00"><meta
property="og:site_name" content="HugeGraph"><meta itemprop=name
content="HugeGraph-Loader Quick Start"><meta itemprop=description content="1
HugeGraph-Loader 概述 HugeGra [...]
+或者使用 docker-compose 启动 loader, 启动命令为 docker-compose up -d, 样例的
docker-compose."><meta property="og:type" content="article"><meta
property="og:url" content="/cn/docs/quickstart/hugegraph-loader/"><meta
property="article:section" content="docs"><meta
property="article:modified_time" content="2023-12-15T19:24:11+08:00"><meta
property="og:site_name" content="HugeGraph"><meta itemprop=name
content="HugeGraph-Loader Quick Start"><meta itemprop=description content="1
HugeGraph-Loader 概述 HugeGra [...]
目前支持的数据源包括:
本地磁盘文件或目录,支持 TEXT、CSV 和 JSON 格式的文件,支持压缩文件 HDFS 文件或目录,支持压缩文件 主流关系型数据库,如
MySQL、PostgreSQL、Oracle、SQL Server 本地磁盘文件和 HDFS 文件支持断点续传。
后面会具体说明。
注意:使用 HugeGraph-Loader 需要依赖 HugeGraph Server 服务,下载和启动 Server 请参考
HugeGraph-Server Quick Start
2 获取 HugeGraph-Loader 有两种方式可以获取 HugeGraph-Loader:
使用 Docker 镜像 (推荐) 下载已编译的压缩包 克隆源码编译安装 2.1 使用 Docker 镜像 我们可以使用 docker run -itd
--name loader hugegraph/loader部署 loader 服务。对于需要加载的数据,则可以通过挂载 -v
/path/to/data/file:/loader/file 或者docker cp的方式将文件复制到 loader 容器内部。
-或者使用 docker-compose 启动 loader, 启动命令为 docker-compose up -d, 样例的
docker-compose."><meta itemprop=dateModified
content="2023-11-20T21:13:54+08:00"><meta itemprop=wordCount
content="2284"><meta itemprop=keywords content><meta name=twitter:card
content="summary"><meta name=twitter:title content="HugeGraph-Loader Quick
Start"><meta name=twitter:description content="1 HugeGraph-Loader 概述
HugeGraph-Loader 是 HugeGraph 的数据导入组件,能够将多种数据源的数据转化为图的顶点和边并批量导入到图数据库中。
+或者使用 docker-compose 启动 loader, 启动命令为 docker-compose up -d, 样例的
docker-compose."><meta itemprop=dateModified
content="2023-12-15T19:24:11+08:00"><meta itemprop=wordCount
content="2287"><meta itemprop=keywords content><meta name=twitter:card
content="summary"><meta name=twitter:title content="HugeGraph-Loader Quick
Start"><meta name=twitter:description content="1 HugeGraph-Loader 概述
HugeGraph-Loader 是 HugeGraph 的数据导入组件,能够将多种数据源的数据转化为图的顶点和边并批量导入到图数据库中。
目前支持的数据源包括:
本地磁盘文件或目录,支持 TEXT、CSV 和 JSON 格式的文件,支持压缩文件 HDFS 文件或目录,支持压缩文件 主流关系型数据库,如
MySQL、PostgreSQL、Oracle、SQL Server 本地磁盘文件和 HDFS 文件支持断点续传。
后面会具体说明。
@@ -78,7 +78,7 @@ HugeGraph-Loader 是 HugeGraph 的数据导入组件,能够将多种数据源
</span></span></span><span style=display:flex><span><span
style=color:#8f5902;font-style:italic></span><span
style=color:#000>schema</span><span
style=color:#ce5c00;font-weight:700>.</span><span
style=color:#c4a000>edgeLabel</span><span
style=color:#ce5c00;font-weight:700>(</span><span
style=color:#4e9a06>"knows"</span><span
style=color:#ce5c00;font-weight:700>).</span><span
style=color:#c4a000>sourceLabel</span><span
style=color:#ce5c00;font-weight:700>(</span><span style=color: [...]
</span></span><span style=display:flex><span><span
style=color:#8f5902;font-style:italic>// 创建 created 边类型,这类边是从 person 指向
software 的
</span></span></span><span style=display:flex><span><span
style=color:#8f5902;font-style:italic></span><span
style=color:#000>schema</span><span
style=color:#ce5c00;font-weight:700>.</span><span
style=color:#c4a000>edgeLabel</span><span
style=color:#ce5c00;font-weight:700>(</span><span
style=color:#4e9a06>"created"</span><span
style=color:#ce5c00;font-weight:700>).</span><span
style=color:#c4a000>sourceLabel</span><span
style=color:#ce5c00;font-weight:700>(</span><span style=colo [...]
-</span></span></code></pre></div><blockquote><p>关于 schema 的详细说明请参考 <a
href=/docs/clients/hugegraph-client>hugegraph-client</a>
中对应部分。</p></blockquote><h4 id=32-准备数据>3.2 准备数据</h4><p>目前 HugeGraph-Loader
支持的数据源包括:</p><ul><li>本地磁盘文件或目录</li><li>HDFS 文件或目录</li><li>部分关系型数据库</li></ul><h5
id=321-数据源结构>3.2.1 数据源结构</h5><h6 id=3211-本地磁盘文件或目录>3.2.1.1
本地磁盘文件或目录</h6><p>用户可以指定本地磁盘文件作为数据源,如果数据分散在多个文件中,也支持以某个目录作为数据源,但暂时不支持以多个目录作为数据源。</p><p>比如:我的数据分散在多个文件中,part-0、part-1
… part-n,要想执行导入,必须保证它们是放在一个目录下的 [...]
+</span></span></code></pre></div><blockquote><p>关于 schema 的详细说明请参考 <a
href=/docs/clients/hugegraph-client>hugegraph-client</a>
中对应部分。</p></blockquote><h4 id=32-准备数据>3.2 准备数据</h4><p>目前 HugeGraph-Loader
支持的数据源包括:</p><ul><li>本地磁盘文件或目录</li><li>HDFS
文件或目录</li><li>部分关系型数据库</li><li>Kafka topic</li></ul><h5 id=321-数据源结构>3.2.1
数据源结构</h5><h6 id=3211-本地磁盘文件或目录>3.2.1.1
本地磁盘文件或目录</h6><p>用户可以指定本地磁盘文件作为数据源,如果数据分散在多个文件中,也支持以某个目录作为数据源,但暂时不支持以多个目录作为数据源。</p><p>比如:我的数据分散在多个文件中,part-0、part-1
… part-n,要想 [...]
</span></span><span
style=display:flex><span>1|lop|java|328|ISBN978-7-107-18618-5
</span></span><span
style=display:flex><span>2|ripple|java|199|ISBN978-7-100-13678-5
</span></span></code></pre></div><p>CSV 是分隔符为逗号<code>,</code>的 TEXT
文件,当列值本身包含逗号时,该列值需要用双引号包起来,如:</p><div class=highlight><pre tabindex=0
style=background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code
class=language-fallback data-lang=fallback><span
style=display:flex><span>marko,29,Beijing
@@ -370,7 +370,7 @@ HugeGraph-Loader 是 HugeGraph 的数据导入组件,能够将多种数据源
</span></span><span style=display:flex><span> <span
style=color:#000;font-weight:700>]</span>
</span></span><span style=display:flex><span><span
style=color:#000;font-weight:700>}</span>
</span></span></code></pre></div></details><br><p>映射文件 1.0 版本是以顶点和边为中心,设置输入源;而
2.0 版本是以输入源为中心,设置顶点和边映射。有些输入源(比如一个文件)既能生成顶点,也能生成边,如果用 1.0 版的格式写,就需要在 vertex 和
edge 映射块中各写一次 input 块,这两次的 input 块是完全一样的;而 2.0 版本只需要写一次 input。所以 2.0 版相比于 1.0
版,能省掉一些 input 的重复书写。</p><p>在 hugegraph-loader-{version} 的 bin 目录下,有一个脚本工具
<code>mapping-convert.sh</code> 能直接将 1.0 版本的映射文件转换为 2.0 版本的,使用方式如下:</p><div
class=highlight><pre tabindex=0
style=background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4><co
[...]
-</span></span></code></pre></div><p>会在 struct.json 的同级目录下生成一个
struct-v2.json。</p><h5 id=332-输入源>3.3.2
输入源</h5><p>输入源目前分为三类:FILE、HDFS、JDBC,由<code>type</code>节点区分,我们称为本地文件输入源、HDFS
输入源和 JDBC 输入源,下面分别介绍。</p><h6 id=3321-本地文件输入源>3.3.2.1 本地文件输入源</h6><ul><li>id:
输入源的 id,该字段用于支持一些内部功能,非必填(未填时会自动生成),强烈建议写上,对于调试大有裨益;</li><li>skip: 是否跳过该输入源,由于
JSON 文件无法添加注释,如果某次导入时不想导入某个输入源,但又不想删除该输入源的配置,则可以设置为 true 将其跳过,默认为
false,非必填;</li><li>input: 输入源映射块,复合结构<ul><li>type: 输入源类型,必须填 file 或
FILE;</li><li>path: 本地文件 [...]
+</span></span></code></pre></div><p>会在 struct.json 的同级目录下生成一个
struct-v2.json。</p><h5 id=332-输入源>3.3.2
输入源</h5><p>输入源目前分为四类:FILE、HDFS、JDBC、KAFKA,由<code>type</code>节点区分,我们称为本地文件输入源、HDFS
输入源、JDBC 输入源和 KAFKA 输入源,下面分别介绍。</p><h6 id=3321-本地文件输入源>3.3.2.1
本地文件输入源</h6><ul><li>id: 输入源的
id,该字段用于支持一些内部功能,非必填(未填时会自动生成),强烈建议写上,对于调试大有裨益;</li><li>skip: 是否跳过该输入源,由于 JSON
文件无法添加注释,如果某次导入时不想导入某个输入源,但又不想删除该输入源的配置,则可以设置为 true 将其跳过,默认为
false,非必填;</li><li>input: 输入源映射块,复合结构<ul><li>type: 输入源类型,必须填 file 或 FILE;</l
[...]
</span></span></span><span style=display:flex><span><span
style=color:#8f5902;font-style:italic></span><span
style=color:#000;font-weight:700>{</span>
</span></span><span style=display:flex><span> <span
style=color:#204a87;font-weight:700>"vertices"</span><span
style=color:#000;font-weight:700>:</span> <span
style=color:#000;font-weight:700>[</span>
</span></span><span style=display:flex><span> <span
style=color:#000;font-weight:700>{</span>
@@ -567,7 +567,7 @@ HugeGraph Toolchain
版本:toolchain-1.0.0</p></blockquote><p><code>spark-load
</span></span></span><span style=display:flex><span><span
style=color:#4e9a06></span>--deploy-mode cluster --name spark-hugegraph-loader
--file ./hugegraph.json <span style=color:#4e9a06>\
</span></span></span><span style=display:flex><span><span
style=color:#4e9a06></span>--username admin --token admin --host xx.xx.xx.xx
--port <span style=color:#0000cf;font-weight:700>8093</span> <span
style=color:#4e9a06>\
</span></span></span><span style=display:flex><span><span
style=color:#4e9a06></span>--graph graph-test --num-executors <span
style=color:#0000cf;font-weight:700>6</span> --executor-cores <span
style=color:#0000cf;font-weight:700>16</span> --executor-memory 15g
-</span></span></code></pre></div><style>.feedback--answer{display:inline-block}.feedback--answer-no{margin-left:1em}.feedback--response{display:none;margin-top:1em}.feedback--response__visible{display:block}</style><script>const
yesButton=document.querySelector(".feedback--answer-yes"),noButton=document.querySelector(".feedback--answer-no"),yesResponse=document.querySelector(".feedback--response-yes"),noResponse=document.querySelector(".feedback--response-no"),disableButtons=()=>{yesButt
[...]
+</span></span></code></pre></div><style>.feedback--answer{display:inline-block}.feedback--answer-no{margin-left:1em}.feedback--response{display:none;margin-top:1em}.feedback--response__visible{display:block}</style><script>const
yesButton=document.querySelector(".feedback--answer-yes"),noButton=document.querySelector(".feedback--answer-no"),yesResponse=document.querySelector(".feedback--response-yes"),noResponse=document.querySelector(".feedback--response-no"),disableButtons=()=>{yesButt
[...]
<script
src=https://cdn.jsdelivr.net/npm/[email protected]/dist/js/bootstrap.min.js
integrity="sha512-UR25UO94eTnCVwjbXozyeVd6ZqpaAE9naiEUBK/A+QDbfSTQFhPGj5lOR6d8tsgbBk84Ggb5A3EkjsOgPRPcKA=="
crossorigin=anonymous></script>
<script src=/js/tabpane-persist.js></script>
<script
src=/js/main.min.aa9f4c5dae6a98b2c46277f4c56f1673a2b000d1756ce4ffae93784cab25e6d5.js
integrity="sha256-qp9MXa5qmLLEYnf0xW8Wc6KwANF1bOT/rpN4TKsl5tU="
crossorigin=anonymous></script>
diff --git a/cn/docs/quickstart/index.xml b/cn/docs/quickstart/index.xml
index 35c037ae..e2ac253c 100644
--- a/cn/docs/quickstart/index.xml
+++ b/cn/docs/quickstart/index.xml
@@ -561,6 +561,7 @@
<li>本地磁盘文件或目录</li>
<li>HDFS 文件或目录</li>
<li>部分关系型数据库</li>
+<li>Kafka topic</li>
</ul>
<h5 id="321-数据源结构">3.2.1 数据源结构</h5>
<h6 id="3211-本地磁盘文件或目录">3.2.1.1 本地磁盘文件或目录</h6>
@@ -923,7 +924,7 @@
<div class="highlight"><pre tabindex="0"
style="background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code
class="language-bash" data-lang="bash"><span
style="display:flex;"><span>bin/mapping-convert.sh struct.json
</span></span></code></pre></div><p>会在 struct.json
的同级目录下生成一个 struct-v2.json。</p>
<h5 id="332-输入源">3.3.2 输入源</h5>
-<p>输入源目前分为三类:FILE、HDFS、JDBC,由<code>type</code>节点区分,我们称为本地文件输入源、HDFS
输入源和 JDBC 输入源,下面分别介绍。</p>
+<p>输入源目前分为四类:FILE、HDFS、JDBC、KAFKA,由<code>type</code>节点区分,我们称为本地文件输入源、HDFS
输入源、JDBC 输入源和 KAFKA 输入源,下面分别介绍。</p>
<h6 id="3321-本地文件输入源">3.3.2.1 本地文件输入源</h6>
<ul>
<li>id: 输入源的 id,该字段用于支持一些内部功能,非必填(未填时会自动生成),强烈建议写上,对于调试大有裨益;</li>
diff --git a/cn/sitemap.xml b/cn/sitemap.xml
index a8238efe..6d34e900 100644
--- a/cn/sitemap.xml
+++ b/cn/sitemap.xml
@@ -1 +1 @@
-<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:xhtml="http://www.w3.org/1999/xhtml"><url><loc>/cn/docs/guides/architectural/</loc><lastmod>2023-06-25T21:06:07+08:00</lastmod><xhtml:link
rel="alternate" hreflang="en" href="/docs/guides/architectural/"/><xhtml:link
rel="alternate" hreflang="cn"
href="/cn/docs/guides/architectural/"/></url><url><loc>/cn/docs/config/config-guide/</loc><lastmod>2023-11-01T21:52:52+08:00
[...]
\ No newline at end of file
+<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:xhtml="http://www.w3.org/1999/xhtml"><url><loc>/cn/docs/guides/architectural/</loc><lastmod>2023-06-25T21:06:07+08:00</lastmod><xhtml:link
rel="alternate" hreflang="en" href="/docs/guides/architectural/"/><xhtml:link
rel="alternate" hreflang="cn"
href="/cn/docs/guides/architectural/"/></url><url><loc>/cn/docs/config/config-guide/</loc><lastmod>2023-11-01T21:52:52+08:00
[...]
\ No newline at end of file
diff --git a/docs/_print/index.html b/docs/_print/index.html
index a132f50a..e8d9f765 100644
--- a/docs/_print/index.html
+++ b/docs/_print/index.html
@@ -335,7 +335,7 @@ Visit the <a
href=https://www.oracle.com/database/technologies/appdev/jdbc-drive
</span></span></span><span style=display:flex><span><span
style=color:#8f5902;font-style:italic></span><span
style=color:#000>schema</span><span
style=color:#ce5c00;font-weight:700>.</span><span
style=color:#c4a000>edgeLabel</span><span
style=color:#ce5c00;font-weight:700>(</span><span
style=color:#4e9a06>"knows"</span><span
style=color:#ce5c00;font-weight:700>).</span><span
style=color:#c4a000>sourceLabel</span><span
style=color:#ce5c00;font-weight:700>(</span><span style=color: [...]
</span></span><span style=display:flex><span><span
style=color:#8f5902;font-style:italic>// Create the created edge type, which
points from person to software
</span></span></span><span style=display:flex><span><span
style=color:#8f5902;font-style:italic></span><span
style=color:#000>schema</span><span
style=color:#ce5c00;font-weight:700>.</span><span
style=color:#c4a000>edgeLabel</span><span
style=color:#ce5c00;font-weight:700>(</span><span
style=color:#4e9a06>"created"</span><span
style=color:#ce5c00;font-weight:700>).</span><span
style=color:#c4a000>sourceLabel</span><span
style=color:#ce5c00;font-weight:700>(</span><span style=colo [...]
-</span></span></code></pre></div><blockquote><p>Please refer to the
corresponding section in <a
href=/docs/clients/hugegraph-client>hugegraph-client</a> for the detailed
description of the schema.</p></blockquote><h4 id=32-prepare-data>3.2 Prepare
data</h4><p>The data sources currently supported by HugeGraph-Loader
include:</p><ul><li>local disk file or directory</li><li>HDFS file or
directory</li><li>Partial relational database</li></ul><h5
id=321-data-source-structure>3.2.1 Data source [...]
+</span></span></code></pre></div><blockquote><p>Please refer to the
corresponding section in <a
href=/docs/clients/hugegraph-client>hugegraph-client</a> for the detailed
description of the schema.</p></blockquote><h4 id=32-prepare-data>3.2 Prepare
data</h4><p>The data sources currently supported by HugeGraph-Loader
include:</p><ul><li>local disk file or directory</li><li>HDFS file or
directory</li><li>Partial relational database</li><li>Kafka topic</li></ul><h5
id=321-data-source-structu [...]
</span></span><span
style=display:flex><span>1|lop|java|328|ISBN978-7-107-18618-5
</span></span><span
style=display:flex><span>2|ripple|java|199|ISBN978-7-100-13678-5
</span></span></code></pre></div><p>CSV is a TEXT file with commas
<code>,</code> as delimiters. When a column value itself contains a comma, the
column value needs to be enclosed in double quotes, for example:</p><div
class=highlight><pre tabindex=0
style=background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code
class=language-fallback data-lang=fallback><span
style=display:flex><span>marko,29,Beijing
@@ -627,7 +627,7 @@ Visit the <a
href=https://www.oracle.com/database/technologies/appdev/jdbc-drive
</span></span><span style=display:flex><span> <span
style=color:#000;font-weight:700>]</span>
</span></span><span style=display:flex><span><span
style=color:#000;font-weight:700>}</span>
</span></span></code></pre></div></details><br><p>The 1.0 version of the
mapping file is centered on the vertex and edge, and sets the input source;
while the 2.0 version is centered on the input source, and sets the vertex and
edge mapping. Some input sources (such as a file) can generate both vertices
and edges. If you write in the 1.0 format, you need to write an input block in
each of the vertex and edge mapping blocks. The two input blocks are exactly
the same ; and the 2.0 version [...]
-</span></span></code></pre></div><p>A struct-v2.json will be generated in the
same directory as struct.json.</p><h5 id=332-input-source>3.3.2 Input
Source</h5><p>Input sources are currently divided into three categories: FILE,
HDFS, and JDBC, which are distinguished by the <code>type</code> node. We call
them local file input sources, HDFS input sources, and JDBC input sources,
which are described below.</p><h6 id=3321-local-file-input-source>3.3.2.1 Local
file input source</h6><ul><li>i [...]
+</span></span></code></pre></div><p>A struct-v2.json will be generated in the
same directory as struct.json.</p><h5 id=332-input-source>3.3.2 Input
Source</h5><p>Input sources are currently divided into four categories: FILE,
HDFS, JDBC and KAFKA, which are distinguished by the <code>type</code> node. We
call them local file input sources, HDFS input sources, JDBC input sources, and
KAFKA input sources, which are described below.</p><h6
id=3321-local-file-input-source>3.3.2.1 Local file [...]
</span></span></span><span style=display:flex><span><span
style=color:#8f5902;font-style:italic></span><span
style=color:#000;font-weight:700>{</span>
</span></span><span style=display:flex><span> <span
style=color:#204a87;font-weight:700>"vertices"</span><span
style=color:#000;font-weight:700>:</span> <span
style=color:#000;font-weight:700>[</span>
</span></span><span style=display:flex><span> <span
style=color:#000;font-weight:700>{</span>
diff --git a/docs/index.xml b/docs/index.xml
index 615111b9..caad685a 100644
--- a/docs/index.xml
+++ b/docs/index.xml
@@ -5068,6 +5068,7 @@ Visit the <a
href="https://www.oracle.com/database/technologies/appdev/jdbc-d
<li>local disk file or directory</li>
<li>HDFS file or directory</li>
<li>Partial relational database</li>
+<li>Kafka topic</li>
</ul>
<h5 id="321-data-source-structure">3.2.1 Data source structure</h5>
<h6 id="3211-local-disk-file-or-directory">3.2.1.1 Local disk file or
directory</h6>
@@ -5430,7 +5431,7 @@ Visit the <a
href="https://www.oracle.com/database/technologies/appdev/jdbc-d
<div class="highlight"><pre tabindex="0"
style="background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code
class="language-bash" data-lang="bash"><span
style="display:flex;"><span>bin/mapping-convert.sh struct.json
</span></span></code></pre></div><p>A struct-v2.json will be
generated in the same directory as struct.json.</p>
<h5 id="332-input-source">3.3.2 Input Source</h5>
-<p>Input sources are currently divided into three categories: FILE, HDFS,
and JDBC, which are distinguished by the <code>type</code> node. We call
them local file input sources, HDFS input sources, and JDBC input sources,
which are described below.</p>
+<p>Input sources are currently divided into four categories: FILE, HDFS,
JDBC and KAFKA, which are distinguished by the <code>type</code> node. We
call them local file input sources, HDFS input sources, JDBC input sources, and
KAFKA input sources, which are described below.</p>
<h6 id="3321-local-file-input-source">3.3.2.1 Local file input
source</h6>
<ul>
<li>id: The id of the input source. This field is used to support some
internal functions. It is not required (it will be automatically generated if
it is not filled in). It is strongly recommended to write it, which is very
helpful for debugging;</li>
diff --git a/docs/quickstart/_print/index.html
b/docs/quickstart/_print/index.html
index 14e138a3..626ffdee 100644
--- a/docs/quickstart/_print/index.html
+++ b/docs/quickstart/_print/index.html
@@ -330,7 +330,7 @@ Visit the <a
href=https://www.oracle.com/database/technologies/appdev/jdbc-drive
</span></span></span><span style=display:flex><span><span
style=color:#8f5902;font-style:italic></span><span
style=color:#000>schema</span><span
style=color:#ce5c00;font-weight:700>.</span><span
style=color:#c4a000>edgeLabel</span><span
style=color:#ce5c00;font-weight:700>(</span><span
style=color:#4e9a06>"knows"</span><span
style=color:#ce5c00;font-weight:700>).</span><span
style=color:#c4a000>sourceLabel</span><span
style=color:#ce5c00;font-weight:700>(</span><span style=color: [...]
</span></span><span style=display:flex><span><span
style=color:#8f5902;font-style:italic>// Create the created edge type, which
points from person to software
</span></span></span><span style=display:flex><span><span
style=color:#8f5902;font-style:italic></span><span
style=color:#000>schema</span><span
style=color:#ce5c00;font-weight:700>.</span><span
style=color:#c4a000>edgeLabel</span><span
style=color:#ce5c00;font-weight:700>(</span><span
style=color:#4e9a06>"created"</span><span
style=color:#ce5c00;font-weight:700>).</span><span
style=color:#c4a000>sourceLabel</span><span
style=color:#ce5c00;font-weight:700>(</span><span style=colo [...]
-</span></span></code></pre></div><blockquote><p>Please refer to the
corresponding section in <a
href=/docs/clients/hugegraph-client>hugegraph-client</a> for the detailed
description of the schema.</p></blockquote><h4 id=32-prepare-data>3.2 Prepare
data</h4><p>The data sources currently supported by HugeGraph-Loader
include:</p><ul><li>local disk file or directory</li><li>HDFS file or
directory</li><li>Partial relational database</li></ul><h5
id=321-data-source-structure>3.2.1 Data source [...]
+</span></span></code></pre></div><blockquote><p>Please refer to the
corresponding section in <a
href=/docs/clients/hugegraph-client>hugegraph-client</a> for the detailed
description of the schema.</p></blockquote><h4 id=32-prepare-data>3.2 Prepare
data</h4><p>The data sources currently supported by HugeGraph-Loader
include:</p><ul><li>local disk file or directory</li><li>HDFS file or
directory</li><li>Partial relational database</li><li>Kafka topic</li></ul><h5
id=321-data-source-structu [...]
</span></span><span
style=display:flex><span>1|lop|java|328|ISBN978-7-107-18618-5
</span></span><span
style=display:flex><span>2|ripple|java|199|ISBN978-7-100-13678-5
</span></span></code></pre></div><p>CSV is a TEXT file with commas
<code>,</code> as delimiters. When a column value itself contains a comma, the
column value needs to be enclosed in double quotes, for example:</p><div
class=highlight><pre tabindex=0
style=background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code
class=language-fallback data-lang=fallback><span
style=display:flex><span>marko,29,Beijing
@@ -622,7 +622,7 @@ Visit the <a
href=https://www.oracle.com/database/technologies/appdev/jdbc-drive
</span></span><span style=display:flex><span> <span
style=color:#000;font-weight:700>]</span>
</span></span><span style=display:flex><span><span
style=color:#000;font-weight:700>}</span>
</span></span></code></pre></div></details><br><p>The 1.0 version of the
mapping file is centered on the vertex and edge, and sets the input source;
while the 2.0 version is centered on the input source, and sets the vertex and
edge mapping. Some input sources (such as a file) can generate both vertices
and edges. If you write in the 1.0 format, you need to write an input block in
each of the vertex and edge mapping blocks. The two input blocks are exactly
the same ; and the 2.0 version [...]
-</span></span></code></pre></div><p>A struct-v2.json will be generated in the
same directory as struct.json.</p><h5 id=332-input-source>3.3.2 Input
Source</h5><p>Input sources are currently divided into three categories: FILE,
HDFS, and JDBC, which are distinguished by the <code>type</code> node. We call
them local file input sources, HDFS input sources, and JDBC input sources,
which are described below.</p><h6 id=3321-local-file-input-source>3.3.2.1 Local
file input source</h6><ul><li>i [...]
+</span></span></code></pre></div><p>A struct-v2.json will be generated in the
same directory as struct.json.</p><h5 id=332-input-source>3.3.2 Input
Source</h5><p>Input sources are currently divided into four categories: FILE,
HDFS, JDBC and KAFKA, which are distinguished by the <code>type</code> node. We
call them local file input sources, HDFS input sources, JDBC input sources, and
KAFKA input sources, which are described below.</p><h6
id=3321-local-file-input-source>3.3.2.1 Local file [...]
</span></span></span><span style=display:flex><span><span
style=color:#8f5902;font-style:italic></span><span
style=color:#000;font-weight:700>{</span>
</span></span><span style=display:flex><span> <span
style=color:#204a87;font-weight:700>"vertices"</span><span
style=color:#000;font-weight:700>:</span> <span
style=color:#000;font-weight:700>[</span>
</span></span><span style=display:flex><span> <span
style=color:#000;font-weight:700>{</span>
diff --git a/docs/quickstart/hugegraph-loader/index.html
b/docs/quickstart/hugegraph-loader/index.html
index 0f46fcf0..a0a65251 100644
--- a/docs/quickstart/hugegraph-loader/index.html
+++ b/docs/quickstart/hugegraph-loader/index.html
@@ -1,9 +1,9 @@
<!doctype html><html lang=en class=no-js><head><meta charset=utf-8><meta
name=viewport
content="width=device-width,initial-scale=1,shrink-to-fit=no"><meta
name=generator content="Hugo 0.102.3"><meta name=robots content="index,
follow"><link rel="shortcut icon" href=/favicons/favicon.ico><link
rel=apple-touch-icon href=/favicons/apple-touch-icon-180x180.png
sizes=180x180><link rel=icon type=image/png href=/favicons/favicon-16x16.png
sizes=16x16><link rel=icon type=image/png href=/favicons [...]
HugeGraph-Loader is the data import component of HugeGraph, which can convert
data from various data sources into graph …"><meta property="og:title"
content="HugeGraph-Loader Quick Start"><meta property="og:description"
content="1 HugeGraph-Loader Overview HugeGraph-Loader is the data import
component of HugeGraph, which can convert data from various data sources into
graph vertices and edges and import them into the graph database in batches.
Currently supported data sources include:
-Local disk file or directory, supports TEXT, CSV and JSON format files,
supports compressed files HDFS file or directory, supports compressed files
Mainstream relational databases, such as MySQL, PostgreSQL, Oracle, SQL Server
Local disk files and HDFS files support resumable uploads."><meta
property="og:type" content="article"><meta property="og:url"
content="/docs/quickstart/hugegraph-loader/"><meta property="article:section"
content="docs"><meta property="article:modified_time" conten [...]
+Local disk file or directory, supports TEXT, CSV and JSON format files,
supports compressed files HDFS file or directory, supports compressed files
Mainstream relational databases, such as MySQL, PostgreSQL, Oracle, SQL Server
Local disk files and HDFS files support resumable uploads."><meta
property="og:type" content="article"><meta property="og:url"
content="/docs/quickstart/hugegraph-loader/"><meta property="article:section"
content="docs"><meta property="article:modified_time" conten [...]
Currently supported data sources include:
-Local disk file or directory, supports TEXT, CSV and JSON format files,
supports compressed files HDFS file or directory, supports compressed files
Mainstream relational databases, such as MySQL, PostgreSQL, Oracle, SQL Server
Local disk files and HDFS files support resumable uploads."><meta
itemprop=dateModified content="2023-11-20T21:13:54+08:00"><meta
itemprop=wordCount content="6199"><meta itemprop=keywords content><meta
name=twitter:card content="summary"><meta name=twitter:title co [...]
+Local disk file or directory, supports TEXT, CSV and JSON format files,
supports compressed files HDFS file or directory, supports compressed files
Mainstream relational databases, such as MySQL, PostgreSQL, Oracle, SQL Server
Local disk files and HDFS files support resumable uploads."><meta
itemprop=dateModified content="2023-12-15T19:24:11+08:00"><meta
itemprop=wordCount content="6205"><meta itemprop=keywords content><meta
name=twitter:card content="summary"><meta name=twitter:title co [...]
Currently supported data sources include:
Local disk file or directory, supports TEXT, CSV and JSON format files,
supports compressed files HDFS file or directory, supports compressed files
Mainstream relational databases, such as MySQL, PostgreSQL, Oracle, SQL Server
Local disk files and HDFS files support resumable uploads."><link rel=preload
href=/scss/main.min.1764bdd1b00b15c82ea08e6a847f47114a8787b9770c047a8c6082457466ce2b.css
as=style><link
href=/scss/main.min.1764bdd1b00b15c82ea08e6a847f47114a8787b9770c047a8c6082457466ce2
[...]
<link rel=stylesheet href=/css/prism.css><script
type=application/javascript>var
doNotTrack=!1;doNotTrack||(window.ga=window.ga||function(){(ga.q=ga.q||[]).push(arguments)},ga.l=+new
Date,ga("create","UA-00000000-0","auto"),ga("send","pageview"))</script><script
async src=https://www.google-analytics.com/analytics.js></script></head><body
class=td-page><header><nav class="js-navbar-scroll navbar navbar-expand
navbar-dark flex-column flex-md-row td-navbar"><a class=navbar-brand href=/><sp
[...]
@@ -60,7 +60,7 @@ Visit the <a
href=https://www.oracle.com/database/technologies/appdev/jdbc-drive
</span></span></span><span style=display:flex><span><span
style=color:#8f5902;font-style:italic></span><span
style=color:#000>schema</span><span
style=color:#ce5c00;font-weight:700>.</span><span
style=color:#c4a000>edgeLabel</span><span
style=color:#ce5c00;font-weight:700>(</span><span
style=color:#4e9a06>"knows"</span><span
style=color:#ce5c00;font-weight:700>).</span><span
style=color:#c4a000>sourceLabel</span><span
style=color:#ce5c00;font-weight:700>(</span><span style=color: [...]
</span></span><span style=display:flex><span><span
style=color:#8f5902;font-style:italic>// Create the created edge type, which
points from person to software
</span></span></span><span style=display:flex><span><span
style=color:#8f5902;font-style:italic></span><span
style=color:#000>schema</span><span
style=color:#ce5c00;font-weight:700>.</span><span
style=color:#c4a000>edgeLabel</span><span
style=color:#ce5c00;font-weight:700>(</span><span
style=color:#4e9a06>"created"</span><span
style=color:#ce5c00;font-weight:700>).</span><span
style=color:#c4a000>sourceLabel</span><span
style=color:#ce5c00;font-weight:700>(</span><span style=colo [...]
-</span></span></code></pre></div><blockquote><p>Please refer to the
corresponding section in <a
href=/docs/clients/hugegraph-client>hugegraph-client</a> for the detailed
description of the schema.</p></blockquote><h4 id=32-prepare-data>3.2 Prepare
data</h4><p>The data sources currently supported by HugeGraph-Loader
include:</p><ul><li>local disk file or directory</li><li>HDFS file or
directory</li><li>Partial relational database</li></ul><h5
id=321-data-source-structure>3.2.1 Data source [...]
+</span></span></code></pre></div><blockquote><p>Please refer to the
corresponding section in <a
href=/docs/clients/hugegraph-client>hugegraph-client</a> for the detailed
description of the schema.</p></blockquote><h4 id=32-prepare-data>3.2 Prepare
data</h4><p>The data sources currently supported by HugeGraph-Loader
include:</p><ul><li>local disk file or directory</li><li>HDFS file or
directory</li><li>Partial relational database</li><li>Kafka topic</li></ul><h5
id=321-data-source-structu [...]
</span></span><span
style=display:flex><span>1|lop|java|328|ISBN978-7-107-18618-5
</span></span><span
style=display:flex><span>2|ripple|java|199|ISBN978-7-100-13678-5
</span></span></code></pre></div><p>CSV is a TEXT file with commas
<code>,</code> as delimiters. When a column value itself contains a comma, the
column value needs to be enclosed in double quotes, for example:</p><div
class=highlight><pre tabindex=0
style=background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code
class=language-fallback data-lang=fallback><span
style=display:flex><span>marko,29,Beijing
@@ -352,7 +352,7 @@ Visit the <a
href=https://www.oracle.com/database/technologies/appdev/jdbc-drive
</span></span><span style=display:flex><span> <span
style=color:#000;font-weight:700>]</span>
</span></span><span style=display:flex><span><span
style=color:#000;font-weight:700>}</span>
</span></span></code></pre></div></details><br><p>The 1.0 version of the
mapping file is centered on the vertex and edge, and sets the input source;
while the 2.0 version is centered on the input source, and sets the vertex and
edge mapping. Some input sources (such as a file) can generate both vertices
and edges. If you write in the 1.0 format, you need to write an input block in
each of the vertex and edge mapping blocks. The two input blocks are exactly
the same ; and the 2.0 version [...]
-</span></span></code></pre></div><p>A struct-v2.json will be generated in the
same directory as struct.json.</p><h5 id=332-input-source>3.3.2 Input
Source</h5><p>Input sources are currently divided into three categories: FILE,
HDFS, and JDBC, which are distinguished by the <code>type</code> node. We call
them local file input sources, HDFS input sources, and JDBC input sources,
which are described below.</p><h6 id=3321-local-file-input-source>3.3.2.1 Local
file input source</h6><ul><li>i [...]
+</span></span></code></pre></div><p>A struct-v2.json will be generated in the
same directory as struct.json.</p><h5 id=332-input-source>3.3.2 Input
Source</h5><p>Input sources are currently divided into four categories: FILE,
HDFS, JDBC and KAFKA, which are distinguished by the <code>type</code> node. We
call them local file input sources, HDFS input sources, JDBC input sources, and
KAFKA input sources, which are described below.</p><h6
id=3321-local-file-input-source>3.3.2.1 Local file [...]
</span></span></span><span style=display:flex><span><span
style=color:#8f5902;font-style:italic></span><span
style=color:#000;font-weight:700>{</span>
</span></span><span style=display:flex><span> <span
style=color:#204a87;font-weight:700>"vertices"</span><span
style=color:#000;font-weight:700>:</span> <span
style=color:#000;font-weight:700>[</span>
</span></span><span style=display:flex><span> <span
style=color:#000;font-weight:700>{</span>
@@ -549,7 +549,7 @@ And there is no need to guarantee the order between the two
parameters.</p><ul><
</span></span></span><span style=display:flex><span><span
style=color:#4e9a06></span>--deploy-mode cluster --name spark-hugegraph-loader
--file ./hugegraph.json <span style=color:#4e9a06>\
</span></span></span><span style=display:flex><span><span
style=color:#4e9a06></span>--username admin --token admin --host xx.xx.xx.xx
--port <span style=color:#0000cf;font-weight:700>8093</span> <span
style=color:#4e9a06>\
</span></span></span><span style=display:flex><span><span
style=color:#4e9a06></span>--graph graph-test --num-executors <span
style=color:#0000cf;font-weight:700>6</span> --executor-cores <span
style=color:#0000cf;font-weight:700>16</span> --executor-memory 15g
-</span></span></code></pre></div><style>.feedback--answer{display:inline-block}.feedback--answer-no{margin-left:1em}.feedback--response{display:none;margin-top:1em}.feedback--response__visible{display:block}</style><script>const
yesButton=document.querySelector(".feedback--answer-yes"),noButton=document.querySelector(".feedback--answer-no"),yesResponse=document.querySelector(".feedback--response-yes"),noResponse=document.querySelector(".feedback--response-no"),disableButtons=()=>{yesButt
[...]
+</span></span></code></pre></div><style>.feedback--answer{display:inline-block}.feedback--answer-no{margin-left:1em}.feedback--response{display:none;margin-top:1em}.feedback--response__visible{display:block}</style><script>const
yesButton=document.querySelector(".feedback--answer-yes"),noButton=document.querySelector(".feedback--answer-no"),yesResponse=document.querySelector(".feedback--response-yes"),noResponse=document.querySelector(".feedback--response-no"),disableButtons=()=>{yesButt
[...]
<script
src=https://cdn.jsdelivr.net/npm/[email protected]/dist/js/bootstrap.min.js
integrity="sha512-UR25UO94eTnCVwjbXozyeVd6ZqpaAE9naiEUBK/A+QDbfSTQFhPGj5lOR6d8tsgbBk84Ggb5A3EkjsOgPRPcKA=="
crossorigin=anonymous></script>
<script src=/js/tabpane-persist.js></script>
<script
src=/js/main.min.aa9f4c5dae6a98b2c46277f4c56f1673a2b000d1756ce4ffae93784cab25e6d5.js
integrity="sha256-qp9MXa5qmLLEYnf0xW8Wc6KwANF1bOT/rpN4TKsl5tU="
crossorigin=anonymous></script>
diff --git a/docs/quickstart/index.xml b/docs/quickstart/index.xml
index 03e8f6b6..8c029f97 100644
--- a/docs/quickstart/index.xml
+++ b/docs/quickstart/index.xml
@@ -579,6 +579,7 @@ Visit the <a
href="https://www.oracle.com/database/technologies/appdev/jdbc-d
<li>local disk file or directory</li>
<li>HDFS file or directory</li>
<li>Partial relational database</li>
+<li>Kafka topic</li>
</ul>
<h5 id="321-data-source-structure">3.2.1 Data source structure</h5>
<h6 id="3211-local-disk-file-or-directory">3.2.1.1 Local disk file or
directory</h6>
@@ -941,7 +942,7 @@ Visit the <a
href="https://www.oracle.com/database/technologies/appdev/jdbc-d
<div class="highlight"><pre tabindex="0"
style="background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code
class="language-bash" data-lang="bash"><span
style="display:flex;"><span>bin/mapping-convert.sh struct.json
</span></span></code></pre></div><p>A struct-v2.json will be
generated in the same directory as struct.json.</p>
<h5 id="332-input-source">3.3.2 Input Source</h5>
-<p>Input sources are currently divided into three categories: FILE, HDFS,
and JDBC, which are distinguished by the <code>type</code> node. We call
them local file input sources, HDFS input sources, and JDBC input sources,
which are described below.</p>
+<p>Input sources are currently divided into four categories: FILE, HDFS,
JDBC and KAFKA, which are distinguished by the <code>type</code> node. We
call them local file input sources, HDFS input sources, JDBC input sources, and
KAFKA input sources, which are described below.</p>
<h6 id="3321-local-file-input-source">3.3.2.1 Local file input
source</h6>
<ul>
<li>id: The id of the input source. This field is used to support some
internal functions. It is not required (it will be automatically generated if
it is not filled in). It is strongly recommended to write it, which is very
helpful for debugging;</li>
diff --git a/en/sitemap.xml b/en/sitemap.xml
index 893fa601..4ae2d5f5 100644
--- a/en/sitemap.xml
+++ b/en/sitemap.xml
@@ -1 +1 @@
-<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:xhtml="http://www.w3.org/1999/xhtml"><url><loc>/docs/guides/architectural/</loc><lastmod>2023-06-25T21:06:07+08:00</lastmod><xhtml:link
rel="alternate" hreflang="cn"
href="/cn/docs/guides/architectural/"/><xhtml:link rel="alternate"
hreflang="en"
href="/docs/guides/architectural/"/></url><url><loc>/docs/config/config-guide/</loc><lastmod>2023-11-01T21:52:52+08:00</last
[...]
\ No newline at end of file
+<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:xhtml="http://www.w3.org/1999/xhtml"><url><loc>/docs/guides/architectural/</loc><lastmod>2023-06-25T21:06:07+08:00</lastmod><xhtml:link
rel="alternate" hreflang="cn"
href="/cn/docs/guides/architectural/"/><xhtml:link rel="alternate"
hreflang="en"
href="/docs/guides/architectural/"/></url><url><loc>/docs/config/config-guide/</loc><lastmod>2023-11-01T21:52:52+08:00</last
[...]
\ No newline at end of file
diff --git a/sitemap.xml b/sitemap.xml
index a47f49b1..804d1808 100644
--- a/sitemap.xml
+++ b/sitemap.xml
@@ -1 +1 @@
-<?xml version="1.0" encoding="utf-8" standalone="yes"?><sitemapindex
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"><sitemap><loc>/en/sitemap.xml</loc><lastmod>2023-11-30T14:46:43+08:00</lastmod></sitemap><sitemap><loc>/cn/sitemap.xml</loc><lastmod>2023-11-30T14:46:43+08:00</lastmod></sitemap></sitemapindex>
\ No newline at end of file
+<?xml version="1.0" encoding="utf-8" standalone="yes"?><sitemapindex
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"><sitemap><loc>/en/sitemap.xml</loc><lastmod>2023-12-15T19:24:11+08:00</lastmod></sitemap><sitemap><loc>/cn/sitemap.xml</loc><lastmod>2023-12-15T19:24:11+08:00</lastmod></sitemap></sitemapindex>
\ No newline at end of file