GitHub user wangzhewwzz added a comment to the discussion: 关于电网拓扑数据导入
还是报错的呀 !!! 已经是按照您说的改了 以 Substation2LineSegment 为例如下:
>> HugeGraphLoader worked in NORMAL MODE
vertices/edges loaded this time : 08:35:15.974 [loader] ERROR
org.apache.hugegraph.loader.task.ParseTaskBuilder - Parse EDGE error
org.apache.hugegraph.loader.exception.ParseException: Make sure the primary key
fields [id] are not empty, or check whether the headers or field_mapping are
configured correctly
.........................
count metrics
input read success : 188
input read failure : 0
vertex parse success : 81
vertex parse failure : 0
vertex insert success : 81
vertex insert failure : 0
edge parse success : 0
edge parse failure : 963
edge insert success : 0
edge insert failure : 0
--------------------------------------------------
meter metrics
total time
: 0.576s
read time
: 0.119s
load time
: 0.457s
vertex load time
: 0.084s
vertex load rate(vertices/s) : 964
edge load time
: 0.373s
edge load rate(edges/s) : 0
schema.edgeLabel("Substation2LineSegment").sourceLabel("Substation").targetLabel("LineSegment").properties("source_label",
"target_label").ifNotExist().create();
{
"label": "Substation2LineSegment",
"source": ["from"],
"target": ["to"],
"input": {
"type": "file",
"path": "/loader/power_data/edge_connected_to.csv",
"format": "CSV",
"header": ["id", "from", "to", "label",
"source_label", "target_label"]
},
"field_mapping": { "from": "from", "to": "to" },
"value_mapping": {
"label": {"CONNECTED_TO": "Substation2LineSegment"},
"source_label": { "Substation": "Substation" },
"target_label": { "LineSegment": "LineSegment" }
}
},
{
"label": "LineSegment2Substation",
"source": ["to"],
"target": ["from"],
"input": {
"type": "file",
"path": "/loader/power_data/edge_connected_to.csv",
"format": "CSV",
"header": ["id", "from", "to", "label",
"source_label", "target_label"]
},
"field_mapping": { "from": "from", "to": "to" },
"value_mapping": {
"label": { "CONNECTED_TO":
"LineSegment2Substation" },
"source_label": { "LineSegment":
"LineSegment" },
"target_label": { "Substation": "Substation" }
}
}
原始邮件
发件人:dosubot[bot] ***@***.***>
发件时间:2025年12月9日 16:28
收件人:apache/incubator-hugegraph ***@***.***>
抄送:wangzhewwzz ***@***.***>, Mention ***@***.***>
主题:Re: [apache/incubator-hugegraph] 关于电网拓扑数据导入 (Discussion #2917)
你的配置问题已经很明确:边数据的 id 字段被错误地当作边属性导入了,但 schema.groovy 的边标签没有声明 id 属性,所以
HugeGraphLoader 报 “Invalid property 'id' for edge label” 错误参考。
怎么修正:
struct.json 的 field_mapping 必须只写 from/to,不要写 id
"field_mapping": { "from": "from", "to": "to" }
这样 loader 只会用 from/to 字段去查找顶点,不会把 id 字段当作边属性导入。
schema.groovy 的 edgeLabel .properties() 只写实际需要的属性
例如:
schema.edgeLabel("LineSegment2LineSegment")
.sourceLabel("LineSegment") .targetLabel("LineSegment")
.properties("source_label", "target_label")
.ifNotExist().create();
不要加 "id"。
edge_connected_to.csv 的 id 列可以保留,但 loader 不会导入它为边属性,只要 field_mapping 不写 id
就不会报错。
value_mapping 必须是平铺结构,你现在的写法没问题。
你现在的 field_mapping 写成了:
"field_mapping": { "from": "id", "to": "id" }
这会把 CSV 的 id 列当作边属性导入,导致报错。正确写法应该是:
"field_mapping": { "from": "from", "to": "to" }
这样 loader 只会用 from/to 字段去查找顶点,不会把 id 字段当作边属性导入。
总结:
struct.json 的 edges 里 field_mapping 只写 from/to,不要写 id。
schema.groovy 的 edgeLabel .properties() 只写 "source_label", "target_label",不要加
"id"。
edge_connected_to.csv 的 id 列可以保留,但 loader 不会导入它为边属性。
value_mapping 是平铺结构即可。
这样配置后,边就能正常导入且无报错,所有关系都能导入。官方FAQ说明 源码机制说明
To reply, just mention @dosu.
How did I do? Good | Irrelevant | Incorrect | Verbose |
Hallucination | Report 🐛 | Other
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.
GitHub link:
https://github.com/apache/incubator-hugegraph/discussions/2917#discussioncomment-15206028
----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]