GitHub user wangzhewwzz added a comment to the discussion: 关于电网拓扑数据导入
Caused by: java.lang.IllegalArgumentException: Make sure the primary key fields
[id] are not empty, or check whether the headers or field_mapping are
configured correctly
at
com.google.common.base.Preconditions.checkArgument(Preconditions.java:164)
~[guava-30.0-jre.jar:?]
at
org.apache.hugegraph.util.E.checkArgument(E.java:52)
~[hugegraph-common-1.5.0.jar:1.5.0]
at
org.apache.hugegraph.loader.builder.ElementBuilder$VertexPkKVPairs.extractFromEdge(ElementBuilder.java:682)
~[hugegraph-loader-1.7.0.jar:1.7.0]
at
org.apache.hugegraph.loader.builder.EdgeBuilder.build(EdgeBuilder.java:88)
~[hugegraph-loader-1.7.0.jar:1.7.0]
at
org.apache.hugegraph.loader.task.ParseTaskBuilder.lambda$buildTask$0(ParseTaskBuilder.java:103)
~[hugegraph-loader-1.7.0.jar:1.7.0]
... 8 more
01:17:47.105 [loader] ERROR org.apache.hugegraph.loader.task.ParseTaskBuilder -
Parse EDGE error
org.apache.hugegraph.loader.exception.ParseException: Make sure the primary key
fields [id] are not empty, or check whether the headers or field_mapping are
configured correctly
at
org.apache.hugegraph.loader.task.ParseTaskBuilder.lambda$buildTask$0(ParseTaskBuilder.java:124)
~[hugegraph-loader-1.7.0.jar:1.7.0]
at
org.apache.hugegraph.loader.task.ParseTaskBuilder$ParseTask.get(ParseTaskBuilder.java:163)
~[hugegraph-loader-1.7.0.jar:1.7.0]
at
org.apache.hugegraph.loader.HugeGraphLoader.executeParseTask(HugeGraphLoader.java:816)
~[hugegraph-loader-1.7.0.jar:1.7.0]
at
org.apache.hugegraph.loader.HugeGraphLoader.loadStruct(HugeGraphLoader.java:789)
~[hugegraph-loader-1.7.0.jar:1.7.0]
at
org.apache.hugegraph.loader.HugeGraphLoader.lambda$asyncLoadStruct$14(HugeGraphLoader.java:734)
~[hugegraph-loader-1.7.0.jar:1.7.0]
at
java.util.concurrent.CompletableFuture$AsyncRun.run(Unknown Source) ~[?:?]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:?]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:?]
at java.lang.Thread.run(Unknown Source) ~[?:?]
Caused by: java.lang.IllegalArgumentException: Make sure the primary key fields
[id] are not empty, or check whether the headers or field_mapping are
configured correctly
at
com.google.common.base.Preconditions.checkArgument(Preconditions.java:164)
~[guava-30.0-jre.jar:?]
at
org.apache.hugegraph.util.E.checkArgument(E.java:52)
~[hugegraph-common-1.5.0.jar:1.5.0]
at
org.apache.hugegraph.loader.builder.ElementBuilder$VertexPkKVPairs.extractFromEdge(ElementBuilder.java:682)
~[hugegraph-loader-1.7.0.jar:1.7.0]
at
org.apache.hugegraph.loader.builder.EdgeBuilder.build(EdgeBuilder.java:88)
~[hugegraph-loader-1.7.0.jar:1.7.0]
at
org.apache.hugegraph.loader.task.ParseTaskBuilder.lambda$buildTask$0(ParseTaskBuilder.java:103)
~[hugegraph-loader-1.7.0.jar:1.7.0]
... 8 more
01:17:47.107 [loader] ERROR org.apache.hugegraph.loader.util.Printer - More
than 1 parse error, stop parsing and waiting all insert tasks stopped
More than 1 parse error, stop parsing and waiting all insert tasks stopped
81/0
--------------------------------------------------
count metrics
input read success : 188
input read failure : 0
vertex parse success : 81
vertex parse failure : 0
vertex insert success : 81
vertex insert failure : 0
edge parse success : 0
edge parse failure : 963
edge insert success : 0
edge insert failure : 0
--------------------------------------------------
meter metrics
total time
: 0.601s
read time
: 0.116s
load time
: 0.485s
vertex load time
: 0.095s
vertex load rate(vertices/s) : 852
edge load time
: 0.39s
edge load rate(edges/s) : 0
struct.json
脚本如下
{
"vertices": [
{
"label": "Substation",
"input": {
"type": "file",
"path":
"/loader/power_data/vertex_substation-1.csv",
"format": "CSV",
"header": ["id", "name", "objectType",
"objectHandle", "deviceType", "internalEndpointNo", "terminalNo", "usage",
"powerFlowDirection", "belongSubstation", "belongFeeder"],
"charset": "UTF-8"
},
"null_values": ["NULL", "null", ""]
},
{
"label": "LineSegment",
"input": {
"type": "file",
"path":
"/loader/power_data/vertex_linesegment-1.csv",
"format": "CSV",
"header": ["id", "name", "objectType",
"objectHandle", "deviceType", "internalEndpointNo", "terminalNo", "usage",
"powerFlowDirection", "belongSubstation", "belongFeeder"],
"charset": "UTF-8"
},
"null_values": ["NULL", "null", ""]
},
{
"label": "LineSwitch",
"input": {
"type": "file",
"path":
"/loader/power_data/vertex_lineswitch-1.csv",
"format": "CSV",
"header": ["id", "name", "objectType",
"objectHandle", "deviceType", "internalEndpointNo", "terminalNo", "usage",
"powerFlowDirection", "belongSubstation", "belongFeeder"],
"charset": "UTF-8"
},
"null_values": ["NULL", "null", ""]
},
{
"label": "StationHouse",
"input": {
"type": "file",
"path":
"/loader/power_data/vertex_stationhouse-1.csv",
"format": "CSV",
"header": ["id", "name", "objectType",
"objectHandle", "deviceType", "internalEndpointNo", "terminalNo", "usage",
"powerFlowDirection", "belongSubstation", "belongFeeder"],
"charset": "UTF-8"
},
"null_values": ["NULL", "null", ""]
}
],
"edges": [
{
"label": "Substation2LineSegment",
"source": ["from"],
"target": ["to"],
"input": {
"type": "file",
"path": "/loader/power_data/edge_connected_to.csv",
"format": "CSV",
"header": ["id", "from", "to", "label",
"source_label", "target_label"]
},
"field_mapping": { "from": "from", "to": "to" },
"value_mapping": {
"label": {"CONNECTED_TO": "Substation2LineSegment"},
"source_label": { "Substation": "Substation" },
"target_label": { "LineSegment": "LineSegment" }
}
},
{
"label": "LineSegment2Substation",
"source": ["to"],
"target": ["from"],
"input": {
"type": "file",
"path": "/loader/power_data/edge_connected_to.csv",
"format": "CSV",
"header": ["id", "from", "to", "label",
"source_label", "target_label"]
},
"field_mapping": { "from": "from", "to": "to" },
"value_mapping": {
"label": { "CONNECTED_TO":
"LineSegment2Substation" },
"source_label": { "LineSegment":
"LineSegment" },
"target_label": { "Substation": "Substation" }
}
},
{
"label": "LineSegment2StationHouse",
"source": ["from"],
"target": ["to"],
"input": {
"type": "file",
"path": "/loader/power_data/edge_connected_to.csv",
"format": "CSV",
"header": ["id", "from", "to", "label",
"source_label", "target_label"]
},
"field_mapping": { "from": "from", "to": "to" },
"value_mapping": {
"label": { "CONNECTED_TO":
"LineSegment2StationHouse" },
"source_label": { "LineSegment": "LineSegment" },
"target_label": { "StationHouse": "StationHouse" }
}
},
{
"label": "StationHouse2LineSegment",
"source": ["to"],
"target": ["from"],
"input": {
"type": "file",
"path": "/loader/power_data/edge_connected_to.csv",
"format": "CSV",
"header": ["id", "from", "to", "label",
"source_label", "target_label"]
},
"field_mapping": { "from": "from", "to": "to" },
"value_mapping": {
"label": { "CONNECTED_TO":
"StationHouse2LineSegment" },
"source_label": { "StationHouse": "StationHouse" },
"target_label": { "LineSegment": "LineSegment" }
}
},
{
"label": "StationHouse2LineSwitch",
"source": ["from"],
"target": ["to"],
"input": {
"type": "file",
"path": "/loader/power_data/edge_connected_to.csv",
"format": "CSV",
"header": ["id", "from", "to", "label",
"source_label", "target_label"]
},
"field_mapping": { "from": "from", "to": "to" },
"value_mapping": {
"label": { "CONNECTED_TO":
"StationHouse2LineSwitch"},
"source_label": { "StationHouse": "StationHouse" },
"target_label": { "LineSwitch": "LineSwitch" }
}
},
{
"label": "LineSwitch2StationHouse",
"source": ["to"],
"target": ["from"],
"input": {
"type": "file",
"path": "/loader/power_data/edge_connected_to.csv",
"format": "CSV",
"header": ["id", "from", "to", "label",
"source_label", "target_label"]
},
"field_mapping": { "from": "from", "to": "to" },
"value_mapping": {
"label": { "CONNECTED_TO":
"LineSwitch2StationHouse" },
"source_label": { "LineSwitch": "LineSwitch" },
"target_label": { "StationHouse": "StationHouse" }
}
},
{
"label": "LineSegment2LineSegment",
"source": ["from"],
"target": ["to"],
"input": {
"type": "file",
"path": "/loader/power_data/edge_connected_to.csv",
"format": "CSV",
"header": ["id", "from", "to", "label",
"source_label", "target_label"]
},
"field_mapping": { "from": "from", "to": "to" },
"value_mapping": {
"label": { "CONNECTED_TO":
"LineSegment2LineSegment" },
"source_label": { "LineSegment": "LineSegment" },
"target_label": { "LineSegment": "LineSegment" }
}
},
{
"label": "LineSegment2LineSwitch",
"source": ["to"],
"target": ["from"],
"input": {
"type": "file",
"path": "/loader/power_data/edge_connected_to.csv",
"format": "CSV",
"header": ["id", "from", "to", "label",
"source_label", "target_label"]
},
"field_mapping": { "from": "from", "to": "to" },
"value_mapping": {
"label": { "CONNECTED_TO": "LineSegment2LineSwitch"
},
"source_label": { "LineSegment": "LineSegment" },
"target_label": { "LineSwitch": "LineSwitch" }
}
},
{
"label": "LineSwitch2LineSegment",
"source": ["from"],
"target": ["to"],
"input": {
"type": "file",
"path": "/loader/power_data/edge_connected_to.csv",
"format": "CSV",
"header": ["id", "from", "to", "label",
"source_label", "target_label"]
},
"field_mapping": { "from": "from", "to": "to" },
"value_mapping": {
"label": { "CONNECTED_TO": "LineSwitch2LineSegment"
},
"source_label": { "LineSwitch": "LineSwitch" },
"target_label": { "LineSegment": "LineSegment" }
}
}
]
}
schema.groovy如下
// 属性定义
schema.propertyKey("id").asText().ifNotExist().create();
schema.propertyKey("name").asText().ifNotExist().create();
schema.propertyKey("objectType").asText().ifNotExist().create();
schema.propertyKey("objectHandle").asText().ifNotExist().create();
schema.propertyKey("deviceType").asText().ifNotExist().create();
schema.propertyKey("internalEndpointNo").asInt().ifNotExist().create();
schema.propertyKey("terminalNo").asInt().ifNotExist().create();
schema.propertyKey("usage").asText().ifNotExist().create();
schema.propertyKey("powerFlowDirection").asInt().ifNotExist().create();
schema.propertyKey("belongSubstation").asText().ifNotExist().create();
schema.propertyKey("belongFeeder").asText().ifNotExist().create();
schema.propertyKey("label").asText().ifNotExist().create();
schema.propertyKey("source_label").asText().ifNotExist().create();
schema.propertyKey("target_label").asText().ifNotExist().create();
// 顶点标签
schema.vertexLabel("Substation").properties("id", "name", "objectType",
"objectHandle", "deviceType", "internalEndpointNo", "terminalNo", "usage",
"powerFlowDirection", "belongSubstation",
"belongFeeder").primaryKeys("id").ifNotExist().create();
schema.vertexLabel("LineSegment").properties("id", "name", "objectType",
"objectHandle", "deviceType", "internalEndpointNo", "terminalNo", "usage",
"powerFlowDirection", "belongSubstation",
"belongFeeder").primaryKeys("id").ifNotExist().create();
schema.vertexLabel("LineSwitch").properties("id", "name", "objectType",
"objectHandle", "deviceType", "internalEndpointNo", "terminalNo", "usage",
"powerFlowDirection", "belongSubstation",
"belongFeeder").primaryKeys("id").ifNotExist().create();
schema.vertexLabel("StationHouse").properties("id", "name", "objectType",
"objectHandle", "deviceType", "internalEndpointNo", "terminalNo", "usage",
"powerFlowDirection", "belongSubstation",
"belongFeeder").primaryKeys("id").ifNotExist().create();
// 边标签(每种类型对每种类型都定义一条边,便于扩展和双向连接)
schema.edgeLabel("Substation2LineSegment").sourceLabel("Substation").targetLabel("LineSegment").properties("source_label",
"target_label").ifNotExist().create();
schema.edgeLabel("LineSegment2StationHouse").sourceLabel("LineSegment").targetLabel("StationHouse").properties("source_label",
"target_label").ifNotExist().create();
schema.edgeLabel("LineSegment2LineSegment").sourceLabel("LineSegment").targetLabel("LineSegment").properties("source_label",
"target_label").ifNotExist().create();
schema.edgeLabel("StationHouse2LineSwitch").sourceLabel("StationHouse").targetLabel("LineSwitch").properties("source_label",
"target_label").ifNotExist().create();
schema.edgeLabel("LineSegment2LineSwitch").sourceLabel("LineSegment").targetLabel("LineSwitch").properties("source_label",
"target_label").ifNotExist().create();
// ...根据实际需要继续补充
// 反向边(如需双向,定义反向边标签)
schema.edgeLabel("LineSegment2Substation").sourceLabel("LineSegment").targetLabel("Substation").properties("source_label",
"target_label").ifNotExist().create();
schema.edgeLabel("StationHouse2LineSegment").sourceLabel("StationHouse").targetLabel("LineSegment").properties("source_label",
"target_label").ifNotExist().create();
//schema.edgeLabel("LineSegment2LineSegment").sourceLabel("LineSegment").targetLabel("LineSegment").properties().ifNotExist().create();
schema.edgeLabel("LineSwitch2StationHouse").sourceLabel("LineSwitch").targetLabel("StationHouse").properties("source_label",
"target_label").ifNotExist().create();
schema.edgeLabel("LineSwitch2LineSegment").sourceLabel("LineSwitch").targetLabel("LineSegment").properties("source_label",
"target_label").ifNotExist().create();
逐一帮我检查一下错误并改正
原始邮件
发件人:dosubot[bot] ***@***.***>
发件时间:2025年12月9日 16:04
收件人:apache/incubator-hugegraph ***@***.***>
抄送:wangzhewwzz ***@***.***>, Mention ***@***.***>
主题:Re: [apache/incubator-hugegraph] 关于电网拓扑数据导入 (Discussion #2917)
你的配置只差一步就能导入成功:每个边标签(edgeLabel)的 .properties() 必须包含所有实际要导入的属性,否则 loader 会报
“Invalid property 'source_label' for edge label” 这样的错【参考1】【参考2】。
你现在 schema.groovy 里边标签是这样写的:
schema.edgeLabel("Substation2LineSegment").sourceLabel("Substation").targetLabel("LineSegment").properties().ifNotExist().create();
应该改成:
schema.edgeLabel("Substation2LineSegment")
.sourceLabel("Substation") .targetLabel("LineSegment")
.properties("source_label", "target_label")
.ifNotExist().create();
所有用到 source_label、target_label 的边标签都要这样写(不要加 id,除非你真的要把 id 作为边属性存储)。你的
struct.json 和 value_mapping 写法是对的,field_mapping 只写 from/to 就行。
标准写法举例:
schema.groovy(以 LineSegment2LineSegment 为例):
schema.edgeLabel("LineSegment2LineSegment")
.sourceLabel("LineSegment") .targetLabel("LineSegment")
.properties("source_label", "target_label")
.ifNotExist().create();
struct.json(以 LineSegment2LineSegment 为例):
{ "label": "LineSegment2LineSegment", "source": ["from"],
"target": ["to"], "input": { "type": "file",
"path": "/loader/power_data/edge_connected_to.csv",
"format": "CSV", "header": ["id", "from", "to", "label",
"source_label", "target_label"] }, "field_mapping": { "from":
"id", "to": "id" }, "value_mapping": { "label": {
"CONNECTED_TO": "LineSegment2LineSegment" }, "source_label": {
"LineSegment": "LineSegment" }, "target_label": { "LineSegment":
"LineSegment" } } }
注意事项:
edgeLabel 的 .properties() 只写 "source_label", "target_label",不要加 "id"。
struct.json 的 field_mapping 只写 from/to,不要写 id。
value_mapping 必须是平铺结构。
edge_connected_to.csv 的 source_label/target_label 必须和顶点 label 完全一致(区分大小写)。
这样配置后,边就能正常导入且无报错,所有关系都能导入【参考3】【参考4】。
如还有具体报错,可以贴出 struct.json/schema.groovy/CSV 的片段,我可以帮你逐条检查。
参考:
官方文档:边属性声明与导入机制
源码机制说明
官方 schema 示例
社区案例
To reply, just mention @dosu.
How did I do? Good | Irrelevant | Incorrect | Verbose |
Hallucination | Report 🐛 | Other
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.
GitHub link:
https://github.com/apache/incubator-hugegraph/discussions/2917#discussioncomment-15214226
----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]