My understanding for having the label ID as part of the ID for the vertices
or edges was due to vertices and edges only being allowed to have 1 label,
a design decision.

Since vertices and edges could only have 1 label, they could be stored in a
table named as such. The idea was that this would increase performance by
having vertices and edges separated by labels, and therefore speed up
access to them. By storing the label ID in the vertex or edge ID, it
allowed one to find the label table more easily, and then the actual entry.

However, I'm not sure that this would necessarily increase performance as
going between multiple tables may be just as inefficient as keeping all the
entries in one. And, everything comes at a cost. The cost here is at the
expense of making it more difficult to support multiple labels or using the
maximum number of graph components.

Hopefully, this was helpful.

john

Below are some queries showing this structure, for reference -

psql-11.5-5432-pgsql=# select * from ag_label;
       name       | graph | id | kind |        relation         |
 seq_name
------------------+-------+----+------+-------------------------+-------------------------
 _ag_label_vertex | 16943 |  1 | v    | graph1._ag_label_vertex |
_ag_label_vertex_id_seq
 _ag_label_edge   | 16943 |  2 | e    | graph1._ag_label_edge   |
_ag_label_edge_id_seq
 zero             | 16943 |  3 | v    | graph1.zero             |
zero_id_seq
 knows            | 16943 | 36 | e    | graph1.knows            |
knows_id_seq
(4 rows)

psql-11.5-5432-pgsql=# select * from graph1.zero;
       id        |                 properties
-----------------+---------------------------------------------
 844424930131969 | {"name": "Zero", "value": 2.71828182845905}
(1 row)

psql-11.5-5432-pgsql=# select * from graph1.knows;
        id         |    start_id     |     end_id      | properties
-------------------+-----------------+-----------------+------------
 10133099161583617 | 281474976710657 | 281474976710658 | {}
(1 row)

psql-11.5-5432-pgsql=# SELECT * from cypher('graph1', $$ MATCH (n:zero)
RETURN n $$) as (result agtype);
                                                   result
-------------------------------------------------------------------------------------------------------------
 {"id": 844424930131969, "label": "zero", "properties": {"name": "Zero",
"value": 2.71828182845905}}::vertex
(1 row)

psql-11.5-5432-pgsql=# SELECT * from cypher('graph1', $$ MATCH
()-[e:knows]->() RETURN e $$) as (result agtype);
                                                           result
-----------------------------------------------------------------------------------------------------------------------------
 {"id": 10133099161583617, "label": "knows", "end_id": 281474976710658,
"start_id": 281474976710657, "properties": {}}::edge
(1 row)

psql-11.5-5432-pgsql=#

On Wed, Apr 12, 2023 at 1:45 PM Panagiotis Foliadis <pfolia...@hotmail.com>
wrote:

> Hey all,
> After digging around in the source code I've discovered that the id's for
> vertices and edges are generated
> through the same procedure (by applying a mask on the label id of the
> different vertices and edges). That leads
> to the fact that the id's generated are solely based on the id of the
> label that they would be inserted, and also
> that the potential id's are huge numbers (since the mask applied is
> 0x0000ffffffffffff). Is there a reason for this?
> Why can't the sequence be used and generated id's starting from 1, without
> applying the mask, for each vertix and edge?
>

Reply via email to