Hi all,

I'd like to propose expanding label support in TinkerPop 4.0 to allow multiple 
mutable labels on vertices and edges.

Background:

When TinkerPop was first developed, the property graph model did not widely 
support anything other than a single immutable label. Since then, multi-label 
has become more common; mutable labels and no-label options are increasingly 
supported across graph databases. The GQL standard defines both vertices and 
edges as having zero or more labels.

Supporting multi-label and mutable labels fits relatively well into Gremlin 
syntax and semantics. The notion of "no label" is more nuanced. It makes sense 
for analytics use cases where algorithms don't care about element 
classification, but less so for transactional cases where labels serve as 
schema anchors. Rather than dictating one behavior, providers would be free to 
configure the extent to which they wish to support multilabel, if any.

Proposal:

The goal is to make multi-label opt-in for providers, with configuration over 
which label cardinalities and element types to support. Providers that wish to 
remain single-label can do so without breaking changes.

For TinkerPop's reference implementation, I propose:

Vertices: 0..N label support (with 0..1, 1..1 and 1..N as configurable options)
Edges: 0..1 label support initially, with the foundation in place for N labels 
later

Key structural changes:
- Element.label() deprecated in favor of Set<String> labels(). The label() 
method returns the first label for backward compatibility. The default labels() 
implementation in Element delegates to Collections.singleton(label()), so 
existing providers work without changes.
- Serialization uses List<String> for all label fields in GraphBinary V4 and 
GraphSON V4. The wire format is already list-based, this change populates the 
list fully rather than always writing a singleton.

New steps: 
- addLabel(String, String...), dropLabel(String, String...), dropLabels(), and 
labels() for streaming all labels.
- with('multilabel') configuration for valueMap()/elementMap() where 
single-label return remains the default, with multi-label output when 
configured.

While multiple labels add some complexity to TinkerPop's model, this opens the 
door for providers who want to expand their database models and moves toward 
interoperability with GQL's label semantics.

I plan to draft a PR soon with a design proposal and initial implementation. 
The goal is to include some level of multi-label support for the upcoming beta 
release, setting a good foundation for 4.0.0 GA feedback.

Please share any thoughts, concerns, or questions in this thread.

Thank you, 

Yang

Reply via email to