A heads-up on the IO/serialization impact of this change, and how I'm proposing 
we handle it for now.

Because Tree no longer extends HashMap, its serialized form changes across the 
IO formats:

  - Gryo: previously Tree was serialized implicitly via Kryo's MapSerializer 
(it picked that up by virtue of being a Map). With Tree as a plain type, that 
no longer applies, so I've
  added an explicit Tree serializer. The on-wire type and framing in Gryo are 
therefore different from before that the old Gryo-encoded Tree data will not be 
readable by the new serializer.
  - GraphSON V1-3 and V4: the JSON structure is unchanged; only the internal 
mechanism for reading/writing the entries was adjusted to use the new 
tree-shaped API. No format change
  there.

I want to be upfront that this is a breaking change for Gryo IO. My proposal is 
to land it now to unblock the Tree work and revisit the serialization story 
before the TP4 release rather than try to preserve byte compatibility with the 
3.x Gryo Tree format, which isn't a goal across a major version anyway.

When we revisit for TP4, a few things are worth putting on the table:

  - Gryo: TP4 likely warrants a major version bump of Kryo itself, which would 
completely break the old Gryo binary format regardless of this Tree change. 
Given that, it may make sense to drop GryoV1-3 entirely and replace it with a 
different/refreshed format rather than carry the legacy registrations forward.
  - GraphSON: similarly, we may want to remove GraphSON V1-3 and standardize on 
V4, with a file converter to migrate existing GraphSON V1-3 documents to V4 so 
users have a clean upgrade path.

Thanks,
Guian

________________________________
From: Guian Gumpac <[email protected]>
Sent: May 29, 2026 3:24 PM
To: Cole Greer via dev <[email protected]>
Subject: [DISCUSS] Make Tree no longer extend HashMap

Hi all,

I propose changing Tree in the server to no longer extend HashMap as it could 
have functions that don't align with Tree behaviour.

Problem
=======

Tree<T> extends HashMap<T, Tree<T>>, and as a result several Tree operations 
behave like Map operations rather than tree operations users expect:

  * tree.size() returns top-level entries, not total nodes
  * tree.containsKey(x) only checks immediate children
  * tree.isLeaf() throws NoSuchElementException on an empty tree
  * getObjectsAtDepth(0) returns an empty list rather than the roots


Proposed change
===============

Replace Tree<T> extends HashMap<T, Tree<T>> with a final class that contains a 
HashMap and exposes only tree-shaped methods:

    public final class Tree<T> implements Serializable {
        private final Map<T, Tree<T>> children = new HashMap<>();

        public Set<T>             rootNodes();
        public Tree<T>            childAt(T key);
        public boolean            isLeaf();
        public int                nodeCount();
        public Optional<Tree<T>>  findSubtree(T key);
        public String             prettyPrint();
        @Override public boolean  equals(Object other);
    }

Tree no longer is-a Map. It has-a Map. All public methods are tree-shaped, with 
names that mean what users expect:

  * nodeCount() for total count
  * hasChild / contains for top-level vs recursive search
  * childAt / findSubtree for direct vs recursive lookup
  * getObjectsAtDepth(d) is 0-based
  * equals/hashCode are structural recursive


Breaking changes
===============

Compile-time break for Java callers that treat Tree as a Map (assignment to 
Map<?,?>, instanceof Map/HashMap checks, calls to Map-only methods).

A few Gremlin pipeline patterns that work today as side effects of Tree being a 
Map like select(keys), count(local), unfold() no longer compose. Tree() becomes 
a terminal-style step, like subgraph() where users process the result 
client-side after .next(), or reshape the upstream traversal.

Performance
===========
Basically, unchanged. Memory per node (~48 B), child-lookup cost (O(1)), merge 
cost (O(distinct keys)), iteration order, andwire-format encoding are all the 
same as today. The change is purely in the public API surface.


Future extensions
=================

Additive, non-breaking follow-ups, landing in a separate PR as demand justifies:

  * findAllSubtrees(T), subtreeAtPath(List<T>), streamLevelOrder()
  * map, filter, prune
  * intersect / difference / merge
  * implements Iterable<T> (for-each over node values)

Thanks,
Guian

Reply via email to