Another change that I propose is fixing how count(local) behaves.

Problem:
When Tree extended HashMap, count(local) hit the Map branch in CountLocalStep 
and returned the number of root entries. Now that Tree isn't a Map, it no 
longer matches any of the step's type checks (Collection/Map/Path) and falls 
through to the catch-all, which treats the whole tree as a single object and 
returns 1. That's not a meaningful count of anything, so the current behavior 
is effectively a bug.


Proposal:
Bind count(local) on a Tree to nodeCount() (total nodes in the tree).
The rationale is consistency with what count(local) already does: it counts the 
contents of the local object in hand. For a Path it counts the path's objects, 
e.g.

  g.V(1).out().out().path().count(local)   // => 3  (marko, josh, ripple) per 
path

By the same principle, the contents of a Tree are its nodes, so count(local) 
returning the total node count is the natural, consistent meaning and it gives 
users a useful number instead of 1.

Impact:
This is an observable semantic change to tree().count(local) (and the 
tree('a')...count(local) side-effect step).

Please let me know if you have any thoughts on this.

Thanks,
Guian



________________________________
From: Guian Gumpac via dev <[email protected]>
Sent: June 4, 2026 4:21 PM
To: Cole Greer via dev <[email protected]>
Cc: Guian Gumpac <[email protected]>
Subject: Re: [DISCUSS] Make Tree no longer extend HashMap

A heads-up on the IO/serialization impact of this change, and how I'm proposing 
we handle it for now.

Because Tree no longer extends HashMap, its serialized form changes across the 
IO formats:

  - Gryo: previously Tree was serialized implicitly via Kryo's MapSerializer 
(it picked that up by virtue of being a Map). With Tree as a plain type, that 
no longer applies, so I've
  added an explicit Tree serializer. The on-wire type and framing in Gryo are 
therefore different from before that the old Gryo-encoded Tree data will not be 
readable by the new serializer.
  - GraphSON V1-3 and V4: the JSON structure is unchanged; only the internal 
mechanism for reading/writing the entries was adjusted to use the new 
tree-shaped API. No format change
  there.

I want to be upfront that this is a breaking change for Gryo IO. My proposal is 
to land it now to unblock the Tree work and revisit the serialization story 
before the TP4 release rather than try to preserve byte compatibility with the 
3.x Gryo Tree format, which isn't a goal across a major version anyway.

When we revisit for TP4, a few things are worth putting on the table:

  - Gryo: TP4 likely warrants a major version bump of Kryo itself, which would 
completely break the old Gryo binary format regardless of this Tree change. 
Given that, it may make sense to drop GryoV1-3 entirely and replace it with a 
different/refreshed format rather than carry the legacy registrations forward.
  - GraphSON: similarly, we may want to remove GraphSON V1-3 and standardize on 
V4, with a file converter to migrate existing GraphSON V1-3 documents to V4 so 
users have a clean upgrade path.

Thanks,
Guian

________________________________
From: Guian Gumpac <[email protected]>
Sent: May 29, 2026 3:24 PM
To: Cole Greer via dev <[email protected]>
Subject: [DISCUSS] Make Tree no longer extend HashMap

Hi all,

I propose changing Tree in the server to no longer extend HashMap as it could 
have functions that don't align with Tree behaviour.

Problem
=======

Tree<T> extends HashMap<T, Tree<T>>, and as a result several Tree operations 
behave like Map operations rather than tree operations users expect:

  * tree.size() returns top-level entries, not total nodes
  * tree.containsKey(x) only checks immediate children
  * tree.isLeaf() throws NoSuchElementException on an empty tree
  * getObjectsAtDepth(0) returns an empty list rather than the roots


Proposed change
===============

Replace Tree<T> extends HashMap<T, Tree<T>> with a final class that contains a 
HashMap and exposes only tree-shaped methods:

    public final class Tree<T> implements Serializable {
        private final Map<T, Tree<T>> children = new HashMap<>();

        public Set<T>             rootNodes();
        public Tree<T>            childAt(T key);
        public boolean            isLeaf();
        public int                nodeCount();
        public Optional<Tree<T>>  findSubtree(T key);
        public String             prettyPrint();
        @Override public boolean  equals(Object other);
    }

Tree no longer is-a Map. It has-a Map. All public methods are tree-shaped, with 
names that mean what users expect:

  * nodeCount() for total count
  * hasChild / contains for top-level vs recursive search
  * childAt / findSubtree for direct vs recursive lookup
  * getObjectsAtDepth(d) is 0-based
  * equals/hashCode are structural recursive


Breaking changes
===============

Compile-time break for Java callers that treat Tree as a Map (assignment to 
Map<?,?>, instanceof Map/HashMap checks, calls to Map-only methods).

A few Gremlin pipeline patterns that work today as side effects of Tree being a 
Map like select(keys), count(local), unfold() no longer compose. Tree() becomes 
a terminal-style step, like subgraph() where users process the result 
client-side after .next(), or reshape the upstream traversal.

Performance
===========
Basically, unchanged. Memory per node (~48 B), child-lookup cost (O(1)), merge 
cost (O(distinct keys)), iteration order, andwire-format encoding are all the 
same as today. The change is purely in the public API surface.


Future extensions
=================

Additive, non-breaking follow-ups, landing in a separate PR as demand justifies:

  * findAllSubtrees(T), subtreeAtPath(List<T>), streamLevelOrder()
  * map, filter, prune
  * intersect / difference / merge
  * implements Iterable<T> (for-each over node values)

Thanks,
Guian

Reply via email to