Github user spmallette commented on a diff in the pull request: https://github.com/apache/tinkerpop/pull/892#discussion_r202651403 --- Diff: gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/structure/io/graphml/GraphMLWriter.java --- @@ -242,7 +257,31 @@ private void writeTypes(final Map<String, String> identifiedVertexKeyTypes, private void writeEdges(final XMLStreamWriter writer, final Graph graph) throws XMLStreamException { if (normalize) { final List<Edge> edges = IteratorUtils.list(graph.edges()); - Collections.sort(edges, Comparators.ELEMENT_COMPARATOR); + Collections.sort(edges, new Comparator<Edge>() { --- End diff -- Your algorithm for sorting leaves me with some concerns. It assumes that a "string of numbers" will fit into the integer space. I suppose that you could use `Long` instead, but what if the "string of numbers" is `39098403984093284023984032840329840389430984` - you'd still blow the space. Maybe I'm not following the code properly but if you had keys like this: ```text gremlin> x = ["a100","100a","1a1"] ==>a100 ==>100a ==>1a1 ``` and you sorted them with existing sorting you get: ```text gremlin> Collections.sort(x, Comparator.comparing({ it.toString() }, String.CASE_INSENSITIVE_ORDER)) gremlin> x ==>100a ==>1a1 ==>a100 ``` with your approach I get: ```text gremlin> extractInt = { s -> ......1> num = s.replaceAll("\\D", ""); ......2> return num.isEmpty() ? 0 : Integer.parseInt(num); ......3> } gremlin> x = ["a100","100a","1a1"] ==>a100 ==>100a ==>1a1 gremlin> x.sort{a,b -> ......1> o1fullString = a.toString(); ......2> o2fullString = b.toString(); ......3> ......3> o1StringPart = o1fullString.replaceAll("\\d", ""); ......4> o2StringPart = o2fullString.replaceAll("\\d", ""); ......5> ......5> if(o1StringPart.equalsIgnoreCase(o2StringPart)) { ......6> return extractInt(o1fullString) - extractInt(o2fullString); ......7> } ......8> return o1fullString.compareTo(o2fullString); ......9> } ==>1a1 ==>a100 ==>100a ``` Is that what you expected to have happen? With IDs you really can't assume too much because TinkerPop is not ID aware. The underlying graph implementation is the only thing that really knows anything about what the type of format of the ID is.
---