[
https://issues.apache.org/jira/browse/TINKERPOP-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marko A. Rodriguez updated TINKERPOP-1343:
------------------------------------------
Fix Version/s: 3.3.0
> A more efficient StarGraph serialization representation.
> --------------------------------------------------------
>
> Key: TINKERPOP-1343
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1343
> Project: TinkerPop
> Issue Type: Improvement
> Components: process
> Affects Versions: 3.2.0-incubating
> Reporter: Marko A. Rodriguez
> Labels: breaking
> Fix For: 3.3.0
>
>
> {{StarGraph}} is used by the Hadoop {{GraphComputers}} and represents a
> vertex, its properties, its incident edges, and their properties. In essence,
> one "row of an adjacency list."
> Here are some ideas on how to make the next version of the serialization
> format more efficient.
> 1. For all Element ids, we currently use {{kryo.readClassAndObject(...)}}.
> This is bad because we have to write the class with each id. It would be
> better if the {{StarGraph}} had metadata like {{vertexIdClass}},
> {{vertexPropertyIdClass}}, and {{edgeIdClass}}. Now for every vertex we are
> serializing three class, but the benefit is that every id class is now known
> and we can use {{kryo.readObject(..., xxxIdClass)}}.
> 2. Edges and VertexProperties are written out as {{[ edgeLabel[ edge[ id,
> otherVertexId]\*]\*}} and {{[ propertyKey[ vertexProperty[
> id,propertyValue]\*]\*}}, respectively. This ensures we don't write so many
> strings as all edges/vertex properties are grouped by label. However, we do
> NOT do this for edge properties nor vertex property properties. We simply
> write out the {{Map<Object,Map<String,Object>>}} which is
> {{Map<EdgeId,Map<PropertyKey,PropertyValue>>}}. Since we have to choose
> between grouping by edgeId or by propertyKey, we should keep it as it is, but
> create a "meta map" that allows us to represent all property keys in a, e.g.,
> {{int}} space. Thus, {{Map<EdgeId,Map<PropertyKeyIntegerId,PropertyValue>>}}
> where we also have a {{Map<PropertyKeyIntegerId,String>}} that is serialized
> with the {{StarGraph}}.
> There are a few other tickets around optimizing {{StarGraph}} here:
> https://issues.apache.org/jira/browse/TINKERPOP-1128 (making {{GraphFilters}}
> more efficient)
> https://issues.apache.org/jira/browse/TINKERPOP-1122 (pointless bits and
> {{StarGraph}} should never auto-generate IDs as the ID space is distributed).
> https://issues.apache.org/jira/browse/TINKERPOP-1287 (related to heap usage
> and clock cycles -- not serialization).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)