[ 
https://issues.apache.org/jira/browse/TINKERPOP-1346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Mallette closed TINKERPOP-1346.
---------------------------------------
    Resolution: Won't Do

Adding a reference to this DISCUSS thread:

https://lists.apache.org/thread.html/rc68d0bf3d6530f14d328fc5f2d5ec141a7e50aac67b2920743612526%40%3Cdev.tinkerpop.apache.org%3E

which basically puts aside the idea of doing Gryo 4.0 as we no longer use it 
for network serialization. I suppose that this issue is more about a different 
type of usage, but to avoid confusion for now I'm going to close this issue. 

> Gryo 4.0
> --------
>
>                 Key: TINKERPOP-1346
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-1346
>             Project: TinkerPop
>          Issue Type: Improvement
>          Components: io, structure
>    Affects Versions: 3.2.0-incubating
>            Reporter: Marko A. Rodriguez
>            Priority: Major
>              Labels: breaking
>
> *Reference*
> Right now, to send a {{ReferenceEdge}} message, we serialize the form as:
> {code:java}
> KryoClassInteger[ReferenceEdge] + KryoClassObject[Edge ID] + 
> KryoClassInteger[ReferenceVertex] + KryoClassObject[Vertex ID] + 
> KryoClassInteger[ReferenceVertex] + KryoClassObject[Vertex ID]
> {code}
> Assuming {{Long}} Element ids, the math says:
> {code:java}
> 48 bytes = 4 bytes + (4 bytes + 8 bytes [long]) + 4 bytes + (4 bytes + 8 
> bytes [long]) + 4 bytes + (4 bytes + 8 bytes [long])
> {code}
> We could get this smaller by not relying on Kryo's {{FieldSerializer}}.
> {code:java}
> KryoClassInteger[ReferenceEdge] + KryoClassInteger[VertexIDClass] + 
> KryoClassObject[Edge ID] + KryoObject[Vertex ID] + KryoObject[Vertex ID]
> {code}
> The math says:
> {code:java}
> 36 bytes = 4 bytes + 4 bytes + (4 bytes + 8 bytes [long]) + 8 bytes [long] + 
> 8 bytes [long]
> {code}
> Similar techniques would apply to {{ReferenceVertexProperty}} and 
> {{ReferenceProperty}}.
> *StarGraph*
> Right now we serialize first the vertex, then its edges, then its properties. 
> We should do vertex, properties, edges. Why? If we know that the vertex is to 
> be filtered (which is an analysis of its label/id/properties), then we can 
> skip over analyzing its edges. Right now, we may do all this work 
> deserializing edges only to realize that the GraphFilter says that the vertex 
> is filtered. Dah, pointless clock cycles – especially when edge sets can be 
> massive.
> {{StarGraph}} is used by the Hadoop {{GraphComputers}} and represents a 
> vertex, its properties, its incident edges, and their properties. In essence, 
> one "row of an adjacency list."
> Here are some ideas on how to make the next version of the serialization 
> format more efficient.
> 1. For all Element ids, we currently use {{kryo.readClassAndObject(...)}}. 
> This is bad because we have to write the class with each id. It would be 
> better if the {{StarGraph}} had metadata like {{vertexIdClass}}, 
> {{vertexPropertyIdClass}}, and {{edgeIdClass}}. Now for every vertex we are 
> serializing three class, but the benefit is that every id class is now known 
> and we can use {{kryo.readObject(..., xxxIdClass)}}.
> 2. Edges and VertexProperties are written out as {{[ edgeLabel[ edge[ id, 
> otherVertexId]*]*}} and {{[ propertyKey[ vertexProperty[ 
> id,propertyValue]*]*}}, respectively. This ensures we don't write so many 
> strings as all edges/vertex properties are grouped by label. However, we do 
> NOT do this for edge properties nor vertex property properties. We simply 
> write out the {{Map<Object,Map<String,Object>>}} which is 
> {{Map<EdgeId,Map<PropertyKey,PropertyValue>>}}. Since we have to choose 
> between grouping by edgeId or by propertyKey, we should keep it as it is, but 
> create a "meta map" that allows us to represent all property keys in a, e.g., 
> {{int}} space. Thus, {{Map<EdgeId,Map<PropertyKeyIntegerId,PropertyValue>>}} 
> where we also have a {{Map<PropertyKeyIntegerId,String>}} that is serialized 
> with the {{StarGraph}}.
> StarGraph also has a Long identifer - This makes no sense as then each 
> StarGraph in the full Graph will have similar ids! Moreover, what is 
> referencing what when the adjacent vertices are just arbitrary long ids?!! We 
> should require that StarGraph get provided ids for vertices (and perhaps 
> edges)... We ensure no inconsistencies and we save 64-bits.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to