Repository: incubator-s2graph
Updated Branches:
  refs/heads/master e674a2550 -> da2209dfc

fix broken markdown on


Branch: refs/heads/master
Commit: 13dab26c1ab736e805ffefb7e6397c7a9f495faa
Parents: e674a25
Author: DO YUNG YOON <>
Authored: Wed May 2 12:59:33 2018 +0900
Committer: DO YUNG YOON <>
Committed: Wed May 2 12:59:33 2018 +0900

---------------------------------------------------------------------- | 882 +++++++++++++++++++++++++++++++++------------------------
 1 file changed, 506 insertions(+), 376 deletions(-)
diff --git a/ b/
index ba7e81c..93338eb 100644
--- a/
+++ b/
@@ -85,379 +85,509 @@ The ~~`loader` and `spark`~~ projects are deprecated by 
 Note that, the OLAP-style workloads are under development and we are planning 
to provide documentations in the upcoming releases.  
-Your First Graph  
-Once the S2Graph server has been set up, you can now start to send HTTP 
queries to the server to create a graph and pour some data in it. This tutorial 
goes over a simple toy problem to get a sense of how S2Graph's API looks like. 
[`bin/`](bin/ contains the example code below.  
-The toy problem is to create a timeline feature for a simple social media, 
like a simplified version of Facebook's timeline:stuck_out_tongue_winking_eye:. 
Using simple S2Graph queries it is possible to keep track of each user's 
friends and their posts.  
-1. First, we need a name for the new service.  
-  The following POST query will create a service named "KakaoFavorites".  
- ``` curl -XPOST localhost:9000/graphs/createService -H 'Content-Type: 
Application/json' -d ' {"serviceName": "KakaoFavorites", "compressionAlgorithm" 
: "gz"} ' ```  
-  To make sure the service is created correctly, check out the following.  
- ``` curl -XGET localhost:9000/graphs/getService/KakaoFavorites ```  
-2. Next, we will need some friends.  
-  In S2Graph, relationships are organized as labels. Create a label called 
`friends` using the following `createLabel` API call:  
- ``` curl -XPOST localhost:9000/graphs/createLabel -H 'Content-Type: 
Application/json' -d ' { "label": "friends", "srcServiceName": 
"KakaoFavorites", "srcColumnName": "userName", "srcColumnType": "string", 
"tgtServiceName": "KakaoFavorites", "tgtColumnName": "userName", 
"tgtColumnType": "string", "isDirected": "false", "indices": [], "props": [], 
"consistencyLevel": "strong" } ' ```  
-  Check if the label has been created correctly:+  
- ``` curl -XGET localhost:9000/graphs/getLabel/friends ```  
-  Now that the label `friends` is ready, we can store the friendship data. 
Entries of a label are called edges, and you can add edges with `edges/insert` 
- ``` curl -XPOST localhost:9000/graphs/edges/insert -H 'Content-Type: 
Application/json' -d ' [ {"from":"Elmo","to":"Big 
Monster","to":"Oscar","label":"friends","props":{},"timestamp":1444360152482} ] 
' ```  
-  Query friends of Elmo with `getEdges` API:  
- ``` curl -XPOST localhost:9000/graphs/getEdges -H 'Content-Type: 
Application/json' -d ' { "srcVertices": [{"serviceName": "KakaoFavorites", 
"columnName": "userName", "id":"Elmo"}], "steps": [ {"step": [{"label": 
"friends", "direction": "out", "offset": 0, "limit": 10}]} ] } ' ```  
-  Now query friends of Cookie Monster:  
- ``` curl -XPOST localhost:9000/graphs/getEdges -H 'Content-Type: 
Application/json' -d ' { "srcVertices": [{"serviceName": "KakaoFavorites", 
"columnName": "userName", "id":"Cookie Monster"}], "steps": [ {"step": 
[{"label": "friends", "direction": "out", "offset": 0, "limit": 10}]} ] } ' ``` 
-3. Users of Kakao Favorites will be able to post URLs of their favorite 
-  We will need a new label ```post``` for this data:  
- ``` curl -XPOST localhost:9000/graphs/createLabel -H 'Content-Type: 
Application/json' -d ' { "label": "post", "srcServiceName": "KakaoFavorites", 
"srcColumnName": "userName", "srcColumnType": "string", "tgtServiceName": 
"KakaoFavorites", "tgtColumnName": "url", "tgtColumnType": "string", 
"isDirected": "true", "indices": [], "props": [], "consistencyLevel": "strong" 
} ' ```  
-  Now, insert some posts of the users:  
- ``` curl -XPOST localhost:9000/graphs/edges/insert -H 'Content-Type: 
Application/json' -d ' [ {"from":"Big 
 ] ' ```  
-  Query posts of Big Bird:  
- ``` curl -XPOST localhost:9000/graphs/getEdges -H 'Content-Type: 
Application/json' -d ' { "srcVertices": [{"serviceName": "KakaoFavorites", 
"columnName": "userName", "id":"Big Bird"}], "steps": [ {"step": [{"label": 
"post", "direction": "out", "offset": 0, "limit": 10}]} ] } ' ```  
-4. So far, we have designed a label schema for the labels `friends` and 
`post`, and stored some edges to them.+  
-  This should be enough for creating the timeline feature! The following 
two-step query will return the URLs for Elmo's timeline, which are the posts of 
Elmo's friends:  
- ``` curl -XPOST localhost:9000/graphs/getEdges -H 'Content-Type: 
Application/json' -d ' { "srcVertices": [{"serviceName": "KakaoFavorites", 
"columnName": "userName", "id":"Elmo"}], "steps": [ {"step": [{"label": 
"friends", "direction": "out", "offset": 0, "limit": 10}]}, {"step": [{"label": 
"post", "direction": "out", "offset": 0, "limit": 10}]} ] } ' ```  
-  Also try Cookie Monster's timeline:  
- ``` curl -XPOST localhost:9000/graphs/getEdges -H 'Content-Type: 
Application/json' -d ' { "srcVertices": [{"serviceName": "KakaoFavorites", 
"columnName": "userName", "id":"Cookie Monster"}], "steps": [ {"step": 
[{"label": "friends", "direction": "out", "offset": 0, "limit": 10}]}, {"step": 
[{"label": "post", "direction": "out", "offset": 0, "limit": 10}]} ] } ' ```  
-The example above is by no means a full blown social network timeline, but it 
gives you an idea of how to represent, store and query graph data with 
-TinkerPop Support  
-Since version 0.2.0-incubating, S2Graph integrate natively with `Apache 
TinkerPop 3.2.5`.  
-S2Graph passes `Apache TinkerPop`'s `StructureStandardSuite` and 
`ProcessStandardSuite` test suites.  
-### Graph Features **not** implemented.  
-- Computer  
-- Transactions  
-- ThreadedTransactions  
-### Vertex Features **not** implemented.  
-- MultiProperties  
-- MetaProperties  
-- UuidIds  
-- AnyIds   
-- NumericIds  
-- StringIds  
-### Edge Features **not** implemented.  
-- UuidIds  
-- AnyIds   
-- NumericIds  
-- StringIds  
-### Vertex property features **not** implemented.  
-- UuidIds  
-- AnyIds   
-- NumericIds  
-- StringIds  
-- MapValues  
-- MixedListValues  
-- BooleanArrayValues  
-- ByteArrayValues  
-- DoubleArrayValues  
-- FloatArrayValues  
-- IntegerArrayValues    
-- StringArrayValues  
-- LongArrayValues  
-- SerializableValues  
-- UniformListValues  
-### Edge property feature **not** implemented.  
-- MapValues  
-- MixedListValues  
-- BooleanArrayValues  
-- ByteArrayValues  
-- DoubleArrayValues  
-- FloatArrayValues  
-- IntegerArrayValues    
-- StringArrayValues  
-- LongArrayValues  
-- SerializableValues  
-- UniformListValues  
->NOTE: This is an ongoing task.  
-## Getting Started  
-### Maven coordinates  
- <groupId>org.apache.s2graph</groupId> <artifactId>s2core_2.11</artifactId> 
-### Start  
-S2Graph is a singleton that can be shared among multiple threads. You 
instantiate S2Graph using the standard TinkerPop static constructors.  
-- Graph g = Configuration configuration)  
-Some important properties for configuration.  
-#### HBase for data storage. ```  
-#### RDBMS for meta storage.  
-### Gremlin Console  
-#### 1. install plugin  
-On gremlin console, it is possible to install s2graph as follow.  
-:install org.apache.s2graph s2graph-gremlin 0.2.0  
-:plugin use tinkerpop.s2graph  
-Example run.  
-shonui-MacBook-Pro:apache-tinkerpop-gremlin-console-3.2.5 shon$ bin/ 
- \,,,/ (o o)-----oOOo-(3)-oOOo-----  
-plugin activated: tinkerpop.server  
-plugin activated: tinkerpop.utilities  
-plugin activated: tinkerpop.tinkergraph  
-gremlin> :install org.apache.s2graph s2graph-gremlin 0.2.0  
-==>Loaded: [org.apache.s2graph, s2graph-gremlin, 0.2.0] - restart the console 
to use [tinkerpop.s2graph]  
-gremlin> :plugin use tinkerpop.s2graph  
-==>tinkerpop.s2graph activated  
-gremlin> :plugin list  
-Once `s2graph-gremlin` plugin is acvive, then following example will generate 
tinkerpop's modern graph in s2graph.  
-Taken from 
-![Modern Graph from 
-### tp3 modern graph(simple).  
-conf = new BaseConfiguration()  
-graph =  
-// init system default schema  
-// init extra schema for tp3 modern graph.  
-// load modern graph into current graph instance.  
-// traversal  
-t = graph.traversal()  
-// show all vertices in this graph.  
-// show all edges in this graph.  
-// add two vertices.  
-shon = graph.addVertex(, 10, T.label, "person", "name", "shon", "age", 35) 
-s2graph = graph.addVertex(, 11, T.label, "software", "name", "s2graph", 
"lang", "scala")  
-// add one edge between two vertices.  
-created = shon.addEdge("created", s2graph, "_timestamp", 10, "weight", 0.1)  
-// check if new edge is available through traversal  
-t.V().has("name", "shon").out()  
-// shutdown  
-Note that simple version used default schema for `Service`, `Column`, `Label` 
for compatibility.  
-Please checkout advanced example below to understand what data model is 
available on S2Graph.  
-### tp3 modern graph(advanced).  
-It is possible to separate multiple namespaces into logical spaces.  
-S2Graph achieve this by following data model. details can be found on  
-1. Service: the top level abstraction   
-A convenient logical grouping of related entities  
-Similar to the database abstraction that most relational databases support.    
-2. Column: belongs to a service.  
-A set of homogeneous vertices such as users, news articles or tags.   
-Every vertex has a user-provided unique ID that allows the efficient lookup.   
-A service typically contains multiple columns.  
-3. Label: schema for edge  
-A set of homogeneous edges such as friendships, views, or clicks.   
-Relation between two columns as well as a recursive association within one 
-The two columns connected with a label may not necessarily be in the same 
service, allowing us to store and query data that spans over multiple services. 
-Instead of convert user provided Id into internal unique numeric Id, S2Graph 
simply composite service and column metadata with user provided Id to guarantee 
global unique Id.  
-Following is simple example to exploit these data model in s2graph.            
-// init graph  
-graph = BaseConfiguration())  
-// 0. import necessary methods for schema management.  
-import static org.apache.s2graph.core.Management.*  
-// 1. initialize dbsession for management which store schema into RDBMS.  
-session = graph.dbSession()  
-// 2. properties for new service "s2graph".  
-serviceName = "s2graph"  
-cluster = "localhost"  
-hTableName = "s2graph"  
-preSplitSize = 0  
-hTableTTL = -1  
-compressionAlgorithm = "gz"  
-// 3. actual creation of s2graph service.  
-// details can be found on  
-service =, cluster, hTableName, 
preSplitSize, hTableTTL, compressionAlgorithm)  
-// 4. properties for user vertex schema belongs to s2graph service.  
-columnName = "user"  
-columnType = "integer"  
-// each property consist of (name: String, defaultValue: String, dataType: 
-// defailts can be found on  
-props = [newProp("name", "-", "string"), newProp("age", "-1", "integer")]  
-schemaVersion = "v3"  
-user =, columnName, 
columnType, props, schemaVersion)  
-// 2.1 (optional) global vertex index."global_vertex_index", ["name", 
-// 3. create VertexId  
-// create S2Graph's VertexId class.  
-v1Id = graph.newVertexId(serviceName, columnName, 20)  
-v2Id = graph.newVertexId(serviceName, columnName, 30)  
-shon = graph.addVertex(, v1Id, "name", "shon", "age", 35)  
-dun = graph.addVertex(, v2Id, "name", "dun", "age", 36)  
-// 4. friends label  
-labelName = "friend_"  
-srcColumn = user  
-tgtColumn = user  
-isDirected = true  
-indices = []  
-props = [newProp("since", "-", "string")]  
-consistencyLevel = "strong"  
-hTableName = "s2graph"  
-hTableTTL = -1  
-options = null  
-friend =, srcColumn, tgtColumn,  
- isDirected, serviceName, indices, props, consistencyLevel, hTableName, 
hTableTTL, schemaVersion, compressionAlgorithm, options)  
-shon.addEdge(labelName, dun, "since", "2017-01-01")  
-t = graph.traversal()  
-println "All Edges"  
-println t.E().toList()  
-println "All Vertices"  
-println t.V().toList()  
-println "Specific Edge"  
-println t.V().has("name", "shon").out().toList()  
-## Architecture  
-physical data storage is closed related to data 
-in HBase storage, Vertex is stored in `v` column family, and Edge is stored in 
`e` column family.  
-each `Service`/`Label` can have it's own dedicated HBase Table.  
-How Edge/Vertex is actually stored in `KeyValue` in HBase is described in 
-## Indexes  
-will be updated.  
-## Cache  
-will be updated.  
-## Gremlin S2Graph has full support for gremlin. However gremlin’s fine 
grained graphy nature results in very high latency  
-Provider suppose to provide `ProviderOptimization` to improve latency of 
traversal, and followings are currently available optimizations.  
->NOTE: This is an ongoing task  
-#### 1. `S2GraphStep`  
-1. translate multiple `has` step into lucene query and find out 
vertexId/edgeId can be found from index provider, lucene.  
-2. if vertexId/edgeId can be found, then change full scan into point lookup 
using list of vertexId/edgeId.  
-for examples, following traversal need full scan on storage if there is no 
index provider.  
-g.V().has("name", "steamshon").out()  
-g.V().has("name", "steamshon").has("age", P.eq(30).or(P.between(20, 30)))  
-once following global vertex index is created, then `S2GraphStep` translate 
above traversal into lucene query, then get list of vertexId/edgeId which 
switch full scan to points lookup.  
-```"global_vertex_index", ["name", 
-#### [The Official Website](  
-#### [S2Graph API 
-#### Mailing Lists  
is for usage questions and announcements.  
this email to subscribe)  
 this email to unsubscribe)  
is for people who want to contribute to S2Graph.  
this email to subscribe)  
 this email to unsubscribe)  
\ No newline at end of file
+Your First Graph
+Once the S2Graph server has been set up, you can now start to send HTTP 
queries to the server to create a graph and pour some data in it. This tutorial 
goes over a simple toy problem to get a sense of how S2Graph's API looks like. 
[`bin/`](bin/ contains the example code below.
+The toy problem is to create a timeline feature for a simple social media, 
like a simplified version of Facebook's timeline:stuck_out_tongue_winking_eye:. 
Using simple S2Graph queries it is possible to keep track of each user's 
friends and their posts.
+1. First, we need a name for the new service.
+  The following POST query will create a service named "KakaoFavorites".
+  ```
+  curl -XPOST localhost:9000/graphs/createService -H 'Content-Type: 
Application/json' -d '
+  {"serviceName": "KakaoFavorites", "compressionAlgorithm" : "gz"}
+  '
+  ```
+  To make sure the service is created correctly, check out the following.
+  ```
+  curl -XGET localhost:9000/graphs/getService/KakaoFavorites
+  ```
+2. Next, we will need some friends.
+  In S2Graph, relationships are organized as labels. Create a label called 
`friends` using the following `createLabel` API call:
+  ```
+  curl -XPOST localhost:9000/graphs/createLabel -H 'Content-Type: 
Application/json' -d '
+  {
+    "label": "friends",
+    "srcServiceName": "KakaoFavorites",
+    "srcColumnName": "userName",
+    "srcColumnType": "string",
+    "tgtServiceName": "KakaoFavorites",
+    "tgtColumnName": "userName",
+    "tgtColumnType": "string",
+    "isDirected": "false",
+    "indices": [],
+    "props": [],
+    "consistencyLevel": "strong"
+  }
+  '
+  ```
+  Check if the label has been created correctly:+
+  ```
+  curl -XGET localhost:9000/graphs/getLabel/friends
+  ```
+  Now that the label `friends` is ready, we can store the friendship data. 
Entries of a label are called edges, and you can add edges with `edges/insert` 
+  ```
+  curl -XPOST localhost:9000/graphs/edges/insert -H 'Content-Type: 
Application/json' -d '
+  [
+    {"from":"Elmo","to":"Big 
+    {"from":"Cookie 
+    {"from":"Cookie 
+    {"from":"Cookie 
+  ]
+  '
+  ```
+  Query friends of Elmo with `getEdges` API:
+  ```
+  curl -XPOST localhost:9000/graphs/getEdges -H 'Content-Type: 
Application/json' -d '
+  {
+    "srcVertices": [{"serviceName": "KakaoFavorites", "columnName": 
"userName", "id":"Elmo"}],
+    "steps": [
+      {"step": [{"label": "friends", "direction": "out", "offset": 0, "limit": 
+    ]
+  }
+  '
+  ```
+  Now query friends of Cookie Monster:
+  ```
+  curl -XPOST localhost:9000/graphs/getEdges -H 'Content-Type: 
Application/json' -d '
+  {
+    "srcVertices": [{"serviceName": "KakaoFavorites", "columnName": 
"userName", "id":"Cookie Monster"}],
+    "steps": [
+      {"step": [{"label": "friends", "direction": "out", "offset": 0, "limit": 
+    ]
+  }
+  '
+  ```
+3. Users of Kakao Favorites will be able to post URLs of their favorite 
+  We will need a new label ```post``` for this data:
+  ```
+  curl -XPOST localhost:9000/graphs/createLabel -H 'Content-Type: 
Application/json' -d '
+  {
+    "label": "post",
+    "srcServiceName": "KakaoFavorites",
+    "srcColumnName": "userName",
+    "srcColumnType": "string",
+    "tgtServiceName": "KakaoFavorites",
+    "tgtColumnName": "url",
+    "tgtColumnType": "string",
+    "isDirected": "true",
+    "indices": [],
+    "props": [],
+    "consistencyLevel": "strong"
+  }
+  '
+  ```
+  Now, insert some posts of the users:
+  ```
+  curl -XPOST localhost:9000/graphs/edges/insert -H 'Content-Type: 
Application/json' -d '
+  [
+    {"from":"Big 
+    {"from":"Big 
+  ]
+  '
+  ```
+  Query posts of Big Bird:
+  ```
+  curl -XPOST localhost:9000/graphs/getEdges -H 'Content-Type: 
Application/json' -d '
+  {
+    "srcVertices": [{"serviceName": "KakaoFavorites", "columnName": 
"userName", "id":"Big Bird"}],
+    "steps": [
+      {"step": [{"label": "post", "direction": "out", "offset": 0, "limit": 
+    ]
+  }
+  '
+  ```
+4. So far, we have designed a label schema for the labels `friends` and 
`post`, and stored some edges to them.+
+  This should be enough for creating the timeline feature! The following 
two-step query will return the URLs for Elmo's timeline, which are the posts of 
Elmo's friends:
+  ```
+  curl -XPOST localhost:9000/graphs/getEdges -H 'Content-Type: 
Application/json' -d '
+  {
+    "srcVertices": [{"serviceName": "KakaoFavorites", "columnName": 
"userName", "id":"Elmo"}],
+    "steps": [
+      {"step": [{"label": "friends", "direction": "out", "offset": 0, "limit": 
+      {"step": [{"label": "post", "direction": "out", "offset": 0, "limit": 
+    ]
+  }
+  '
+  ```
+  Also try Cookie Monster's timeline:
+  ```
+  curl -XPOST localhost:9000/graphs/getEdges -H 'Content-Type: 
Application/json' -d '
+  {
+    "srcVertices": [{"serviceName": "KakaoFavorites", "columnName": 
"userName", "id":"Cookie Monster"}],
+    "steps": [
+      {"step": [{"label": "friends", "direction": "out", "offset": 0, "limit": 
+      {"step": [{"label": "post", "direction": "out", "offset": 0, "limit": 
+    ]
+  }
+  '
+  ```
+The example above is by no means a full blown social network timeline, but it 
gives you an idea of how to represent, store and query graph data with S2Graph.+
+TinkerPop Support
+Since version 0.2.0-incubating, S2Graph integrate natively with `Apache 
TinkerPop 3.2.5`.
+S2Graph passes `Apache TinkerPop`'s `StructureStandardSuite` and 
`ProcessStandardSuite` test suites.
+### Graph Features **not** implemented.
+- Computer
+- Transactions
+- ThreadedTransactions
+### Vertex Features **not** implemented.
+- MultiProperties
+- MetaProperties
+- UuidIds
+- AnyIds 
+- NumericIds
+- StringIds
+### Edge Features **not** implemented.
+- UuidIds
+- AnyIds 
+- NumericIds
+- StringIds
+### Vertex property features **not** implemented.
+- UuidIds
+- AnyIds 
+- NumericIds
+- StringIds
+- MapValues
+- MixedListValues
+- BooleanArrayValues
+- ByteArrayValues
+- DoubleArrayValues
+- FloatArrayValues
+- IntegerArrayValues  
+- StringArrayValues
+- LongArrayValues
+- SerializableValues
+- UniformListValues
+### Edge property feature **not** implemented.
+- MapValues
+- MixedListValues
+- BooleanArrayValues
+- ByteArrayValues
+- DoubleArrayValues
+- FloatArrayValues
+- IntegerArrayValues  
+- StringArrayValues
+- LongArrayValues
+- SerializableValues
+- UniformListValues
+>NOTE: This is an ongoing task.
+## Getting Started
+### Maven coordinates
+    <groupId>org.apache.s2graph</groupId>
+    <artifactId>s2core_2.11</artifactId>
+    <version>0.2.0</version>
+### Start
+S2Graph is a singleton that can be shared among multiple threads. You 
instantiate S2Graph using the standard TinkerPop static constructors.
+- Graph g = Configuration configuration)
+Some important properties for configuration.
+#### HBase for data storage. 
+#### RDBMS for meta storage.
+### Gremlin Console
+#### 1. install plugin
+On gremlin console, it is possible to install s2graph as follow.
+:install org.apache.s2graph s2graph-gremlin 0.2.0
+:plugin use tinkerpop.s2graph
+Example run.
+shonui-MacBook-Pro:apache-tinkerpop-gremlin-console-3.2.5 shon$ bin/
+         \,,,/
+         (o o)
+plugin activated: tinkerpop.server
+plugin activated: tinkerpop.utilities
+plugin activated: tinkerpop.tinkergraph
+gremlin> :install org.apache.s2graph s2graph-gremlin 0.2.0
+==>Loaded: [org.apache.s2graph, s2graph-gremlin, 0.2.0] - restart the console 
to use [tinkerpop.s2graph]
+gremlin> :plugin use tinkerpop.s2graph
+==>tinkerpop.s2graph activated
+gremlin> :plugin list
+Once `s2graph-gremlin` plugin is acvive, then following example will generate 
tinkerpop's modern graph in s2graph.
+Taken from 
+![Modern Graph from 
+### tp3 modern graph(simple).
+conf = new BaseConfiguration()
+graph =
+// init system default schema
+// init extra schema for tp3 modern graph.
+// load modern graph into current graph instance.
+// traversal
+t = graph.traversal()
+// show all vertices in this graph.
+// show all edges in this graph.
+// add two vertices.
+shon = graph.addVertex(, 10, T.label, "person", "name", "shon", "age", 35)
+s2graph = graph.addVertex(, 11, T.label, "software", "name", "s2graph", 
"lang", "scala")
+// add one edge between two vertices.
+created = shon.addEdge("created", s2graph, "_timestamp", 10, "weight", 0.1)
+// check if new edge is available through traversal
+t.V().has("name", "shon").out()
+// shutdown
+Note that simple version used default schema for `Service`, `Column`, `Label` 
for compatibility.
+Please checkout advanced example below to understand what data model is 
available on S2Graph.
+### tp3 modern graph(advanced).
+It is possible to separate multiple namespaces into logical spaces.
+S2Graph achieve this by following data model. details can be found on
+1. Service: the top level abstraction 
+A convenient logical grouping of related entities
+Similar to the database abstraction that most relational databases support.    
+2. Column: belongs to a service.
+A set of homogeneous vertices such as users, news articles or tags. 
+Every vertex has a user-provided unique ID that allows the efficient lookup. 
+A service typically contains multiple columns.
+3. Label: schema for edge
+A set of homogeneous edges such as friendships, views, or clicks. 
+Relation between two columns as well as a recursive association within one 
+The two columns connected with a label may not necessarily be in the same 
service, allowing us to store and query data that spans over multiple services.
+Instead of convert user provided Id into internal unique numeric Id, S2Graph 
simply composite service and column metadata with user provided Id to guarantee 
global unique Id.
+Following is simple example to exploit these data model in s2graph.            
+// init graph
+graph = BaseConfiguration())
+// 0. import necessary methods for schema management.
+import static org.apache.s2graph.core.Management.*
+// 1. initialize dbsession for management which store schema into RDBMS.
+session = graph.dbSession()
+// 2. properties for new service "s2graph".
+serviceName = "s2graph"
+cluster = "localhost"
+hTableName = "s2graph"
+preSplitSize = 0
+hTableTTL = -1
+compressionAlgorithm = "gz"
+// 3. actual creation of s2graph service.
+// details can be found on
+service =, cluster, hTableName, 
preSplitSize, hTableTTL, compressionAlgorithm)
+// 4. properties for user vertex schema belongs to s2graph service.
+columnName = "user"
+columnType = "integer"
+// each property consist of (name: String, defaultValue: String, dataType: 
+// defailts can be found on
+props = [newProp("name", "-", "string"), newProp("age", "-1", "integer")]
+schemaVersion = "v3"
+user =, columnName, 
columnType, props, schemaVersion)
+// 2.1 (optional) global vertex index."global_vertex_index", ["name", "age"])
+// 3. create VertexId
+// create S2Graph's VertexId class.
+v1Id = graph.newVertexId(serviceName, columnName, 20)
+v2Id = graph.newVertexId(serviceName, columnName, 30)
+shon = graph.addVertex(, v1Id, "name", "shon", "age", 35)
+dun = graph.addVertex(, v2Id, "name", "dun", "age", 36)
+// 4. friends label
+labelName = "friend_"
+srcColumn = user
+tgtColumn = user
+isDirected = true
+indices = []
+props = [newProp("since", "-", "string")]
+consistencyLevel = "strong"
+hTableName = "s2graph"
+hTableTTL = -1
+options = null
+friend =, srcColumn, tgtColumn,
+        isDirected, serviceName, indices, props, consistencyLevel,
+        hTableName, hTableTTL, schemaVersion, compressionAlgorithm, options)
+shon.addEdge(labelName, dun, "since", "2017-01-01")
+t = graph.traversal()
+println "All Edges"
+println t.E().toList()
+println "All Vertices"
+println t.V().toList()
+println "Specific Edge"
+println t.V().has("name", "shon").out().toList()
+## Architecture
+physical data storage is closed related to data 
+in HBase storage, Vertex is stored in `v` column family, and Edge is stored in 
`e` column family.
+each `Service`/`Label` can have it's own dedicated HBase Table.
+How Edge/Vertex is actually stored in `KeyValue` in HBase is described in 
+## Indexes
+will be updated.
+## Cache
+will be updated.
+## Gremlin 
+S2Graph has full support for gremlin. However gremlin’s fine grained graphy 
nature results in very high latency
+Provider suppose to provide `ProviderOptimization` to improve latency of 
traversal, and followings are currently available optimizations.
+>NOTE: This is an ongoing task
+#### 1. `S2GraphStep`
+1. translate multiple `has` step into lucene query and find out 
vertexId/edgeId can be found from index provider, lucene.
+2. if vertexId/edgeId can be found, then change full scan into point lookup 
using list of vertexId/edgeId.
+for examples, following traversal need full scan on storage if there is no 
index provider.
+g.V().has("name", "steamshon").out()
+g.V().has("name", "steamshon").has("age", P.eq(30).or(P.between(20, 30)))
+once following global vertex index is created, then `S2GraphStep` translate 
above traversal into lucene query, then get list of vertexId/edgeId which 
switch full scan to points lookup.
+```"global_vertex_index", ["name", "age"])
+#### [The Official Website](
+#### [S2Graph API 
+#### Mailing Lists
is for usage questions and announcements.
this email to subscribe)
 this email to unsubscribe)
is for people who want to contribute to S2Graph.
this email to subscribe)
 this email to unsubscribe)

Reply via email to