Re: TraversalStrategies.setTraverserGeneratorFactory's removal

Marko Rodriguez Thu, 07 Apr 2016 10:55:53 -0700

Hi,

There is ImmutablePath and MutablePath. ImmutablePath is used in OLTP and is 
much more efficient in terms of space & time than MutablePath. You can create 
either as you please you just do:


Path path = XXXPath.make()
path = path.extend(…)
path = path.extend(…)

However, I don't recommend you work at that level. What I would recommend you 
do is this:

public class MyBigSelectLabelStep<S,E> {

  Traverser traverser = this.starts.next();
  Map<String,Object> result = doLowLevelProviderSpecificStuff(traverser);
  traverser = traverser.split(result.get("a"), EmptyStep.instance()); // 
simulate GraphStep
  traverser.addLabels("a")
  traverser = traverser.split(result.get("b"), EmptyStep.instance()); // 
simulate VertexStep
  traverser.addLabels("b") 
  traverser = traverser.split(result, EmptyStep.instance()); // simulate 
SelectStep
  return traverser;

}

This handles all the low-level mechanisms of generating a traverser that looks 
like it went through multiple steps even though it only went through one. This 
is hand-typed from memory of the API so please be aware that I might have 
gotten an argument wrong or something. Also, out() is a FlatMapStep and thus, 
processes an iterator of results -- you will have to be smart to flatten that 
iterator… see FlatMapStep's implementation for how this is typically handled in 
TinkerPop.

HTH,
Marko. 

http://markorodriguez.com

On Apr 7, 2016, at 10:59 AM, pieter-gmail <[email protected]> wrote:

> Hi,
> 
> I have been working on this without using a custom traverser. It is fine
> but I do need to be able to set the path of the traverser.
> 
> We had a ticket for this previously here
> <https://issues.apache.org/jira/browse/TINKERPOP-766>. I mentioned there
> that I no longer needed the path setter and you mentioned that is ok but
> it was never done.
> 
> To give some background.
> 
> g.V(a1).as("a").out().as("b").select("a", "b")
> 
> This will be compiled to one step. i.e. This means that the collapsed
> steps label information is somewhat obfuscated and the path is
> incorrectly calculated by the traverser.
> However the label information is not lost and I recalculate the path but
> need to set it on the traverser.
> 
> Is it still ok to make the path mutable?
> 
> Thanks
> Pieter
> 
> On 30/03/2016 23:19, Marko Rodriguez wrote:
>> Hi,
>> 
>> So Titan does something similar where it takes a row in Cassandra and turns 
>> those into Traversers. It uses FlatMapStep to do so where the iterator in 
>> FlatMapStep is a custom iterator that knows how to do data conversions.
>> 
>> Would something like that help?
>> 
>> If not and you really need your own TraverserGenerator, then you can use 
>> reflection to set it in DefaultTraversal. Its a private member now.
>> 
>> Moving forward, I would highly recommend you don't create classes so low in 
>> the stack. Graph database providers should only create (if necessary):
>> 
>>      1. Steps that extend non-final TinkerPop steps.
>>      2. TraversalStrategies that implement ProviderOptimizationStrategy.
>>      3. Classes that extend Graph, Vertex, Edge, Property, VertexProperty, 
>> and GraphComputer.
>>      4. Their own InputFormat or InputRDD if they want to have Spark/Giraph 
>> work against them.
>> 
>> Anything beyond that (I think) is starting to get into murky territory.
>> 
>> Marko.
>> 
>> http://markorodriguez.com
>> 
>> On Mar 30, 2016, at 2:46 PM, pieter-gmail <[email protected]> wrote:
>> 
>>> Hi,
>>> 
>>> I need it to keep state. Mapping sql ResultSet's grid nature to a graph
>>> nature gets complex quickly and being able to store and manipulate the
>>> state of the traverser makes it easier. Also it was possible and the
>>> solution presented itself to me as such.
>>> 
>>> A single row i.e. AbstractStep.processNextStart() from a sql ResultSet
>>> might map to many traversers. This is the at the heart of what makes
>>> Sqlg a worthwhile enterprise. To fetch lots of data (reduce latency
>>> cost) in a denormalized manner and map it to a graph format.
>>> 
>>> To manage this I needed to store state and add a custom method
>>> "customSplit(...)". customSplit is similar to split() but it uses and
>>> updates the said state. The custom traverser also keeps additional state
>>> of where we are with respect to the processNextStart() (a sql row) as I
>>> need it in order to calculate the next traverser from the same row.
>>> 
>>> So if all this becomes impossible there is bound to be a different
>>> solution to the same problem but it would require quite some thinking
>>> and effort.
>>> 
>>> With a little bit of arrogance and ignorance, perhaps letting OLAP
>>> constraints leak into OLTP is not a good idea. I'd say OLTP is 99% of
>>> use-cases so whatever these serialization issues are they ought to be
>>> contained to OLAP.
>>> 
>>> Thanks
>>> Pieter
>>> 
>>> 
>>> 
>>> On 30/03/2016 21:38, Marko Rodriguez wrote:
>>>> Hello Pieter,
>>>> 
>>>>> In SqlgGraph 
>>>>> <https://github.com/pietermartin/sqlg/blob/schema/sqlg-core/src/main/java/org/umlg/sqlg/structure/SqlgGraph.java>
>>>>> in a static code block invokes
>>>>> 
>>>>> static {
>>>>> TraversalStrategies.GlobalCache.registerStrategies(Graph.class,
>>>>> TraversalStrategies.GlobalCache.getStrategies(Graph.class).clone().addStrategies(new
>>>>> SqlgVertexStepStrategy()));
>>>>> TraversalStrategies.GlobalCache.registerStrategies(Graph.class,
>>>>> TraversalStrategies.GlobalCache.getStrategies(Graph.class).clone().addStrategies(new
>>>>> SqlgGraphStepStrategy()));
>>>>> TraversalStrategies.GlobalCache.getStrategies(Graph.class).setTraverserGeneratorFactory(new
>>>>> SqlgTraverserGeneratorFactory());
>>>>> }
>>>> This all looks great exception the TraverserGeneratorFactory. Traverser 
>>>> classes are so low-level and so tied to serialization code in OLAP that I 
>>>> removed all concept of users able to create traverser species. I need full 
>>>> control at that level to maneuver.
>>>> 
>>>> I really need to create a section in the docs that says stuff like:
>>>> 
>>>>    * Graph System Providers: only implement steps that extend non-final 
>>>> TinkerPop-steps (e.g. GraphStep, VertexStep, etc.).
>>>>    * Graph Language Providers: only have Traversal.steps() that can be 
>>>> represented as a composition of TinkerPop-steps.
>>>> 
>>>> When providers get too low level, then its hard for us to maneuver and 
>>>> optimize and move forward with designs. There are so many assumption in 
>>>> the code that we make around Traverser instances, Step interfaces, etc. 
>>>> that if people just make new ones, then strategies, serialization, etc. 
>>>> breaks down.
>>>> 
>>>> The question I have, why do you have your own Traverser implementation? I 
>>>> can't imagine a reason for a provider needs their own traverser class. ??
>>>> 
>>>> Thanks,
>>>> Marko.
>>>> 
>>>> http://markorodriguez.com
>> 
>

Re: TraversalStrategies.setTraverserGeneratorFactory's removal

Reply via email to