Re: Update on TINKERPOP-1278: Gremlin Language Variants

2016-07-14 Thread Stephen Mallette
I've done a fair bit of work trying to generalize Gremlin-enabled
ScriptEngines so that we have a common way to configure any additional ones
we may add. Unfortunately, I think I've exhausted what I can do there
without introducing significant breaking change. The easiest way to explain
where I've reached an impasse is to say that to this point gremlin-groovy
has held most of the code related to "ScriptEngines" in TinkerPop going
back to 1.x and now, much of that infrastructure needs to live in
gremlin-core so that it can be easily shared with gremlin-groovy,
gremlin-python, gremlin-whatever, etc. I can't move those things easily
without breaking many many many things.

On a positive note however, I don't think that this is a big problem for
TINKERPOP-1278. I'm going to switch gears a bit and focus on the changes
required to Gremlin Server to better support GLVs. I think that with the
work I've already done and the work Marko has done we can have something
that remains fully functional but will need a lot of clean-up going
forward. So for 3.2.2 we can have gremlin-python working and the pattern
for GLV implementation in good shape with the expectation that for 3.3.0 we
will get it 100% solid all around (as with 3.3.x we will have some room to
maneuver as we expect some breaking change to occur in that line of code).

This post is sorta high-level, so if anyone is interested in more details,
please ask.





On Wed, Jul 13, 2016 at 11:24 AM, Marko Rodriguez 
wrote:

> Hi,
>
> TINKERPOP-1278 represents the evolution of Gremlin in that Gremlin can be
> (easily) embedded in any host language and can be (easily) compiled in said
> host language to a standard Bytecode representation. That Bytecode can then
> be shipped to a RemoteConnection-based server (e.g. GremlinServer) to be
> processed and have results streamed back. That is, for instance, Python
> developers can natively write Gremlin and get results back from a remote
> GremlinVM (e.g. Apache TinkerPop’s Java-based Gremlin VM).
>
> https://gist.github.com/okram/b95d4b37af625435620aa078b2746e8a <
> https://gist.github.com/okram/b95d4b37af625435620aa078b2746e8a>
>
> Here is what I believe we have left to accomplish on TINKERPOP-1278. I’ve
> added people’s names next to the tasks based on who has been working in
> these areas. If others are interested in helping out, please feel free to
> jump in accordingly.
>
>   * Finalize GraphSON2.0 [Stephen, Robert, Kevin, Dylan]
>   * New RemoteConnection protocol (GraphSON2.0-based supporting Traversers
> (bulking) and SideEffects) [Stephen]
> - XXXServer agnostic -- GremlinServer, Neo4jServer,
> ArangoDBServer, etc. implementable.
>   * GremlinScriptEngine infrastructure for testing. [Stephen]
>   * Write documentation on new RemoteConnection protocol (published
> specification). [Stephen]
>   * Bytecode serialization specification in GraphSON2.0 [Marko]
> - Gremlin-XXX agnostic (save lambdas) -- Gremlin-Java,
> Gremlin-Groovy, Gremlin-Python.
>   * Implement new RemoteConnection protocol in Python. [Marko, Dave,
> Leifur]
>   * Refactor Gremlin-Python test infrastructure in Stephen's new
> GremlinScriptEngine test framework. [Marko]
>   * Write documentation on Bytecode format (published specification).
> [Marko]
>
> NOTE: If you are unsure what some of this means, please ask and I can
> describe in more detail each of these items.
>
> As you can see, TINKERPOP-1278 is heavily dependent on GraphSON2.0 being
> defined and merged so that should take top priority. We have a nice thread
> going on this and we should try and converge on it.
>
> Finally. I was texting with Stephen and asking him about how long it would
> take him to do X,Y,Z and it seems that together, we should shoot for August
> 20th to have a PR for this work for review/VOTE and ultimate merge to
> master/.
>
> Thoughts?,
> Marko.
>
> http://markorodriguez.com
>
>
>
>


Re: [DISCUSS] 3.1.x code line

2016-07-14 Thread Ted Wilmes
It does for me too, +1.

--Ted

On Wed, Jul 13, 2016 at 3:44 PM, Hadrian Zbarcea  wrote:

> +1. It does to me.
> Hadrian
>
>
> On 07/13/2016 04:29 PM, Stephen Mallette wrote:
>
>> Since we don't really follow semantic versioning for releases, I thought
>> we
>> should discuss the 3.1.x code line. We've been steadily adding features
>> right up to our current 3.1.3 release which we will vote on shortly. I
>> think it's pretty awesome that we've managed to maintain that older line
>> of
>> code for as long as we have and I think we've evolved it in a very
>> sensible
>> way.
>>
>> I think we should continue to maintain 3.1.x after the 3.1.3 release,
>> which
>> would mean a 3.1.4 release at some point, but we should strictly limit the
>> changes there to bug fixes and not do any "new features" on that line of
>> code (i.e the tp31 branch). As it stands, I don't see any open "bugs" for
>> the 3.1.3 in JIRA so as of right now, we wouldn't have much planned for
>> 3.1.4.
>>
>> Does that make sense for everyone?
>>
>>


ApacheCon Europe call for papers open

2016-07-14 Thread Rich Bowen
Dear Apache Enthusiast,

As you are no doubt already aware, we will be holding ApacheCon in
Seville, Spain, the week of November 14th, 2016. The call for papers
(CFP) for this event is now open, and will remain open until
September 9th.

The event is divided into two parts, each with its own CFP. The first
part of the event, called Apache Big Data, focuses on Big Data
projects and related technologies.

Website: http://events.linuxfoundation.org/events/apache-big-data-europe
CFP:
http://events.linuxfoundation.org/events/apache-big-data-europe/program/cfp

The second part, called ApacheCon Europe, focuses on the Apache
Software Foundation as a whole, covering all projects, community
issues, governance, and so on.

Website: http://events.linuxfoundation.org/events/apachecon-europe
CFP: http://events.linuxfoundation.org/events/apachecon-europe/program/cfp

ApacheCon is the official conference of the Apache Software
Foundation, and is the best place to meet members of your project and
other ASF projects, and strengthen your project's community.

If your organization is interested in sponsoring ApacheCon, contact me
at e...@apache.org  ApacheCon is a great place to find the brightest
developers in the world, and experts on a huge range of technologies.

I hope to see you in Seville!


Re: [TinkerPop] Traversal ByteCode and Translators (The TINKERPOP-1278 Saga)

2016-07-14 Thread Marko Rodriguez
Here are the ProcessStandardTestSuite examples in GraphSON Bytecode:

https://gist.github.com/okram/3c7a5f8047d10c0bfc29d55f8c229c54 


Marko.

http://markorodriguez.com



> On Jul 14, 2016, at 9:50 AM, Marko Rodriguez  wrote:
> 
> Hi,
> 
> Check this out. Yesterday, I added GraphSONSerializers/Deserializers for 
> Bytecode. This means that we are able to translate Bytecode to JSON. In order 
> to test to make sure my serializers/deserializers are good, I created a 
> Translator called GraphSONTranslator that is only used in our test suite. 
> What it does is it wraps the JavaTranslator and before sending the Bytecode 
> to the JavaTranslator, it first writes the Bytecode to GraphSON, then reads 
> it from GraphSON and then gives it to JavaTranslator. This is run against our 
> complete ProcessStandard and ProcessComputer suite. There are a few hiccups 
> (failed tests), but I know why they are failing and it has to do either with 
> Long/int-stuff and/or using Elements as arguments — e.g. g.V(g.V(1).next()). 
> What is neat though is that the serializer/deserializer code is simple and 
> just like that, Bytecode is language agnostic.
> 
>   
>   
> https://github.com/apache/tinkerpop/blob/c60e2132a6feb76ddf72073614b60dfa824193b4/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/structure/io/graphson/GraphSONTraversalSerializers.java
>  
> 
>   
> https://github.com/apache/tinkerpop/blob/c60e2132a6feb76ddf72073614b60dfa824193b4/tinkergraph-gremlin/src/test/java/org/apache/tinkerpop/gremlin/java/translator/GraphSONTranslator.java#L69-L73
>  
> 
>   
> Neato… now we just need to have a standard GraphSON representation…. 
> GraphSON2.0 is important for this work.
> 
> Marko.
> 
> http://markorodriguez.com 
> 
> 
> 
>> On Jul 6, 2016, at 3:23 PM, Marko Rodriguez > > wrote:
>> 
>> Hi,
>> 
>>> Hi Marko, this is neat stuff. Do you have any designs on making the
>>> "ByteCode" construct the source of truth for Gremlin semantics?
>> 
>> Yes and no. Bytecode will have two “standard serializations” (GraphSON and 
>> Gryo). From there, any gremlin-xxx will have a “new Bytecode(json)”-style 
>> constructor to create a Bytecode object in the respective language. From 
>> there, the respective language can construct a traversal object as it sees 
>> fit (either via reflection or via script string creation). For instance:
>> 
>>  JavaTranslator uses reflection:
>>  
>> https://github.com/apache/tinkerpop/blob/88b266c9224e3d19cc28e1c4935416c1fb26e9cb/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/util/JavaTranslator.java
>>  
>> 
>>  GroovyTranslator uses script string: 
>>  
>> https://github.com/apache/tinkerpop/blob/88b266c9224e3d19cc28e1c4935416c1fb26e9cb/gremlin-groovy/src/main/java/org/apache/tinkerpop/gremlin/java/translator/GroovyTranslator.java
>>  
>> 
>> 
>> Next, the only reason you would need to use GroovyTranslator is because of a 
>> lambda. If the Bytecode you submit to GremlinServer (or any RemoteConnection 
>> implementation) doesn’t have lambdas in it, it would be more efficient to 
>> use JavaTranslator and thus, bypass the overhead of String compilation and 
>> Groovy dispatch-based meta-classes.
>> 
>>> That is to say, generating an AST is nice but I would be hesitant to
>>> build and support a non-JVM implementation of Gremlin unless the
>>> semantics of the bytecode was specified and the JVM implementation was
>>> simply the reference implementation.
>> 
>> So the semantics of the byte code are specified by the semantics of the 
>> step(arguments). 
>> 
>>  out(“created”)
>>  V()
>>  path()
>>  etc.
>> 
>>  TinkerPop’s ProcessStandardSuite and ProcessComputerSuite verify the 
>> semantics of every step and thus, can be used to verify that the Bytecode 
>> generated in another language (e.g. gremlin-python) has valid semantics. 
>> Here is how I verify Gremlin-Python generated Bytecode.
>> 
>>  
>> https://github.com/apache/tinkerpop/blob/88b266c9224e3d19cc28e1c4935416c1fb26e9cb/gremlin-python/src/test/java/org/apache/tinkerpop/gremlin/ja

Re: [TinkerPop] Traversal ByteCode and Translators (The TINKERPOP-1278 Saga)

2016-07-14 Thread Marko Rodriguez
Hi,

Check this out. Yesterday, I added GraphSONSerializers/Deserializers for 
Bytecode. This means that we are able to translate Bytecode to JSON. In order 
to test to make sure my serializers/deserializers are good, I created a 
Translator called GraphSONTranslator that is only used in our test suite. What 
it does is it wraps the JavaTranslator and before sending the Bytecode to the 
JavaTranslator, it first writes the Bytecode to GraphSON, then reads it from 
GraphSON and then gives it to JavaTranslator. This is run against our complete 
ProcessStandard and ProcessComputer suite. There are a few hiccups (failed 
tests), but I know why they are failing and it has to do either with 
Long/int-stuff and/or using Elements as arguments — e.g. g.V(g.V(1).next()). 
What is neat though is that the serializer/deserializer code is simple and just 
like that, Bytecode is language agnostic.



https://github.com/apache/tinkerpop/blob/c60e2132a6feb76ddf72073614b60dfa824193b4/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/structure/io/graphson/GraphSONTraversalSerializers.java
 


https://github.com/apache/tinkerpop/blob/c60e2132a6feb76ddf72073614b60dfa824193b4/tinkergraph-gremlin/src/test/java/org/apache/tinkerpop/gremlin/java/translator/GraphSONTranslator.java#L69-L73
 


Neato… now we just need to have a standard GraphSON representation…. 
GraphSON2.0 is important for this work.

Marko.

http://markorodriguez.com



> On Jul 6, 2016, at 3:23 PM, Marko Rodriguez  wrote:
> 
> Hi,
> 
>> Hi Marko, this is neat stuff. Do you have any designs on making the
>> "ByteCode" construct the source of truth for Gremlin semantics?
> 
> Yes and no. Bytecode will have two “standard serializations” (GraphSON and 
> Gryo). From there, any gremlin-xxx will have a “new Bytecode(json)”-style 
> constructor to create a Bytecode object in the respective language. From 
> there, the respective language can construct a traversal object as it sees 
> fit (either via reflection or via script string creation). For instance:
> 
>   JavaTranslator uses reflection:
>   
> https://github.com/apache/tinkerpop/blob/88b266c9224e3d19cc28e1c4935416c1fb26e9cb/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/util/JavaTranslator.java
>  
> 
>   GroovyTranslator uses script string: 
>   
> https://github.com/apache/tinkerpop/blob/88b266c9224e3d19cc28e1c4935416c1fb26e9cb/gremlin-groovy/src/main/java/org/apache/tinkerpop/gremlin/java/translator/GroovyTranslator.java
>  
> 
> 
> Next, the only reason you would need to use GroovyTranslator is because of a 
> lambda. If the Bytecode you submit to GremlinServer (or any RemoteConnection 
> implementation) doesn’t have lambdas in it, it would be more efficient to use 
> JavaTranslator and thus, bypass the overhead of String compilation and Groovy 
> dispatch-based meta-classes.
> 
>> That is to say, generating an AST is nice but I would be hesitant to
>> build and support a non-JVM implementation of Gremlin unless the
>> semantics of the bytecode was specified and the JVM implementation was
>> simply the reference implementation.
> 
> So the semantics of the byte code are specified by the semantics of the 
> step(arguments). 
> 
>   out(“created”)
>   V()
>   path()
>   etc.
> 
>  TinkerPop’s ProcessStandardSuite and ProcessComputerSuite verify the 
> semantics of every step and thus, can be used to verify that the Bytecode 
> generated in another language (e.g. gremlin-python) has valid semantics. Here 
> is how I verify Gremlin-Python generated Bytecode.
> 
>   
> https://github.com/apache/tinkerpop/blob/88b266c9224e3d19cc28e1c4935416c1fb26e9cb/gremlin-python/src/test/java/org/apache/tinkerpop/gremlin/java/translator/jython/PythonJythonTranslatorProvider.java
>  
> 
>   
> https://github.com/apache/tinkerpop/blob/88b266c9224e3d19cc28e1c4935416c1fb26e9cb/gremlin-python/src/test/java/org/apache/tinkerpop/gremlin/java/translator/jython/PythonJythonTranslatorProce

Re: [DISCUSS] New IO format for GLVs/Gremlin Server

2016-07-14 Thread gallardo.kev...@gmail.com


On 2016-07-13 13:17 (+0100), Robert Dale  wrote: 
> Marko, I agree that empty object properties should not be represented.
> I think if you saw that in an example then it was probably for
> demonstration purposes.
> 
> Kevin, can you expand on this comment:
> 
> > the format you suggest would lead to the same inconsistencies as in 
> > GraphSON 1.0.
> > Since the type is at the same level than the data itself, whether the 
> > container is an Array or an Object
> > https://github.com/apache/tinkerpop/pull/351#issuecomment-231351653
> 
> What exactly are the inconsistencies?  What is the problem in
> determining an array or object?
> This is a natural JSON array (or list): []
> This is a natural JSON object: {}
> 
> Type at the object level is a common pattern and supported feature of
> Jackson.  Also, GeoJSON would be a natural fit as it also stores
> 'type' at the object level. Titan supports GeoJSON currently.  I
> wonder if it would make sense to promote geometry to gremlin.
> 

I wasn't probably clear enough, in my first email exposing my motivation to 
improve GraphSON 1.0, one of the things I noticed was that according to the 
enclosing element (either an Array or a Map), a type will either be described 
as (respectively) an element of the Array, or a key/value pair in a Map, you 
can see that in the "embedded types" example of the Tinkerpop docs : 
http://tinkerpop.apache.org/docs/current/reference/#graphson-reader-writer . 

There you can see that the type "java.util.ArrayList" is a simple element of 
the enclosing array, but the "java.util.HashMap" type is a field of the 
enclosing Map as {"@class" : "java.util.HashMap", ...}. This does not seem 
consistent to me and even though I know that Jackson handles it well, it seems 
that we'd better provide a consistent enclosing format that we know is fixed 
whatever the enclosed data is, to make the automatic type detection for other 
parsers in other libraries/languages easier. Does that make sense ?

> We should probably start documenting a table of supported types. (If
> there is one, please provide link)
> 
> I wonder if it even makes sense to type numbers according to their
> memory model. As objects, Byte, Short, and Integer occupy the same
> space. Long isn't much more.  So in Java we're not saving much space.
> Jackson will attempt to parse in order: int, long, BigInt, BigDecimal.
> The JSON JSR uses only BigDecimal. Some non-jvm languages don't even
> have this concept.  Does anything in gremlin actually require this?
> I'm thinking that this is only going to be relevant at the domain
> model level. This way json native numbers can be used and not need
> typing.
> 
> Additionally, I think that all things that will be typed should always
> be typed. For the use cases of injesting a saved graph from a file, it
> can probably be assumed that the top-level objects are vertices since
> the graph is vertex-centric and everything else follows naturally.
> I'm not entirely sure what is required for submitting traversals to
> gremlin server from GLV.  However, if this is used for the results
> from gremlin server then the results could start with any one of path,
> vertex, edge, property, vertex property, etc. So you'll need that type
> data there.
> 
> -- 
> Robert Dale
> 
> On Tue, Jul 12, 2016 at 8:35 AM, Marko Rodriguez  wrote:
> > Hi,
> >
> > I’m not following this PR too closely so what I might be saying is a 
> > already known/argued against/etc.
> >
> > 1. I think we should go with Robert Dale’s proposal of int32, 
> > int64, Vertex, uuid, etc. instead of Java class names.
> > 2. In Java we then have a Map for typecasting 
> > accordingly.
> > 3. This would make GraphSON 2.0 perfect for Bytecode serialization 
> > in TINKERPOP-1278.
> > 4. I think that if a Vertex, Edge, etc. doesn’t have properties, 
> > outV, etc. then don’t even have those fields in the representation.
> > 5. Most of the serialization back and forth will be ReferenceXXX 
> > elements and thus, don’t create more Maps/lists for no reason. — less 
> > chars.
> >
> > For me, my interests with this work is all about a language agnostic way of 
> > sending Gremlin traversal bytecode between different languages. This work 
> > is exactly what I am looking for.
> >
> > Thanks,
> > Marko.
> >
> > http://markorodriguez.com
> >
> >
> >
> >> On Jul 9, 2016, at 9:48 AM, Stephen Mallette  wrote:
> >>
> >> With all the work on GLVs and the recent work on GraphSON 2.0, I think it's
> >> important that we have a solid, efficient, programming language neutral,
> >> lossless serialization format. Right now that format is GraphSON and it
> >> works for that purpose (ever more  so with 2.0). Given some discussion on
> >> the GraphSON 2.0 PR driven a bit by Robert Dale:
> >>
> >> https://github.com/apache/tinkerpop/pull/351#issuecomment-231157389
> >>
> >> I wonder if we shouldn't consider another IO format that has Gremlin
> >> Server/GLVs in 

[jira] [Created] (TINKERPOP-1375) Possible ByteBuf leak for certain transactional scenarios

2016-07-14 Thread stephen mallette (JIRA)
stephen mallette created TINKERPOP-1375:
---

 Summary: Possible ByteBuf leak for certain transactional scenarios
 Key: TINKERPOP-1375
 URL: https://issues.apache.org/jira/browse/TINKERPOP-1375
 Project: TinkerPop
  Issue Type: Bug
  Components: server
Affects Versions: 3.2.0-incubating
Reporter: stephen mallette
Assignee: stephen mallette


Not sure how to recreate this but certain transactional scenarios in sessions 
seem to generate a standard Netty "LEAK" log message. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)