[jira] [Updated] (TINKERPOP-1685) Introduce optional feature to allow for upserts without read-before-write

2017-06-05 Thread Jeremy Hanna (JIRA)

 [ 
https://issues.apache.org/jira/browse/TINKERPOP-1685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated TINKERPOP-1685:

Description: 
Currently TINKERPOP-479 is being considered to do some sort of {{getOrCreate}} 
functionality.  However for some data stores such as Cassandra, this is still 
short of upserts.  As I understand it, {{getOrCreate}} still has to do a 
read-before-write.  In cases where the user can guarantee that upserts are 
going to be idempotent, there is a significant performance improvement and risk 
avoidance (race condition with multi-threaded read-before-write).  Additionally 
with some data stores such as Apache Cassandra, the natural way to update data 
is with an upsert.

This ticket is to consider adding an additional optional feature to support 
upserts by default on {{addV}} and {{addE}}.  It could be called something like 
{{upsert_on_add}}.  This configuration would default to false so that it 
doesn't break anyone currently relying on errors when adding the same vertex or 
edge.  However if enabled, it would just add or modify data on the existing 
vertex or edge.

If overriding the existing {{addV}} and {{addE}} operations with this optional 
feature is undesirable, then perhaps new operators could be added like 
{{upsertV}} and {{upsertE}} or {{putV}} and {{putE}} and those could be used to 
both add and update the data.  Allowing it to insert data is important because 
otherwise you are left with having to read-before-write which incurs the 
performance cost and race condition risk.  A benefit of a separate operator is 
that you could mix upsert behavior and non-upsert add behavior in a single 
graph.  I'm not sure there is a huge need to use both in a single graph, but it 
is a difference between the two strategies.

  was:
Currently TINKERPOP-479 is being considered to do some sort of {{getOrCreate}} 
functionality.  However for some data stores such as Cassandra, this is still 
short of upserts.  As I understand it, {{getOrCreate}} still has to do a 
read-before-write.  In cases where the user can guarantee that upserts are 
going to be idempotent, there is a significant performance improvement and risk 
avoidance (race condition with multi-threaded read-before-write).  Additionally 
with some data stores such as Apache Cassandra, the natural way to update data 
is with an upsert.

This ticket is to consider adding an additional optional feature to support 
upserts by default on {{addV}} and {{addE}}.  This configuration would default 
to false so that it doesn't break anyone currently relying on errors when 
adding the same vertex or edge.  However if enabled, it would just add or 
modify data on the existing vertex or edge.

If overriding the existing {{addV}} and {{addE}} operations with this optional 
feature is undesirable, then perhaps new operators could be added like 
{{upsertV}} and {{upsertE}} or {{putV}} and {{putE}} and those could be used to 
both add and update the data.  Allowing it to insert data is important because 
otherwise you are left with having to read-before-write which incurs the 
performance cost and race condition risk.  A benefit of a separate operator is 
that you could mix upsert behavior and non-upsert add behavior in a single 
graph.  I'm not sure there is a huge need to use both in a single graph, but it 
is a difference between the two strategies.


> Introduce optional feature to allow for upserts without read-before-write
> -
>
> Key: TINKERPOP-1685
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1685
> Project: TinkerPop
>  Issue Type: Wish
>Reporter: Jeremy Hanna
>
> Currently TINKERPOP-479 is being considered to do some sort of 
> {{getOrCreate}} functionality.  However for some data stores such as 
> Cassandra, this is still short of upserts.  As I understand it, 
> {{getOrCreate}} still has to do a read-before-write.  In cases where the user 
> can guarantee that upserts are going to be idempotent, there is a significant 
> performance improvement and risk avoidance (race condition with 
> multi-threaded read-before-write).  Additionally with some data stores such 
> as Apache Cassandra, the natural way to update data is with an upsert.
> This ticket is to consider adding an additional optional feature to support 
> upserts by default on {{addV}} and {{addE}}.  It could be called something 
> like {{upsert_on_add}}.  This configuration would default to false so that it 
> doesn't break anyone currently relying on errors when adding the same vertex 
> or edge.  However if enabled, it would just add or modify data on the 
> existing vertex or edge.
> If overriding the existing {{addV}} and {{addE}} operations with this 
> optional feature is undesirable, then perhaps new operators could be added 
> 

[jira] [Updated] (TINKERPOP-1685) Introduce optional feature to allow for upserts without read-before-write

2017-06-05 Thread Jeremy Hanna (JIRA)

 [ 
https://issues.apache.org/jira/browse/TINKERPOP-1685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated TINKERPOP-1685:

Description: 
Currently TINKERPOP-479 is being considered to do some sort of {{getOrCreate}} 
functionality.  However for some data stores such as Cassandra, this is still 
short of upserts.  As I understand it, {{getOrCreate}} still has to do a 
read-before-write.  In cases where the user can guarantee that upserts are 
going to be idempotent, there is a significant performance improvement and risk 
avoidance (race condition with multi-threaded read-before-write).  Additionally 
with some data stores such as Apache Cassandra, the natural way to update data 
is with an upsert.

This ticket is to consider adding an additional optional feature to support 
upserts by default on {{addV}} and {{addE}}.  This configuration would default 
to false so that it doesn't break anyone currently relying on errors when 
adding the same vertex or edge.  However if enabled, it would just add or 
modify data on the existing vertex or edge.

If overriding the existing {{addV}} and {{addE}} operations with this optional 
feature is undesirable, then perhaps new operators could be added like 
{{upsertV}} and {{upsertE}} or {{putV}} and {{putE}} and those could be used to 
both add and update the data.  Allowing it to insert data is important because 
otherwise you are left with having to read-before-write which incurs the 
performance cost and race condition risk.  A benefit of a separate operator is 
that you could mix upsert behavior and non-upsert add behavior in a single 
graph.  I'm not sure there is a huge need to use both in a single graph, but it 
is a difference between the two strategies.

  was:
Currently TINKERPOP-479 is being considered to do some sort of {{getOrCreate}} 
functionality.  However for some data stores such as Cassandra, this is still 
short of upserts.  As I understand it, {{getOrCreate}} still has to do a 
read-before-write.  In cases where the user can guarantee that upserts are 
going to be idempotent, there is a significant performance improvement and risk 
avoidance (race condition with multi-threaded read-before-write).  Additionally 
with some data stores such as Apache Cassandra, the natural way to update data 
is with an upsert.

This ticket is to consider adding an additional optional feature to support 
upserts by default on {{addV}} and {{addE}}.  This configuration would default 
to false so that it doesn't break anyone currently relying on errors when 
adding the same vertex or edge.  However if enabled, it would just add or 
modify data on the existing vertex or edge.

If overriding the existing {{addV}} and {{addE}} operations with this optional 
feature is undesirable, then perhaps new operators could be added like 
{{upsertV}} and {{upsertE}} or {{putV}} and {{putE}} and those could be used to 
both add and update the data.  Allowing it to insert data is important because 
otherwise you are left with having to read-before-write which incurs the 
performance cost and race condition risk.


> Introduce optional feature to allow for upserts without read-before-write
> -
>
> Key: TINKERPOP-1685
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1685
> Project: TinkerPop
>  Issue Type: Wish
>Reporter: Jeremy Hanna
>
> Currently TINKERPOP-479 is being considered to do some sort of 
> {{getOrCreate}} functionality.  However for some data stores such as 
> Cassandra, this is still short of upserts.  As I understand it, 
> {{getOrCreate}} still has to do a read-before-write.  In cases where the user 
> can guarantee that upserts are going to be idempotent, there is a significant 
> performance improvement and risk avoidance (race condition with 
> multi-threaded read-before-write).  Additionally with some data stores such 
> as Apache Cassandra, the natural way to update data is with an upsert.
> This ticket is to consider adding an additional optional feature to support 
> upserts by default on {{addV}} and {{addE}}.  This configuration would 
> default to false so that it doesn't break anyone currently relying on errors 
> when adding the same vertex or edge.  However if enabled, it would just add 
> or modify data on the existing vertex or edge.
> If overriding the existing {{addV}} and {{addE}} operations with this 
> optional feature is undesirable, then perhaps new operators could be added 
> like {{upsertV}} and {{upsertE}} or {{putV}} and {{putE}} and those could be 
> used to both add and update the data.  Allowing it to insert data is 
> important because otherwise you are left with having to read-before-write 
> which incurs the performance cost and race condition risk.  A benefit of a 
> separate operator is that you could mix upsert 

[jira] [Created] (TINKERPOP-1685) Introduce optional feature to allow for upserts without read-before-write

2017-06-05 Thread Jeremy Hanna (JIRA)
Jeremy Hanna created TINKERPOP-1685:
---

 Summary: Introduce optional feature to allow for upserts without 
read-before-write
 Key: TINKERPOP-1685
 URL: https://issues.apache.org/jira/browse/TINKERPOP-1685
 Project: TinkerPop
  Issue Type: Wish
Reporter: Jeremy Hanna


Currently TINKERPOP-479 is being considered to do some sort of {{getOrCreate}} 
functionality.  However for some data stores such as Cassandra, this is still 
short of upserts.  As I understand it, {{getOrCreate}} still has to do a 
read-before-write.  In cases where the user can guarantee that upserts are 
going to be idempotent, there is a significant performance improvement and risk 
avoidance (race condition with multi-threaded read-before-write).  Additionally 
with some data stores such as Apache Cassandra, the natural way to update data 
is with an upsert.

This ticket is to consider adding an additional optional feature to support 
upserts by default on add.  This configuration would default to false so that 
it doesn't break anyone currently relying on errors when adding the same vertex 
or edge.  However if enabled, it would just add or modify data on the existing 
vertex or edge.

If overriding the existing {{addV}} and {{addE}} operations with this optional 
feature is undesirable, then perhaps new operators could be added like 
{{upsertV}} and {{upsertE}} or {{putV}} and {{putE}} and those could be used to 
both add and update the data.  Allowing it to insert data is important because 
otherwise you are left with having to read-before-write which incurs the 
performance cost and race condition risk.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (TINKERPOP-1685) Introduce optional feature to allow for upserts without read-before-write

2017-06-05 Thread Jeremy Hanna (JIRA)

 [ 
https://issues.apache.org/jira/browse/TINKERPOP-1685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated TINKERPOP-1685:

Description: 
Currently TINKERPOP-479 is being considered to do some sort of {{getOrCreate}} 
functionality.  However for some data stores such as Cassandra, this is still 
short of upserts.  As I understand it, {{getOrCreate}} still has to do a 
read-before-write.  In cases where the user can guarantee that upserts are 
going to be idempotent, there is a significant performance improvement and risk 
avoidance (race condition with multi-threaded read-before-write).  Additionally 
with some data stores such as Apache Cassandra, the natural way to update data 
is with an upsert.

This ticket is to consider adding an additional optional feature to support 
upserts by default on {{addV}} and {{addE}}.  This configuration would default 
to false so that it doesn't break anyone currently relying on errors when 
adding the same vertex or edge.  However if enabled, it would just add or 
modify data on the existing vertex or edge.

If overriding the existing {{addV}} and {{addE}} operations with this optional 
feature is undesirable, then perhaps new operators could be added like 
{{upsertV}} and {{upsertE}} or {{putV}} and {{putE}} and those could be used to 
both add and update the data.  Allowing it to insert data is important because 
otherwise you are left with having to read-before-write which incurs the 
performance cost and race condition risk.

  was:
Currently TINKERPOP-479 is being considered to do some sort of {{getOrCreate}} 
functionality.  However for some data stores such as Cassandra, this is still 
short of upserts.  As I understand it, {{getOrCreate}} still has to do a 
read-before-write.  In cases where the user can guarantee that upserts are 
going to be idempotent, there is a significant performance improvement and risk 
avoidance (race condition with multi-threaded read-before-write).  Additionally 
with some data stores such as Apache Cassandra, the natural way to update data 
is with an upsert.

This ticket is to consider adding an additional optional feature to support 
upserts by default on add.  This configuration would default to false so that 
it doesn't break anyone currently relying on errors when adding the same vertex 
or edge.  However if enabled, it would just add or modify data on the existing 
vertex or edge.

If overriding the existing {{addV}} and {{addE}} operations with this optional 
feature is undesirable, then perhaps new operators could be added like 
{{upsertV}} and {{upsertE}} or {{putV}} and {{putE}} and those could be used to 
both add and update the data.  Allowing it to insert data is important because 
otherwise you are left with having to read-before-write which incurs the 
performance cost and race condition risk.


> Introduce optional feature to allow for upserts without read-before-write
> -
>
> Key: TINKERPOP-1685
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1685
> Project: TinkerPop
>  Issue Type: Wish
>Reporter: Jeremy Hanna
>
> Currently TINKERPOP-479 is being considered to do some sort of 
> {{getOrCreate}} functionality.  However for some data stores such as 
> Cassandra, this is still short of upserts.  As I understand it, 
> {{getOrCreate}} still has to do a read-before-write.  In cases where the user 
> can guarantee that upserts are going to be idempotent, there is a significant 
> performance improvement and risk avoidance (race condition with 
> multi-threaded read-before-write).  Additionally with some data stores such 
> as Apache Cassandra, the natural way to update data is with an upsert.
> This ticket is to consider adding an additional optional feature to support 
> upserts by default on {{addV}} and {{addE}}.  This configuration would 
> default to false so that it doesn't break anyone currently relying on errors 
> when adding the same vertex or edge.  However if enabled, it would just add 
> or modify data on the existing vertex or edge.
> If overriding the existing {{addV}} and {{addE}} operations with this 
> optional feature is undesirable, then perhaps new operators could be added 
> like {{upsertV}} and {{upsertE}} or {{putV}} and {{putE}} and those could be 
> used to both add and update the data.  Allowing it to insert data is 
> important because otherwise you are left with having to read-before-write 
> which incurs the performance cost and race condition risk.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: code freeze 3.2.5

2017-06-05 Thread Stephen Mallette
I published latest docs for 3.2.5-SNAPSHOT:

http://tinkerpop.apache.org/docs/3.2.5-SNAPSHOT/

and made another deployment to the Apache Snapshot Repo after those
TinkerFactory adjustments.

On Fri, Jun 2, 2017 at 8:39 PM, Stephen Mallette 
wrote:

> Just a reminder that code is frozen on the tp32 branch starting tomorrow
> (Saturday) and for the following week. We'll use this thread to discuss any
> issues or problems on 3.2.5 that are found during testing. There are no
> open pull requests and no outstanding issues that I'm aware of. I've
> published a TinkerPop 3.2.5-SNAPSHOT for providers to test against (or they
> may build themselves - whatever is more convenient).
>
> Thanks,
>
> Stephen
>


Re: Modern Graph

2017-06-05 Thread Stephen Mallette
It took a few more minor tweaks, but I think I finally have TinkerFactory
"right". Here's the final look (i hope):

https://github.com/apache/tinkerpop/blob/57e7f702e44356b72ffa7d1d11d526
ac5e768812/tinkergraph-gremlin/src/main/java/org/apache/tinkerpop/gremlin/
tinkergraph/structure/TinkerFactory.java

I ended up reverting createClassic() to how it was working before - it did
some weird stuff to some of the integration tests for some reason and I
wasn't sure why, so seemed best to keep it as it was. That method was
already "mostly" right - I'm just not sure why it didn't use a long
IdManager for vertex properties, but that's probably fine.

On Fri, Jun 2, 2017 at 7:41 PM, Stephen Mallette 
wrote:

> This really surprised me:
>
> gremlin> graph = TinkerFactory.createModern()
> ==>tinkergraph[vertices:6 edges:6]
> gremlin> g = graph.traversal()
> ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
> gremlin> g.V(1)
> ==>v[1]
> gremlin> g.V(1L)
> gremlin>
>
> We all know why it does that - I'm just amazed that createModern() isn't
> rigged with an IdManager to handle it properly. createClassic() is just
> fine:
>
> gremlin> graph = TinkerFactory.createClassic()
> ==>tinkergraph[vertices:6 edges:6]
> gremlin> g = graph.traversal()
> ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
> gremlin> g.V(1)
> ==>v[1]
> gremlin> g.V(1L)
> ==>v[1]
>
> Going to push a fix shortly to tp31 on up (can't seem to think of a single
> reason it is setup this way - so weird). I'll then issue the code freeze
> email to follow.
>
>
>