Re: [VOTE] TinkerPop 3.4.9 Release
I found this issue while testing sanity: https://issues.apache.org/jira/browse/TINKERPOP-2489 but this exists in older releases (tested with 3.4.8) as well and hence, I don't consider this as a release blocker. Other than this, I tested some basic sanity using console -> server interaction, which looked good. VOTE +1 -- Divij Vaidya On Wed, Dec 9, 2020 at 9:02 AM (null) (null) wrote: > VOTE +1 > > Sent from my iPhone > > Cheers, > Kelvin > > > > On Dec 9, 2020, at 10:06 AM, f...@florian-hockmann.de wrote: > > > > VOTE +1 > > > > -Ursprüngliche Nachricht- > > Von: Jorge Bay Gondra > > Gesendet: Mittwoch, 9. Dezember 2020 15:23 > > An: dev@tinkerpop.apache.org > > Betreff: Re: [VOTE] TinkerPop 3.4.9 Release > > > > VOTE +1 > > > >> On Mon, Dec 7, 2020 at 8:05 PM Stephen Mallette > >> wrote: > >> > >> Hello, > >> > >> We are happy to announce that TinkerPop 3.4.9 is ready for release. > >> > >> The release artifacts can be found at this location: > >>https://dist.apache.org/repos/dist/dev/tinkerpop/3.4.9/ > >> > >> The source distribution is provided by: > >>apache-tinkerpop-3.4.9-src.zip > >> > >> Two binary distributions are provided for user convenience: > >>apache-tinkerpop-gremlin-console-3.4.9-bin.zip > >>apache-tinkerpop-gremlin-server-3.4.9-bin.zip > >> > >> The GPG key used to sign the release artifacts is available at: > >>https://dist.apache.org/repos/dist/dev/tinkerpop/KEYS > >> > >> The online docs can be found here: > >>https://tinkerpop.apache.org/docs/3.4.9/ (user docs) > >>https://tinkerpop.apache.org/docs/3.4.9/upgrade/ (upgrade docs) > >>https://tinkerpop.apache.org/javadocs/3.4.9/core/ (core javadoc) > >>https://tinkerpop.apache.org/javadocs/3.4.9/full/ (full javadoc) > >>https://tinkerpop.apache.org/dotnetdocs/3.4.9/ (.NET API docs) > >>https://tinkerpop.apache.org/jsdocs/3.4.9/ (Javascript API > >> docs) > >> > >> The tag in Apache Git can be found here: > >>https://github.com/apache/tinkerpop/tree/3.4.9 > >> > >> The release notes are available here: > >> > >> https://github.com/apache/tinkerpop/blob/3.4.9/CHANGELOG.asciidoc > >> > >> The [VOTE] will be open for the next 72 hours --- closing Thursday > >> (December 10, 2020) at 2pm EST. > >> > >> My vote is +1. > >> > >> Thank you very much, > >> > >> Stephen > >> > > > >
[jira] [Created] (TINKERPOP-2489) Server doesn't start if folder has spaces
Divij Vaidya created TINKERPOP-2489: --- Summary: Server doesn't start if folder has spaces Key: TINKERPOP-2489 URL: https://issues.apache.org/jira/browse/TINKERPOP-2489 Project: TinkerPop Issue Type: Bug Components: server Affects Versions: 3.4.8, 3.4.9 Reporter: Divij Vaidya Repro steps: 1. Download the server zip. 2. Unzip the binary. 3. Rename the unzipped folder and add a space, e.g. {code:java} apache-tinkerpop-gremlin-server-3.4.9 my{code} 4. Start the server {code:java} ./bin/gremlin-server.sh start{code} 5. The server will fail to start (check status) with the error "Error: Could not find or load main class my.conf.log4j-server.properties" -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (TINKERPOP-2487) Add steps to support basic analysis like standard deviation and percentile
[ https://issues.apache.org/jira/browse/TINKERPOP-2487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17246736#comment-17246736 ] Kelvin R. Lawrence commented on TINKERPOP-2487: --- I have recently had two different users ask me if we have considered a `product` step also that would multiply all the values in the stream together. There is no easy workaround today outside of using lambdas/closures. > Add steps to support basic analysis like standard deviation and percentile > -- > > Key: TINKERPOP-2487 > URL: https://issues.apache.org/jira/browse/TINKERPOP-2487 > Project: TinkerPop > Issue Type: Improvement > Components: process >Affects Versions: 3.4.8 >Reporter: Guo Junshi >Priority: Minor > > When using tinkerpop Gremlin for real use cases, we found that some general > analytical steps are very useful, yet not supported now. Some analytical > steps are general enough to be part of the official gremlin package, e.g. > steps to calculate standard deviation and percentile. The example usage might > be: > > {code:java} > gremlin> g.V().values('ages') > ==>1 > ==>2 > ==>3 > gremlin> g.V().values('ages').stdev() > ==>0.816 > gremlin> g.V().values('ages').fold().stdev(Scope.local) > ==>0.816 > gremlin> g.V().values('ages').percentile(50) > ==>2 > // one percentile, return single value > gremlin> g.V().values('ages').percentile(0, 100) > ==>[0: 1, 100: 3] > // multiple percentiles, return a map{code} > These steps are frequently used in our cases, and we think it would be great > to support them in official versions. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (TINKERPOP-2389) Authorization support in TinkerPop
[ https://issues.apache.org/jira/browse/TINKERPOP-2389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17246702#comment-17246702 ] ASF GitHub Bot commented on TINKERPOP-2389: --- spmallette commented on pull request #1308: URL: https://github.com/apache/tinkerpop/pull/1308#issuecomment-741922455 Thanks for all the changes on this. I will give it another review in greater detail after 3.4.9 is officially released. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Authorization support in TinkerPop > -- > > Key: TINKERPOP-2389 > URL: https://issues.apache.org/jira/browse/TINKERPOP-2389 > Project: TinkerPop > Issue Type: Improvement > Components: server >Affects Versions: 3.4.7 >Reporter: Shekhar Bansal >Priority: Major > Attachments: Screenshot 2020-06-25 at 15.15.04.png > > > Use case: > # Tinkerpop supports multiple graphs using a single API and admin might want > to restrict access to some of the graphs. > # Admin might want to restrict read/write access to certain users. > > Proposal > Add read/write access restrictions at graph level. We can extend it to > executing scripts by adding execute privileges. > > Changes required > Add `authorizer` block similar to `authentication` block in yaml file > > {code:java} > authorization: { > authorizer: > org.apache.tinkerpop.gremlin.server.authorization.AllowAllAuthorizer, > authorizationHandler: > org.apache.tinkerpop.gremlin.server.handler.SaslAuthorizationHandler, > config: { >} > }{code} > > Authorization will be done only if authentication is enabled. Authentication > is done at per session basis while authorization will be done for each and > every request. > In `SaslAuthorizationHandler` or `HttpAuthorizationHandler` query will be > parsed and depending on the step instructions, the query will be marked as of > type read or write and then privilege evaluation will be done by calling > `isAccessAllowed` method of `Authorizer` > {code:java} > public interface Authorizer { > /** > * Whether or not the authorization requires check. > * If false will not authorzie user. > */ > public boolean requireAuthorization(); > /** > * Setup is called once upon system startup to initialize the {@code > Authorizer}. > */ > public void setup(final Map config); > /** > * A "standard" authorization implementation > */ > public boolean isAccessAllowed(AuthorizationRequest authorizationRequest) > throws AuthorizationException; > } > {code} > Access policies can be defined in tools like `Apache Ranger`, sample policy: > !Screenshot 2020-06-25 at 15.15.04.png|width=1017,height=548! > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (TINKERPOP-2389) Authorization support in TinkerPop
[ https://issues.apache.org/jira/browse/TINKERPOP-2389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17246700#comment-17246700 ] ASF GitHub Bot commented on TINKERPOP-2389: --- spmallette commented on a change in pull request #1308: URL: https://github.com/apache/tinkerpop/pull/1308#discussion_r539499923 ## File path: gremlin-server/src/main/java/org/apache/tinkerpop/gremlin/server/handler/HttpBasicAuthorizationHandler.java ## @@ -0,0 +1,116 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.tinkerpop.gremlin.server.handler; + +import io.netty.channel.ChannelFutureListener; +import io.netty.channel.ChannelHandler; +import io.netty.channel.ChannelHandlerContext; +import io.netty.channel.ChannelInboundHandlerAdapter; +import io.netty.handler.codec.http.DefaultFullHttpResponse; +import io.netty.handler.codec.http.FullHttpMessage; +import io.netty.handler.codec.http.FullHttpRequest; +import io.netty.handler.codec.http.HttpResponseStatus; +import io.netty.util.ReferenceCountUtil; +import org.apache.tinkerpop.gremlin.driver.Tokens; +import org.apache.tinkerpop.gremlin.driver.message.RequestMessage; +import org.apache.tinkerpop.gremlin.server.GremlinServer; +import org.apache.tinkerpop.gremlin.server.auth.AuthenticatedUser; +import org.apache.tinkerpop.gremlin.server.authz.AuthorizationException; +import org.apache.tinkerpop.gremlin.server.authz.Authorizer; +import org.javatuples.Quartet; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.Map; + +import static io.netty.handler.codec.http.HttpResponseStatus.BAD_REQUEST; +import static io.netty.handler.codec.http.HttpResponseStatus.INTERNAL_SERVER_ERROR; +import static io.netty.handler.codec.http.HttpResponseStatus.UNAUTHORIZED; +import static io.netty.handler.codec.http.HttpVersion.HTTP_1_1; + + +/** + * An authorization handler for the http channel that allows the {@link Authorizer} to be plugged into it. + * + * @author Marc de Lignie + */ +@ChannelHandler.Sharable +public class HttpBasicAuthorizationHandler extends ChannelInboundHandlerAdapter { +private static final Logger logger = LoggerFactory.getLogger(HttpBasicAuthorizationHandler.class); +private static final Logger auditLogger = LoggerFactory.getLogger(GremlinServer.AUDIT_LOGGER_NAME); + +private AuthenticatedUser user; +private final Authorizer authorizer; + +public HttpBasicAuthorizationHandler(Authorizer authorizer) { +this.authorizer = authorizer; +} + +@Override +public void channelRead(final ChannelHandlerContext ctx, final Object msg) { +if (msg instanceof FullHttpMessage){ +final FullHttpMessage request = (FullHttpMessage) msg; +try { +user = ctx.channel().attr(StateKey.AUTHENTICATED_USER).get(); +if (null == user) {// This is expected when using the AllowAllAuthenticator +user = AuthenticatedUser.ANONYMOUS_USER; +} +// ToDo: move getRequestArguments to a new preceding pipeline step in the Channelizer, but @Stephen, +// how about the sendAndCleanupConnection logic in HttpGremlinEndpointHandler? Review comment: As they are all static methods I think you could refactor to create a small final utility class to house them - `HttpUtil` or something like that? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Authorization support in TinkerPop > -- > > Key: TINKERPOP-2389 > URL: https://issues.apache.org/jira/browse/TINKERPOP-2389 > Project: TinkerPop > Issue Type: Improvement > Components: server >Affects Versions: 3.4.7 >Reporter: Shekhar Bansal >Priority: Major > Attachments: Screenshot 2020-06-25 at
Re: [VOTE] TinkerPop 3.4.9 Release
VOTE +1 Sent from my iPhone Cheers, Kelvin > On Dec 9, 2020, at 10:06 AM, f...@florian-hockmann.de wrote: > > VOTE +1 > > -Ursprüngliche Nachricht- > Von: Jorge Bay Gondra > Gesendet: Mittwoch, 9. Dezember 2020 15:23 > An: dev@tinkerpop.apache.org > Betreff: Re: [VOTE] TinkerPop 3.4.9 Release > > VOTE +1 > >> On Mon, Dec 7, 2020 at 8:05 PM Stephen Mallette >> wrote: >> >> Hello, >> >> We are happy to announce that TinkerPop 3.4.9 is ready for release. >> >> The release artifacts can be found at this location: >>https://dist.apache.org/repos/dist/dev/tinkerpop/3.4.9/ >> >> The source distribution is provided by: >>apache-tinkerpop-3.4.9-src.zip >> >> Two binary distributions are provided for user convenience: >>apache-tinkerpop-gremlin-console-3.4.9-bin.zip >>apache-tinkerpop-gremlin-server-3.4.9-bin.zip >> >> The GPG key used to sign the release artifacts is available at: >>https://dist.apache.org/repos/dist/dev/tinkerpop/KEYS >> >> The online docs can be found here: >>https://tinkerpop.apache.org/docs/3.4.9/ (user docs) >>https://tinkerpop.apache.org/docs/3.4.9/upgrade/ (upgrade docs) >>https://tinkerpop.apache.org/javadocs/3.4.9/core/ (core javadoc) >>https://tinkerpop.apache.org/javadocs/3.4.9/full/ (full javadoc) >>https://tinkerpop.apache.org/dotnetdocs/3.4.9/ (.NET API docs) >>https://tinkerpop.apache.org/jsdocs/3.4.9/ (Javascript API >> docs) >> >> The tag in Apache Git can be found here: >>https://github.com/apache/tinkerpop/tree/3.4.9 >> >> The release notes are available here: >> >> https://github.com/apache/tinkerpop/blob/3.4.9/CHANGELOG.asciidoc >> >> The [VOTE] will be open for the next 72 hours --- closing Thursday >> (December 10, 2020) at 2pm EST. >> >> My vote is +1. >> >> Thank you very much, >> >> Stephen >> >
AW: [VOTE] TinkerPop 3.4.9 Release
VOTE +1 -Ursprüngliche Nachricht- Von: Jorge Bay Gondra Gesendet: Mittwoch, 9. Dezember 2020 15:23 An: dev@tinkerpop.apache.org Betreff: Re: [VOTE] TinkerPop 3.4.9 Release VOTE +1 On Mon, Dec 7, 2020 at 8:05 PM Stephen Mallette wrote: > Hello, > > We are happy to announce that TinkerPop 3.4.9 is ready for release. > > The release artifacts can be found at this location: > https://dist.apache.org/repos/dist/dev/tinkerpop/3.4.9/ > > The source distribution is provided by: > apache-tinkerpop-3.4.9-src.zip > > Two binary distributions are provided for user convenience: > apache-tinkerpop-gremlin-console-3.4.9-bin.zip > apache-tinkerpop-gremlin-server-3.4.9-bin.zip > > The GPG key used to sign the release artifacts is available at: > https://dist.apache.org/repos/dist/dev/tinkerpop/KEYS > > The online docs can be found here: > https://tinkerpop.apache.org/docs/3.4.9/ (user docs) > https://tinkerpop.apache.org/docs/3.4.9/upgrade/ (upgrade docs) > https://tinkerpop.apache.org/javadocs/3.4.9/core/ (core javadoc) > https://tinkerpop.apache.org/javadocs/3.4.9/full/ (full javadoc) > https://tinkerpop.apache.org/dotnetdocs/3.4.9/ (.NET API docs) > https://tinkerpop.apache.org/jsdocs/3.4.9/ (Javascript API > docs) > > The tag in Apache Git can be found here: > https://github.com/apache/tinkerpop/tree/3.4.9 > > The release notes are available here: > > https://github.com/apache/tinkerpop/blob/3.4.9/CHANGELOG.asciidoc > > The [VOTE] will be open for the next 72 hours --- closing Thursday > (December 10, 2020) at 2pm EST. > > My vote is +1. > > Thank you very much, > > Stephen >
Re: [VOTE] TinkerPop 3.4.9 Release
VOTE +1 On Mon, Dec 7, 2020 at 8:05 PM Stephen Mallette wrote: > Hello, > > We are happy to announce that TinkerPop 3.4.9 is ready for release. > > The release artifacts can be found at this location: > https://dist.apache.org/repos/dist/dev/tinkerpop/3.4.9/ > > The source distribution is provided by: > apache-tinkerpop-3.4.9-src.zip > > Two binary distributions are provided for user convenience: > apache-tinkerpop-gremlin-console-3.4.9-bin.zip > apache-tinkerpop-gremlin-server-3.4.9-bin.zip > > The GPG key used to sign the release artifacts is available at: > https://dist.apache.org/repos/dist/dev/tinkerpop/KEYS > > The online docs can be found here: > https://tinkerpop.apache.org/docs/3.4.9/ (user docs) > https://tinkerpop.apache.org/docs/3.4.9/upgrade/ (upgrade docs) > https://tinkerpop.apache.org/javadocs/3.4.9/core/ (core javadoc) > https://tinkerpop.apache.org/javadocs/3.4.9/full/ (full javadoc) > https://tinkerpop.apache.org/dotnetdocs/3.4.9/ (.NET API docs) > https://tinkerpop.apache.org/jsdocs/3.4.9/ (Javascript API docs) > > The tag in Apache Git can be found here: > https://github.com/apache/tinkerpop/tree/3.4.9 > > The release notes are available here: > https://github.com/apache/tinkerpop/blob/3.4.9/CHANGELOG.asciidoc > > The [VOTE] will be open for the next 72 hours --- closing Thursday > (December 10, 2020) at 2pm EST. > > My vote is +1. > > Thank you very much, > > Stephen >
Re: [DISCUSS] Creating pattern steps to codify best practices
Josh, thanks for your thoughts - some responses inline: On Tue, Dec 8, 2020 at 10:16 PM Josh Perryman wrote: > I'll offer some thoughts. I'm seeing upsertV() as an idempotent getOrCreate > call which always returns a vertex with the label/property values specified > within the step. It's sort of a declarative pattern: "return this vertex to > me, find it if you can, create it if you must." > I like this description - I've added it to the gist, though it's a bit at odds with Dave's previous post, so we'll consider it a temporary addition until he responds. > On that account, I do like the simplification in 1. Repetition shouldn't be > necessary. In an ideal world, the engine should know the primary > identifiers (name or id) and find/create the vertex based on them. Any > other included values will be "trued up" as well. But this may be a bridge > too far for TinkerPop since knowing identifiers may require a specified > schema. I'd prefer to omit the third input, but it might be necessary to > keep it so that the second input can be for the matching use case. > In my most recent post on gremlin-users I think I came up with a nice way to get rid of the second Map. One Map that forms the full list of properties for upserting is easier than partitioning two Maps that essentially merge together. I imagine it's unlikely that application code will have that separation naturally so users will have the added step of trying to separate their data into searchable vs "just data". Getting us to one Map argument will simplify APIs for us and reduce complexity to users. Here is what I'd proposed for those not following over there: // match on name and age (or perhaps whatever the underlying graph system thinks is best?) g.upsertV('person', [name:'marko',age:29]) // match on name only g.upsertV('person', [name:'marko',age:29]).by('name') // explicitly match on name and age g.upsertV('person', [name:'marko',age:29]). by('name').by('age') // match on id only g.upsertV('person', [(T.id): 100, name:'marko',age:29]).by(T.id) // match on whatever the by(Traversal) predicate defines g.upsertV('person', [name:'marko',age:29]). by(has('name', 'marko')) // match on id, then update age g.upsertV('person', [(T.id): 100, name:'marko']).by(T.id). property('age',29) With this model, we get one Map argument that represents the complete property set to be added/updated to the graph and the user can hint on what key they wish to match on using by() where that sort of step modulation should be a well understood and familiar concept in Gremlin at this point. So that means I think 2 should always match or update the additional > values. Again, we're specifying the expected result and letting the engine > figure out best how to return that results and appropriately maintain > state. > I again like this description, but we'll see what Dave's thoughts are since he's a bit behind on the threads at this point I think. > I'm also presuming that anything not included as inputs to the upsertV() > step are then to be handled by following steps. I'm hoping that is a > sufficient approach for addressing the multi/meta property use cases > brought up in 3. > yeahit needs more thought. I spent more time thinking on this issue yesterday than I have for all the previous posts combined and I think it yielded something good in that revised syntax. It's going to take more of that kind of elbow grease to dig into these lesser use cases to make sure we aren't coding ourselves into corners. > I do like the idea of using modulators (with(), by()) for more > sophisticated usage and advanced use cases. Also, the streaming examples > are quite elegant allowing for a helpful separation of data and logic. > cool - hope you like the revised syntax I posted then. :) > That's my humble take. This is a very welcome addition to the language and > I appreciate the thoughtful & collaborative approach to the design > considerations. > Thanks again and please keep the thoughts coming. Lots of other interesting design discussions seem to be brewing. > > Josh > > On Tue, Dec 8, 2020 at 8:57 AM Stephen Mallette > wrote: > > > I started a expanded this discussion to gremlin-users for a wider > audience > > and the thread is starting to grow: > > > > https://groups.google.com/g/gremlin-users/c/QBmiOUkA0iI/m/pj5Ukiq6AAAJ > > > > I guess we'll need to summarize that discussion back here now > > > > I did have some more thoughts to hang out there and figured that I > wouldn't > > convolute the discussion on gremlin-users with it so I will continue the > > discussion here. > > > > 1, The very first couple of examples seem wrong (or at least not best > > demonstrating the usage): > > > > g.upsertV('person', [name: 'marko'], > > [name: 'marko', age: 29]) > > g.upsertV('person', [(T.id): 1], > > [(T.id): 1, name: 'Marko']) > > > > should instead be: > > > > g.upsertV('person', [name: 'marko'], >
[jira] [Updated] (TINKERPOP-2487) Add steps to support basic analysis like standard deviation and percentile
[ https://issues.apache.org/jira/browse/TINKERPOP-2487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen Mallette updated TINKERPOP-2487: Affects Version/s: 3.4.8 > Add steps to support basic analysis like standard deviation and percentile > -- > > Key: TINKERPOP-2487 > URL: https://issues.apache.org/jira/browse/TINKERPOP-2487 > Project: TinkerPop > Issue Type: Improvement > Components: process >Affects Versions: 3.4.8 >Reporter: Guo Junshi >Priority: Minor > > When using tinkerpop Gremlin for real use cases, we found that some general > analytical steps are very useful, yet not supported now. Some analytical > steps are general enough to be part of the official gremlin package, e.g. > steps to calculate standard deviation and percentile. The example usage might > be: > > {code:java} > gremlin> g.V().values('ages') > ==>1 > ==>2 > ==>3 > gremlin> g.V().values('ages').stdev() > ==>0.816 > gremlin> g.V().values('ages').fold().stdev(Scope.local) > ==>0.816 > gremlin> g.V().values('ages').percentile(50) > ==>2 > // one percentile, return single value > gremlin> g.V().values('ages').percentile(0, 100) > ==>[0: 1, 100: 3] > // multiple percentiles, return a map{code} > These steps are frequently used in our cases, and we think it would be great > to support them in official versions. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [New Step Discussion] Add Steps to Support Basic Distribution Analysis (e.g. Standard Deviation and Percentile)
Thanks for posting. In the math department, I think that these two steps are asked for commonly and I think we have reached a point where the things folks are doing with Gremlin are requiring steps of greater specificity so this conversation is definitely expected. We currently have two sorts of steps for operating on numbers: reducing steps like sum() and then math() step for expressions. It's interesting what you can accomplish with those two steps - note here how Kelvin manages standard deviation without lambdas: g.V().hasLabel('airport'). values('runways').fold().as('runways'). mean(local).as('mean'). select('runways').unfold(). math('(_-mean)^2').mean().math('sqrt(_)') https://kelvinlawrence.net/book/Gremlin-Graph-Guide.html#stddevone In any case, we can see that there is a fair bit of indirection there to do the work of a simple stdev() step. I've often wondered if math() could behave both in the way it does now and as a form of reducing step. In that way we could quietly add new math functions without forming new steps, as I can't help imaging that the addition of stdev() and percentile() will then follow with: variance(), covariance(), confidence() and so on. Kelvin recently asked me about mult() for use cases that he sees from time to time. As it stands our math expression library exp4j: https://www.objecthunter.net/exp4j/ is good at extensibility but isn't' really formed well out of the box to handle reducing operations because its architecture forces you to specify the number of arguments it will take up front and those arguments must be double: https://www.objecthunter.net/exp4j/#Custom_functions So, that would be an issue to contend with, but technical issues aside and focusing instead on the user angle, would math() that worked as follows be a good path? gremlin> g.V().values('ages').fold().math(local, "stdev(_)") ==>0.816 gremlin> g.inject([1,2,3]).math(local, "product(_)") ==>6 And then, what distinction would there be between a math() step and first class "math steps" like sum(), min(), max(), and mean()? in other words, why would those exist if math() could already do it all? What makes a math operation "common" enough to beget its own first class representation? Just to be clear, I'm not saying we shouldn't add stdev()/percentile() - I just want to consider all the design possibilities and talk them through. Thanks again for bringing up this conversation. I will link this thread to your JIRA for reference. On Wed, Dec 9, 2020 at 6:40 AM js guo wrote: > Hi team, > > We are using tinkerpop Gremlin in our risk detection cases. Some analytical > calculations are used frequently, yet there is no corresponding steps in > hand. > > I am thinking that some general analytical steps can be added in Gremlin. > e.g. steps to calculate standard deviation and percentile. The example > usage might be as follows. > > gremlin> g.V().values('ages') > ==>1 > ==>2 > ==>3 > gremlin> g.V().values('ages').stdev() > ==>0.816 > gremlin> g.V().values('ages').fold().stdev(Scope.local) > ==>0.816 > > gremlin> g.V().values('ages').percentile(50) > ==>2 > // one percentile, return single value > gremlin> g.V().values('ages').percentile(0, 100) > ==>[0: 1, 100: 3] > // multiple percentiles, return a map > > > Sorry for not emailing earlier, I have created a JIRA ticket for this > https://issues.apache.org/jira/browse/TINKERPOP-2487. > > As new steps are already used in our cases, we are glad to offer the > implementation for review, if you think it good to add the two steps. > > Regards, > Junshi >
[New Step Discussion] Add Steps to Support Basic Distribution Analysis (e.g. Standard Deviation and Percentile)
Hi team, We are using tinkerpop Gremlin in our risk detection cases. Some analytical calculations are used frequently, yet there is no corresponding steps in hand. I am thinking that some general analytical steps can be added in Gremlin. e.g. steps to calculate standard deviation and percentile. The example usage might be as follows. gremlin> g.V().values('ages') ==>1 ==>2 ==>3 gremlin> g.V().values('ages').stdev() ==>0.816 gremlin> g.V().values('ages').fold().stdev(Scope.local) ==>0.816 gremlin> g.V().values('ages').percentile(50) ==>2 // one percentile, return single value gremlin> g.V().values('ages').percentile(0, 100) ==>[0: 1, 100: 3] // multiple percentiles, return a map Sorry for not emailing earlier, I have created a JIRA ticket for this https://issues.apache.org/jira/browse/TINKERPOP-2487. As new steps are already used in our cases, we are glad to offer the implementation for review, if you think it good to add the two steps. Regards, Junshi