Re: [New Step Discussion] Add Steps to Support Basic Distribution Analysis (e.g. Standard Deviation and Percentile)

2021-01-06 Thread Stephen Mallette
As there hasn't been any real objection to this direction I've started code review on the PR: https://github.com/apache/tinkerpop/pull/1375 On Thu, Dec 24, 2020 at 1:25 AM js guo wrote: > OK. I have updated the PR to target master branch. > If you have time to review my code, please note the

Re: [New Step Discussion] Add Steps to Support Basic Distribution Analysis (e.g. Standard Deviation and Percentile)

2020-12-23 Thread js guo
OK. I have updated the PR to target master branch. If you have time to review my code, please note the `PercentileGlobalStep` implementation which holds intermediate results as instance variables to avoid extra object creation. If this works, I think we can apply similar behavior to

Re: [New Step Discussion] Add Steps to Support Basic Distribution Analysis (e.g. Standard Deviation and Percentile)

2020-12-21 Thread Stephen Mallette
After some more thought on this, I don't think we need to do a scoped math() step, but I think an improvement that allows both individual numbers as well as arrays of numbers as variables would suffice. Then math() could contain a full listing of all the possible math operations that can be

Re: [New Step Discussion] Add Steps to Support Basic Distribution Analysis (e.g. Standard Deviation and Percentile)

2020-12-17 Thread js guo
Thanks for the reply. We definitely can have a separate DISCUSS thread for details of refactoring reducing of number streams. As for my PR, agree that new steps are better targeted in 3.5.0 versions instead of minor fix versions. I will later apply the changes to master branch and raise a new

Re: [New Step Discussion] Add Steps to Support Basic Distribution Analysis (e.g. Standard Deviation and Percentile)

2020-12-16 Thread Stephen Mallette
Some responses inline: On Fri, Dec 11, 2020 at 12:53 AM js guo wrote: > Thanks for the reply. It is a good idea to provide reducing operations > through math() step. But from my understanding, we still need different > reducing steps or at least different seed suppliers and reducing operators >

Re: [New Step Discussion] Add Steps to Support Basic Distribution Analysis (e.g. Standard Deviation and Percentile)

2020-12-15 Thread js guo
Thanks. I think nPk and nCk calculations are different from the reducing barrier step candidates. For nPk and nCk type calculations, as long as `factorial` is supported by `math` step, I think we can manage it by using `factorial`, *` and `/`. But if `factorial` is supported in `math` step, I

Re: [New Step Discussion] Add Steps to Support Basic Distribution Analysis (e.g. Standard Deviation and Percentile)

2020-12-14 Thread Kelvin Lawrence
I quite like the idea of adding some additional reducing barrier steps that make it easy to do simple statistical analysis on streams of numbers. Likely candidates are `stddev`, `variance` and `product`. It's not easy today to calculate the product of a stream in Gremlin using the `math` step.

Re: [New Step Discussion] Add Steps to Support Basic Distribution Analysis (e.g. Standard Deviation and Percentile)

2020-12-10 Thread js guo
Thanks for the reply. It is a good idea to provide reducing operations through math() step. But from my understanding, we still need different reducing steps or at least different seed suppliers and reducing operators in the back-end. gremlin> g.V().values('age').fold().math(local, "stdev(_)")

Re: [New Step Discussion] Add Steps to Support Basic Distribution Analysis (e.g. Standard Deviation and Percentile)

2020-12-09 Thread Stephen Mallette
Thanks for posting. In the math department, I think that these two steps are asked for commonly and I think we have reached a point where the things folks are doing with Gremlin are requiring steps of greater specificity so this conversation is definitely expected. We currently have two sorts of

[New Step Discussion] Add Steps to Support Basic Distribution Analysis (e.g. Standard Deviation and Percentile)

2020-12-09 Thread js guo
Hi team, We are using tinkerpop Gremlin in our risk detection cases. Some analytical calculations are used frequently, yet there is no corresponding steps in hand. I am thinking that some general analytical steps can be added in Gremlin. e.g. steps to calculate standard deviation and percentile.