[jira] [Commented] (IGNITE-10493) Refactor exchange stages time measurements
[ https://issues.apache.org/jira/browse/IGNITE-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725820#comment-16725820 ] Pavel Kovalenko commented on IGNITE-10493: -- [~agoncharuk] Thank you for review. Merged to master. > Refactor exchange stages time measurements > -- > > Key: IGNITE-10493 > URL: https://issues.apache.org/jira/browse/IGNITE-10493 > Project: Ignite > Issue Type: Improvement > Components: cache >Affects Versions: 2.7 >Reporter: Pavel Kovalenko >Assignee: Pavel Kovalenko >Priority: Major > Fix For: 2.8 > > > At the current implementation, we don't cover and measure all possible code > executions that influence on PME time. Instead of it we just measure the > hottest separate parts with the following hardcoded pattern: > {noformat} > long time = currentTime(); > ... // some code block > print ("Stage name performed in " + (currentTime() - time)); > {noformat} > This approach can be improved. Instead of declaring time variable and print > the message to log immediately we can introduce a utility class (TimesBag) > that will hold all stages and their times. The content of TimesBag can be > printed when the exchange future is done. > As exchange is a linear process that executes init stage by exchange-worker > and finish stage by one of the sys thread we can easily cover all exchange > code base by time cutoffs. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-10493) Refactor exchange stages time measurements
[ https://issues.apache.org/jira/browse/IGNITE-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725819#comment-16725819 ] ASF GitHub Bot commented on IGNITE-10493: - Github user asfgit closed the pull request at: https://github.com/apache/ignite/pull/5688 > Refactor exchange stages time measurements > -- > > Key: IGNITE-10493 > URL: https://issues.apache.org/jira/browse/IGNITE-10493 > Project: Ignite > Issue Type: Improvement > Components: cache >Affects Versions: 2.7 >Reporter: Pavel Kovalenko >Assignee: Pavel Kovalenko >Priority: Major > Fix For: 2.8 > > > At the current implementation, we don't cover and measure all possible code > executions that influence on PME time. Instead of it we just measure the > hottest separate parts with the following hardcoded pattern: > {noformat} > long time = currentTime(); > ... // some code block > print ("Stage name performed in " + (currentTime() - time)); > {noformat} > This approach can be improved. Instead of declaring time variable and print > the message to log immediately we can introduce a utility class (TimesBag) > that will hold all stages and their times. The content of TimesBag can be > printed when the exchange future is done. > As exchange is a linear process that executes init stage by exchange-worker > and finish stage by one of the sys thread we can easily cover all exchange > code base by time cutoffs. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-10493) Refactor exchange stages time measurements
[ https://issues.apache.org/jira/browse/IGNITE-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725748#comment-16725748 ] Alexey Goncharuk commented on IGNITE-10493: --- [~Jokser], looks good to me! > Refactor exchange stages time measurements > -- > > Key: IGNITE-10493 > URL: https://issues.apache.org/jira/browse/IGNITE-10493 > Project: Ignite > Issue Type: Improvement > Components: cache >Affects Versions: 2.7 >Reporter: Pavel Kovalenko >Assignee: Pavel Kovalenko >Priority: Major > Fix For: 2.8 > > > At the current implementation, we don't cover and measure all possible code > executions that influence on PME time. Instead of it we just measure the > hottest separate parts with the following hardcoded pattern: > {noformat} > long time = currentTime(); > ... // some code block > print ("Stage name performed in " + (currentTime() - time)); > {noformat} > This approach can be improved. Instead of declaring time variable and print > the message to log immediately we can introduce a utility class (TimesBag) > that will hold all stages and their times. The content of TimesBag can be > printed when the exchange future is done. > As exchange is a linear process that executes init stage by exchange-worker > and finish stage by one of the sys thread we can easily cover all exchange > code base by time cutoffs. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-10493) Refactor exchange stages time measurements
[ https://issues.apache.org/jira/browse/IGNITE-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725737#comment-16725737 ] Ignite TC Bot commented on IGNITE-10493: {panel:title=-- Run :: All: Possible Blockers|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1} {color:#d04437}Hibernate 5.3{color} [[tests 0 Exit Code |https://ci.ignite.apache.org/viewLog.html?buildId=2594882]] {panel} [TeamCity *-- Run :: All* Results|https://ci.ignite.apache.org/viewLog.html?buildId=2594883buildTypeId=IgniteTests24Java8_RunAll] > Refactor exchange stages time measurements > -- > > Key: IGNITE-10493 > URL: https://issues.apache.org/jira/browse/IGNITE-10493 > Project: Ignite > Issue Type: Improvement > Components: cache >Affects Versions: 2.7 >Reporter: Pavel Kovalenko >Assignee: Pavel Kovalenko >Priority: Major > Fix For: 2.8 > > > At the current implementation, we don't cover and measure all possible code > executions that influence on PME time. Instead of it we just measure the > hottest separate parts with the following hardcoded pattern: > {noformat} > long time = currentTime(); > ... // some code block > print ("Stage name performed in " + (currentTime() - time)); > {noformat} > This approach can be improved. Instead of declaring time variable and print > the message to log immediately we can introduce a utility class (TimesBag) > that will hold all stages and their times. The content of TimesBag can be > printed when the exchange future is done. > As exchange is a linear process that executes init stage by exchange-worker > and finish stage by one of the sys thread we can easily cover all exchange > code base by time cutoffs. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-10493) Refactor exchange stages time measurements
[ https://issues.apache.org/jira/browse/IGNITE-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724911#comment-16724911 ] Pavel Kovalenko commented on IGNITE-10493: -- Timings in logs looks like this: {noformat} [2018-12-19 13:33:09,942][INFO ][sys-#322%database.IgniteDbMultiNodePutGetTest0%][GridDhtPartitionsExchangeFuture] Exchange timings [startVer=AffinityTopologyVersion [topVer=3, minorTopVer=1], resVer=AffinityTopologyVersion [topVer=3, minorTopVer=1], stage="Waiting in exchange queue" (0 ms), stage="Exchange parameters initialization" (0 ms), stage="Update caches registry" (144 ms), stage="Start caches" (56 ms), stage="Affinity initialization on cache group start" (11 ms), stage="Exchange type determination" (0 ms), stage="Preloading notification" (0 ms), stage="WAL history reservation" (0 ms), stage="Wait partitions release" (0 ms), stage="After states restored callback" (220 ms), stage="Waiting for all single messages" (27 ms), stage="Affinity recalculation (crd)" (2 ms), stage="Collect update counters and create affinity messages" (0 ms), stage="Validate partitions states" (0 ms), stage="Assign partitions states" (1 ms), stage="Ideal affinity diff calculation (enforced)" (6 ms), stage="Apply update counters" (0 ms), stage="Full message preparing" (5 ms), stage="Full message sending" (12 ms), stage="State finish message sending" (8 ms), stage="Exchange done" (65 ms), stage="Total time" (557 ms), Discovery lag / Clocks discrepancy = 13 ms.] [2018-12-19 13:33:09,943][INFO ][sys-#322%database.IgniteDbMultiNodePutGetTest0%][GridDhtPartitionsExchangeFuture] Exchange longest local stages [startVer=AffinityTopologyVersion [topVer=3, minorTopVer=1], resVer=AffinityTopologyVersion [topVer=3, minorTopVer=1], stage="Affinity initialization on cache group start [grp=tiny]" (0 ms) (parent=Affinity initialization on cache group start), stage="Affinity initialization on cache group start [grp=non-primitive]" (0 ms) (parent=Affinity initialization on cache group start), stage="Affinity initialization on cache group start [grp=large]" (0 ms) (parent=Affinity initialization on cache group start), stage="Affinity centralized initialization (crd) [grp=tiny]" (0 ms) (parent=Exchange type determination), stage="Affinity centralized initialization (crd) [grp=non-primitive]" (0 ms) (parent=Exchange type determination), stage="Affinity centralized initialization (crd) [grp=large]" (0 ms) (parent=Exchange type determination), stage="Restore partition states" (0 ms) (parent=After states restored callback), stage="Affinity recalculation (partitions availability) [grp=tiny]" (0 ms) (parent=Ideal affinity diff calculation (enforced)), stage="Affinity recalculation (partitions availability) [grp=non-primitive]" (0 ms) (parent=Ideal affinity diff calculation (enforced)), stage="Affinity recalculation (partitions availability) [grp=large]" (0 ms) (parent=Ideal affinity diff calculation (enforced))] {noformat} > Refactor exchange stages time measurements > -- > > Key: IGNITE-10493 > URL: https://issues.apache.org/jira/browse/IGNITE-10493 > Project: Ignite > Issue Type: Improvement > Components: cache >Affects Versions: 2.7 >Reporter: Pavel Kovalenko >Assignee: Pavel Kovalenko >Priority: Major > Fix For: 2.8 > > > At the current implementation, we don't cover and measure all possible code > executions that influence on PME time. Instead of it we just measure the > hottest separate parts with the following hardcoded pattern: > {noformat} > long time = currentTime(); > ... // some code block > print ("Stage name performed in " + (currentTime() - time)); > {noformat} > This approach can be improved. Instead of declaring time variable and print > the message to log immediately we can introduce a utility class (TimesBag) > that will hold all stages and their times. The content of TimesBag can be > printed when the exchange future is done. > As exchange is a linear process that executes init stage by exchange-worker > and finish stage by one of the sys thread we can easily cover all exchange > code base by time cutoffs. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-10493) Refactor exchange stages time measurements
[ https://issues.apache.org/jira/browse/IGNITE-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16723137#comment-16723137 ] ASF GitHub Bot commented on IGNITE-10493: - GitHub user Jokser opened a pull request: https://github.com/apache/ignite/pull/5688 IGNITE-10493 Refactor exchange timings You can merge this pull request into a Git repository by running: $ git pull https://github.com/gridgain/apache-ignite ignite-10493 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/ignite/pull/5688.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5688 commit 11bab06240ca9dae34d1b108f45f374feae08348 Author: Pavel Kovalenko Date: 2018-12-03T18:18:16Z IGNITE-10493 WIP Signed-off-by: Pavel Kovalenko commit f9bf25cea24fbc40fe0cb5b7326a485b2afbfee0 Author: Pavel Kovalenko Date: 2018-12-12T10:10:59Z IGNITE-10493 WIP Signed-off-by: Pavel Kovalenko commit a3ad16f63b2b11ccb829b1468339a759a776a21e Author: Pavel Kovalenko Date: 2018-12-17T16:36:53Z IGNITE-10493 Refactor exchange timings. Signed-off-by: Pavel Kovalenko > Refactor exchange stages time measurements > -- > > Key: IGNITE-10493 > URL: https://issues.apache.org/jira/browse/IGNITE-10493 > Project: Ignite > Issue Type: Improvement > Components: cache >Affects Versions: 2.7 >Reporter: Pavel Kovalenko >Assignee: Pavel Kovalenko >Priority: Major > Fix For: 2.8 > > > At the current implementation, we don't cover and measure all possible code > executions that influence on PME time. Instead of it we just measure the > hottest separate parts with the following hardcoded pattern: > {noformat} > long time = currentTime(); > ... // some code block > print ("Stage name performed in " + (currentTime() - time)); > {noformat} > This approach can be improved. Instead of declaring time variable and print > the message to log immediately we can introduce a utility class (TimesBag) > that will hold all stages and their times. The content of TimesBag can be > printed when the exchange future is done. > As exchange is a linear process that executes init stage by exchange-worker > and finish stage by one of the sys thread we can easily cover all exchange > code base by time cutoffs. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-10493) Refactor exchange stages time measurements
[ https://issues.apache.org/jira/browse/IGNITE-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16704835#comment-16704835 ] Pavel Kovalenko commented on IGNITE-10493: -- [~Mmuzaf] Hello, Maxim. Thank you for comment. Of course, it would be a very useful enhancement. But most of our current jmx metrics are continuous, while exchange timings are discrete and related to concrete topology version. I'm not able to find out at the moment how we can implement it, but if you have some good ideas - feel free to create a ticket with your thoughts. > Refactor exchange stages time measurements > -- > > Key: IGNITE-10493 > URL: https://issues.apache.org/jira/browse/IGNITE-10493 > Project: Ignite > Issue Type: Improvement > Components: cache >Affects Versions: 2.7 >Reporter: Pavel Kovalenko >Priority: Major > Fix For: 2.8 > > > At the current implementation, we don't cover and measure all possible code > executions that influence on PME time. Instead of it we just measure the > hottest separate parts with the following hardcoded pattern: > {noformat} > long time = currentTime(); > ... // some code block > print ("Stage name performed in " + (currentTime() - time)); > {noformat} > This approach can be improved. Instead of declaring time variable and print > the message to log immediately we can introduce a utility class (TimesBag) > that will hold all stages and their times. The content of TimesBag can be > printed when the exchange future is done. > As exchange is a linear process that executes init stage by exchange-worker > and finish stage by one of the sys thread we can easily cover all exchange > code base by time cutoffs. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-10493) Refactor exchange stages time measurements
[ https://issues.apache.org/jira/browse/IGNITE-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16704784#comment-16704784 ] Maxim Muzafarov commented on IGNITE-10493: -- Hello, Pavel! I think we should extend our JMX metrics (for instance, create a new Bean) and allow external tools to collect and aggregate these metrics regarding to exchange stages. WDYT? Will it be a good idea? > Refactor exchange stages time measurements > -- > > Key: IGNITE-10493 > URL: https://issues.apache.org/jira/browse/IGNITE-10493 > Project: Ignite > Issue Type: Improvement > Components: cache >Affects Versions: 2.7 >Reporter: Pavel Kovalenko >Priority: Major > Fix For: 2.8 > > > At the current implementation, we don't cover and measure all possible code > executions that influence on PME time. Instead of it we just measure the > hottest separate parts with the following hardcoded pattern: > {noformat} > long time = currentTime(); > ... // some code block > print ("Stage name performed in " + (currentTime() - time)); > {noformat} > This approach can be improved. Instead of declaring time variable and print > the message to log immediately we can introduce a utility class (TimesBag) > that will hold all stages and their times. The content of TimesBag can be > printed when the exchange future is done. > As exchange is a linear process that executes init stage by exchange-worker > and finish stage by one of the sys thread we can easily cover all exchange > code base by time cutoffs. -- This message was sent by Atlassian JIRA (v7.6.3#76005)