Surya Hebbar has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/21683 )

Change subject: IMPALA-13304: Include aggregate instance-level metrics within 
experimental profile(V2)
......................................................................

IMPALA-13304: Include aggregate instance-level metrics within experimental 
profile(V2)

Currently, instance-level details of fragment events are being
completely omitted from the experimental profile in contrast with
the traditional profile. This is in order to limit the profile size
from growing rapidly with increasing number of instances.

This patch introduces aggregate instance-level metrics to profile V2
or the experimental profile without allowing the profile size to
grow rapidly.

The experimental profile can be generated instead of the traditional
profile by setting 'gen_experimental_profile' to 'true'.

When the number of instances is more than 5, to limit the profile size
from growing rapidly, only the following aggregate metrics are included
in the profile. The aggregated metrics are calculated by splitting
the event timestamps into 5 divisions, each spanning a duration of 20%
between the maximum and minimum instance timestamps.
- w.r.t an event's division of timestamps generated from processing
    a particular plan node -
  * Maximum timestamp
  * Minimum timestamp
  * Average timestamp
  * Total no. of instances

In case the number of instances is less than or equal to 5,
all instances' event timestamps are included.

The aggregate metrics are calculated with minimal overhead through
assignment to a particular divison without the need for sorting,
resulting in a time complexity of O(n) with only two passes through the
entire list of timestamps.

To further optimize the performance, the aggregates are calculated
by circumventing the requirement to store each division's timestamps,
utilizing only the memory needed for a single value per metric,
instead of the entire range of values.

For efficiently copying the calculated values without internally
reallocating on insertion, memory is preallocated for each array
of metrics using RapidJSON library.

On using the experimental JSON profile, within a particular plan node's
profile, the following structure is used.

When no. of instances > 5 -
{
  profile_name : <PLAN_NODE_NAME>,
  num_children : <NUM_CHILDREN>
  node_metadata : <NODE_METADATA_OBJECT>
  event_sequences :
  [{
    events : // An example event
    [{
      label : "Open Started""
      ts_stat :
      {
        min : [ 2257887941, ...4 other division's minimum timestamps ],
        max : [ 3257887941, ...4 other division's maximum timestamps ],
        avg : [ 2757887941, ...4 other division's average timestamps ]
        count : [ 2, ...4 other counts of divison's no. of instances ]
      }
    }, ...other plan node's events
    ]
  }],
  counters : <COUNTERS_OBJECT_ARRAY>,
  child_profiles : <CHILD_PROFILES>
}

When no. of instances <= 5 -
{
  profile_name : <PLAN_NODE_NAME>,
  num_children : <NUM_CHILDREN>
  node_metadata : <NODE_METADATA_OBJECT>
  event_sequences :
  [{
    offset : 0
    events : // An example event
    [{
      label : "Open Started""
      ts_list : [ 2257887941, ...4 other instance's timestamps ]
    }, ...other plan node's events
    ]
  }],
  counters : <COUNTERS_OBJECT_ARRAY>,
  child_profiles : <CHILD_PROFILES>
}

Note: In the above structures, unlike a plan node's profile,
a fragment's profile does not contain the 'node_metadata' field.

Change-Id: I49e18a7a7e1288e3e674e15b6fc86aad60a08214
---
M be/src/util/runtime-profile.cc
M be/src/util/runtime-profile.h
2 files changed, 102 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/21683/4
--
To view, visit http://gerrit.cloudera.org:8080/21683
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I49e18a7a7e1288e3e674e15b6fc86aad60a08214
Gerrit-Change-Number: 21683
Gerrit-PatchSet: 4
Gerrit-Owner: Surya Hebbar <sheb...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Reviewer: Surya Hebbar <sheb...@cloudera.com>

Reply via email to