Re: [Essentials] Collecting usage telemetry from Jenkins Pipeline

Samuel Van Oort Mon, 19 Mar 2018 14:50:00 -0700

Late to the party because I was heads-down on Pipeline bugs a lot of 
Friday, but this is a subject near-and-dear to my heart and in the past 
I've discussed what metrics might be interesting since this was an explicit 
intent to surface from my Bismuth (Pipeline Graph Analysis APIs).  Some of 
these are things I'd wanted to make a weekend project of (including 
surfacing the existing workflow-cps performance metrics).

We should aim to implement Metrics using the existing Metrics interface 
because then that can be fairly easy exported in a variety of ways -- I use 
a Graphite Metrics reporter that couples to another metric aggregator/store 
for the Pipeline Scalability Lab (some may know it as "Hydra").  Other 
*cough* proprietary systems may already consume this format of data.  I 
would not be surprised if a StatsD reporter is pretty easy to hack together 
using https://github.com/ReadyTalk/metrics-statsd and you get a lot of 
goodies "for free."

The one catch for implementing metrics is that we want to be cautious about 
adding too much overhead to the execution process. 

As far as specific metrics:

> distinct built-in step invocations (i.e. not counting Global Variable 
invocations)

This can't be measured easily from the flow graph due to the potential to 
create multiple block structures for one step.  It COULD be added easily 
via a registered new StepListener API in workflow-api (and implemented in 
workflow-cps) though.   I think it's valuable.

> configured Declarative Pipelines, configured Script Pipelines 

We can get all Pipelines (flavor-agnostic) by iterating over WorkflowJob 
items.  Not sure how we'd tell Scripted vs. Declarative -- maybe 
registering a Listener extension point of some sort?   I see value here. 

I'd *also* like to have a breakdown of which Pipelines have been run in the 
last, say week and month, by type (easy to do by looking at the most recent 
build).   That way we know not just which were created but which are in 
active use. 

> Pipeline executions

Rates and counts can be achieved with the existing Metrics Timer time.  I'd 
like to see that broken down by Scripted vs. Declarative as well. 

> * Global Shared Pipelines configured 
>   * Folder-level Shared Pipelines configured 

Do you mean Shared Library use?  One metric I'd be interested in is how 
many shared libraries are used *per-pipeline* -- easy to measure from the 
count of LoadedScripts I believe (correct me if there's something I'm 
missing here, Jesse). 

> Agents used per-Pipeline

I think should be possible to do this easily via flow graph analysis, 
looking for WorkspaceActionImpl -- nodes and labels are be available.  We 
might want to count total nodes *uses* (open/close of node blocks) and 
distinct nodes used.  

Best to triggers as a post-build analysis using the RunTrigger -- that way 
it's just a quick iteration over the Pipeline. 

> Runtime duration per step invocation

This is one of the MOST useful metrics I think.

I already have an implementation used in the Scalability Lab that does this 
on a per-flownode basis using the GraphListener (rather than per-step).  
This is part of a small utility plugin for metrics used in the scalability 
lab (not hosted currently since it's not general-use). 

Doing per-step is somewhat more complex - for many steps, trivial, but for 
example for a Retry step there's not a logical way to do it because you get 
multiple blocks.  Blocks in general are undefined - do you count the block 
*contents*, just the start, just the end, or start+end nodes?   Also 
remember that Groovy logic counts against the Step time with the 
FlowNodes.  Usually that shouldn't be a huge issue unless the Groovy is 
complex.  

If that's too noisy there might be ways to insert Listeners for the Step 
itself (more complex though) -- I think using the FlowNodes is good enough 
for now and gives us a solid first-order approximation that is useful 99% 
of the time. 

I would also like to extend this by breaking it down into separate metrics 
per step type, i.e. runtime for sh, runtime for echo, for 'node', etc.  
 This is easier than you'd think since you can fetch the StepDescriptor and 
call getFunctionName to get a unique metric key for the step.   This is far 
more useful to us than just average step timings, because it helps spot 
performance regressions in the field. 

Other aggregates of interest: total time spent in each step type for the 
pipeline and counts of the FlowNode by step per pipeline.  This will show 
if we're spending (for example) a LOT of time running 
readFile/writeFile/dir steps due to some sudden bottleneck in the remoting 
interaction and also reveal which step types are used most often.   Knowing 
which steps are used heavily helps me know which deserve extra priority for 
bugfixes, features, and optimizations.

It actually *sounds* far more complicated than it really is -- this would 
be a pretty trivial afternoon project I think. 

> Runtime duration per Pipeline

I already have an implementation.  Same plugin as above.  It's exposed as a 
DropWizard histogram as well, so you get rates + aggregate times with 
median, mean, etc. 

*Other desired metrics:  *I think we want FlowNodes created as a rate per 
unit time (I have an implementation in the same plugin above).   I also 
have an impl for this already (same plugins as before). 

If we could find a way I'd really like to have a counter of how many 
elements of GroovyCPS logic are run and how many function calls (for 
off-master you obviously wouldn't get this data).  This is something useful 
for measuring the real complexity of their Groovy -- even better than 
Liam's Cyclomatic Complexity metric because it directly tracks runtime 
operations, not just code structure.   I have notions how we'd accomplish 
this.

On Friday, March 16, 2018 at 6:55:58 PM UTC-4, Andrew Bayer wrote:
>
> It’s a normal step - what I’m talking about is counting Pipelines 
> containing one or more script blocks, I.e., what percentage of total 
> Declarative Pipelines use script blocks, which I think is a more useful 
> metric than just how many script block invocations there are.
>
> A.
>
> On Fri, Mar 16, 2018 at 5:32 PM R. Tyler Croy <ty...@monkeypox.org 
> <javascript:>> wrote:
>
>> (replies inline)
>>
>> On Fri, 16 Mar 2018, Andrew Bayer wrote:
>>
>> > If we???re going to be tracking step invocations anyway, it???d be 
>> interesting
>> > to count the number of Declarative Pipelines with a script block, maybe?
>>
>> I kind of assumed that if we were incrementing a counter on step 
>> invocations
>> that script{} would be collected already by the machinery, e.g. isn't it 
>> "just"
>> a step?
>>
>> If it's a special snowflake then I'll make sure to include it in my 
>> design.
>>
>>
>> A few more which come to mind now that I'm thinking about Script:
>>
>>  * Count of stages per Pipeline
>>  * Count of Pipelines with the Groovy sandbox disable
>>  * Time spent in script{} block
>>
>>
>> Thanks for the ideas abayer!
>>
>>
>> Cheers
>> - R. Tyler Croy
>>
>> ------------------------------------------------------
>>      Code: <https://github.com/rtyler>
>>   Chatter: <https://twitter.com/agentdero>
>>      xmpp: rty...@jabber.org <javascript:>
>>
>>   % gpg --keyserver keys.gnupg.net --recv-key 1426C7DC3F51E16F
>> ------------------------------------------------------
>>
>> --
>> You received this message because you are subscribed to the Google Groups 
>> "Jenkins Developers" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to jenkinsci-de...@googlegroups.com <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/jenkinsci-dev/20180316213201.ckenekkqcbgtsuzx%40blackberry.coupleofllamas.com
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-dev/9842785f-49ee-43fc-ab61-d9e7b45dc3db%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [Essentials] Collecting usage telemetry from Jenkins Pipeline

Reply via email to