[
https://issues.apache.org/jira/browse/CASSANDRA-20311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael Semb Wever updated CASSANDRA-20311:
-------------------------------------------
Description:
Using CASSANDRA-20157, analyse
# what splits are problematic
# what test types are configured with too many splits, or not enough
*Analyse Process*
Download the jenkins consoleText logs we want to analyse
{noformat}
for i in $(seq $first_build $last_build) ; do
wget -q https://ci-cassandra.apache.org/job/Cassandra-5.0/$i/consoleText
bash -c "grep '] Time ' consoleText | awk -F']' '{print $2}' | sort -u >
Cassandra-5.0_${i}_timings.txt"
rm consoleText
done
{noformat}
(note, if we had consoleText files stored in nightlies.a.o we'd be able to skip
this step, instead just working with the webdav mount)
(1)
{code}
for i in $(seq $first_build $last_build) ; do grep " 01:0"
Cassandra-5.0_${i}_timings.txt | awk '{print $3}' ; done | sort | uniq -c |
sort -n
{code}
Target those individual splits that have timed out (duration of 1 hour) the
most.
(2a)
For test types that have too few splits (are timing out at the 1 hour mark too
often).
{code}
for i in $(seq $first_build $last_build) ; do grep " 01:0"
Cassandra-5.0_${i}_timings.txt | awk '{print $3}' ; done | awk -F_ '{print $1}'
| sort | uniq -c | sort -n
{code}
(2b)
For test types that have too many splits (are finishing faster than ten
minutes).
{code}
for i in $(seq $first_build $last_build) ; do grep " 00:0"
Cassandra-5.0_${i}_timings.txt | awk '{print $3}' ; done | awk -F_ '{print $1}'
| sort | uniq -c | sort -n
{code}
This output will be weighted by those test types that have more splits (but
that's ok because that's where we can save time / improve throughput most).
was:
Using CASSANDRA-20157, analyse
# what splits are problematic
# what test types are configured with too many splits, or not enough
(1)
{code}
for i in $(seq <first_build> <second_build>) ; do grep " 01:0"
Cassandra-5.0_${i}_timings.txt | awk '{print $3}' ; done | sort | uniq -c |
sort -n
{code}
Target those individual splits that have timed out (duration of 1 hour) the
most.
(2a)
For test types that have too few splits (are timing out at the 1 hour mark too
often).
{code}
for i in $(seq 352 385) ; do grep " 01:0" Cassandra-5.0_${i}_timings.txt | awk
'{print $3}' ; done | awk -F_ '{print $1}' | sort | uniq -c | sort -n
{code}
(2b)
For test types that have too many splits (are finishing faster than ten
minutes).
{code}
for i in $(seq 352 385) ; do grep " 00:0" Cassandra-5.0_${i}_timings.txt | awk
'{print $3}' ; done | awk -F_ '{print $1}' | sort | uniq -c | sort -n
{code}
This output will be weighted by those test types that have more splits (but
that's ok because that's where we can save time / improve throughput most).
> Adjust 5.0 and trunk Jenkinsfile's splits configuration
> --------------------------------------------------------
>
> Key: CASSANDRA-20311
> URL: https://issues.apache.org/jira/browse/CASSANDRA-20311
> Project: Apache Cassandra
> Issue Type: Task
> Components: CI
> Reporter: Michael Semb Wever
> Assignee: Michael Semb Wever
> Priority: Normal
>
> Using CASSANDRA-20157, analyse
> # what splits are problematic
> # what test types are configured with too many splits, or not enough
> *Analyse Process*
> Download the jenkins consoleText logs we want to analyse
> {noformat}
> for i in $(seq $first_build $last_build) ; do
> wget -q https://ci-cassandra.apache.org/job/Cassandra-5.0/$i/consoleText
> bash -c "grep '] Time ' consoleText | awk -F']' '{print $2}' | sort -u >
> Cassandra-5.0_${i}_timings.txt"
> rm consoleText
> done
> {noformat}
> (note, if we had consoleText files stored in nightlies.a.o we'd be able to
> skip this step, instead just working with the webdav mount)
> (1)
> {code}
> for i in $(seq $first_build $last_build) ; do grep " 01:0"
> Cassandra-5.0_${i}_timings.txt | awk '{print $3}' ; done | sort | uniq -c |
> sort -n
> {code}
> Target those individual splits that have timed out (duration of 1 hour) the
> most.
> (2a)
> For test types that have too few splits (are timing out at the 1 hour mark
> too often).
> {code}
> for i in $(seq $first_build $last_build) ; do grep " 01:0"
> Cassandra-5.0_${i}_timings.txt | awk '{print $3}' ; done | awk -F_ '{print
> $1}' | sort | uniq -c | sort -n
> {code}
> (2b)
> For test types that have too many splits (are finishing faster than ten
> minutes).
> {code}
> for i in $(seq $first_build $last_build) ; do grep " 00:0"
> Cassandra-5.0_${i}_timings.txt | awk '{print $3}' ; done | awk -F_ '{print
> $1}' | sort | uniq -c | sort -n
> {code}
> This output will be weighted by those test types that have more splits (but
> that's ok because that's where we can save time / improve throughput most).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]