[lttng-dev] (no subject)

2023-06-20 Thread Mousa, Anas via lttng-dev
Hello,

I've recently profiled the latency of LTTng tracepoints on arm platforms,

using the follow sample program:

--

static inline uint64_t get_time_nsec(void)

{

struct timespec ts;

if (caa_unlikely(clock_gettime(CLOCK_MONOTONIC, &ts))) {

ts.tv_sec = 0;

ts.tv_nsec = 0;

}

return ((uint64_t) ts.tv_sec * 10ULL) + ts.tv_nsec;

}


int main(int argc, char *argv[])

{

unsigned int i;

int tp_num = 0;

uint64_t total_time = 0;

uint64_t now, nowz;

if (argc > 1) {

sscanf (argv[1],"%d",&tp_num);

}

for (i = 0; i < tp_num; i++) {

now = get_time_nsec();

lttng_ust_tracepoint(hello_world, my_first_tracepoint,

 i, "some_str");

nowz = get_time_nsec();

total_time += (nowz - now);

}

if (tp_num) {

printf("---Average TP time is 
%"PRIu64"---\n", total_time / tp_num);

}

}

--


I observed a big average latency variance on different platforms when tracing a 
high number (many thousands to millions) of tracepoints:

  *   [platform 1] with CPU info running a linux kernel based on Buildroot 
(4.19.273 aarch64 GNU/Linux):

BogoMIPS: 187.50

Features: fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid

CPU implementer : 0x41

CPU architecture: 8

CPU variant : 0x0

CPU part: 0xd08

CPU revision: 3

  *Saw an average latency of 2-3usec


  *   [platform 2] with CPU info running a linux kernel based on Amazon Linux 
(4.14.294-220.533.amzn2.aarch64 aarch64 GNU/Linux):

BogoMIPS: 243.75

Features: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp 
asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs

CPU implementer : 0x41

CPU architecture: 8

CPU variant : 0x3

CPU part: 0xd0c

CPU revision: 1

  *   Saw an average latency of ~0.5usec



Are there any suggestions to root cause the high latency and potentially 
improve it on platform 1?


Thanks and best regards,

Anas.

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


[lttng-dev] Profiling LTTng tracepoint latency on different arm platforms

2023-06-20 Thread Mousa, Anas via lttng-dev
Hello,



I've recently profiled the latency of LTTng tracepoints on arm platforms,

using the follow sample program:



--

static inline uint64_t get_time_nsec(void)

{

struct timespec ts;



if (caa_unlikely(clock_gettime(CLOCK_MONOTONIC, &ts))) {

ts.tv_sec = 0;

ts.tv_nsec = 0;

}

return ((uint64_t) ts.tv_sec * 10ULL) + ts.tv_nsec;

}






int main(int argc, char *argv[])

{

unsigned int i;

int tp_num = 0;

uint64_t total_time = 0;

uint64_t now, nowz;



if (argc > 1) {

sscanf (argv[1],"%d",&tp_num);

}

for (i = 0; i < tp_num; i++) {

now = get_time_nsec();

lttng_ust_tracepoint(hello_world, my_first_tracepoint,

 i, "some_str");

nowz = get_time_nsec();

total_time += (nowz - now);

}



if (tp_num) {

printf("---Average TP time is 
%"PRIu64"---\n", total_time / tp_num);

}

}

--






I observed a big average latency variance on different platforms when tracing a 
high number (many thousands to millions) of tracepoints:



  *   [platform 1] with CPU info running a linux kernel based on Buildroot 
(4.19.273 aarch64 GNU/Linux):



BogoMIPS: 187.50

Features: fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid

CPU implementer   : 0x41

CPU architecture: 8

CPU variant   : 0x0

CPU part  : 0xd08

CPU revision  : 3



 *Saw an average latency of 2-3usec


  *   [platform 2] with CPU info running a linux kernel based on Amazon Linux 
(4.14.294-220.533.amzn2.aarch64 aarch64 GNU/Linux):



BogoMIPS: 243.75

Features: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp 
asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs

CPU implementer   : 0x41

CPU architecture: 8

CPU variant   : 0x3

CPU part  : 0xd0c

CPU revision  : 1



 *   Saw an average latency of ~0.5usec







Are there any suggestions to root cause the high latency and potentially 
improve it on platform 1?



Thanks and best regards,

Anas.

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


Re: [lttng-dev] Profiling LTTng tracepoint latency on different arm platforms

2023-06-20 Thread Mathieu Desnoyers via lttng-dev

On 6/20/23 06:27, Mousa, Anas via lttng-dev wrote:

Hello,




Arethereanysuggestionstorootcausethehighlatencyandpotentiallyimproveiton*platform1*?

Thanks and best regards,

Anas.



I recommend using "perf" when tracing with the sample program in a loop 
to figure out the hot spots. With that information on the "fast" and 
"slow" system, we might be able to figure out what differs.


Also, comparing the kernel configurations of the two systems can help. 
Also comparing the glibc versions of the two systems would be relevant.


Thanks,

Mathieu


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


Re: [lttng-dev] Profiling LTTng tracepoint latency on different arm platforms

2023-06-20 Thread Mathieu Desnoyers via lttng-dev

On 6/20/23 10:20, Mathieu Desnoyers via lttng-dev wrote:

On 6/20/23 06:27, Mousa, Anas via lttng-dev wrote:

Hello,




Arethereanysuggestionstorootcausethehighlatencyandpotentiallyimproveiton*platform1*?

Thanks and best regards,

Anas.



I recommend using "perf" when tracing with the sample program in a loop 
to figure out the hot spots. With that information on the "fast" and 
"slow" system, we might be able to figure out what differs.


Also, comparing the kernel configurations of the two systems can help. 
Also comparing the glibc versions of the two systems would be relevant.




Also make sure you benchmark the lttng "snapshot" mode [1] to make sure 
you don't run into a situation where the disk/network I/O throughput 
cannot cope with the generated event throughput, thus causing the ring 
buffer to discard events. This would therefore "speed up" tracing from 
the application perspective because discarding an event is faster than 
writing it to a ring buffer.


Thanks,

Mathieu

[1] https://lttng.org/docs/v2.13/#doc-taking-a-snapshot


Thanks,

Mathieu




--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


[lttng-dev] [RELEASE] LTTng-tools 2.13.10 - Nordicité

2023-06-20 Thread Jérémie Galarneau via lttng-dev
Hi all,

We have just released the tenth bug-fix release of LTTng-tools 2.13 -
Nordicité. This release addresses a number of issues listed below.

2023-06-20 lttng-tools 2.13.10 (National Ice Cream Soda Day)
* Fix: sessiond: incorrect use of exclusions array leads to crash
* Tests fix: test_callstack: output of addr2line incorrectly parsed
* Fix: sessiond: silent kernel notifier registration error
* Fix: sessiond: size-based notification occasionally not triggered
* Fix: adding a user space probe fails on thumb functions
* Fix: Tests that assume CPU 0 is present
* Fix: Wrong assumption about possible CPUs
* Tests: fix: parse-callback reports missing addr2line
* Fix: lttng remove-trigger -h fails
* Tests: snapshot tests complain that nothing is output
* Tests: Test snapshot maximum size correctly
* Tests: Add get_possible_cpus_count utility
* Fix: ini parser: truncation of value name
* Fix: truncated len in lttng_event_rule_user_tracepoint_serialize()
* Tests: remove leftover debug printing in test_add_trigger_cli

As always, please report any issue you may encounter to bugs.lttng.org
or on this mailing list.

Project website: https://lttng.org

Download link:
https://lttng.org/files/lttng-tools/lttng-tools-2.13.10.tar.bz2

GPG signature:
https://lttng.org/files/lttng-tools/lttng-tools-2.13.10.tar.bz2.asc
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


[lttng-dev] [RELEASE] LTTng-tools 2.12.14 - (Ta) Meilleure

2023-06-20 Thread Jérémie Galarneau via lttng-dev
Hi all,

We have just released the fourteenth bug-fix release of LTTng-tools 2.12 - (Ta)
Meilleure. This release addresses a number of issues listed below.

2023-06-20 lttng-tools 2.12.14 (National Ice Cream Soda Day)
* Tests fix: test_callstack: output of addr2line incorrectly parsed
* Fix: adding a user space probe fails on thumb functions
* Fix: Tests that assume CPU 0 is present
* Fix: Wrong assumption about possible CPUs
* Tests: fix: parse-callback reports missing addr2line
* Tests: snapshot tests complain that nothing is output
* Tests: Test snapshot maximum size correctly
* Tests: Add get_possible_cpus_count utility
* Fix: ini parser: truncation of value name

As always, please report any issue you may encounter to bugs.lttng.org
or on this mailing list.

Project website: https://lttng.org

Download link:
https://lttng.org/files/lttng-tools/lttng-tools-2.12.14.tar.bz2

GPG signature:
https://lttng.org/files/lttng-tools/lttng-tools-2.12.14.tar.bz2.asc
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


Re: [lttng-dev] I'm still getting empty ust traces using tracef

2023-06-20 Thread Brian Hutchinson via lttng-dev
On Thu, May 11, 2023 at 2:14 PM Mathieu Desnoyers
 wrote:
>
> On 2023-05-11 14:13, Mathieu Desnoyers via lttng-dev wrote:
> > On 2023-05-11 12:36, Brian Hutchinson via lttng-dev wrote:
> >> ... more background.  I've always used ltt in the kernel so I don't
> >> have much experience with the user side of it and especially
> >> multi-threaded, multi-core so I'm probably missing some fundamental
> >> concepts that I need to understand.
> >
> > Which are the exact versions of LTTng-UST and LTTng-Tools you are using
> > now ? (2.13.N or which git commit ?)
> >
>
> Also, can you try using lttng-ust stable-2.13 branch, which includes the 
> following commit ?
>
> commit be2ca8b563bab81be15cbce7b9f52422369f79f7
> Author: Mathieu Desnoyers 
> Date:   Tue Feb 21 14:29:49 2023 -0500
>
>  Fix: Reevaluate LTTNG_UST_TRACEPOINT_DEFINE each time tracepoint.h is 
> included
>
>  Fix issues with missing symbols in use-cases where tracef.h is included
>  before defining LTTNG_UST_TRACEPOINT_DEFINE, e.g.:
>
>   #include 
>   #define LTTNG_UST_TRACEPOINT_DEFINE
>   #include 
>
>  It is caused by the fact that tracef.h includes tracepoint.h in a
>  context which has LTTNG_UST_TRACEPOINT_DEFINE undefined, and this is not
>  re-evaluated for the following includes.
>
>  Fix this by lifting the definition code in tracepoint.h outside of the
>  header include guards, and #undef the old LTTNG_UST__DEFINE_TRACEPOINT
>  before re-defining it to its new semantic. Use a new
>  _LTTNG_UST_TRACEPOINT_DEFINE_ONCE include guard within the
>  LTTNG_UST_TRACEPOINT_DEFINE defined case to ensure symbols are not
>  duplicated.
>
>  Signed-off-by: Mathieu Desnoyers 
>  Change-Id: I0ef720435003a7ca0bfcf29d7bf27866c5ff8678
>

I applied this patch and if I use "tracef" type calls in our
application that is made up of a bunch of static libs ... the UST
trace calls work.  I verified that traces that were called from
several different static libs all worked.

But as soon as I include a "tracepoint" style tracepoint (that uses
trace provider include files etc.) then doing a "lttng list -u"
returns "None" for UST events.

Is there some kind of rule that says a file can't use both tracef and
tracepoint calls?  Is there something special you have to do to use
tracef and tracepoints in same file?  Doing so appears to have broken
everything.

Thanks,

Brian
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


Re: [lttng-dev] Profiling LTTng tracepoint latency on different arm platforms

2023-06-20 Thread Yitschak, Yehuda via lttng-dev
> On 6/20/23 10:20, Mathieu Desnoyers via lttng-dev wrote:
> > On 6/20/23 06:27, Mousa, Anas via lttng-dev wrote:
> >> Hello,
> >
> >>
> >>
> Arethereanysuggestionstorootcausethehighlatencyandpotentiallyimproveito
> n*platform1*?
> >>
> >> Thanks and best regards,
> >>
> >> Anas.
> >>
> >
> > I recommend using "perf" when tracing with the sample program in a
> > loop to figure out the hot spots. With that information on the "fast"
> > and "slow" system, we might be able to figure out what differs.
> >
> > Also, comparing the kernel configurations of the two systems can help.
> > Also comparing the glibc versions of the two systems would be relevant.
> >
> 
> Also make sure you benchmark the lttng "snapshot" mode [1] to make sure
> you don't run into a situation where the disk/network I/O throughput cannot
> cope with the generated event throughput, thus causing the ring buffer to
> discard events. This would therefore "speed up" tracing from the application
> perspective because discarding an event is faster than writing it to a ring
> buffer.

You mean we should avoid the "discard" loss mode and use "overwrite" loss mode 
since discard mode can fake fast performance ? 

> 
> Thanks,
> 
> Mathieu
> 
> [1] https://lttng.org/docs/v2.13/#doc-taking-a-snapshot
> 
> > Thanks,
> >
> > Mathieu
> >
> >
> 
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> https://www.efficios.com
> 
> ___
> lttng-dev mailing list
> lttng-dev@lists.lttng.org
> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev