from:"Mathieu Desnoyers"

Re: [lttng-dev] rcu_cmpxchg_pointer() documentation patch

2024-07-04 Thread Mathieu Desnoyers via lttng-dev


On 2024-07-04 13:33, Ondřej Surý via lttng-dev wrote:

Hi,

looks like my git-send-email configuration is not correct and my 
mailserver ate the patch,

so here's one created by git-format-patch...

Nothing important, but it bite me today...


Merged into master, stable-0.14, stable-0.13, thanks!

Mathieu





Ondrej
--
Ondřej Surý (He/Him)
ond...@sury.org

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev 
<https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev>


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

[lttng-dev] Common Trace Format 2 (CTF2) specification and sub-specs

2024-06-27 Thread Mathieu Desnoyers via lttng-dev


Hi,

Here is the status of the specification and sub-specifications:

- "CTF2‑SPEC‑2.0: Common Trace Format version 2"
  https://diamon.org/ctf/files/CTF2-SPEC-2.0.html

This is the official CTF 2.0 spec. The URL https://diamon.org/ctf
points to the latest version of the Common Trace Format spec.

- "CTF2‑DOCID‑2.0: CTF 2 document identifier format"
  https://diamon.org/ctf/files/CTF2-DOCID-2.0.html

The CTF 2 specification refers to it.

- "CTF2-FS-1.0: Layout of a CTF 2 trace stored on a file system"
  https://diamon.org/ctf/files/CTF2-FS-1.0.html

This document covers how Babeltrace and Trace Compass
will expect the CTF2 files on the filesystem, and how
Babeltrace and LTTng plan to produce them. This is by
all means an "optional" specification, which means it
is up to the implementation to decide whether they want
to abide by it or not.

Philippe plans to soon release a CTF2-FS-2.0 document with
pretty much the same content as version 1.0, but formatted
following the CTF2‑DOCID‑2.0 specification.

We are planning to add an index of those relevant "optional"
specifications within the CTF2 specification so they can easily
be found.

If we end up having a new storage pattern that end up being
commonly used by implementations, e.g. storing a CTF2 trace
within a binary blob in OpenTelemetry [1], we can always create
a new sub-specification similar to CTF2-FS to cover this.

The following specification files were deprecated by the
time the CTF2 specification was finalized:

- https://diamon.org/ctf/files/CTF2-BASICATTRS-1.0.html
- https://diamon.org/ctf/files/CTF2-PMETA-1.0.html

As always, feedback is welcome!

Thanks,

Mathieu

[1] https://github.com/open-telemetry/opentelemetry-specification/issues/3979

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [lttng-relayd] is there existing cases for relayd to stream over Android usb based adb?

2024-06-04 Thread Mathieu Desnoyers via lttng-dev


On 2024-06-04 11:25, Wu, Yannan wrote:

The device is a rooted android device

*On the device:*

lttng-sessiond -d --no-kernel

lttng create my-live-session --live

lttng enable-event -u 

lttng start


*On the host:*

adb reverse tcp:5343 tcp:5343

The adb reverse will fail for "/adb.exe: error: cannot bind to socket/"


In the reversed order, if set up adb reverse from the host first and 
create the live session after, lttng-relayd on device cannot be started.

Here is the error message:


The "reverse order" you describe is the order you need. What you are
missing is to run lttng-relayd on the host and to forward both ports
5342 *and* 5343. You will also need to either override the target URLs
for the live control and data ports to prevent sessiond from auto-spawning
a relayd, or forward the live viewer port as well through adb (5344).

Overall:

* First on the Host:

lttng-relayd
adb reverse tcp:5342 tcp:5342  # control port
adb reverse tcp:5343 tcp:5343  # data port
adb reverse tcp:5344 tcp:5344  # live viewer port

* Then on the Android Device:

lttng-sessiond -d --no-kernel
lttng create my-live-session --live --ctrl-url=tcp://localhost:5342 
--data-url=tcp://localhost:5343
lttng enable-event -u 
lttng start

The reason why the relayd auto-spawn needs to be prevented is because
the "lttng create" command line attempts to connect to the localhost relayd
as a viewer (default port tcp 5344). So if you don't forward this
port as well through adb, the sessiond will always try to auto-spawn
a relayd which conflicts with your forwarded ports on the Android
device.

Technically either forwarding port 5344 or specifying control/data
URL override is sufficient to prevent the relayd auto-spawn, but
I'd recommend doing both if it is possible.

Thanks,

Mathieu



/PERROR - 15:23:30.915938387 [9813/9829]: Failed to bind socket: Address 
already in use (in relay_socket_create() at 
/src/VodkaLttngTool/build/private/source/src/bin/lttng-relayd/main.c:1036)/

/Error: Health error occurred in relay_thread_listener/
/Error: A file descriptor leak has been detected: 1 tracked file 
descriptors are still being tracked/


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [lttng-relayd] is there existing cases for relayd to stream over Android usb based adb?

2024-06-04 Thread Mathieu Desnoyers via lttng-dev


Hi Amanda,

For each of the 4 commands described below, please clarify on
which device they are executed, whether on the Android device
or on the Development device.

Please make sure to follow to the letter the commands proposed
by Kienan: in the correct order, and on the appropriate device.

Thanks,

Mathieu


On 2024-06-02 22:55, Wu, Yannan via lttng-dev wrote:

Yes. My test command is like below:

 1. lttng-sessiond --d --no-kernel
 2.

yannanwu@ue91e96f2951b5c:~/trees/lttng_test_run$ lttng create 
my-user-space-live-session --live
Live session my-user-space-live-session created.
Traces will be output to tcp4://127.0.0.1:5342/ [data: 5343]
Live timer interval set to 100 us

 3. After this, I could "ps -Ax|grep lttng" and see lttng-relayd
started. But once I start adb reverse, it will failed for failed
binding to socket.
 4. In the other order, if I start adb reverse first and lttng-create
later, lttng-create will not fail but lttng-relayd is not started.
By manually start lttng-relayd it will also failed for unable
binding to socket.

Amanda


*From:* Kienan Stewart 
*Sent:* Friday, May 31, 2024 3:12:16 AM
*To:* Wu, Yannan; lttng-dev@lists.lttng.org
*Subject:* RE: [EXTERNAL] [lttng-dev] [lttng-relayd] is there existing 
cases for relayd to stream over Android usb based adb?
CAUTION: This email originated from outside of the organization. Do not 
click links or open attachments unless you can confirm the sender and 
know the content is safe.




Hi Amanda,

I'd like to confirm my understanding the situation.

Android device
    - Running lttng-sessiond with one or more configured sessions

Development device
    - Connected to the android device over usb using adb

You want want the data captured on the android device to be streamed via
the usb connection rather than the other networks on the android device.

Could you expand on the commands you used to set up the tracing sessions
and relay, and where each of those commands were run?

It sounds to me like you might want to be doing something like the
following:

(Development device) Start lttng-relayd:
    - tcp://0.0.0.0:5342 and :5343 will be bound on the development device
    - tcp://127.0.0.1:5344 will be available for the live reader

(Development device) Create the reverses for the following ports: 5342
and 5343
    - At this point :5342 and :5343 should be available on the android
device and reach the relayd running on the development device

(Android device) Start lttng-sessiond
(Android device) Create session(s): `lttng create -U tcp://localhost/
    - Using `-U/--set-url`, no relayd will be spawned on the android device
(Android device) Start session(s)

This setup should have the relayd running on the development and writing
the traces there and/or viewing them with a live viewer. On the android
device, the UST applications (if any) will connect to the local sessiond
and consumers, which will shuttle the information over :5342 and :5343
to the developer device via the reverse sockets.

Please note that I didn't have time to test this, so there might be some
mistakes. As I requested above, clear details of the exact commands you
use for the tracing setup would be very helpful to have the clearest
understanding of what you're doing.

hope this helps,
kienan

On 5/30/24 1:53 AM, Wu, Yannan via lttng-dev wrote:

Hihi, there,

I am currently working on enabling lttng live mode over android usb adb.
Here is the situation, during debugging some network related issues, we
dont want the trace data to be streamed via network to cause extra load
to the system being profiled. Then we select to connect lttng-relayd
with adb via port forwarding so that the data is "forward" to the host.

*Here is the set up and the problem:*

for the device:  adb reverse tcp:5342 tcp:5342; adb reverse tcp:5343
tcp:5343; adb reverse tcp:5344 tcp:5344
Then starting up lttng with --live enabled.

*What is expected:*
lttng start streaming to the localhost.
*What is seen: *
the lttng-relayd failed to start. For unable binding to the socket.

*The cause of this issue: *

both adb reverse and lttng relayd need binding to the socket which is
conflict with each other.


So what I wanna ask is, for embedded system use cases, do we have
successful use cases among team that could stream the trace data in live
mode to the host with usb based adb? If not, any idea or suggestion to
me on how to process forward?

Amanda





___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev 

<https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev>

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


--
Mathieu Desnoyers
EfficiOS Inc

Re: [lttng-dev] [PATCH] Fix mm_vmscan_lru_isolate tracepoint for RHEL 9.4 kernel

2024-05-22 Thread Mathieu Desnoyers via lttng-dev


On 2024-05-17 12:04, Kienan Stewart via lttng-dev wrote:

Hi Martin,

thanks for the patch.

I changed the version range slightly. The RHEL kernel 5.14.0-427.13.1 
still has the `isolate_mode` parameter in the `mm_vmscan_lru_isolate` 
tracepoint; it was only removed in 5.14.0-427.16.1.


I also forward ported the patch to the master branch.

The updated patches will be reviewed at: 
https://review.lttng.org/q/topic:%22buildfix-el9.4%22


Merged into lttng-modules master and stable-2.13, thanks!

Mathieu



thanks,
kienan

On 5/17/24 10:30 AM, Martin Hicks via lttng-dev wrote:



Redhat has moved to using the format first found in the 6.7 kernel
for the mm_vmscan_lru_isolate tracepoint.

Signed-off-by: Martin Hicks 
---
  include/instrumentation/events/mm_vmscan.h | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/instrumentation/events/mm_vmscan.h 
b/include/instrumentation/events/mm_vmscan.h

index ea6f4b7..49a9eae 100644
--- a/include/instrumentation/events/mm_vmscan.h
+++ b/include/instrumentation/events/mm_vmscan.h
@@ -369,7 +369,9 @@ LTTNG_TRACEPOINT_EVENT_MAP(mm_shrink_slab_end,
  )
  #endif
-#if (LTTNG_LINUX_VERSION_CODE >= LTTNG_KERNEL_VERSION(6,7,0))
+#if (LTTNG_LINUX_VERSION_CODE >= LTTNG_KERNEL_VERSION(6,7,0) || \
+ LTTNG_RHEL_KERNEL_RANGE(5,14,0,427,0,0, 5,15,0,0,0,0))
+
  LTTNG_TRACEPOINT_EVENT(mm_vmscan_lru_isolate,
  TP_PROTO(int classzone_idx,

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] Capturing snapshot on kernel panic

2024-05-16 Thread Mathieu Desnoyers via lttng-dev

he time
 > it enters.
 >
 > While this doesn't necessarily help your original question of
panics, if
 > you want to snapshot before shutdown or reboot and are using
systemd,
 > it's possible to leave a script or binary in a known directory so
that
 > it's invoked prior to the rest of the shutdown sequence[4].
 >
 > [1]:
https://lttng.org/docs/v2.13/#doc-persistent-memory-file-systems
<https://lttng.org/docs/v2.13/#doc-persistent-memory-file-systems>
 > [2]:
 >

https://github.com/systemd/systemd/blob/6533c14997700f74e9ea42121303fc1f5c63e62b/src/shutdown/shutdown.c
 
<https://github.com/systemd/systemd/blob/6533c14997700f74e9ea42121303fc1f5c63e62b/src/shutdown/shutdown.c>
 > [3]:
 >
https://github.com/systemd/systemd/blob/main/src/shared/reboot-util.c#L77 
<https://github.com/systemd/systemd/blob/main/src/shared/reboot-util.c#L77>
 > [4]:
https://www.systutorials.com/docs/linux/man/8-systemd-reboot/
<https://www.systutorials.com/docs/linux/man/8-systemd-reboot/>
 >
 > hope this helps,
 > kienan
 >
 >> Would you have any suggestions?
 >> Thanks for your help,
 >> Cheers
 >> Damien
 >>
 >> 
 >>
 >> # Prep output dir
 >> mkdir /application/trace/
 >> rm -rf /application/trace/*
 >>
 >> # Create session
 >> sudo lttng destroy snapshot-trace-session
 >> sudo lttng create snapshot-trace-session --snapshot
 >> --output="/application/trace/"
 >> sudo lttng enable-channel --kernel --num-subbuf=8 channelk
 >> sudo lttng enable-channel --userspace --num-subbuf=8 channelu
 >>
 >> # Configure session
 >> sudo lttng enable-event --kernel --syscall --all --channel channelk
 >> sudo lttng enable-event --kernel --tracepoint "sched*" --channel
channelk
 >> sudo lttng enable-event --userspace --all --channel channelu
 >> sudo lttng add-context -u -t vtid -t procname
 >> sudo lttng remove-trigger trig_reboot
 >> sudo lttng add-trigger --name=trig_reboot \
 >>          --condition=event-rule-matches
--type=kernel:syscall:entry \
 >>          --name=reboot\
 >>          --action=snapshot-session snapshot-trace-session \
 >>          --rate-policy=once-after:1
 >>
 >> # start & list info
 >> sudo lttng start
 >> sudo lttng list snapshot-trace-session
 >> sudo lttng list-triggers
 >>
 >> # test it...
 >> sudo reboot
 >>
 >> #=== reconnect and Nothing :(
 >> $ ls -alu /application/trace/
 >> drwxr-xr-x    2 u  u       4096 May 15  2024 .
 >> drwxr-xr-x   10 u  u       4096 May 15  2024 ..
 >>
 >>
 >> ___
 >> lttng-dev mailing list
 >> lttng-dev@lists.lttng.org <mailto:lttng-dev@lists.lttng.org>
 >> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
<https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev>
 > ___
 > lttng-dev mailing list
 > lttng-dev@lists.lttng.org <mailto:lttng-dev@lists.lttng.org>
 > https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
<https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev>



--
*Damien Berget*

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

[lttng-dev] [RELEASE] LTTng-modules 2.13.13 and 2.12.17 (Linux kernel tracer)

2024-05-13 Thread Mathieu Desnoyers via lttng-dev


Hi,

This is a stable release announcement for the LTTng kernel tracer,
an out-of-tree kernel tracer for the Linux kernel.

The LTTng project provides low-overhead, correlated userspace and
kernel tracing on Linux. Its use of the Common Trace Format and a
flexible control interface allows it to fulfill various workloads.

* New in these releases:

- LTTng-modules 2.13.13:

  - Introduce support for Linux v6.9.

  - Removed unused duplicated code, add missing static to
function definitions, and add missing includes for function
declarations which were observed when building against recent
kernels with newer toolchains.
We plan to adapt our CI to add jobs that will report warnings
as errors when building lttng-modules against recent kernels
with a recent tool chain so we can catch and fix those warnings
earlier in the future.

- In both LTTng-modules 2.12.17 and 2.13.13:

  - Fix incorrect get_pfnblock_flags_mask prototype which did not match
upstream after upstream commit 535b81e209219 (v5.9). Fix the prototype
mismatch detection code as well. This affects the event
mm_page_alloc_extfrag which uses get_pageblock_migratetype(). Note that
because the kernel macro get_pageblock_migratetype was also updated
to pass 3 parameters to get_pfnblock_flags_mask as its kernel prototype
was updated to expect three parameters, it does not matter that the
lttng-modules wrapper expects 4 parameters and provides those 4 parameters
to the kernel function. This issue should therefore not affect the
runtime behavior.

  - Instrumentation updates to support EL 8.4+.

  - Instrumentation updates for RHEL kernels.

  - Instrumentation updates to the timer subsystem to adapt to
changes backported in the 4.19 stable kernels.


* Detailed change logs:

2024-05-13 (National Leprechaun Day) LTTng modules 2.13.13
* splice wrapper: Fix missing declaration
* page alloc wrapper: Fix get_pfnblock_flags_mask prototype
* lttng probe: include events-internal.h
* syscalls: Remove unused duplicated code
* statedump: Add missing events-internal.h include
* lttng-events: Add missing static
* event notifier: Add missing static
* context callstack: Add missing static
* lttng-clock: Add missing lttng/events-internal.h include
* lttng-calibrate: Add missing static and include
* lttng-bytecode: Remove dead code
* lttng-abi: Add missing static to function definitions
* ring buffer: Add missing static to function definitions
* blkdev wrapper: Fix constness warning
* Fix: timer_expire_entry changed in 4.19.312
* Fix: dev_base_lock removed in linux 6.9-rc1
* Fix: mm_compaction_migratepages changed in linux 6.9-rc1
* Fix: ASoC add component to set_bias_level events in linux 6.9-rc1
* Fix: ASoC snd_doc_dapm on linux 6.9-rc1
* Fix: build kvm probe on EL 8.4+
* Fix: support ext4_journal_start on EL 8.4+
* Fix: correct RHEL range for kmem_cache_free define

2024-05-13 (National Leprechaun Day) 2.12.17
* page alloc wrapper: Fix get_pfnblock_flags_mask prototype
* Fix: timer_expire_entry changed in 4.19.312
* Fix: build kvm probe on EL 8.4+
* Fix: support ext4_journal_start on EL 8.4+
* Fix: correct RHEL range for kmem_cache_free define

Project website: https://lttng.org
Documentation: https://lttng.org/docs
Download link: https://lttng.org/download

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [PATCH urcu] fix: handle EINTR correctly in get_cpu_mask_from_sysfs

2024-05-02 Thread Mathieu Desnoyers via lttng-dev


On 2024-05-02 10:32, Michael Jeanson wrote:

On 2024-05-02 09:54, Mathieu Desnoyers wrote:

On 2024-05-01 19:42, Benjamin Marzinski via lttng-dev wrote:

If the read() in get_cpu_mask_from_sysfs() fails with EINTR, the code is
supposed to retry, but the while loop condition has (bytes_read > 0),
which is false when read() fails with EINTR. The result is that the code
exits the loop, having only read part of the string.

Use (bytes_read != 0) in the while loop condition instead, since the
(bytes_read < 0) case is already handled in the loop.


Thanks for the fix ! It is indeed the right thing to do.

I would like to integrate this fix into the librseq and libside
projects as well though, but I notice the the copy in liburcu
is LGPLv2.1 whereas the copy in librseq and libside are
MIT.

Michael, should we first relicense the liburcu src/compat-smp.h
implementation to MIT so it matches the license of the copies
in librseq and libside ?


Sure, please go ahead.


For the records, we also have a copy of this code in lttng-ust,
also under MIT license. So liburcu's copy is the only outlier
there.

Thanks,

Mathieu




Thanks,

Mathieu



Signed-off-by: Benjamin Marzinski 
---
  src/compat-smp.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/compat-smp.h b/src/compat-smp.h
index 31fa979..075a332 100644
--- a/src/compat-smp.h
+++ b/src/compat-smp.h
@@ -164,7 +164,7 @@ static inline int get_cpu_mask_from_sysfs(char 
*buf, size_t max_bytes, const cha

  total_bytes_read += bytes_read;
  assert(total_bytes_read <= max_bytes);
-    } while (max_bytes > total_bytes_read && bytes_read > 0);
+    } while (max_bytes > total_bytes_read && bytes_read != 0);
  /*
   * Make sure the mask read is a null terminated string.






--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [PATCH urcu] fix: handle EINTR correctly in get_cpu_mask_from_sysfs

2024-05-02 Thread Mathieu Desnoyers via lttng-dev


On 2024-05-01 19:42, Benjamin Marzinski via lttng-dev wrote:

If the read() in get_cpu_mask_from_sysfs() fails with EINTR, the code is
supposed to retry, but the while loop condition has (bytes_read > 0),
which is false when read() fails with EINTR. The result is that the code
exits the loop, having only read part of the string.

Use (bytes_read != 0) in the while loop condition instead, since the
(bytes_read < 0) case is already handled in the loop.


Thanks for the fix ! It is indeed the right thing to do.

I would like to integrate this fix into the librseq and libside
projects as well though, but I notice the the copy in liburcu
is LGPLv2.1 whereas the copy in librseq and libside are
MIT.

Michael, should we first relicense the liburcu src/compat-smp.h
implementation to MIT so it matches the license of the copies
in librseq and libside ?

Thanks,

Mathieu



Signed-off-by: Benjamin Marzinski 
---
  src/compat-smp.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/compat-smp.h b/src/compat-smp.h
index 31fa979..075a332 100644
--- a/src/compat-smp.h
+++ b/src/compat-smp.h
@@ -164,7 +164,7 @@ static inline int get_cpu_mask_from_sysfs(char *buf, size_t 
max_bytes, const cha
  
  		total_bytes_read += bytes_read;

assert(total_bytes_read <= max_bytes);
-   } while (max_bytes > total_bytes_read && bytes_read > 0);
+   } while (max_bytes > total_bytes_read && bytes_read != 0);
  
  	/*

 * Make sure the mask read is a null terminated string.


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

[lttng-dev] [RELEASE] LTTng-UST 2.12.10 and 2.13.8 (Linux user-space tracer)

2024-04-19 Thread Mathieu Desnoyers via lttng-dev


LTTng-UST, the Linux Trace Toolkit Next Generation Userspace Tracer,
is a low-overhead application tracer. The library "liblttng-ust" enables
tracing of applications and libraries.

New in both 2.12.10 and 2.13.8:

* Add close_range wrapper to liblttng-ust-fd.so

GNU libc 2.34 implements a new close_range symbol which is used
by the ssh client and other applications to close all file descriptors,
including those which do not belong to the application. Override
this symbol to prevent the application from closing file descriptors
actively used by lttng-ust.

* Fix: libc wrapper: use initial-exec for malloc_nesting TLS

Use the initial-exec TLS model for the malloc_nesting nesting guard
variable to ensure that the GNU libc implementation of the TLS access
don't trigger infinite recursion by calling the memory allocator wrapper
functions, which can happen with global-dynamic.

This fixes a liblttng-ust-libc-wrapper.so regression on recent
Fedora distributions.

* lttng-ust(3): Fix wrong len_type for sequence

`len_type' of a sequence field must be of type unsigned integer. Some
provided examples in the man page were incorrectly using a type signed
integer, resulting in correct compilation, but error while decoding.

New in 2.13.8:

* ust-tracepoint-event: Add static check of sequences length type

Add a compile-time check to validate that unsigned types are used
for the length field of sequences.

Detailed change logs:

2024-04-19 (National Garlic Day) lttng-ust 2.13.8
* Add close_range wrapper to liblttng-ust-fd.so
* ust-tracepoint-event: Add static check of sequences length type
* lttng-ust(3): Fix wrong len_type for sequence
* Fix: libc wrapper: use initial-exec for malloc_nesting TLS

2024-04-19 (National Garlic Day) lttng-ust 2.12.10
* Add close_range wrapper to liblttng-ust-fd.so
* lttng-ust(3): Fix wrong len_type for sequence
* Fix: libc wrapper: use initial-exec for malloc_nesting TL


Project website: https://lttng.org
Documentation: https://lttng.org/docs
Download link: https://lttng.org/download

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] Software Heritage archival notification for git.liburcu.org

2024-04-15 Thread Mathieu Desnoyers via lttng-dev


On 2024-04-15 10:20, Michael Jeanson via lttng-dev wrote:

On 2024-04-14 20:39, Paul Wise wrote:

On Thu, 2024-04-11 at 13:45 -0400, Michael Jeanson wrote:


I see no issues with this, thanks for the heads-up.


PS: I note that git.liburcu.org and git.lttng.org seem to have
identical contents. I wonder if SWH should be archiving just one of
them or if we should archive both just in case they get split up?


At the moment 'git.liburu.org' is just a CNAME for 'git.lttng.org', we 


Typo: git.liburcu.org

Thanks,

Mathieu


don't have plans to split them up so far.

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] Compile fix for urcu-bp.c

2024-04-01 Thread Mathieu Desnoyers via lttng-dev


Hi,

There are a few things missing before I can take this patch:

- Missing commit message describing the issue,
- Missing "Signed-off-by" tag.

Thanks!

Mathieu

On 2024-03-29 10:06, Duncan Sands via lttng-dev wrote:

--- src/urcu-bp.c
+++ src/urcu-bp.c
@@ -409,7 +409,7 @@ void expand_arena(struct registry_arena *arena)
  new_chunk_size_bytes, 0);
  if (new_chunk != MAP_FAILED) {
  /* Should not have moved. */
-    assert(new_chunk == last_chunk);
+    urcu_posix_assert(new_chunk == last_chunk);
  memset((char *) last_chunk + old_chunk_size_bytes, 0,
  new_chunk_size_bytes - old_chunk_size_bytes);
  last_chunk->capacity = new_capacity;
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

[lttng-dev] [RELEASE] LTTng-modules 2.12.16 and 2.13.12 (Linux kernel tracer)

2024-03-21 Thread Mathieu Desnoyers via lttng-dev


Hi,

This is a release announcement for the currently maintained
LTTng-modules Linux kernel tracer stables branches.

* New and noteworthy in these releases:

Linux kernel v6.8 is now supported by LTTng modules 2.13.12. If you need
support for recent kernels (v5.18+), you will need to upgrade to a
recent LTTng-modules 2.13.x.

Both releases correct issues with SLE kernel version ranges detection.

A compilation fix for RHEL 9.3 kernel is present in v2.13.12.

Feedback is welcome!

Thanks,

Mathieu

Project website: https://lttng.org
Documentation: https://lttng.org/docs
Download link: https://lttng.org/download

Detailed change logs:

2024-03-21 (National Common Courtesy Day) LTTng modules 2.13.12
* docs: Add supported versions and fix-backport policy
* docs: Add links to project resources
* Fix: Correct minimum version in jbd2 SLE kernel range
* Fix: Handle recent SLE major version codes
* Fix: build on sles15sp4
* Compile fixes for RHEL 9.3 kernels
* Fix: ext4_discard_preallocations changed in linux 6.8.0-rc3
* Fix: btrfs_get_extent flags and compress_type changed in linux 
6.8.0-rc1
* Fix: btrfs_chunk tracepoints changed in linux 6.8.0-rc1
* Fix: strlcpy removed in linux 6.8.0-rc1
* Fix: timer_start changed in linux 6.8.0-rc1
* Fix: sched_stat_runtime changed in linux 6.8.0-rc1

2024-03-21 (National Common Courtesy Day) 2.12.16
* fix: lttng-probe-kvm-x86-mmu build with linux 6.6
* docs: Add supported versions and fix-backport policy
* docs: Add links to project resources
* Fix: Correct minimum version in jbd2 SLE kernel range
* Fix: Handle recent SLE major version codes
* Fix: build on sles15sp4

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [PATCH] coredump debugging: add a tracepoint to report the coredumping

2024-02-23 Thread Mathieu Desnoyers via lttng-dev


On 2024-02-23 09:26, Steven Rostedt wrote:

On Mon, 19 Feb 2024 13:01:16 -0500
Mathieu Desnoyers  wrote:


Between "sched_process_exit" and "sched_process_free", the task can still be
observed by a trace analysis looking at sched and signal events: it's a zombie 
at
that stage.


Looking at the history of this tracepoint, it was added in 2008 by commit
0a16b60758433 ("tracing, sched: LTTng instrumentation - scheduler").
Hmm, LLTng? I wonder who the author was?


[ common typo: LLTng -> LTTng ;-) ]



   Author: Mathieu Desnoyers 

  :-D

Mathieu, I would say it's your call on where the tracepoint can be located.
You added it, you own it!


Wow! that's now 16 years ago :)

I've checked with Matthew Khouzam (maintainer of Trace Compass)
which care about this tracepoint, and we have not identified any
significant impact of moving it on its model of the scheduler, other
than slightly changing its timing.

I've also checked quickly in lttng-analyses and have not found
any code that care about its specific placement.

So I would say go ahead and move it earlier in do_exit(), it's
fine by me.

If you are interested in a bit of archeology, "sched_process_free"
originated from my ltt-experimental 0.1.99.13 kernel patch against
2.6.12-rc4-mm2 back in September 2005 (that's 19 years ago). It was
a precursor to the LTTng 0.x kernel patchset.

https://lttng.org/files/ltt-experimental/patch-2.6.12-rc4-mm2-ltt-exp-0.1.99.13.gz

Index: kernel/exit.c
===
--- a/kernel/exit.c (.../trunk/kernel/linux-2.6.12-rc4-mm2) (revision 41)
+++ b/kernel/exit.c (.../branches/mathieu/linux-2.6.12-rc4-mm2) 
(revision 41)
@@ -4,6 +4,7 @@
  *  Copyright (C) 1991, 1992  Linus Torvalds
  */
 
+#include 

 #include 
 #include 
 #include 
@@ -55,6 +56,7 @@ static void __unhash_process(struct task
}
 
 	REMOVE_LINKS(p);

+  trace_process_free(p->pid);
 }
 
 void release_task(struct task_struct * p)

@@ -832,6 +834,8 @@ fastcall NORET_TYPE void do_exit(long co
}
exit_mm(tsk);
 
+	trace_process_exit(tsk->pid);

+
exit_sem(tsk);
__exit_files(tsk);
__exit_fs(tsk);

This was a significant improvement over the prior LTT which only
had the equivalent of "sched_process_exit", which caused issues
with the Linux scheduler model in LTTV due to zombie processes.

Here is where it appeared in LTT back in 1999:

http://www.opersys.com/ftp/pub/LTT/TracePackage-0.9.0.tgz

patch-ltt-2.2.13-991118

diff -urN linux/kernel/exit.c linux-2.2.13/kernel/exit.c
--- linux/kernel/exit.c Tue Oct 19 20:14:02 1999
+++ linux-2.2.13/kernel/exit.c  Sun Nov  7 23:49:17 1999
@@ -14,6 +14,8 @@
 #include 
 #endif
 
+#include 

+
 #include 
 #include 
 #include 
@@ -386,6 +388,8 @@
del_timer(>real_timer);
end_bh_atomic();
 
+	TRACE_PROCESS(TRACE_EV_PROCESS_EXIT, 0, 0);

+
lock_kernel();
 fake_volatile:
 #ifdef CONFIG_BSD_PROCESS_ACCT

And it was moved to its current location (after exit_mm()) a bit
later (2001):

http://www.opersys.com/ftp/pub/LTT/TraceToolkit-0.9.5pre2.tgz

Patches/patch-ltt-linux-2.4.5-vanilla-010909-1.10

diff -urN linux/kernel/exit.c /ext2/home/karym/kernel/linux-2.4.5/kernel/exit.c
--- linux/kernel/exit.c Fri May  4 17:44:06 2001
+++ /ext2/home/karym/kernel/linux-2.4.5/kernel/exit.c   Wed Jun 20 12:39:24 2001
@@ -14,6 +14,8 @@
 #include 
 #endif
 
+#include 

+
 #include 
 #include 
 #include 
@@ -439,6 +441,8 @@
 #endif
__exit_mm(tsk);
 
+	TRACE_PROCESS(TRACE_EV_PROCESS_EXIT, 0, 0);

+
lock_kernel();
sem_exit();
__exit_files(tsk);

So this sched_process_exit placement was actually decided
by Karim Yaghmour back in the LTT days (2001). I don't think
he will mind us moving it around some 23 years later. ;)

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] New TLS usage in libgcc_s.so.1, compatibility impact

2024-01-15 Thread Mathieu Desnoyers via lttng-dev


On 2024-01-15 14:42, Florian Weimer wrote:

* Mathieu Desnoyers:


[...]


General use of lttng should be fine, I think, only the malloc wrapper
has this problem.


The purpose of the nesting counter TLS variable in the malloc wrapper
is to catch situations like this where a global-dynamic TLS access
(or any unexpected memory access done as a side-effect from calling
libc) from within LTTng-UST instrumentation would internally attempt to
call recursively into the malloc wrapper. In that nested case, we skip
the instrumentation and call the libc function directly.

I agree with your conclusion that only this nesting counter gating variable
actually needs to be initial-exec.




But moving all TLS variables used by lttng-ust from global-dynamic to
initial-exec is tricky, because a prior attempt to do so introduced
regressions in use-cases where lttng-ust was dlopen'd by Java or
Python, AFAIU situations where the runtimes were already using most of
the extra memory pool for dlopen'd libraries initial-exec variables,
causing dlopen of lttng-ust to fail.


Oh, right, that makes it quite difficult.  Could you link a private copy
of the libraries into the wrapper that uses initial-exec TLS?


Unfortunately not easily, because by design LTTng-UST is meant to be a
singleton per-process. Changing this would have far-reaching impacts on
interactions with the LTTng-UST tracepoint instrumentation, as well as
impacts on synchronization between the LTTng-UST agent thread and
application calling fork/clone. Also AFAIR, the LTTng session daemon
(at least until recently) does not expect multiple concurrent
registrations from a given process.

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] New TLS usage in libgcc_s.so.1, compatibility impact

2024-01-15 Thread Mathieu Desnoyers via lttng-dev

ht stage of the test.)  In this
particular case, we can also paper over the test failure in glibc by not
call free at all because the argument is a null pointer:

diff --git a/elf/dl-tls.c b/elf/dl-tls.c
index 7b3dd9ab60..14c71cbd06 100644
--- a/elf/dl-tls.c
+++ b/elf/dl-tls.c
@@ -819,7 +819,8 @@ _dl_update_slotinfo (unsigned long int req_modid, size_t 
new_gen)
 dtv entry free it.  Note: this is not AS-safe.  */
  /* XXX Ideally we will at some point create a memory
 pool.  */
- free (dtv[modid].pointer.to_free);
+ if (dtv[modid].pointer.to_free != NULL)
+   free (dtv[modid].pointer.to_free);
  dtv[modid].pointer.val = TLS_DTV_UNALLOCATED;
  dtv[modid].pointer.to_free = NULL;

As the comment hints, we shouldn't be using malloc for TLS memory at all
because it is not AS-safe, but that's a long-term change.  This change
seems rather specific to this particular test case failure because it
relies on libgcc_s.so.1 never using TLS before it gets unloaded.

Regarding the libgcc_s side, I'm not sure if the TLS usage there should
be considered a real problem, although I'm a bit nervous about it.
However, the current implementation caches one page of trampolines past
the outermost nested function pointer deallocation (otherwise creating
one function pointer per thread in a loop would be really expensive).
It looks to me that is never freed, so if the thread exits even with
proper unwinding (e.g., on glibc with code compiled with -fexceptions),
there is a memory leak.  Integration with glibc could avoid this issue,
and also help with the longjmp problem, and fix setcontext/swapcontext,
too.

Thanks,
Florian

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

[lttng-dev] [RELEASE] LTTng-modules 2.12.15 and 2.13.11 (Linux kernel tracer)

2024-01-10 Thread Mathieu Desnoyers via lttng-dev


The LTTng modules provide Linux kernel tracing capability to the LTTng
tracer toolset.

* New and noteworthy in these releases:

Newer Linux kernels (v6.6 and v6.7) are now supported by LTTng modules
2.13.11. If you need support for recent kernels (v5.18+), you will
need to upgrade to a recent LTTng-modules 2.13.x.

The "prio" context has been fixed in 2.13.11 to eliminate a crash
triggered by calling a NULL pointer address when using the "prio"
context (lttng add-context -k -t prio). This issue was introduced
when refactoring the prio context code during the 2.13 development.
The missing initialization was re-introduced, and the use of the kernel
"task_prio()" symbol was entirely replaced by inlining a copy of this
trivial function into lttng-modules instead.

The "built-in.sh" script which can be used to add a link to lttng-modules
within a kernel source tree to built LTTng into a Linux kernel image
has been updated to adapt to changes introduced in Linux v6.1.

A work-around to ensure that LTTng-modules works fine on CPUs and kernels
with IBT support enabled has been integrated:

When the Intel IBT feature is enabled, a CPU supporting this feature
validates that all indirect jumps/calls land on an ENDBR64 instruction.

The kernel seals functions which are not meant to be called indirectly,

which means that calling functions indirectly from their address fetched
using kallsyms or kprobes trigger a crash.

Use the MSR_IA32_S_CET CET_ENDBR_EN MSR bit to temporarily disable ENDBR

validation around indirect calls to kernel functions. Considering that
the main purpose of this feature is to prevent ROP-style attacks,
disabling the ENDBR validation temporarily around the call from a kernel
module does not affect the ROP protection.


Both 2.13.11 and 2.12.15:

- Fix an issue with importing VFS namespace for Android kernels.

- Fix build for RHEL 8.8 with linux 4.18.0-477.10.1+

- Fix a hardening OOPS during validation of immediate strings in the bytecode
  validator when CONFIG_UBSAN_BOUNDS and/or CONFIG_FORTIFY_SOURCE are
  configured. It boils down to changing 0-len arrays to flexible arrays
  to let the toolchain know about our intent.

- Add Ubuntu Kinetic kernel ranges for jbd2 instrumentation.

Project website: https://lttng.org
Documentation: https://lttng.org/docs
Download link: https://lttng.org/download

Detailed change logs:

2024-01-10 (National Houseplant Appreciation Day) LTTng modules 2.13.11
* Fix: Include linux/sched/rt.h for kernels v3.9 to v3.14
* Fix: Disable IBT around indirect function calls
* Inline implementation of task_prio()
* Fix: prio context NULL pointer exception
* Fix: MODULE_IMPORT_NS is introduced in kernel 5.4
* Android: Import VFS namespace for android common kernel
* Fix: get_file_rcu is missing in kernels < 4.1
* fix: lookup_fd_rcu replaced by lookup_fdget_rcu in linux 6.7.0-rc1
* fix: mm, vmscan signatures changed in linux 6.7.0-rc1
* fix: phys_proc_id and cpu_core_id moved in linux 6.7.0-rc1
* Fix build for RHEL 8.8 with linux 4.18.0-477.10.1+
* Fix: bytecode validator: oops during validation of immediate string
* fix: lttng-probe-kvm-x86-mmu build with linux 6.6
* fix: built-in lttng with kernel >= v6.1
* fix: ubuntu kinetic kernel range for jdb2

2024-01-10 (National Houseplant Appreciation Day) 2.12.15
* Fix: MODULE_IMPORT_NS is introduced in kernel 5.4
* Android: Import VFS namespace for android common kernel
* Fix build for RHEL 8.8 with linux 4.18.0-477.10.1+
* Fix: bytecode validator: oops during validation of immediate string
* fix: ubuntu kinetic kernel range for jdb2

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

[lttng-dev] [RELEASE] LTTng-UST 2.12.9 and 2.13.7 (Linux user-space tracer)

2024-01-10 Thread Mathieu Desnoyers via lttng-dev


LTTng-UST, the Linux Trace Toolkit Next Generation Userspace Tracer,
is a low-overhead application tracer. The library "liblttng-ust" enables
tracing of applications and libraries.

* New and noteworthy in these releases:

Specific to 2.13.7, a fix for misaligned urcu reader accesses was
introduced. It only applies to the lttng-ust 2.13 branch because
it implements its own "lttng-ust-urcu" flavor.

Also specific to 2.13.7, "sync" vs "unsync" enablers are introduced
to eliminate an O(n*m) algorithm:

Eliminate iteration over unmodified enablers when synchronizing the
enablers vs event state.

The intent is to turn a O(m*n) algorithm (m = number of enablers, n =
number of event probes) into a O(n) when enabling many additional events
when tracing is active.

Specifically in 2.12.9, the rfork() wrapper is fixed: it was not
passing the flags arguments. This was fixed in a larger commit
in the master and stable-2.13 branches.

Both stable branches include:

- a build system fix for documentation examples with old autoconf when
  used with a relative path.

- a clang warning fix around volatile qualifier on function pointers.

- Python agent uplift to adapt to modern python (>= 3.10),

- a possible race condition in the ustfork helper.

Enjoy!

Mathieu

Project website: https://lttng.org
Documentation: https://lttng.org/docs
Download link: https://lttng.org/download

Detailed change logs:

2024-01-10 (National Houseplant Appreciation Day) lttng-ust 2.13.7
* fix: invoke MKDIR_P before changing directories
* fix: -Wsingle-bit-bitfield-constant-conversion with clang16
* fix: clean java inner class files in examples
* Introduce sync vs unsync enablers
* Fix: misaligned urcu reader accesses
* ustfork: Fix warning about volatile qualifier
* ustfork: Fix possible race conditions
* Fix: tracepoint: Remove trailing \ at the end of macro
* fix: python agent: use stdlib distutils when setuptools is installed
* fix: python agent: install on Debian python >= 3.10
* fix: python agent: Add a dependency on generated files
* python: use setuptools with python >= 3.12

2024-01-10 (National Houseplant Appreciation Day) lttng-ust 2.12.9
* fix: invoke MKDIR_P before changing directories
* fix: clean java inner class files in examples
* ustfork: Fix warning about volatile qualifier
* ustfork: Fix possible race conditions
* Fix: FreeBSD: Pass flags arguments to rfork wrapper
* fix: python agent: use stdlib distutils when setuptools is installed
* fix: python agent: install on Debian python >= 3.10
* fix: python agent: Add a dependency on generated files
    * python: use setuptools with python >= 3.12


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [PATCH lttng-modules] Android: Import VFS namespace for android common kernel

2023-12-18 Thread Mathieu Desnoyers via lttng-dev


On 2023-12-18 05:16, Lei wang via lttng-dev wrote:

Android GKI kernel add limitation on fs interface usage.
Need to import VFS namespace explicitly to make it workable
for lttng-modules.



Merged into lttng-modules master and 2.13 branches, thanks!

Mathieu


Signed-off-by: Lei wang 
---
  src/wrapper/kallsyms.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/src/wrapper/kallsyms.c b/src/wrapper/kallsyms.c
index 97897c4..9398c83 100644
--- a/src/wrapper/kallsyms.c
+++ b/src/wrapper/kallsyms.c
@@ -113,3 +113,7 @@ unsigned long wrapper_kallsyms_lookup_name(const char *name)
  EXPORT_SYMBOL_GPL(wrapper_kallsyms_lookup_name);
  
  #endif

+
+#ifdef CONFIG_ANDROID
+MODULE_IMPORT_NS(VFS_internal_I_am_really_a_filesystem_and_am_NOT_a_driver);
+#endif


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] TSAN build broken on master branch

2023-09-23 Thread Mathieu Desnoyers via lttng-dev


On 9/21/23 21:21, Olivier Dion via lttng-dev wrote:

On Thu, 21 Sep 2023, Ondřej Surý via lttng-dev  
wrote:
[...]

It fails with:

rculfhash.c:1189:2: error: address argument to atomic operation must be a 
pointer to integer ('typeof (node_next)' (aka 'struct cds_lfht_node **') 
invalid)
 uatomic_or_mo(node_next, REMOVED_FLAG, CMM_RELEASE);
 ^~~
../include/urcu/uatomic/builtins-generic.h:123:10: note: expanded from macro 
'uatomic_or_mo'
 (void) __atomic_or_fetch(cmm_cast_volatile(addr), mask, \
^ ~~~
rculfhash.c:1440:3: error: address argument to atomic operation must be a 
pointer to integer ('typeof (fini_bucket_next)' (aka 'struct cds_lfht_node **') 
invalid)
 uatomic_or(fini_bucket_next, REMOVED_FLAG);
 ^~
../include/urcu/uatomic/builtins-generic.h:130:2: note: expanded from macro 
'uatomic_or'
 uatomic_or_mo(addr, mask, CMM_RELAXED)
 ^~
../include/urcu/uatomic/builtins-generic.h:123:10: note: expanded from macro 
'uatomic_or_mo'
 (void) __atomic_or_fetch(cmm_cast_volatile(addr), mask, \
^ ~~~


Eh I thought we fixed that.  Clang is very strict about these things.

You can apply the following
<https://review.lttng.org/c/userspace-rcu/+/10911/1>.  That ought to fix
the issue until we merge the patch.


Fix merged into liburcu master, thanks!

Mathieu

  


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] Profiling LTTng tracepoint latency on different arm platforms

2023-09-11 Thread Mathieu Desnoyers via lttng-dev


On 9/10/23 10:18, Mousa, Anas wrote:

Hey Mathieu,


Hi Anas,



We see that upon recording a tracepoint, there are multiple stages of 
reserve-commit-write, where atomics and shared memory accesses take up a big part of the 
recording time,


we're wondering, is there a "light-mode" of recording a tracepoint 
involving less logic or


a mode which can potentially have lower latency?


I've been working on the rseq(2) system call for a few years now, and 
this is intended to help reduce the cost of lttng-ust's ring buffer 
atomics on the tracing fast-path. The road ahead there is integration of 
rseq with lttng-ust, which did not show up on our customer feature 
requirements radar yet.


In terms of logic involved in the lttng-ust tracepoints, I hope that my 
current work on "libside" will help steer away from tracepoint providers 
based on macros and generated code, replacing this by an efficient 
bytecode interpreter. This should allow me to inline many of the calls 
that are currently needed between the tracepoint probe provider and the 
lttng-ust ring buffer. Again, this is an area where I think we can have 
great speed improvements, but it did not show up on our customer's 
feature requirement radar yet.



Also, are there any recent docs to share regarding tracepoint latency?


There is a Polytechnique student who extensively analyzed this recently. 
Michel, do you have a pointer to his work ?


Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [RFC] Deprecating RCU signal flavor

2023-08-23 Thread Mathieu Desnoyers via lttng-dev


On 8/23/23 10:47, Paul E. McKenney wrote:

On Mon, Aug 21, 2023 at 11:43:32AM -0400, Mathieu Desnoyers wrote:

On 8/15/23 08:38, Mathieu Desnoyers via lttng-dev wrote:

On 8/14/23 17:05, Olivier Dion via lttng-dev wrote:


After discussing it with Mathieu, we agree on the following 3 phases for
deprecating the signal flavor:

   1) liburcu-signal will be implemented in term of liburcu-mb. The only
   difference between the two flavors will be the public header files,
   linked symbols and library name.  Note that this add a regression in
   term of performance, since the implementation of liburcu-mb adds memory
   barriers on the reader side which are not present in the original
   liburcu-signal implementation.

   2) Adding the deprecated attribute to every public functions exposed by
   the liburcu-signal flavor.  At this point, tests for liburcu-signal
   will also be removed from the project.  There will be no more support
   for this flavor.

   3) Removing the liburcu-signal flavor completely from the project.

Finally, here is a tentative versions release of mine for each phase:

   1) 0.15.0 [October 2023] (also TSAN support yay!)

   2) 0.15.1

   3) 0.16.0 || 1.0.0 (maybe a major bump since this is an API breaking
   change)


There is a distinction between the version number of the liburcu project
(0.14) and the ABI soname for the shared objects. We may be able to do
step (3) without going to 1.0.0 (I don't see removal of the urcu-signal
flavor a strong enough motivation for hitting 1.0.0 yet).

Technically speaking, given that we would be removing the entire
liburcu-signal.so shared object, we would not be changing _symbols_
within an existing shared object, therefore I'm not even sure we need to
bump the soname for all the other remaining shared objects.


So after merging this commit:

 Phase 1 of deprecating liburcu-signal
 The first phase of liburcu-signal deprecation consists of implementing
 it in term of liburcu-mb. In other words, liburcu-signal is identical to
 liburcu-mb at the exception of the function symbols and public header
 files.
 This is done by:
   1) Removing the RCU_SIGNAL specific code in urcu.c
   2) Making the RCU_MB specific code also specific to RCU_SIGNAL in
   urcu.c
   3) Rewriting _urcu_signal_read_unlock_update_and_wakeup to use a
   atomic store with CMM_SEQ_CST instead of a store CMM_RELAXED with
   cmm_barrier() around it. We could keep the explicit barriers, but that
   would require to add some cmm_annotate annotations. Therefore, to be
   less intrusive in a public header file, simply use the CMM_SEQ_CST
   like for the mb flavor.

I notice that an application previously built against urcu-signal with
_LGPL_SOURCE defined would have to be rebuilt, which would require a
soname bump of urcu-signal.

So considering that this phase 1 is not really a "drop in" replacement,
I favor removing the urcu-signal flavor entirely before the next release.

Thoughts ?


The replacement is liburcu-mb, correct?


After merging this "phase 1" of the removal, I noticed that we would need
to require applications built with _LGPL_SOURCE defined and using
liburcu-signal to be rebuilt, which would require a major library soname
bump, which I would prefer to avoid unless necessary.

Therefore, I went ahead and pushed additional commits in the master branch
which completely remove liburcu-signal from the tree. Therefore, the next
release of liburcu will not have the liburcu-signal header files nor its
library shared objects.



I will need to change perfbook, but that should be an easy change,
plus sys_membarrier() is widely available by now.


Users of liburcu-signal would be expected to migrate to liburcu-memb, which
relies on membarrier to achieve similar performance, but with lower-overhead
grace periods.

Thanks,

Mathieu





--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [RFC] Deprecating RCU signal flavor

2023-08-21 Thread Mathieu Desnoyers via lttng-dev


On 8/15/23 08:38, Mathieu Desnoyers via lttng-dev wrote:

On 8/14/23 17:05, Olivier Dion via lttng-dev wrote:


After discussing it with Mathieu, we agree on the following 3 phases for
deprecating the signal flavor:

  1) liburcu-signal will be implemented in term of liburcu-mb. The only
  difference between the two flavors will be the public header files,
  linked symbols and library name.  Note that this add a regression in
  term of performance, since the implementation of liburcu-mb adds memory
  barriers on the reader side which are not present in the original
  liburcu-signal implementation.

  2) Adding the deprecated attribute to every public functions exposed by
  the liburcu-signal flavor.  At this point, tests for liburcu-signal
  will also be removed from the project.  There will be no more support
  for this flavor.

  3) Removing the liburcu-signal flavor completely from the project.

Finally, here is a tentative versions release of mine for each phase:

  1) 0.15.0 [October 2023] (also TSAN support yay!)

  2) 0.15.1

  3) 0.16.0 || 1.0.0 (maybe a major bump since this is an API breaking
  change)


There is a distinction between the version number of the liburcu project 
(0.14) and the ABI soname for the shared objects. We may be able to do 
step (3) without going to 1.0.0 (I don't see removal of the urcu-signal 
flavor a strong enough motivation for hitting 1.0.0 yet).


Technically speaking, given that we would be removing the entire 
liburcu-signal.so shared object, we would not be changing _symbols_ 
within an existing shared object, therefore I'm not even sure we need to 
bump the soname for all the other remaining shared objects.


So after merging this commit:

Phase 1 of deprecating liburcu-signal

The first phase of liburcu-signal deprecation consists of implementing

it in term of liburcu-mb. In other words, liburcu-signal is identical to
liburcu-mb at the exception of the function symbols and public header
files.

This is done by:

  1) Removing the RCU_SIGNAL specific code in urcu.c

  2) Making the RCU_MB specific code also specific to RCU_SIGNAL in

  urcu.c

  3) Rewriting _urcu_signal_read_unlock_update_and_wakeup to use a

  atomic store with CMM_SEQ_CST instead of a store CMM_RELAXED with
  cmm_barrier() around it. We could keep the explicit barriers, but that
  would require to add some cmm_annotate annotations. Therefore, to be
  less intrusive in a public header file, simply use the CMM_SEQ_CST
  like for the mb flavor.

I notice that an application previously built against urcu-signal with
_LGPL_SOURCE defined would have to be rebuilt, which would require a
soname bump of urcu-signal.

So considering that this phase 1 is not really a "drop in" replacement,
I favor removing the urcu-signal flavor entirely before the next release.

Thoughts ?

Thanks,

Mathieu


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [RFC] Deprecating RCU signal flavor

2023-08-15 Thread Mathieu Desnoyers via lttng-dev


On 8/14/23 17:05, Olivier Dion via lttng-dev wrote:


After discussing it with Mathieu, we agree on the following 3 phases for
deprecating the signal flavor:

  1) liburcu-signal will be implemented in term of liburcu-mb. The only
  difference between the two flavors will be the public header files,
  linked symbols and library name.  Note that this add a regression in
  term of performance, since the implementation of liburcu-mb adds memory
  barriers on the reader side which are not present in the original
  liburcu-signal implementation.

  2) Adding the deprecated attribute to every public functions exposed by
  the liburcu-signal flavor.  At this point, tests for liburcu-signal
  will also be removed from the project.  There will be no more support
  for this flavor.

  3) Removing the liburcu-signal flavor completely from the project.

Finally, here is a tentative versions release of mine for each phase:

  1) 0.15.0 [October 2023] (also TSAN support yay!)

  2) 0.15.1

  3) 0.16.0 || 1.0.0 (maybe a major bump since this is an API breaking
  change)


There is a distinction between the version number of the liburcu project 
(0.14) and the ABI soname for the shared objects. We may be able to do 
step (3) without going to 1.0.0 (I don't see removal of the urcu-signal 
flavor a strong enough motivation for hitting 1.0.0 yet).


Technically speaking, given that we would be removing the entire 
liburcu-signal.so shared object, we would not be changing _symbols_ 
within an existing shared object, therefore I'm not even sure we need to 
bump the soname for all the other remaining shared objects.


Thanks,

Mathieu


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [PATCH] Fix: list lttng sub-directory in Kbuild

2023-08-10 Thread Mathieu Desnoyers via lttng-dev


On 8/10/23 06:05, Richa Bharti wrote:

From: Richa Bharti 


Hi!

Thanks for your patch. I'm adding Michael Jeanson and the lttng-dev 
mailing list in CC.


Thanks,

Mathieu



* Linux kernel>=6.1 reads sub-directory from Kbuild
* Kernel < 6.1 reads sub-directory from Makefile

Signed-off-by: Richa Bharti 
---
  scripts/built-in.sh | 12 +++-
  1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/scripts/built-in.sh b/scripts/built-in.sh
index f0594ec..2451230 100755
--- a/scripts/built-in.sh
+++ b/scripts/built-in.sh
@@ -14,9 +14,19 @@ KERNEL_DIR="$(readlink --canonicalize-existing "$1")"
  # Symlink the lttng-modules directory in the kernel source
  ln -sf "$(pwd)" "${KERNEL_DIR}/lttng"
  
+# Get kernel version from Makefile

+version=$(grep -m 1 VERSION ${KERNEL_DIR}/Makefile | sed 's/^.*= //g')
+patchlevel=$(grep -m 1 PATCHLEVEL ${KERNEL_DIR}/Makefile | sed 's/^.*= //g')
+kernel_version=${version}.${patchlevel}
+
  # Graft ourself to the kernel build system
  echo 'source "lttng/src/Kconfig"' >> "${KERNEL_DIR}/Kconfig"
-sed -i 's#+= kernel/#+= kernel/ lttng/#' "${KERNEL_DIR}/Makefile"
+
+if awk "BEGIN {exit !(${kernel_version} >= 6.1)}"; then
+   echo 'obj-y += lttng/' >> "${KERNEL_DIR}/Kbuild"
+else
+   sed -i 's#+= kernel/#+= kernel/ lttng/#' "${KERNEL_DIR}/Makefile"
+fi
  
  echo >&2

  echo "$0: done." >&2


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] Status of LTTng-scope and Lttng-analyses

2023-07-19 Thread Mathieu Desnoyers via lttng-dev


On 7/18/23 15:27, Cook, Layne via lttng-dev wrote:

Can you tell me the status of the beta projects listed on the web site?

LTTng scope
LTTng analyses

The github projects haven't had an activity for quite a while. Have 
these projects been abandoned, or superceded by something else?


Hi Layne,

Thanks for your interest in those projects!

The LTTng scope beta project was an attempt at doing a significant UX
redesign of Trace Compass, starting from a use-cases/user workflow
perspective. We currently don't have the resources/funding/staff to
work on this project further, so it has not progressed for a while.

You should look at the Trace Compass and VSCode trace extension
projects instead, which have a lot more activity:

https://tracecompass.org
https://github.com/eclipse-cdt-cloud/vscode-trace-extension

The LTTng analyses beta project is a set of python scripts to analyze
LTTng traces. Our original intent with that project was that EfficiOS
would fund the work to create those analyses as prototypes in Python,
and eventually customers would fund the rather large amount of work
required to go from a prototype (slow scripts) to a production quality
project (faster C++ implementation, generic state tracking module).
Unfortunately, this never materialized, so this beta project has been
on the back burner as well.

In the recent years we have focused our efforts on the Babeltrace 2
project and on CTF2 (Common Trace Format version 2).

Feel free to have a look at Trace Compass and VSCode trace extension, and
please let us know if LTTng scope and LTTng analyses fill a gap that is
not covered by those other tools.

Thanks,

Mathieu



Thanks,

LC

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] Status of the RCU Red Black Tree

2023-07-12 Thread Mathieu Desnoyers via lttng-dev


On 7/12/23 14:44, Uttormark, Mike via lttng-dev wrote:
What became of the red-black tree effort?  I see it in the git repo, 10 
years old.  It never made it onto master.


What would it take to get it onto master and into a release branch?


Hi Mike,

There are a few things that are in the way of merging it into a liburcu
release, namely:

* An end user with a clearly defined use-case to allow defining a solid
  API,

* Validation that those use-cases are not better covered by some
  variation of my RCU Judy Array prototype instead, ref.:

  https://github.com/urcu/userspace-rcu/tree/urcu/rcuja-simple-int

* More testing, both within the liburcu project and in terms of use of
  the API from an application perspective,

* Funding for all that work, allowing us to prioritize this effort with
  respect to our various other projects.

Thanks for your interest in the liburcu Red-Black Tree prototype! Please
don't hesitate to reach out to EfficiOS if HPE would like to explore
supporting this project.

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] Fwd: lttng issue

2023-07-12 Thread Mathieu Desnoyers via lttng-dev

.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [PATCH lttng-modules 0/1] Introduce configure script to describe changes in linux kernel interface

2023-07-04 Thread Mathieu Desnoyers via lttng-dev


On 7/4/23 14:39, Roxana Nicolescu wrote:

Hi,

Thanks a lot for your feedback.

I realize I did not say the reason why I did not go for 
LTTNG_UBUNTU_KERNEL_RANGE. We deliver a bunch of different

derivatives (inherited from the main kernel), each with its own
version and it's impossible to use LTTNG_UBUNTU_KERNEL_RANGE alone. 
Derivatives in the same cycle don't have the same version number, so

I cannot rely on the version alone to determine when a change has
happened. For example these are some kernels we released last cycle: 
- linux (main kernel): 5.19.0-46 - linux-kvm: 5.19.0-1026 -

linux-lowlatency: 5.19.0-1028 As you can see, linux-kvm and
linux-lowlatency versions are not the same, and linux-lowlatency from
2 months ago version version number coincides with linux-kvm from
now, but they don't match the same base. I hope that explains it.

Initially I thought about exposing the version of the main kernel in
the kernel headers that can be later used in the module, but then I
came across openvswitch and that's how I came up with the idea of an
initial configure step.

But I totally understand if you think this is not worth it.


LTTng modules use the UTS_UBUNTU_RELEASE_ABI from the Ubuntu 
generated/utsrelease.h kernel headers to detect tracepoint 
instrumentation changes. I don't understand why many kernel flavors 
would have the same ABI number with different ABI semantics, but I guess 
that's just how things are now.


One way to solve this would be to detect the "-lowlatency" and "-kvm" 
suffixes in the string within generated/utsrelease.h UTS_RELEASE, e.g.:


#define UTS_RELEASE "5.15.0-76-lowlatency"

This could be done by LTTng modules by implementing a script similar to 
what we do for debian, fedora, rhel, and sle (see scripts/ in 
lttng-modules).


Then we could have:

* LTTNG_UBUNTU_KERNEL_RANGE for kernels where all flavors have the same
  kernel ABI.

* LTTNG_UBUNTU_GENERIC_KERNEL_RANGE for generic kernels only, for
  situations where the kernel ABI differ between flavors,

* LTTNG_UBUNTU_LOWLATENCY_KERNEL_RANGE for lowlatency kernels only, for
  situations where the kernel ABI differ between flavors,

* LTTNG_UBUNTU_KVM_KERNEL_RANGE for kvm kernels only, for situations
  where the kernel ABI differ between flavors.

It would all have been simpler if the UTS_UBUNTU_RELEASE_ABI would 
actually have been a versioned kernel ABI without different semantics 
across kernel flavors, but considering the current situation we will 
need to deal with this with scripts as we have done for other distributions.


Thanks,

Mathieu



All the best, Roxana

On 04/07/2023 20:07, Mathieu Desnoyers wrote:

On 7/4/23 11:35, Michael Jeanson via lttng-dev wrote:

On 2023-07-03 14:28, Roxana Nicolescu via lttng-dev wrote:

This script described the changes in the linux kernel interface
that affect compatibility with lttng-modules.

It is introduced for a specific usecase where commit 
d87a7b4c77a9: "jbd2: use the correct print format" broke the
interface between the kernel and lttng-module. 3 variables 
changed their type to tid_t (transaction, head and tid) in

multiple function declarations. The lttng module was updated
properly to ensure backwards compatibility by using the version
of the kernel. But this change took into account only long term
supported versions. As an example, ubuntu 5.19 kernels picked
the linux kernel change from 5.15 without actually changing the
linux kernel upstream version. This means the current tooling
does not allow to fix the module for newer ubuntu 5.19
kernels.

This script is supposed to solve the problem mentioned above,
but to also make this change easier to integrate. We check the
linux kernel header (include/trace/events/jbd2.h) if the types
of tid, transaction and head variable have changed to tid_t
and define these 3 variables in 'include/generated/config.h': 
TID_IS_TID_T 1 TRANSACTION_IS_TID_T 1 HEAD_IS_TID_T 1


In 'include/instrumentation/events/jbd2.h' we then check these
to define the proper type of transaction, head and tid
variables that will be later used in the function declarations
that need them.

This change is meant to remove the dependency on linux kernel
version and the outcome is a bit cleaner that before. As with
the previous implementation, this may need changes in the 
future if the kernel interface changes again.


Note: This is a proposal for a simpler way of integrating linux
kernel changes in lttng-modules. The implementation is very
simple due to the fact that tid_t was introduced everywhere in
one commit in include/trace/events/jbd2.h. I would like to get
your opinion on this approach. If needed, it can be improved.

Roxana Nicolescu (1): Introduce configure script to describe
changes in linux kernel interface

README.md |   3 +- configure
|  36 + include/instrumentation/events/jbd2.h | 110 
++ 3 files changed, 61 insertions(+),

88 deletio

Re: [lttng-dev] [PATCH lttng-modules 0/1] Introduce configure script to describe changes in linux kernel interface

2023-07-04 Thread Mathieu Desnoyers via lttng-dev


On 7/4/23 11:35, Michael Jeanson via lttng-dev wrote:

On 2023-07-03 14:28, Roxana Nicolescu via lttng-dev wrote:

This script described the changes in the linux kernel interface that
affect compatibility with lttng-modules.

It is introduced for a specific usecase where commit
d87a7b4c77a9: "jbd2: use the correct print format"
broke the interface between the kernel and lttng-module. 3 variables
changed their type to tid_t (transaction, head and tid) in multiple
function declarations. The lttng module was updated properly to ensure
backwards compatibility by using the version of the kernel.
But this change took into account only long term supported versions.
As an example, ubuntu 5.19 kernels picked the linux kernel change from
5.15 without actually changing the linux kernel upstream version. This
means the current tooling does not allow to fix the module for newer
ubuntu 5.19 kernels.

This script is supposed to solve the problem mentioned above, but to
also make this change easier to integrate.
We check the linux kernel header (include/trace/events/jbd2.h) if the
types of tid, transaction and head variable have changed to tid_t and
define these 3 variables in 'include/generated/config.h':
TID_IS_TID_T 1
TRANSACTION_IS_TID_T 1
HEAD_IS_TID_T 1

In 'include/instrumentation/events/jbd2.h' we then check these to define
the proper type of transaction, head and tid variables that will be
later used in the function declarations that need them.

This change is meant to remove the dependency on linux kernel version
and the outcome is a bit cleaner that before.
As with the previous implementation, this may need changes in the future
if the kernel interface changes again.

Note:
This is a proposal for a simpler way of integrating linux kernel changes
in lttng-modules. The implementation is very simple due to the fact that
tid_t was introduced everywhere in one commit in
include/trace/events/jbd2.h.
I would like to get your opinion on this approach. If needed, it can be
improved.

Roxana Nicolescu (1):
   Introduce configure script to describe changes in linux kernel
 interface

  README.md |   3 +-
  configure |  36 +
  include/instrumentation/events/jbd2.h | 110 ++
  3 files changed, 61 insertions(+), 88 deletions(-)
  create mode 100755 configure



Hi Roxana,

While I can see advantages to a configure script approach to detect 
kernel source changes I don't think it's worth the added complexity on 
top of our current kernel version range system.


We already have an Ubuntu specific kernel range macro that supplements 
the upstream version with Ubuntu's kernel ABI number:


LTTNG_UBUNTU_KERNEL_RANGE(5,19,17,X, 6,0,0,0)

I'll let Mathieu make the final call but I think that would be the 
preferred approach.


Indeed, many of the kernel tracepoint code changes we had to deal with 
in the past 10 years would not be easy to track with configure scripts, 
so we would end up with not just one, but with a combination of two 
different mechanisms to adapt to kernel code changes.


In order to keep things maintainable long-term, I prefer that we stay 
with the version-based approach as recommended by Michael.


Thanks,

Mathieu


Regards,

Michael
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [PATCH 02/11] urcu/uatomic: Use atomic builtins if configured

2023-06-29 Thread Mathieu Desnoyers via lttng-dev


On 6/29/23 13:27, Olivier Dion wrote:

On Thu, 29 Jun 2023, Olivier Dion  wrote:


   [0] https://godbolt.org/z/3nW14M3v1
   [1] https://godbolt.org/z/TcTeMeKbW


Sorry.  That was:

 [0] https://godbolt.org/z/ETcxnz4TW


Change

(volatile __typeof__(ptr))(ptr);

for:

(volatile __typeof__(*(ptr)) *)(ptr);

and:

void love_iso(int *x)
{
 __atomic_store_n(cast_volatile(), 1,
  __ATOMIC_RELAXED);
}

for

void love_iso(int *x)
{
 __atomic_store_n(cast_volatile(x), 1,
  __ATOMIC_RELAXED);
}

Thanks,

Mathieu



 [1] https://godbolt.org/z/jMjh8YoM4
--

Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [PATCH 02/11] urcu/uatomic: Use atomic builtins if configured

2023-06-29 Thread Mathieu Desnoyers via lttng-dev


On 6/29/23 13:22, Olivier Dion wrote:

On Thu, 22 Jun 2023, "Paul E. McKenney"  wrote:

On Thu, Jun 22, 2023 at 11:55:55AM -0400, Mathieu Desnoyers wrote:

On 6/21/23 19:19, Paul E. McKenney wrote:

I suggest C11 volatile atomic load/store.  Load/store fusing is permitted
for non-volatile atomic loads and stores, and such fusing can ruin your
code's entire day.  ;-)


After some testing, I got a wall of warnings:

   -Wignored-qualifiers:

 Warn if the return type of a function has a type qualifier such as
 "const".  For ISO C such a type qualifier has no effect, since the
 value returned by a function is not an lvalue.  For C++, the warning
 is only emitted for scalar types or "void".  ISO C prohibits
 qualified "void" return types on function definitions, so such
 return types always receive a warning even without this option.

Since we are using atomic builtins, for example load:

   type __atomic_load_n (type *ptr, int memorder)

If we put the qualifier volatile to TYPE, we end up with the same
qualifier on the return value, triggering a warning for each atomic
operation.

This seems to be only a problem when compiling in C++ [0] while in C it
seems the compiler is more relaxed on this [1].

Ideas to make the toolchains happy? :-)


Change:

(__typeof__(*ptr) *volatile)(ptr);

(which applies the volatile to the pointer, rather than what is pointed to)

to either:

(volatile __typeof__(*ptr) *)(ptr);

or:

(__typeof__(*ptr) volatile *)(ptr);

Thanks,

Mathieu



   [0] https://godbolt.org/z/3nW14M3v1
   [1] https://godbolt.org/z/TcTeMeKbW



--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [PATCH 02/11] urcu/uatomic: Use atomic builtins if configured

2023-06-22 Thread Mathieu Desnoyers via lttng-dev


On 6/22/23 15:53, Olivier Dion wrote:

On Thu, 22 Jun 2023, "Paul E. McKenney"  wrote:


I suggest C11 volatile atomic load/store.  Load/store fusing is permitted
for non-volatile atomic loads and stores, and such fusing can ruin your
code's entire day.  ;-)


Good catch.  Seems like not a problem on GCC (yet), but Clang is extremely
aggressive and seems to do store fusing on some corner cases [0].


I don't think this is an example of store fusing, but rather just that 
the compiler can eliminate stores to static variables which are 
otherwise unused, making the entire variable useless.


Thanks,

Mathieu



However, I do not find any simple reproducer of load/store fusing.  Do
you have example of such fusing, or is this a precaution?  In the
meantime, back to reading the standard to be certain :-)

  [0] https://godbolt.org/z/odKG9a75a



--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [PATCH 02/11] urcu/uatomic: Use atomic builtins if configured

2023-06-22 Thread Mathieu Desnoyers via lttng-dev


On 6/22/23 14:32, Paul E. McKenney wrote:

On Thu, Jun 22, 2023 at 11:55:55AM -0400, Mathieu Desnoyers wrote:

On 6/21/23 19:19, Paul E. McKenney wrote:
[...]

diff --git a/include/urcu/uatomic/builtins-generic.h 
b/include/urcu/uatomic/builtins-generic.h
new file mode 100644
index 000..8e6a9b5
--- /dev/null
+++ b/include/urcu/uatomic/builtins-generic.h
@@ -0,0 +1,85 @@
+/*
+ * urcu/uatomic/builtins-generic.h
+ *
+ * Copyright (c) 2023 Olivier Dion 
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef _URCU_UATOMIC_BUILTINS_GENERIC_H
+#define _URCU_UATOMIC_BUILTINS_GENERIC_H
+
+#include 
+
+#define uatomic_set(addr, v) __atomic_store_n(addr, v, __ATOMIC_RELAXED)
+
+#define uatomic_read(addr) __atomic_load_n(addr, __ATOMIC_RELAXED)


Does this lose the volatile semantics that the old-style definitions
had?



Yes.

[...]


+++ b/include/urcu/uatomic/builtins-x86.h
@@ -0,0 +1,85 @@
+/*
+ * urcu/uatomic/builtins-x86.h
+ *
+ * Copyright (c) 2023 Olivier Dion 
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef _URCU_UATOMIC_BUILTINS_X86_H
+#define _URCU_UATOMIC_BUILTINS_X86_H
+
+#include 
+
+#define uatomic_set(addr, v) __atomic_store_n(addr, v, __ATOMIC_RELAXED)
+
+#define uatomic_read(addr) __atomic_load_n(addr, __ATOMIC_RELAXED)


And same question here.


Yes, this opens interesting questions:

* what semantic do we want for uatomic_read/set ?

* what semantic do we want for CMM_LOAD_SHARED/CMM_STORE_SHARED ?

* do we want to allow load/store-shared to work on variables larger than a
word ? (e.g. on a uint64_t on a 32-bit architecture, or on a structure)

* what are the guarantees of a volatile type ?

* what are the guarantees of a load/store relaxed in C11 ?

Does the delta between volatile and C11 relaxed guarantees matter ?

Is there an advantage to use C11 load/store relaxed over volatile ? Should
we combine both C11 load/store _and_ volatile ? Should we use
atomic_signal_fence instead ?


I suggest C11 volatile atomic load/store.  Load/store fusing is permitted
for non-volatile atomic loads and stores, and such fusing can ruin your
code's entire day.  ;-)


I'm OK with erring towards a safer approach, but just out of curiosity, 
do you have examples of compilers doing load or store fusion on C11 or 
C++11 relaxed atomics, or is it out of caution due to lack of explicit 
guarantees in the standards ?


Does this lack of guarantee about fusion also apply to other MO such as 
acquire, release and seq.cst. ?


Thanks,

Mathieu


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [PATCH 02/11] urcu/uatomic: Use atomic builtins if configured

2023-06-22 Thread Mathieu Desnoyers via lttng-dev


On 6/21/23 19:19, Paul E. McKenney wrote:
[...]

diff --git a/include/urcu/uatomic/builtins-generic.h 
b/include/urcu/uatomic/builtins-generic.h
new file mode 100644
index 000..8e6a9b5
--- /dev/null
+++ b/include/urcu/uatomic/builtins-generic.h
@@ -0,0 +1,85 @@
+/*
+ * urcu/uatomic/builtins-generic.h
+ *
+ * Copyright (c) 2023 Olivier Dion 
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef _URCU_UATOMIC_BUILTINS_GENERIC_H
+#define _URCU_UATOMIC_BUILTINS_GENERIC_H
+
+#include 
+
+#define uatomic_set(addr, v) __atomic_store_n(addr, v, __ATOMIC_RELAXED)
+
+#define uatomic_read(addr) __atomic_load_n(addr, __ATOMIC_RELAXED)


Does this lose the volatile semantics that the old-style definitions
had?



Yes.

[...]


+++ b/include/urcu/uatomic/builtins-x86.h
@@ -0,0 +1,85 @@
+/*
+ * urcu/uatomic/builtins-x86.h
+ *
+ * Copyright (c) 2023 Olivier Dion 
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef _URCU_UATOMIC_BUILTINS_X86_H
+#define _URCU_UATOMIC_BUILTINS_X86_H
+
+#include 
+
+#define uatomic_set(addr, v) __atomic_store_n(addr, v, __ATOMIC_RELAXED)
+
+#define uatomic_read(addr) __atomic_load_n(addr, __ATOMIC_RELAXED)


And same question here.


Yes, this opens interesting questions:

* what semantic do we want for uatomic_read/set ?

* what semantic do we want for CMM_LOAD_SHARED/CMM_STORE_SHARED ?

* do we want to allow load/store-shared to work on variables larger than 
a word ? (e.g. on a uint64_t on a 32-bit architecture, or on a structure)


* what are the guarantees of a volatile type ?

* what are the guarantees of a load/store relaxed in C11 ?

Does the delta between volatile and C11 relaxed guarantees matter ?

Is there an advantage to use C11 load/store relaxed over volatile ? 
Should we combine both C11 load/store _and_ volatile ? Should we use 
atomic_signal_fence instead ?


Thanks,

Mathieu



Thanx, Paul


+


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [PATCH] Avoid calling caa_container_of on NULL pointer in cds_lfhash macros

2023-06-22 Thread Mathieu Desnoyers via lttng-dev


On 6/22/23 06:45, Ondřej Surý via lttng-dev wrote:

(Sorry, I missed closing brackets in both macros, so resending fixed patch...)

The cds_lfht_for_each_entry and cds_lfht_for_each_entry_duplicate macros
would call caa_container_of() macro on NULL pointer.  This is not a
problem under normal circumstances as the check in the for loop fails
and the loop-statement is not called with invalid (pos) value.

However AddressSanitizer doesn't like that and complains about this:

 runtime error: applying non-zero offset 18446744073709551056 to null 
pointer

Move the cds_lfht_iter_get_node(iter) != NULL from the cond-expression
of the for loop into both init-clause and iteration-expression as
conditional operator and check for (pos) value in the cond-expression
instead.


I've taken the liberty to reimplement this with a new helper "cds_lfht_entry".

Can you review and try the following commits please ?

https://review.lttng.org/c/userspace-rcu/+/10445 compiler.h: Introduce 
caa_unqual_scalar_typeof
https://review.lttng.org/c/userspace-rcu/+/10446 Avoid calling caa_container_of 
on NULL pointer in cds_lfht macros

Thanks!

Mathieu



Signed-off-by: Ondřej Surý 
---
  include/urcu/rculfhash.h | 20 ++--
  1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/include/urcu/rculfhash.h b/include/urcu/rculfhash.h
index fbd33cc..64cc18f 100644
--- a/include/urcu/rculfhash.h
+++ b/include/urcu/rculfhash.h
@@ -546,22 +546,22 @@ void cds_lfht_resize(struct cds_lfht *ht, unsigned long 
new_size);
  
  #define cds_lfht_for_each_entry(ht, iter, pos, member)			\

for (cds_lfht_first(ht, iter),  \
-   pos = caa_container_of(cds_lfht_iter_get_node(iter), \
-   __typeof__(*(pos)), member);\
-   cds_lfht_iter_get_node(iter) != NULL;   \
+   pos = (cds_lfht_iter_get_node(iter) != NULL ? 
caa_container_of(cds_lfht_iter_get_node(iter), \
+   __typeof__(*(pos)), member) : NULL);
\
+   pos != NULL;\
cds_lfht_next(ht, iter),\
-   pos = caa_container_of(cds_lfht_iter_get_node(iter), \
-   __typeof__(*(pos)), member))
+   pos = (cds_lfht_iter_get_node(iter) != NULL ? 
caa_container_of(cds_lfht_iter_get_node(iter), \
+   __typeof__(*(pos)), member) : NULL))
  
  #define cds_lfht_for_each_entry_duplicate(ht, hash, match, key,		\

iter, pos, member)  \
for (cds_lfht_lookup(ht, hash, match, key, iter),   \
-   pos = caa_container_of(cds_lfht_iter_get_node(iter), \
-   __typeof__(*(pos)), member);\
-   cds_lfht_iter_get_node(iter) != NULL;   \
+   pos = (cds_lfht_iter_get_node(iter) != NULL ? 
caa_container_of(cds_lfht_iter_get_node(iter), \
+   __typeof__(*(pos)), member) : NULL);
\
+   pos != NULL;\
cds_lfht_next_duplicate(ht, match, key, iter),  \
-   pos = caa_container_of(cds_lfht_iter_get_node(iter), \
-   __typeof__(*(pos)), member))
+   pos = (cds_lfht_iter_get_node(iter) != NULL ? 
caa_container_of(cds_lfht_iter_get_node(iter), \
+   __typeof__(*(pos)), member) : NULL))
  
  #ifdef __cplusplus

  }


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [PATCH 04/11] urcu/arch/generic: Use atomic builtins if configured

2023-06-21 Thread Mathieu Desnoyers via lttng-dev


On 6/21/23 20:53, Olivier Dion wrote:

On Wed, 21 Jun 2023, "Paul E. McKenney"  wrote:

On Mon, May 15, 2023 at 04:17:11PM -0400, Olivier Dion wrote:

  #ifndef cmm_mb
  #define cmm_mb()__sync_synchronize()


Just out of curiosity, why not also implement cmm_mb() in terms of
__atomic_thread_fence(__ATOMIC_SEQ_CST)?  (Or is that a later patch?)


IIRC, Mathieu and I agree that the definition of a thread fence -- acts
as a synchronization fence between threads -- is too weak for what we
want here.  For example, with I/O devices.

Although __sync_synchronize() is probably an alias for a SEQ_CST thread
fence, its definition -- issues a full memory barrier -- is stronger.

We do not want to rely on this assumption (alias) and prefer to rely on
the documented definition instead.



We should document this rationale with a new comment near the #define,
in case anyone mistakenly decides to use a thread fence there to make it
similar to the rest of the code in the future.

Thanks,

Mathieu


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] I'm still getting empty ust traces using tracef

2023-06-21 Thread Mathieu Desnoyers via lttng-dev


On 6/20/23 18:02, Brian Hutchinson wrote:

On Thu, May 11, 2023 at 2:14 PM Mathieu Desnoyers
 wrote:


On 2023-05-11 14:13, Mathieu Desnoyers via lttng-dev wrote:

On 2023-05-11 12:36, Brian Hutchinson via lttng-dev wrote:

... more background.  I've always used ltt in the kernel so I don't
have much experience with the user side of it and especially
multi-threaded, multi-core so I'm probably missing some fundamental
concepts that I need to understand.


Which are the exact versions of LTTng-UST and LTTng-Tools you are using
now ? (2.13.N or which git commit ?)



Also, can you try using lttng-ust stable-2.13 branch, which includes the 
following commit ?

commit be2ca8b563bab81be15cbce7b9f52422369f79f7
Author: Mathieu Desnoyers 
Date:   Tue Feb 21 14:29:49 2023 -0500

  Fix: Reevaluate LTTNG_UST_TRACEPOINT_DEFINE each time tracepoint.h is 
included

  Fix issues with missing symbols in use-cases where tracef.h is included
  before defining LTTNG_UST_TRACEPOINT_DEFINE, e.g.:

   #include 
   #define LTTNG_UST_TRACEPOINT_DEFINE
   #include 

  It is caused by the fact that tracef.h includes tracepoint.h in a
  context which has LTTNG_UST_TRACEPOINT_DEFINE undefined, and this is not
  re-evaluated for the following includes.

  Fix this by lifting the definition code in tracepoint.h outside of the
  header include guards, and #undef the old LTTNG_UST__DEFINE_TRACEPOINT
  before re-defining it to its new semantic. Use a new
  _LTTNG_UST_TRACEPOINT_DEFINE_ONCE include guard within the
  LTTNG_UST_TRACEPOINT_DEFINE defined case to ensure symbols are not
  duplicated.

  Signed-off-by: Mathieu Desnoyers 
  Change-Id: I0ef720435003a7ca0bfcf29d7bf27866c5ff8678



I applied this patch and if I use "tracef" type calls in our
application that is made up of a bunch of static libs ... the UST
trace calls work.  I verified that traces that were called from
several different static libs all worked.

But as soon as I include a "tracepoint" style tracepoint (that uses
trace provider include files etc.) then doing a "lttng list -u"
returns "None" for UST events.

Is there some kind of rule that says a file can't use both tracef and
tracepoint calls?  Is there something special you have to do to use
tracef and tracepoints in same file?  Doing so appears to have broken
everything.


It should just work.

Can you provide a minimal example of the compile unit having this
issue ?

Also you mention "static libs". Make sure you do *not* define 
"LTTNG_UST_TRACEPOINT_PROBE_DYNAMIC_LINKAGE" in this case. See the 
lttng-ust(3) man page for details (section "Statically linking the 
tracepoint provider").


Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] Profiling LTTng tracepoint latency on different arm platforms

2023-06-21 Thread Mathieu Desnoyers via lttng-dev


On 6/21/23 01:39, Yitschak, Yehuda wrote:

On 6/20/23 10:20, Mathieu Desnoyers via lttng-dev wrote:

On 6/20/23 06:27, Mousa, Anas via lttng-dev wrote:

Hello,






Arethereanysuggestionstorootcausethehighlatencyandpotentiallyimproveito
n*platform1*?


Thanks and best regards,

Anas.



I recommend using "perf" when tracing with the sample program in a
loop to figure out the hot spots. With that information on the "fast"
and "slow" system, we might be able to figure out what differs.

Also, comparing the kernel configurations of the two systems can help.
Also comparing the glibc versions of the two systems would be relevant.



Also make sure you benchmark the lttng "snapshot" mode [1] to make sure
you don't run into a situation where the disk/network I/O throughput cannot
cope with the generated event throughput, thus causing the ring buffer to
discard events. This would therefore "speed up" tracing from the application
perspective because discarding an event is faster than writing it to a ring
buffer.


You mean we should avoid the "discard" loss mode and use "overwrite" loss mode 
since discard mode can fake fast performance ?


Yes. In addition to use "overwrite-when-buffer-full" mode, the 
"snapshot" session also ensures that no consumer daemon extracts the 
trace data (unless an explicit snapshot record is performed), which 
allows comparing the ring buffer producer performance with minimal noise.


If you really want to benchmark the discard-when-buffer-full mode and 
the the consumer daemon I/O behavior, then you need to take into account 
event discarded counts and the actual trace data size that was written 
to disk.


Thanks,

Mathieu





Thanks,

Mathieu

[1] https://lttng.org/docs/v2.13/#doc-taking-a-snapshot


Thanks,

Mathieu




--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] Profiling LTTng tracepoint latency on different arm platforms

2023-06-20 Thread Mathieu Desnoyers via lttng-dev


On 6/20/23 10:20, Mathieu Desnoyers via lttng-dev wrote:

On 6/20/23 06:27, Mousa, Anas via lttng-dev wrote:

Hello,




Arethereanysuggestionstorootcausethehighlatencyandpotentiallyimproveiton*platform1*?

Thanks and best regards,

Anas.



I recommend using "perf" when tracing with the sample program in a loop 
to figure out the hot spots. With that information on the "fast" and 
"slow" system, we might be able to figure out what differs.


Also, comparing the kernel configurations of the two systems can help. 
Also comparing the glibc versions of the two systems would be relevant.




Also make sure you benchmark the lttng "snapshot" mode [1] to make sure 
you don't run into a situation where the disk/network I/O throughput 
cannot cope with the generated event throughput, thus causing the ring 
buffer to discard events. This would therefore "speed up" tracing from 
the application perspective because discarding an event is faster than 
writing it to a ring buffer.


Thanks,

Mathieu

[1] https://lttng.org/docs/v2.13/#doc-taking-a-snapshot


Thanks,

Mathieu




--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] Profiling LTTng tracepoint latency on different arm platforms

2023-06-20 Thread Mathieu Desnoyers via lttng-dev


On 6/20/23 06:27, Mousa, Anas via lttng-dev wrote:

Hello,




Arethereanysuggestionstorootcausethehighlatencyandpotentiallyimproveiton*platform1*?

Thanks and best regards,

Anas.



I recommend using "perf" when tracing with the sample program in a loop 
to figure out the hot spots. With that information on the "fast" and 
"slow" system, we might be able to figure out what differs.


Also, comparing the kernel configurations of the two systems can help. 
Also comparing the glibc versions of the two systems would be relevant.


Thanks,

Mathieu


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [PATCH] Fix: revise urcu_read_lock_update() comment

2023-06-15 Thread Mathieu Desnoyers via lttng-dev


On 6/13/23 21:51, Li-Kuan Ou wrote:

Read-side critical section nesting is tracked in lower-order bits
and grace-period phase number use a single high-order bit



Merged, thanks!

Mathieu


Signed-off-by: Li-Kuan Ou 
---
  include/urcu/static/urcu-bp.h | 6 +++---
  include/urcu/static/urcu-mb.h | 6 +++---
  include/urcu/static/urcu-memb.h   | 6 +++---
  include/urcu/static/urcu-signal.h | 6 +++---
  4 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/include/urcu/static/urcu-bp.h b/include/urcu/static/urcu-bp.h
index 8ba3830..b163a90 100644
--- a/include/urcu/static/urcu-bp.h
+++ b/include/urcu/static/urcu-bp.h
@@ -137,9 +137,9 @@ static inline enum urcu_bp_state 
urcu_bp_reader_state(unsigned long *ctr)
  
  /*

   * Helper for _urcu_bp_read_lock().  The format of urcu_bp_gp.ctr (as well as
- * the per-thread rcu_reader.ctr) has the upper bits containing a count of
- * _urcu_bp_read_lock() nesting, and a lower-order bit that contains either 
zero
- * or URCU_BP_GP_CTR_PHASE.  The smp_mb_slave() ensures that the accesses in
+ * the per-thread rcu_reader.ctr) has the lower-order bits containing a count 
of
+ * _urcu_bp_read_lock() nesting, and a single high-order URCU_BP_GP_CTR_PHASE 
bit
+ * that contains either zero or one.  The smp_mb_slave() ensures that the 
accesses in
   * _urcu_bp_read_lock() happen before the subsequent read-side critical 
section.
   */
  static inline void _urcu_bp_read_lock_update(unsigned long tmp)
diff --git a/include/urcu/static/urcu-mb.h b/include/urcu/static/urcu-mb.h
index b97e42a..253d29b 100644
--- a/include/urcu/static/urcu-mb.h
+++ b/include/urcu/static/urcu-mb.h
@@ -63,9 +63,9 @@ extern DECLARE_URCU_TLS(struct urcu_reader, urcu_mb_reader);
  
  /*

   * Helper for _urcu_mb_read_lock().  The format of urcu_mb_gp.ctr (as well as
- * the per-thread rcu_reader.ctr) has the upper bits containing a count of
- * _urcu_mb_read_lock() nesting, and a lower-order bit that contains either 
zero
- * or URCU_GP_CTR_PHASE.  The cmm_smp_mb() ensures that the accesses in
+ * the per-thread rcu_reader.ctr) has the lower-order bits containing a count 
of
+ * _urcu_mb_read_lock() nesting, and a single high-order URCU_BP_GP_CTR_PHASE 
bit
+ * that contains either zero or one.  The cmm_smp_mb() ensures that the 
accesses in
   * _urcu_mb_read_lock() happen before the subsequent read-side critical 
section.
   */
  static inline void _urcu_mb_read_lock_update(unsigned long tmp)
diff --git a/include/urcu/static/urcu-memb.h b/include/urcu/static/urcu-memb.h
index c8d102f..f64cb57 100644
--- a/include/urcu/static/urcu-memb.h
+++ b/include/urcu/static/urcu-memb.h
@@ -86,9 +86,9 @@ extern DECLARE_URCU_TLS(struct urcu_reader, urcu_memb_reader);
  
  /*

   * Helper for _rcu_read_lock().  The format of urcu_memb_gp.ctr (as well as
- * the per-thread rcu_reader.ctr) has the upper bits containing a count of
- * _rcu_read_lock() nesting, and a lower-order bit that contains either zero
- * or URCU_GP_CTR_PHASE.  The smp_mb_slave() ensures that the accesses in
+ * the per-thread rcu_reader.ctr) has the lower-order bits containing a count 
of
+ * _rcu_read_lock() nesting, and a single high-order URCU_BP_GP_CTR_PHASE bit
+ * that contains either zero or one.  The smp_mb_slave() ensures that the 
accesses in
   * _rcu_read_lock() happen before the subsequent read-side critical section.
   */
  static inline void _urcu_memb_read_lock_update(unsigned long tmp)
diff --git a/include/urcu/static/urcu-signal.h 
b/include/urcu/static/urcu-signal.h
index c7577d3..707eaf8 100644
--- a/include/urcu/static/urcu-signal.h
+++ b/include/urcu/static/urcu-signal.h
@@ -64,9 +64,9 @@ extern DECLARE_URCU_TLS(struct urcu_reader, 
urcu_signal_reader);
  
  /*

   * Helper for _rcu_read_lock().  The format of urcu_signal_gp.ctr (as well as
- * the per-thread rcu_reader.ctr) has the upper bits containing a count of
- * _rcu_read_lock() nesting, and a lower-order bit that contains either zero
- * or URCU_GP_CTR_PHASE.  The cmm_barrier() ensures that the accesses in
+ * the per-thread rcu_reader.ctr) has the lower-order bits containing a count 
of
+ * _rcu_read_lock() nesting, and a single high-order URCU_BP_GP_CTR_PHASE bit
+ * that contains either zero or one.  The cmm_barrier() ensures that the 
accesses in
   * _rcu_read_lock() happen before the subsequent read-side critical section.
   */
  static inline void _urcu_signal_read_lock_update(unsigned long tmp)


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [PATCH] Fix: revise urcu_read_lock_update() comment

2023-06-13 Thread Mathieu Desnoyers via lttng-dev


On 6/13/23 11:45, Li-Kuan Ou via lttng-dev wrote:

Read-side critical section nesting is tracked in lower-order bits
and grace-period phase number use a single high-order bit



Thanks for the fix. Here is a comment below,


Signed-off-by: Li-Kuan Ou 
---
  include/urcu/static/urcu-bp.h | 4 ++--
  include/urcu/static/urcu-mb.h | 4 ++--
  include/urcu/static/urcu-memb.h   | 4 ++--
  include/urcu/static/urcu-signal.h | 4 ++--
  4 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/urcu/static/urcu-bp.h b/include/urcu/static/urcu-bp.h
index 8ba3830..c90c9f1 100644
--- a/include/urcu/static/urcu-bp.h
+++ b/include/urcu/static/urcu-bp.h
@@ -137,8 +137,8 @@ static inline enum urcu_bp_state 
urcu_bp_reader_state(unsigned long *ctr)
  
  /*

   * Helper for _urcu_bp_read_lock().  The format of urcu_bp_gp.ctr (as well as
- * the per-thread rcu_reader.ctr) has the upper bits containing a count of
- * _urcu_bp_read_lock() nesting, and a lower-order bit that contains either 
zero
+ * the per-thread rcu_reader.ctr) has the lower-order bits containing a count 
of
+ * _urcu_bp_read_lock() nesting, and a single high-order bit that contains 
either zero


I think it would be clearer to state:

Helper for _urcu_bp_read_lock().  The format of urcu_bp_gp.ctr (as well as
the per-thread rcu_reader.ctr) has the lower-order bits containing a count of
urcu_bp_read_lock() nesting, and a single high-order URCU_BP_GP_CTR_PHASE bit
that contains either zero or one.  The smp_mb_slave() ensures that the accesses
in urcu_bp_read_lock() happen before the subsequent read-side critical section.

(likewise for similar comments in other files).

Can you submit an updated patch please ?

Thanks,

Mathieu




   * or URCU_BP_GP_CTR_PHASE.  The smp_mb_slave() ensures that the accesses in
   * _urcu_bp_read_lock() happen before the subsequent read-side critical 
section.
   */
diff --git a/include/urcu/static/urcu-mb.h b/include/urcu/static/urcu-mb.h
index b97e42a..218e2f3 100644
--- a/include/urcu/static/urcu-mb.h
+++ b/include/urcu/static/urcu-mb.h
@@ -63,8 +63,8 @@ extern DECLARE_URCU_TLS(struct urcu_reader, urcu_mb_reader);
  
  /*

   * Helper for _urcu_mb_read_lock().  The format of urcu_mb_gp.ctr (as well as
- * the per-thread rcu_reader.ctr) has the upper bits containing a count of
- * _urcu_mb_read_lock() nesting, and a lower-order bit that contains either 
zero
+ * the per-thread rcu_reader.ctr) has the lower-order bits containing a count 
of
+ * _urcu_mb_read_lock() nesting, and a single high-order bit that contains 
either zero
   * or URCU_GP_CTR_PHASE.  The cmm_smp_mb() ensures that the accesses in
   * _urcu_mb_read_lock() happen before the subsequent read-side critical 
section.
   */
diff --git a/include/urcu/static/urcu-memb.h b/include/urcu/static/urcu-memb.h
index c8d102f..b923f73 100644
--- a/include/urcu/static/urcu-memb.h
+++ b/include/urcu/static/urcu-memb.h
@@ -86,8 +86,8 @@ extern DECLARE_URCU_TLS(struct urcu_reader, urcu_memb_reader);
  
  /*

   * Helper for _rcu_read_lock().  The format of urcu_memb_gp.ctr (as well as
- * the per-thread rcu_reader.ctr) has the upper bits containing a count of
- * _rcu_read_lock() nesting, and a lower-order bit that contains either zero
+ * the per-thread rcu_reader.ctr) has the lower-order bits containing a count 
of
+ * _rcu_read_lock() nesting, and a single high-order bit that contains either 
zero
   * or URCU_GP_CTR_PHASE.  The smp_mb_slave() ensures that the accesses in
   * _rcu_read_lock() happen before the subsequent read-side critical section.
   */
diff --git a/include/urcu/static/urcu-signal.h 
b/include/urcu/static/urcu-signal.h
index c7577d3..00588b8 100644
--- a/include/urcu/static/urcu-signal.h
+++ b/include/urcu/static/urcu-signal.h
@@ -64,8 +64,8 @@ extern DECLARE_URCU_TLS(struct urcu_reader, 
urcu_signal_reader);
  
  /*

   * Helper for _rcu_read_lock().  The format of urcu_signal_gp.ctr (as well as
- * the per-thread rcu_reader.ctr) has the upper bits containing a count of
- * _rcu_read_lock() nesting, and a lower-order bit that contains either zero
+ * the per-thread rcu_reader.ctr) has the lower-order bits containing a count 
of
+ * _rcu_read_lock() nesting, and a single high-order bit that contains either 
zero
   * or URCU_GP_CTR_PHASE.  The cmm_barrier() ensures that the accesses in
   * _rcu_read_lock() happen before the subsequent read-side critical section.
   */


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

[lttng-dev] Tracing Summit - Last year's 2022 talk recordings are available online!

2023-06-09 Thread Mathieu Desnoyers via lttng-dev


Hello all,

The recordings for last year’s 2022 Tracing Summit talks were just
posted to the DiaMon Workgroup channel!

2022 Tracing Summit Talks:
https://www.youtube.com/playlist?list=PLuo4E47p5_7YbvyBpSHh-wO3KUVQ81BQR

If you did not get the chance to attend last year, we invite you to take
a look at the diverse tracing talks that included eBPF and Perfetto
developments as well as updates for the core Linux kernel tracers.


This year, we’re looking forward to hearing about your new tracing
developments and challenging use cases at the 2023 Tracing Summit! If
you’re interested in exchanging ideas with experts in state-of-the-art
tracing, we invite you to submit a talk proposal soon as the deadline is
coming up next week (June 16th).

You can submit your 2023 Tracing Summit talk abstract here:
https://cfp.tracingsummit.org/ts2023/cfp

Best regards,

Mathieu



The 2023 Tracing Summit will be held in Bilbao, Spain on September 17th
and 18th, at the Euskalduna Conference Centre, co-located with Open
Source Summit Europe.

To register, you can include the Tracing Summit as an add-on to your
Open Source Summit ticket or use these links to register solely for the
Tracing Summit: https://cvent.me/Gn0nkR (in-person, 80$),
https://cvent.me/xywylX (virtual).

For more info: https://tracingsummit.org/

The 2023 Tracing Summit is sponsored by EfficiOS and organized by Erica
Bugden (EfficiOS), Olivier Dion (EfficiOS), and Mathieu Desnoyers
(EfficiOS) on behalf of the Linux Foundation Diagnostic and Monitoring
Workgroup.

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

[lttng-dev] [RELEASE] LTTng UST 2.12.8/2.13.6 and LTTng modules 2.12.14/2.13.10 tracers

2023-06-07 Thread Mathieu Desnoyers via lttng-dev


Hi,

This is a stable release announcement for the LTTng UST and LTTng modules 
tracer projects.
Those contain mainly bug fixes and add support for recent distributions and 
upstream kernels.

What's new in both LTTng-UST 2.12.8 and 2.13.6:

- Fix: use unaligned pointer accesses for lttng_inline_memcpy

  lttng_inline_memcpy receives pointers which can be unaligned. This

  causes issues (traps) specifically on arm 32-bit with 8-byte strings
  (including \0).

- Fix: trace events in C constructors/destructors

  Adding a priority (150) to the tracepoint and tracepoint provider
  constructors/destructors ensures that we trace tracepoints located
  within C constructors/destructors with a higher priority value,
  including the default init priority of 65535, when the tracepoint vs
  tracepoint definition vs tracepoint probe provider are in different
  compile units (and in various link order one compared to another).

- Fix: Reevaluate LTTNG_UST_TRACEPOINT_DEFINE each time tracepoint.h is included

  Fix issues with missing symbols in use-cases where tracef.h is included
  before defining LTTNG_UST_TRACEPOINT_DEFINE

- Fix: segmentation fault on filter interpretation in "switch" mode

  Fix a bytecode interpreter crash when building with INTERPRETER_USE_SWITCH
  defined (used mainly for debugging purposes).


What's new specifically in LTTng-UST 2.13.6:

- Fix: `ip` context is expressed as a base-10 field

  The base for UST context field `ip` was changed from 16 (hexadecimal) to
  10 (decimal), most likely an unintentional copy error in 4e48b5d.

- Various fixes to build with -std=c99.

- Fix: trace events in C++ constructors/destructors

  Wrap constructor and destructor functions to invoke them as functions with
  the constructor/destructor GNU C attributes, which ensures that those
  constructors/destructors are ordered before/after C++
  constructors/destructors.


What's new in LTTng modules 2.12.14 and 2.13.10:

- fix: kallsyms wrapper on CONFIG_PPC64_ELF_ABI_V1

  Work-around PPC64 ELF ABI v1 function descriptor issues when using kallsyms.

- Add support for RHEL 9.0 and 9.1.


What's new specifically in LTTng modules 2.12.14:

- Various tracepoint instrumentation fixes to support kernel v5.18.


What's new specifically in LTTng modules 2.13.10:

- Various tracepoint instrumentation fixes to support kernel v6.3.


Feedback is welcome!

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] Trying to understand use of lttng enable-event --kernel --userspace-probe=

2023-05-18 Thread Mathieu Desnoyers via lttng-dev


On 2023-05-18 15:20, Brian Hutchinson wrote:

On Thu, May 18, 2023 at 3:07 PM Brian Hutchinson  wrote:


On Thu, May 18, 2023 at 3:03 PM Mathieu Desnoyers
 wrote:


On 2023-05-18 14:58, Brian Hutchinson wrote:

On Thu, May 18, 2023 at 11:00 AM Brian Hutchinson  wrote:


On Thu, May 18, 2023 at 10:45 AM Mathieu Desnoyers
 wrote:


On 2023-05-18 10:10, Brian Hutchinson wrote:
[...]

I updated my hello world to have a function I'd like to use the
--userspace-probe method on with the very original name of
'probe_function':

#include 
#include 

void probe_function(int i);

int main(int argc, char *argv[])
{
  unsigned int i;
  puts("Hello, World!\nPress Enter to continue...");
  /*
   * The following getchar() call only exists for the purpose of this
   * demonstration, to pause the application in order for you to have
   * time to list its tracepoints. You don't need it otherwise.
   */
  getchar();

  lttng_ust_tracef("Number %d, string %s", 23, "hi there!");
  printf("Number %d, string %s", 23, "hi there!");

  for (i = 0; i < argc; i++) {
  lttng_ust_tracef("Number %d, argv %s", i, argv[i]);
  printf("Number %d, argv %s", i, argv[i]);
  }

  puts("Quitting now!");

  probe_function(i);

  return 0;
}

void probe_function(int i) {

  lttng_ust_tracef("Number %d, string %s", i * i, "i^2");
  printf("Number %d, string %s", i * i, "i^2");

}

... and I get the same error as before when I try to enable the probe:
# lttng enable-event --kernel
--userspace-probe=/usr/local/bin/hello:probe_function
Error: Missing event name(s).


As the error states, you are missing the event name. See

man 1 lttng-enable-event

  lttng [GENERAL OPTIONS] enable-event --kernel
[--probe=SOURCE | --function=SOURCE | --syscall |
 --userspace-probe=SOURCE]
[--filter=EXPR] [--session=SESSION]
[--channel=CHANNEL] EVENT[,EVENT]...


You will want something like:

lttng enable-event --kernel 
--userspace-probe=/usr/local/bin/hello:probe_function my_probe_function

Where "my_probe_function" is the event name that will appear in the collected 
traces.


Wow!  I must not have woken up this morning ha, ha.  Thanks for that!
The event is enabled now.  Hope to actually get tracing data now.


Well, I guess we just have the app that thwarts all attempts at tracing.

I did a dynamic probe on several functions that should be getting
called like crazy and again I get no tracing data.

Tried it with my hello world example above after Mathieu set me
straight on the event syntax and it works.

I saw this comment in the documentation "As of this version, only USDT
probes that are not surrounded by a reference counter (semaphore) are
supported."

I don't know that I can say that this function I'm probing isn't
"surrounded" by a reference counter, it's in a large multi-threaded
application so I guess it's possible.

Sigh, I'm striking out every which way.

No offense (since this is lttng list - please don't flame me ... I
want/need lttng), but I think I'm going to try just straight kprobes
and uprobes and see if trace compass can show those traces in an
attempt to get "something/anything" working.


If you attach to an ELF symbol (function), then there is no USDT in
play, so it should not be related to the issue you have.


That is what I was thinking which is why I wanted to try it.



But if your functions happen to be inlined, then there will be nothing
to attach to. Perhaps this is what happens there ?


I don't see any evidence of anything being inlined in this module.  I
grepped the code to verify.

Back to being stumped/stuck.


I can do trace-cmd stuff and it works.  The hello world above works so
I don't "think" this is a problem but again in full disclosure I'll
mention/ask about it.

Does any of the lttng tools/libs depend on kernel headers?  I ask
because old yocto (Dunfell) built lttng package against a 4.something
kernel and we're running a 5.10.69 kernel that lttng modules were
added to it with the "builtin" script and built that way.

Should probably have yocto build the local kernel too, but kernel is
being built stand alone due to vendor stuff that hasn't been mainlined
yet.

I'm running out of things to think about that could be the issue.


If lttng-modules can trace your smaller test application through 
uprobes, then the problem is likely elsewhere.


Only lttng-modules has dependencies on kernel headers. lttng-tools/ust 
don't depend on kernel headers.


Mathieu


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] Trying to understand use of lttng enable-event --kernel --userspace-probe=

2023-05-18 Thread Mathieu Desnoyers via lttng-dev


On 2023-05-18 15:07, Brian Hutchinson wrote:

[...]



If you attach to an ELF symbol (function), then there is no USDT in
play, so it should not be related to the issue you have.


That is what I was thinking which is why I wanted to try it.



But if your functions happen to be inlined, then there will be nothing
to attach to. Perhaps this is what happens there ?


I don't see any evidence of anything being inlined in this module.  I
grepped the code to verify.

Back to being stumped/stuck.


Make sure to check the resulting assembler and ELF symbol tables.

The compiler is free to inline various functions unless they are 
explicitly marked as __attribute__((noinline)). Also, if LTO is enabled, 
further optimization can be done at link-time.


One purpose of the UST tracepoints is to be less fragile with respect to 
specific optimizations done by the compiler and linker, thus 
guaranteeing that whatever is instrumented with a tracepoint is indeed 
available for tracing.


Also, double-check that the path you pass to --userspace-probe really 
targets your executable or .so binary file, and is not just a symbolic link.


Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] Trying to understand use of lttng enable-event --kernel --userspace-probe=

2023-05-18 Thread Mathieu Desnoyers via lttng-dev


On 2023-05-18 14:58, Brian Hutchinson wrote:

On Thu, May 18, 2023 at 11:00 AM Brian Hutchinson  wrote:


On Thu, May 18, 2023 at 10:45 AM Mathieu Desnoyers
 wrote:


On 2023-05-18 10:10, Brian Hutchinson wrote:
[...]

I updated my hello world to have a function I'd like to use the
--userspace-probe method on with the very original name of
'probe_function':

#include 
#include 

void probe_function(int i);

int main(int argc, char *argv[])
{
 unsigned int i;
 puts("Hello, World!\nPress Enter to continue...");
 /*
  * The following getchar() call only exists for the purpose of this
  * demonstration, to pause the application in order for you to have
  * time to list its tracepoints. You don't need it otherwise.
  */
 getchar();

 lttng_ust_tracef("Number %d, string %s", 23, "hi there!");
 printf("Number %d, string %s", 23, "hi there!");

 for (i = 0; i < argc; i++) {
 lttng_ust_tracef("Number %d, argv %s", i, argv[i]);
 printf("Number %d, argv %s", i, argv[i]);
 }

 puts("Quitting now!");

 probe_function(i);

 return 0;
}

void probe_function(int i) {

 lttng_ust_tracef("Number %d, string %s", i * i, "i^2");
 printf("Number %d, string %s", i * i, "i^2");

}

... and I get the same error as before when I try to enable the probe:
# lttng enable-event --kernel
--userspace-probe=/usr/local/bin/hello:probe_function
Error: Missing event name(s).


As the error states, you are missing the event name. See

man 1 lttng-enable-event

 lttng [GENERAL OPTIONS] enable-event --kernel
   [--probe=SOURCE | --function=SOURCE | --syscall |
--userspace-probe=SOURCE]
   [--filter=EXPR] [--session=SESSION]
   [--channel=CHANNEL] EVENT[,EVENT]...


You will want something like:

lttng enable-event --kernel 
--userspace-probe=/usr/local/bin/hello:probe_function my_probe_function

Where "my_probe_function" is the event name that will appear in the collected 
traces.


Wow!  I must not have woken up this morning ha, ha.  Thanks for that!
The event is enabled now.  Hope to actually get tracing data now.


Well, I guess we just have the app that thwarts all attempts at tracing.

I did a dynamic probe on several functions that should be getting
called like crazy and again I get no tracing data.

Tried it with my hello world example above after Mathieu set me
straight on the event syntax and it works.

I saw this comment in the documentation "As of this version, only USDT
probes that are not surrounded by a reference counter (semaphore) are
supported."

I don't know that I can say that this function I'm probing isn't
"surrounded" by a reference counter, it's in a large multi-threaded
application so I guess it's possible.

Sigh, I'm striking out every which way.

No offense (since this is lttng list - please don't flame me ... I
want/need lttng), but I think I'm going to try just straight kprobes
and uprobes and see if trace compass can show those traces in an
attempt to get "something/anything" working.


If you attach to an ELF symbol (function), then there is no USDT in 
play, so it should not be related to the issue you have.


But if your functions happen to be inlined, then there will be nothing 
to attach to. Perhaps this is what happens there ?


Mathieu



Regards,

Brian


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] Trying to understand use of lttng enable-event --kernel --userspace-probe=

2023-05-18 Thread Mathieu Desnoyers via lttng-dev


On 2023-05-18 10:10, Brian Hutchinson wrote:
[...]

I updated my hello world to have a function I'd like to use the
--userspace-probe method on with the very original name of
'probe_function':

#include 
#include 

void probe_function(int i);

int main(int argc, char *argv[])
{
unsigned int i;
puts("Hello, World!\nPress Enter to continue...");
/*
 * The following getchar() call only exists for the purpose of this
 * demonstration, to pause the application in order for you to have
 * time to list its tracepoints. You don't need it otherwise.
 */
getchar();

lttng_ust_tracef("Number %d, string %s", 23, "hi there!");
printf("Number %d, string %s", 23, "hi there!");

for (i = 0; i < argc; i++) {
lttng_ust_tracef("Number %d, argv %s", i, argv[i]);
printf("Number %d, argv %s", i, argv[i]);
}

puts("Quitting now!");

probe_function(i);

return 0;
}

void probe_function(int i) {

lttng_ust_tracef("Number %d, string %s", i * i, "i^2");
printf("Number %d, string %s", i * i, "i^2");

}

... and I get the same error as before when I try to enable the probe:
# lttng enable-event --kernel
--userspace-probe=/usr/local/bin/hello:probe_function
Error: Missing event name(s).


As the error states, you are missing the event name. See

man 1 lttng-enable-event

   lttng [GENERAL OPTIONS] enable-event --kernel
 [--probe=SOURCE | --function=SOURCE | --syscall |
  --userspace-probe=SOURCE]
 [--filter=EXPR] [--session=SESSION]
 [--channel=CHANNEL] EVENT[,EVENT]...


You will want something like:

lttng enable-event --kernel 
--userspace-probe=/usr/local/bin/hello:probe_function my_probe_function

Where "my_probe_function" is the event name that will appear in the collected 
traces.

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] Trying to understand use of lttng enable-event --kernel --userspace-probe=

2023-05-17 Thread Mathieu Desnoyers via lttng-dev


On 2023-05-17 12:37, Brian Hutchinson wrote:

On Wed, May 17, 2023 at 12:08 PM Mathieu Desnoyers
 wrote:


On 2023-05-16 22:11, Brian Hutchinson via lttng-dev wrote:

Hi,

I'm trying to figure out how to use uprobes with lttng.

I can't use a normal uprobe for a line number just using the address I
want to probe obtained from objdump?  As in:

echo 'p /usr/local/bin/my_app:0x2c3a8' >>
/sys/kernel/debug/tracing/uprobe_events

... which isn't a function entry, it's just a line of code I want to probe on.

This link says it has to be elf or sdt:
https://lttng.org/man/1/lttng-enable-event/v2.11/#doc-opt--userspace-probe

So can I not probe on just a line of code by specifying an address???

It doesn't look like these methods above will do what I'm wanting to
do.  I've tried to find examples of using enable-event --kernel
--userspace-probe= but there doesn't appear to be many.



There are examples here:

https://lttng.org/docs/v2.13/#doc-enabling-disabling-events

Indeed inserting a lttng-modules uprobe within functions is not
supported at the moment, mainly because we prefer to err towards safety
and don't have the validation in place to prevent corrupting the
program's instructions if an end user would try to insert a uprobe at an
address which is not an instruction boundary.


Hmm, was really hoping to be able to do dynamic tracing without having
to modify code.


uprobes with the proper validations about instruction boundaries would 
eventually provide this. Another approach we want to invest time in is 
to integrate libpatch from Olivier Dion into lttng-ust. This would 
provide dynamic instrumentation with the performance of a purely 
userspace tracer.


But those are all things that were never prioritized by any of our 
customers, so they progress at a "back burner" pace.




I guess if I add a function call to a debug statement or something at
the point I want to probe then I could use the elf example.


Yes.





So we only support inserting uprobe on functions and SDT probes at
the moment.


I've heard of system tap but never used it.  Will have to look into that.

I really want to get lttng-ust working but I'm getting pushback on the
time I'm spending trying to get it to work ... and would really like
to demonstrate something (was hoping kernel events and uprobes)
quickly to an audience that knows nothing about lttng or full stack
tracing to gain "buy in" for the effort.


Understood.

The main thing we are missing to help you on the UST front is a console 
log of the _application_ with LTTNG_UST_DEBUG=1. I suspect it is not 
collected in your tests.


Thanks,

Mathieu




You know, those pesky things called schedules.

Thanks,

Brian



Thanks,

Mathieu



Thanks,

Brian
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com



--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] Trying to understand use of lttng enable-event --kernel --userspace-probe=

2023-05-17 Thread Mathieu Desnoyers via lttng-dev


On 2023-05-16 22:11, Brian Hutchinson via lttng-dev wrote:

Hi,

I'm trying to figure out how to use uprobes with lttng.

I can't use a normal uprobe for a line number just using the address I
want to probe obtained from objdump?  As in:

echo 'p /usr/local/bin/my_app:0x2c3a8' >>
/sys/kernel/debug/tracing/uprobe_events

... which isn't a function entry, it's just a line of code I want to probe on.

This link says it has to be elf or sdt:
https://lttng.org/man/1/lttng-enable-event/v2.11/#doc-opt--userspace-probe

So can I not probe on just a line of code by specifying an address???

It doesn't look like these methods above will do what I'm wanting to
do.  I've tried to find examples of using enable-event --kernel
--userspace-probe= but there doesn't appear to be many.



There are examples here:

https://lttng.org/docs/v2.13/#doc-enabling-disabling-events

Indeed inserting a lttng-modules uprobe within functions is not 
supported at the moment, mainly because we prefer to err towards safety 
and don't have the validation in place to prevent corrupting the 
program's instructions if an end user would try to insert a uprobe at an 
address which is not an instruction boundary.


So we only support inserting uprobe on functions and SDT probes at
the moment.

Thanks,

Mathieu



Thanks,

Brian
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

[lttng-dev] Tracing Summit 2023 Announcement and CFP

2023-05-15 Thread Mathieu Desnoyers via lttng-dev


Hello all!

This is a Call for Proposals for the Tracing Summit 2023[0] which will be held 
in Bilbao,
Spain on the 17th and 18th of September, 2023. This year, the event is 
co-located with
Open Source Summit Europe 2023 [1].

- Event dates: Sunday, September 17th - Monday, September 18th
- Location: Bilbao, Spain and virtually (co-located with Open Source Summit 
Europe)
- Registration cost
- In person: $80.00 USD (Free for speakers)
- Virtual: Free
- Call for proposals link: [2]

Important dates:
- Call for proposals close: Friday, June 16th, at 11:59PM EDT
- Call for proposals notifications: Friday, June 23rd
- Schedule announcement: Tuesday, June 27th
- Event dates: Sunday, September 17th - Monday, September 18th

Stand-alone registration is expected to open next week. In the meantime, you 
can subscribe
to the mailing list to get the latest information on the event: [3]

The 2023 Tracing Summit is a two-day, single-track conference on the topic of 
tracing. The
event focuses on software and hardware tracing, gathering developers and 
end-users of tracing
and trace analysis tools. The main goal of the Tracing Summit is to provide 
space for
discussion between people of the various areas that benefit from tracing, 
namely parallel,
distributed and/or real-time systems, as well as kernel development.

We are welcoming 30 minute presentations from both end users and developers, on 
topics
covering, but not limited to:

- Investigation workflow of real-time, latency, and throughput issues,
- Trace collection and extraction,
- Trace filtering,
- Trace aggregation,
- Trace formats,
- Tracing multi-core systems,
- Trace abstraction,
- Trace modeling,
- Automated trace analysis (e.g. dependency analysis),
- Tracing large clusters and distributed systems,
- Hardware-level tracing (e.g. DSP, GPU, bare-metal),
- Trace visualization,
- Interaction between debugging and tracing,
- Tracing remote control,
- Analysis of large trace datasets,
- Cloud trace collection and analysis,
- Integration between trace tools,
- Live tracing & monitoring,
- Dynamic instrumentation,
- Programmable tracing (e.g. eBPF).

Talks can cover recently available technologies, ongoing work, and yet 
non-existing
technologies (that are compelling to end-users). Talks covering interesting or 
challenging
tracing use cases are also welcome as they can reveal future directions or 
tooling needs.

Please understand that this open forum is not the proper place to present sales 
or marketing
pitches, nor technologies which are prevented from being freely used in open 
source.

Please send any questions about this conference to .

This event is organized by EfficiOS on behalf of the Linux Foundation 
Diagnostic and
Monitoring Workgroup [4].

The organizers of this event are Mathieu Desnoyers (EfficiOS), Erica Bugden 
(EfficiOS)
and Olivier Dion (EfficiOS).

[0]: https://tracingsummit.org
[1]: https://events.linuxfoundation.org/open-source-summit-europe/
[2]: https://cfp.tracingsummit.org/ts2023/cfp
[3]: https://eepurl.com/goakfv
[4]: https://diamon.org/

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] I'm still getting empty ust traces using tracef

2023-05-12 Thread Mathieu Desnoyers via lttng-dev


On 2023-05-12 10:52, Brian Hutchinson wrote:

Hi Mathieu,

On Fri, May 12, 2023 at 9:33 AM Mathieu Desnoyers
 wrote:


On 2023-05-12 00:10, Brian Hutchinson wrote:

Hmm, I missed this earlier somehow.

So, I'm not the greatest at updating OE and Yocto recipes.  I'm
currently using this recipe:
http://cgit.openembedded.org/openembedded-core/tree/meta/recipes-kernel/lttng/lttng-ust_2.13.5.bb?h=master

... and it looks like the commit you are talking about is newer.

I always think, oh, I'll just update the source URI in the recipe but
it's never that simple ... and there are patches in the recipe etc.

I've got a sdk (external toolchain) built for my embedded platform.
Would it be too hard to just download stable-2.13 of everything and
cross compile it outside of Yocto?

What do you suggest?

And do I need to do anything besides just get 2.13 stable working?  I
was kind of confused if I need to put a #define
LTTNG_TRACEPOINT_DEFINE somewhere in my code.  I'm not using a
tracepoint provider packages at this point


Hi Brian,

You might want to provide a trimmed-down reproducer of your issue:
example .c compile unit instrumented with tracepoints, example .c
compile unit containing the tracepoint probes, and the log of the
console when this application is run with LTTNG_UST_DEBUG=1.


The code has two different areas where I'm trying to use tracef.  The
way the app is put together, each of these areas end up becoming
static libs that all get lumped together to make the final executable
(which is then linked with -llttng-ust and -ldl).

If I'm reading between the lines correctly with respect to the commit
you pointed out (that I'm missing), if I reduce the inclusion of I
#include  to one instance (like with the hello world
that worked), I'm thinking the version I have might work.

I don't know how I could trim down the large multi threaded app I'm
trying to debug to share.

Another dynamic I should mention in full disclosure, the app in
question has been ported from a different OS and was on a single core
cpu.  The new host ( imx8) is a quad core A53 and since the app wasn't
written for multicore, the cpu's are isolated and systemd is starting
the app on cpu 0 but once it's up it switches it's affinity to cpu 1
so I don't know if that's a factor here or not so just mentioning it.

I did try with LTTNG_UST_DEBUG=1 last night and it didn't put out much:

export LTTNG_UST_DEBUG=1
# systemctl start my_app


I suspect that because you run your application under systemctl, we are 
not seeing the console output from the application.


The console output below appears to come from liblttng-ust-ctl.so linked 
within lttng-sessiond/consumerd, not the application.


Can you find a way to run your application and capture the console output ?

Thanks,

Mathieu




#lttng create my_tc_trace --output=/tmp/my_tc_trace
Spawning a session daemon
libringbuffer-clients[711/711]
: LTT : ltt ring buffer client
"relay-metadata-mmap" init
(in lttng_ring_buffer_metadata_client_init() at
../../../lttng-ust-2.13.5/src/common/ringbuffer-clients/metadata-template.h:364)
libringbuffer-clients[711/711]: LTT : ltt ring buffer client
"relay-overwrite-mmap" init
(in lttng_ring_buffer_client_overwrite_init() at
../../../lttng-ust-2.13.5/src/common/ringbuffer-clients/template.h:826)
libringbuffer-clients[711/711]: LTT : ltt ring buffer client
"relay-overwrite-rt-mmap" init
(in lttng_ring_buffer_client_overwrite_rt_init() at
../../../lttng-ust-2.13.5/src/common/ringbuffer-clients/template.h:826)
libringbuffer-clients[711/711]: LTT : ltt ring buffer client
"relay-discard-mmap" init
(in lttng_ring_buffer_client_discard_init() at
../../../lttng-ust-2.13.5/src/common/ringbuffer-clients/template.h:826)
libringbuffer-clients[711/711]: LTT : ltt ring buffer client
"relay-discard-rt-mmap" init
(in lttng_ring_buffer_client_discard_rt_init() at
../../../lttng-ust-2.13.5/src/common/ringbuffer-clients/template.h:826)
[  179.384456] LTTng: Loaded modules v2.13.9 (Nordicit�é)
[  179.390366] LTTng: Experimental bitwise enum enabled.
libringbuffer-clients[711/711]: LTT : ltt ring buffer client
"relay-discard-rt-mmap" exit
(in lttng_ring_buffer_client_discard_rt_exit() at
../../../lttng-ust-2.13.5/src/common/ringbuffer-clients/template.h:833)
libringbuffer-clients[711/711]: LTT : ltt ring buffer client
"relay-discard-mmap" exit
(in lttng_ring_buffer_client_discard_exit() at
../../../lttng-ust-2.13.5/src/common/ringbuffer-clients/template.h:833)
libringbuffer-clients[711/711]: LTT : ltt ring buffer client
"relay-overwrite-rt-mmap" exit
(in lttng_ring_buffer_client_overwrite_rt_exit() at
../../../lttng-ust-2.13.5/src/common/ringbuffer-clients/template.h:833)
libringbuffer-clients[711/711]: LTT : ltt ring buffer client
"relay-overwrite-mmap" exit
(in lttng_ring_buffer_client_overwrite_exit() at
../../../lttng-ust-2.13.5/src/common/ringbuffer-clients/template.h:83

Re: [lttng-dev] I'm still getting empty ust traces using tracef

2023-05-12 Thread Mathieu Desnoyers via lttng-dev


[adding back the mailing list]

On 2023-05-12 09:33, Mathieu Desnoyers wrote:

On 2023-05-12 00:10, Brian Hutchinson wrote:

Hmm, I missed this earlier somehow.

So, I'm not the greatest at updating OE and Yocto recipes.  I'm
currently using this recipe:
http://cgit.openembedded.org/openembedded-core/tree/meta/recipes-kernel/lttng/lttng-ust_2.13.5.bb?h=master

... and it looks like the commit you are talking about is newer.

I always think, oh, I'll just update the source URI in the recipe but
it's never that simple ... and there are patches in the recipe etc.

I've got a sdk (external toolchain) built for my embedded platform.
Would it be too hard to just download stable-2.13 of everything and
cross compile it outside of Yocto?

What do you suggest?

And do I need to do anything besides just get 2.13 stable working?  I
was kind of confused if I need to put a #define
LTTNG_TRACEPOINT_DEFINE somewhere in my code.  I'm not using a
tracepoint provider packages at this point


Hi Brian,

You might want to provide a trimmed-down reproducer of your issue: 
example .c compile unit instrumented with tracepoints, example .c 
compile unit containing the tracepoint probes, and the log of the 
console when this application is run with LTTNG_UST_DEBUG=1.






Thanks,

Mathieu



--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] I'm still getting empty ust traces using tracef

2023-05-11 Thread Mathieu Desnoyers via lttng-dev


On 2023-05-11 14:13, Mathieu Desnoyers via lttng-dev wrote:

On 2023-05-11 12:36, Brian Hutchinson via lttng-dev wrote:

... more background.  I've always used ltt in the kernel so I don't
have much experience with the user side of it and especially
multi-threaded, multi-core so I'm probably missing some fundamental
concepts that I need to understand.


Which are the exact versions of LTTng-UST and LTTng-Tools you are using 
now ? (2.13.N or which git commit ?)




Also, can you try using lttng-ust stable-2.13 branch, which includes the 
following commit ?

commit be2ca8b563bab81be15cbce7b9f52422369f79f7
Author: Mathieu Desnoyers 
Date:   Tue Feb 21 14:29:49 2023 -0500

Fix: Reevaluate LTTNG_UST_TRACEPOINT_DEFINE each time tracepoint.h is 
included

Fix issues with missing symbols in use-cases where tracef.h is included

before defining LTTNG_UST_TRACEPOINT_DEFINE, e.g.:

 #include 

 #define LTTNG_UST_TRACEPOINT_DEFINE
 #include 

It is caused by the fact that tracef.h includes tracepoint.h in a

context which has LTTNG_UST_TRACEPOINT_DEFINE undefined, and this is not
re-evaluated for the following includes.

Fix this by lifting the definition code in tracepoint.h outside of the

header include guards, and #undef the old LTTNG_UST__DEFINE_TRACEPOINT
before re-defining it to its new semantic. Use a new
_LTTNG_UST_TRACEPOINT_DEFINE_ONCE include guard within the
LTTNG_UST_TRACEPOINT_DEFINE defined case to ensure symbols are not
duplicated.

Signed-off-by: Mathieu Desnoyers 

Change-Id: I0ef720435003a7ca0bfcf29d7bf27866c5ff8678

Thanks,

Mathieu



Thanks,

Mathieu



Regards,

Brian

On Thu, May 11, 2023 at 11:53 AM Brian Hutchinson 
 wrote:


Hi,

I posted a while ago (thread - Using lttng 2.11 and UST doesn't appear
to work - getting empty trace files) about this problem I'm having
with getting empty trace logs.

I've since upgraded to lttng v2.13 and while I can do a simple hello
world program with tracef and get events in the log files, my more
complicated large multi-threaded app I'm trying to debug is still
getting empty log file traces.

I can list the user space events in my app.

Next I do:

lttng enable-event --userspace 'lttng_ust_tracef:*'

... to enable the events, start lttng, start my app,  and I get a
trace directory structure that's empty.

I feel like I've read every thread in the archives about people having
the same problem.

I did try using LD_PRELOAD with various libs thinking that was the
problem but so far I'm still getting empty traces.

So far I've tried:

LD_PRELOAD=liblttng-ust-libc-wrapper.so.1:liblttng-ust-pthread-wrapper.so.1:liblttng-ust-dl.so.1:liblttng-ust-fork.so.1:liblttng-ust-fd.so.1
/usr/local/bin/my_app

I guess one question I have is how do I determine which "helper libs"
I need to preload?

The application I'm working on is made up of a bunch of smaller static
libs linked together into one big executable and that is linked with
-llttng-ust and -ldl.

I'm pretty stuck at the moment.  Anyone have any wisdom on what I
might be doing wrong or how I can tell why I'm not getting events in
the logs?

Thanks,

Brian

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev




--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] I'm still getting empty ust traces using tracef

2023-05-11 Thread Mathieu Desnoyers via lttng-dev

On 2023-05-11 12:36, Brian Hutchinson via lttng-dev wrote:

... more background. I've always used ltt in the kernel so I don't
have much experience with the user side of it and especially
multi-threaded, multi-core so I'm probably missing some fundamental
concepts that I need to understand.

Which are the exact versions of LTTng-UST and LTTng-Tools you are using
now ? (2.13.N or which git commit ?)

Thanks,

Mathieu

Regards,

Brian

On Thu, May 11, 2023 at 11:53 AM Brian Hutchinson wrote:

Hi,

I posted a while ago (thread - Using lttng 2.11 and UST doesn't appear
to work - getting empty trace files) about this problem I'm having
with getting empty trace logs.

I've since upgraded to lttng v2.13 and while I can do a simple hello
world program with tracef and get events in the log files, my more
complicated large multi-threaded app I'm trying to debug is still
getting empty log file traces.

I can list the user space events in my app.

Next I do:

lttng enable-event --userspace 'lttng_ust_tracef:*'

... to enable the events, start lttng, start my app, and I get a
trace directory structure that's empty.

I feel like I've read every thread in the archives about people having
the same problem.

I did try using LD_PRELOAD with various libs thinking that was the
problem but so far I'm still getting empty traces.

So far I've tried:

LD_PRELOAD=liblttng-ust-libc-wrapper.so.1:liblttng-ust-pthread-wrapper.so.1:liblttng-ust-dl.so.1:liblttng-ust-fork.so.1:liblttng-ust-fd.so.1
/usr/local/bin/my_app

I guess one question I have is how do I determine which "helper libs"
I need to preload?

The application I'm working on is made up of a bunch of smaller static
libs linked together into one big executable and that is linked with
-llttng-ust and -ldl.

I'm pretty stuck at the moment. Anyone have any wisdom on what I
might be doing wrong or how I can tell why I'm not getting events in
the logs?

Thanks,

Brian

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] https://lists.lttng.org/pipermail/lttng-dev/2020-May/029631.html

2023-03-27 Thread Mathieu Desnoyers via lttng-dev


On 2023-03-26 11:00, yashvardhan kukreti wrote:


Hi Mathew,

I have a question about this patch for lttng-modules and the use of
register_kprobe() to fetch the function ptr.
The question in this regard is especially from PPC64 ELF_ABI_v1
perspective.

The functions on PPC64 are accessed via the Function descriptor
while what register_kprobes returns is the entry point of the function.
Hence using the return pointer tends to interpret the addr as the
address of the function descriptor and dereferences the ppc_inst as
the function entry point and crashes

[ 4145.483594] kernel tried to execute exec-protected page
(7c0802a6fb81ffe0) - exploit attempt? (uid: 0)
here 7c0802a6 is the mfspr instruction from the code text section of
the kallsyms_lookup_name()

note for PPC_ELF_ABI_v1 the register_kprobes() searches for the dot
variant of the symbol and only in case if cannot find the dot
variant looks for the normal symbol.
register_kprobe() -> kprobe_addr() -> kprobe_lookup_name() [arch
variant replaces weak symbol]
https://elixir.bootlin.com/linux/v5.10.174/C/ident/kprobe_lookup_name 
<https://elixir.bootlin.com/linux/v5.10.174/C/ident/kprobe_lookup_name>

Please let me know if i make sense or that i may have missed something.

I have looked at the code of 2.12.8 as well and 2.12.3 verstion of
lttng-modules.


Please have a look at commits (from stable-2.12 branch of lttng-modules):

commit 53772db24facd84f1f3ddcf21a1ef5f162608721
Author: He Zhe 
Date:   Tue Sep 27 15:59:42 2022 +0800

wrapper: powerpc64: fix kernel crash caused by do_get_kallsyms

commit 8fe888d86ccad4226b05a536efb73d71bb091062
Author: Michael Jeanson 
Date:   Thu Nov 24 14:25:33 2022 -0500

fix: kallsyms wrapper on ppc64el

I suspect you'll also need this change currently in review:

https://review.lttng.org/c/lttng-modules/+/9113

Please let us know if especially this last change fixes things on your side.

Thanks,

Mathieu




Regards,
Shashank



--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] ThreadSanitizer: data race between urcu_mb_synchronize_rcu and urcu_adaptative_wake_up

2023-03-22 Thread Mathieu Desnoyers via lttng-dev

nqueue.
 */
while ((next = CMM_LOAD_SHARED(node->next)) == NULL) {
if (___cds_wfcq_busy_wait(, blocking))
return CDS_WFCQ_WOULDBLOCK;
}

return next;
}

So the release semantic is provided by the implicit SEQ_CST barrier in:

___cds_wfcq_append():
  old_tail = uatomic_xchg(>p, new_tail); (release)

and the acquire semantic is provided by the implicit SEQ_CST barrier in:

___cds_wfcq_splice():

/*
 * Memory barrier implied before uatomic_xchg() orders store to
 * src_q->head before store to src_q->tail. This is required by
 * concurrent enqueue on src_q, which exchanges the tail before
 * updating the previous tail's next pointer.
 */
tail = uatomic_xchg(_q_tail->p, _q_head->node);

Notice how the release/acquire semantic is provided by tail->p, which is 
atomically modified _before_ we set the node->next pointer.

With this information, is there a specific annotation that would make sense ?

Thanks,

Mathieu




Ondrej
--
Ondřej Surý (He/Him)
ond...@sury.org

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] ThreadSanitizer: data race between urcu_mb_synchronize_rcu and urcu_adaptative_wake_up

2023-03-22 Thread Mathieu Desnoyers via lttng-dev

he stack for other uses.

So somehow we should add an annotation about the lifetime of this object, which begins with 
DEFINE_URCU_WAIT_QUEUE() and ends right after 
"urcu_posix_assert(uatomic_read(>state) & URCU_WAIT_TEARDOWN);".

Thanks,

Mathieu


 which lead me to the fact that

ThreadSanitizer doesn't intercept futex, but we can annotate the futexes:

https://groups.google.com/g/thread-sanitizer/c/T0G_NyyZ3s4

Oh boy...

Ondrej
--
Ondřej Surý (He/Him)
ond...@sury.org

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] RCU API usage from call_rcu callbacks?

2023-03-22 Thread Mathieu Desnoyers via lttng-dev


On 2023-03-22 07:08, Ondřej Surý via lttng-dev wrote:

Hi,

the documentation is pretty silent on this, and asking here is probably going 
to be faster
than me trying to use the source to figure this out.

Is it legal to call_rcu() from within the call_rcu() callback?


Yes. call_rcu callbacks can be chained.

Note that you'll need to issue rcu_barrier() on program exit as many times as 
you chained call_rcu callbacks if you intend to make sure no queued callbacks 
still exist on program clean shutdown. See this comment above 
urcu_call_rcu_exit():

 * Teardown the default call_rcu worker thread if there are no queued
 * callbacks on process exit. This prevents leaking memory.
 *
 * Here is how an application can ensure graceful teardown of this
 * worker thread:
 *
 * - An application queuing call_rcu callbacks should invoke
 *   rcu_barrier() before it exits.
 * - When chaining call_rcu callbacks, the number of calls to
 *   rcu_barrier() on application exit must match at least the maximum
 *   number of chained callbacks.
 * - If an application chains callbacks endlessly, it would have to be
 *   modified to stop chaining callbacks when it detects an application
 *   exit (e.g. with a flag), and wait for quiescence with rcu_barrier()
 *   after setting that flag.
 * - The statements above apply to a library which queues call_rcu
 *   callbacks, only it needs to invoke rcu_barrier in its library
 *   destructor.




What about the other RCU (and CDS) API calls?


They can be unless stated otherwise. For instance, rcu_barrier() cannot be 
called from a call_rcu worker thread.



How does that interact with create_call_rcu_data()?  I have  event loops and 
I am
initializing  1:1 call_rcu helper threads as I need to do some per-thread 
initialization
as some of the destroy-like functions use random numbers (don't ask).


As I recall, set_thread_call_rcu_data() will associate a call_rcu worker 
instance for the current thread. So all following call_rcu() invocations from 
that thread will be queued into this per-thread call_rcu queue, and handled by 
the call_rcu worker thread.

But I wonder why you inherently need this 1:1 mapping, rather than using the 
content of the structure containing the rcu_head to figure out which per-thread 
data should be used ?

If you manage to separate the context from the worker thread instances, then 
you could use per-cpu call_rcu worker threads, which will eventually scale even 
better when I integrate the liburcu call_rcu API with sys_rseq concurrency ids 
[1].



If it's legal to call_rcu() from call_rcu thread, which thread is going to be 
used?


The call_rcu invoked from the call_rcu worker thread will queue the call_rcu 
callback onto the queue handled by that worker thread. It does so by setting

  URCU_TLS(thread_call_rcu_data) = crdp;

early in call_rcu_thread(). So any chained call_rcu is handled by the same 
call_rcu worker thread doing the chaining, with the exception of teardown where 
the pending callbacks are moved to the default worker thread.

Thanks,

Mathieu

[1] 
https://lore.kernel.org/lkml/20221122203932.231377-1-mathieu.desnoy...@efficios.com/




Thank you,
Ondrej
--
Ondřej Surý (He/Him)
ond...@sury.org

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] Fwd: how to disable local file writing in relayd?

2023-03-22 Thread Mathieu Desnoyers via lttng-dev


On 2023-03-22 02:39, Yuan Bin via lttng-dev wrote:



  Can I disable local-file-writing in lttng-relayd to avoid the disk 
space overhead, only using it as a live viewer?




I am not sure why you bump this email thread. I already answered here. 
Perhaps you did not receive my reply ?


https://lists.lttng.org/pipermail/lttng-dev/2023-March/030358.html

Thanks,

Mathieu


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [PATCH 2/7] Use gcc __atomic builtis for implementation

2023-03-21 Thread Mathieu Desnoyers via lttng-dev


On 2023-03-20 15:38, Duncan Sands via lttng-dev wrote:

Hi Mathieu,

While OK for the general case, I would recommend that we immediately 
implement something more efficient on x86 32/64 which takes into 
account that __ATOMIC_ACQ_REL atomic operations are implemented with 
LOCK prefixed atomic ops, which imply the barrier already, leaving the 
before/after_uatomic_*() as no-ops.


maybe first check whether the GCC optimizers merge them.  I believe some 
optimizations of atomic primitives are allowed and implemented, but I 
couldn't say which ones.


Best wishes, Duncan.


Tested on godbolt.org with:

int a;

void fct(void)
{
(void) __atomic_add_fetch(, 1, __ATOMIC_RELAXED);
__atomic_thread_fence(__ATOMIC_SEQ_CST);
}

x86-64 gcc 12.2 -O2 -std=c11:

fct:
lock addDWORD PTR a[rip], 1
lock or QWORD PTR [rsp], 0
ret
a:
.zero   4

x86-64 clang 16.0.0 -O2 -std=c11:

fct:# @fct
lockinc dword ptr [rip + a]
mfence
ret
a:
.long   0

So none of gcc/clang optimize this today, hence the need for an 
x86-specific implementation.


Thanks,

Mathieu


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [PATCH 7/7] Fix: uatomic_or() need retyping to uintptr_t in rculfhash.c

2023-03-21 Thread Mathieu Desnoyers via lttng-dev


On 2023-03-21 10:51, Ondřej Surý via lttng-dev wrote:

When adding REMOVED_FLAG to the pointers in the rculfhash
implementation, retype the generic pointer to unsigned long to fix the
following compiler error:


You will need to update the patch subject as well.

Thanks,

Mathieu



rculfhash.c:1201:2: error: address argument to atomic operation must be a 
pointer to integer ('struct cds_lfht_node **' invalid)
uatomic_or(>next, REMOVED_FLAG);
^
../include/urcu/uatomic.h:60:8: note: expanded from macro 'uatomic_or'
(void)__atomic_or_fetch((addr), (mask), __ATOMIC_RELAXED)
  ^ ~~
rculfhash.c:1444:3: error: address argument to atomic operation must be a 
pointer to integer ('struct cds_lfht_node **' invalid)
uatomic_or(_bucket->next, REMOVED_FLAG);
^~~~
../include/urcu/uatomic.h:60:8: note: expanded from macro 'uatomic_or'
(void)__atomic_or_fetch((addr), (mask), __ATOMIC_RELAXED)
  ^ ~~

This was not a problem before because the way the uatomic_or was
implemented, but now we directly pass the addr to __atomic_or_fetch()
and the compiler doesn't like the implicit conversion from pointer to
pointer to integer.

Signed-off-by: Ondřej Surý 
---
  src/rculfhash.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/rculfhash.c b/src/rculfhash.c
index b456415..5292725 100644
--- a/src/rculfhash.c
+++ b/src/rculfhash.c
@@ -1198,7 +1198,7 @@ int _cds_lfht_del(struct cds_lfht *ht, unsigned long size,
 * Knowing which wins the race will be known after the garbage
 * collection phase, stay tuned!
 */
-   uatomic_or(>next, REMOVED_FLAG);
+   uatomic_or((unsigned long *)>next, REMOVED_FLAG);
/* We performed the (logical) deletion. */
  
  	/*

@@ -1441,7 +1441,7 @@ void remove_table_partition(struct cds_lfht *ht, unsigned 
long i,
dbg_printf("remove entry: order %lu index %lu hash %lu\n",
   i, j, j);
/* Set the REMOVED_FLAG to freeze the ->next for gc */
-   uatomic_or(_bucket->next, REMOVED_FLAG);
+   uatomic_or((unsigned long *)_bucket->next, REMOVED_FLAG);
_cds_lfht_gc_bucket(parent_bucket, fini_bucket);
}
ht->flavor->read_unlock();


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [PATCH 5/7] Replace the arch-specific memory barriers with __atomic builtins

2023-03-21 Thread Mathieu Desnoyers via lttng-dev

: : : "memory")
-
-#define cmm_mb()   membar_safe("#LoadLoad | #LoadStore | #StoreStore | 
#StoreLoad")
-#define cmm_rmb()  membar_safe("#LoadLoad")
-#define cmm_wmb()  membar_safe("#StoreStore")


Same comment as for ppc.


-
  #ifdef __cplusplus
  }
  #endif
diff --git a/include/urcu/arch/x86.h b/include/urcu/arch/x86.h
index 744f9f9..af4487d 100644
--- a/include/urcu/arch/x86.h
+++ b/include/urcu/arch/x86.h
@@ -46,44 +46,8 @@ extern "C" {
  /* For backwards compat */
  #define CONFIG_RCU_HAVE_FENCE 1
  
-#define cmm_mb()__asm__ __volatile__ ("mfence":::"memory")

-
-/*
- * Define cmm_rmb/cmm_wmb to "strict" barriers that may be needed when
- * using SSE or working with I/O areas.  cmm_smp_rmb/cmm_smp_wmb are
- * only compiler barriers, which is enough for general use.
- */
-#define cmm_rmb() __asm__ __volatile__ ("lfence":::"memory")
-#define cmm_wmb() __asm__ __volatile__ ("sfence"::: "memory")
-#define cmm_smp_rmb() cmm_barrier()
-#define cmm_smp_wmb() cmm_barrier()


Relying on the generic barrier for rmb and wmb would slow things down on 
x86, we may want to do like I suggest for ppc.



-
-#else
-
-/*
- * We leave smp_rmb/smp_wmb as full barriers for processors that do not have
- * fence instructions.
- *
- * An empty cmm_smp_rmb() may not be enough on old PentiumPro multiprocessor
- * systems, due to an erratum.  The Linux kernel says that "Even distro
- * kernels should think twice before enabling this", but for now let's
- * be conservative and leave the full barrier on 32-bit processors.  Also,
- * IDT WinChip supports weak store ordering, and the kernel may enable it
- * under our feet; cmm_smp_wmb() ceases to be a nop for these processors.
- */
-#if (CAA_BITS_PER_LONG == 32)
-#define cmm_mb()__asm__ __volatile__ ("lock; addl $0,0(%%esp)":::"memory")
-#define cmm_rmb()__asm__ __volatile__ ("lock; addl $0,0(%%esp)":::"memory")
-#define cmm_wmb()__asm__ __volatile__ ("lock; addl $0,0(%%esp)":::"memory")
-#else
-#define cmm_mb()__asm__ __volatile__ ("lock; addl $0,0(%%rsp)":::"memory")
-#define cmm_rmb()__asm__ __volatile__ ("lock; addl $0,0(%%rsp)":::"memory")
-#define cmm_wmb()__asm__ __volatile__ ("lock; addl $0,0(%%rsp)":::"memory")
-#endif


Removing this removes support for older i686 and for URCU_ARCH_K1OM 
(Xeon Phi). Do we intend to remove that support ?


Thanks,

Mathieu


  #endif
  
-#define caa_cpu_relax()	__asm__ __volatile__ ("rep; nop" : : : "memory")

-
  #define HAS_CAA_GET_CYCLES
  
  #define rdtscll(val)			  \

@@ -98,10 +62,10 @@ typedef uint64_t caa_cycles_t;
  
  static inline caa_cycles_t caa_get_cycles(void)

  {
-caa_cycles_t ret = 0;
+   caa_cycles_t ret = 0;
  
-rdtscll(ret);

-return ret;
+   rdtscll(ret);
+   return ret;
  }


This whitespace to tab cleanup should be moved to its own patch.

Thanks,

Mathieu

  
  /*


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [PATCH 2/7] Use gcc __atomic builtis for implementation

2023-03-21 Thread Mathieu Desnoyers via lttng-dev

LAXED)
+
+#define uatomic_sub_return(addr, v) \
+   __atomic_sub_fetch((addr), (v), __ATOMIC_SEQ_CST)
+
+#define uatomic_sub(addr, v) \
+   (void)__atomic_sub_fetch((addr), (v), __ATOMIC_RELAXED)
+
+#define uatomic_and(addr, mask) \
+   (void)__atomic_and_fetch((addr), (mask), __ATOMIC_RELAXED)
+
+#define uatomic_or(addr, mask) \
+   (void)__atomic_or_fetch((addr), (mask), __ATOMIC_RELAXED)
+
+#define uatomic_inc(addr) (void)__atomic_add_fetch((addr), 1, __ATOMIC_RELAXED)
+#define uatomic_dec(addr) (void)__atomic_sub_fetch((addr), 1, __ATOMIC_RELAXED)
+
+#define cmm_smp_mb__before_uatomic_and()   
__atomic_thread_fence(__ATOMIC_SEQ_CST)
+#define cmm_smp_mb__after_uatomic_and()
__atomic_thread_fence(__ATOMIC_SEQ_CST)
+#define cmm_smp_mb__before_uatomic_or()
__atomic_thread_fence(__ATOMIC_SEQ_CST)
+#define cmm_smp_mb__after_uatomic_or() 
__atomic_thread_fence(__ATOMIC_SEQ_CST)
+#define cmm_smp_mb__before_uatomic_add()   
__atomic_thread_fence(__ATOMIC_SEQ_CST)
+#define cmm_smp_mb__after_uatomic_add()
__atomic_thread_fence(__ATOMIC_SEQ_CST)
+#define cmm_smp_mb__before_uatomic_sub()   cmm_smp_mb__before_uatomic_add()
+#define cmm_smp_mb__after_uatomic_sub()
cmm_smp_mb__after_uatomic_add()
+#define cmm_smp_mb__before_uatomic_inc()   cmm_smp_mb__before_uatomic_add()
+#define cmm_smp_mb__after_uatomic_inc()
cmm_smp_mb__after_uatomic_add()
+#define cmm_smp_mb__before_uatomic_dec()   cmm_smp_mb__before_uatomic_add()
+#define cmm_smp_mb__after_uatomic_dec()
cmm_smp_mb__after_uatomic_add()
+
+#define cmm_smp_mb()   cmm_mb()
  
  #endif /* _URCU_UATOMIC_H */


[...]

Thanks,

Mathieu



--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [PATCH 1/7] Require __atomic builtins to build

2023-03-21 Thread Mathieu Desnoyers via lttng-dev


On 2023-03-21 09:30, Ondřej Surý via lttng-dev wrote:

Add autoconf checks for all __atomic builtins that urcu require, and
adjust the gcc and clang versions in the README.md.

Signed-off-by: Ondřej Surý 
---
  README.md| 33 +
  configure.ac | 15 +++
  2 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/README.md b/README.md
index ba5bb08..a65a07a 100644
--- a/README.md
+++ b/README.md
@@ -68,30 +68,15 @@ Should also work on:
  
  (more testing needed before claiming support for these OS).
  
-Linux ARM depends on running a Linux kernel 2.6.15 or better, GCC 4.4 or

-better.
-
-The C compiler used needs to support at least C99. The C++ compiler used
-needs to support at least C++11.
-
-The GCC compiler versions 3.3, 3.4, 4.0, 4.1, 4.2, 4.3, 4.4 and 4.5 are
-supported, with the following exceptions:
-
-  - GCC 3.3 and 3.4 have a bug that prevents them from generating volatile
-accesses to offsets in a TLS structure on 32-bit x86. These versions are
-therefore not compatible with `liburcu` on x86 32-bit
-(i386, i486, i586, i686).
-The problem has been reported to the GCC community:
-<http://www.mail-archive.com/gcc-bugs@gcc.gnu.org/msg281255.html>
-  - GCC 3.3 cannot match the "xchg" instruction on 32-bit x86 build.
-See <http://kerneltrap.org/node/7507>
-  - Alpha, ia64 and ARM architectures depend on GCC 4.x with atomic builtins
-support. For ARM this was introduced with GCC 4.4:
-<http://gcc.gnu.org/gcc-4.4/changes.html>.
-  - Linux aarch64 depends on GCC 5.1 or better because prior versions
-perform unsafe access to deallocated stack.
-
-Clang version 3.0 (based on LLVM 3.0) is supported.
+Linux ARM depends on running a Linux kernel 2.6.15 or better.
+
+The C compiler used needs to support at least C99 and __atomic
+builtins. The C++ compiler used needs to support at least C++11
+and __atomic builtins.
+
+The GCC compiler versions 4.7 or better are supported.
+
+Clang version 3.1 (based on LLVM 3.1) is supported.
  
  Glibc >= 2.4 should work but the older version we test against is

  currently 2.17.
diff --git a/configure.ac b/configure.ac
index 909cf1d..cb7ba18 100644
--- a/configure.ac
+++ b/configure.ac
@@ -198,6 +198,21 @@ AC_SEARCH_LIBS([clock_gettime], [rt], [
AC_DEFINE([CONFIG_RCU_HAVE_CLOCK_GETTIME], [1], [clock_gettime() is 
detected.])
  ])
  
+# Require __atomic builtins

+AC_COMPILE_IFELSE(
+   [AC_LANG_PROGRAM(
+   [[int x, y;]],
+   [[__atomic_store_n(, 0, __ATOMIC_RELEASE);
+ __atomic_load_n(, __ATOMIC_CONSUME);
+ y = __atomic_exchange_n(, 1, __ATOMIC_ACQ_REL);
+ __atomic_compare_exchange_n(, , 0, 0, __ATOMIC_ACQ_REL, 
__ATOMIC_CONSUME);
+ __atomic_add_fetch(, 1, __ATOMIC_ACQ_REL);
+ __atomic_sub_fetch(, 1, __ATOMIC_ACQ_REL);
+ __atomic_and_fetch(, 0x01, __ATOMIC_ACQ_REL);
+ __atomic_or_fetch(, 0x01, __ATOMIC_ACQ_REL);
+ __atomic_thread_fence(__ATOMIC_ACQ_REL)]])],


I think we also want to test for __atomic_signal_fence here.

Thanks,

Mathieu



+   [],
+   [AC_MSG_ERROR([The compiler does not support __atomic builtins])])
  
  ## ##

  ## Optional features selection ##


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [PATCH 7/7] Experiment: Add explicit memory barrier in free_completion()

2023-03-21 Thread Mathieu Desnoyers via lttng-dev


On 2023-03-21 10:48, Ondřej Surý wrote:

On 21. 3. 2023, at 15:46, Mathieu Desnoyers  
wrote:

On 2023-03-21 06:21, Ondřej Surý wrote:

On 20. 3. 2023, at 19:37, Mathieu Desnoyers  
wrote:

On 2023-03-17 17:37, Ondřej Surý via lttng-dev wrote:

FIXME: This is experiment that adds explicit memory barrier in the
free_completion in the workqueue.c, so ThreadSanitizer knows it's ok to
free the resources.
Signed-off-by: Ondřej Surý 
---
  src/workqueue.c | 1 +
  1 file changed, 1 insertion(+)
diff --git a/src/workqueue.c b/src/workqueue.c
index 1039d72..f21907f 100644
--- a/src/workqueue.c
+++ b/src/workqueue.c
@@ -377,6 +377,7 @@ void free_completion(struct urcu_ref *ref)
   struct urcu_workqueue_completion *completion;
 completion = caa_container_of(ref, struct urcu_workqueue_completion, ref);
+ assert(!urcu_ref_get_unless_zero(>ref));


Perhaps what we really want here is an ANNOTATE_UNPUBLISH_MEMORY_RANGE() of 
some sort ?

I guess?
My experience with TSAN tells me, that you need some kind of memory barrier 
when using acquire-release
semantics and you do:
if (__atomic_sub_fetch(obj->ref, __ATOMIC_RELEASE) == 0) {
   /* __ATOMIC_ACQUIRE needed here */
free(obj);
}
we end up using following code in BIND 9:
if (__atomic_sub_fetch(obj->ref, __ATOMIC_ACQ_REL) == 0) {
free(obj);
}
So, I am guessing after the change of uatomic_sub_return() to __ATOMIC_ACQ_REL,
this patch should no longer be needed.


Actually we want __ATOMIC_SEQ_CST, which is even stronger than ACQ_REL.


Yeah, I think I already did that, but wrote the email before that. 
Nevertheless, my main
point was that it should not be needed anymore.


Agreed, let's see how it holds up to testing under TSAN. :)

Thanks,

Mathieu



Ondrej
--
Ondřej Surý (He/Him)
ond...@sury.org



--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [PATCH 7/7] Experiment: Add explicit memory barrier in free_completion()

2023-03-21 Thread Mathieu Desnoyers via lttng-dev


On 2023-03-21 06:21, Ondřej Surý wrote:

On 20. 3. 2023, at 19:37, Mathieu Desnoyers  
wrote:

On 2023-03-17 17:37, Ondřej Surý via lttng-dev wrote:

FIXME: This is experiment that adds explicit memory barrier in the
free_completion in the workqueue.c, so ThreadSanitizer knows it's ok to
free the resources.
Signed-off-by: Ondřej Surý 
---
  src/workqueue.c | 1 +
  1 file changed, 1 insertion(+)
diff --git a/src/workqueue.c b/src/workqueue.c
index 1039d72..f21907f 100644
--- a/src/workqueue.c
+++ b/src/workqueue.c
@@ -377,6 +377,7 @@ void free_completion(struct urcu_ref *ref)
   struct urcu_workqueue_completion *completion;
 completion = caa_container_of(ref, struct urcu_workqueue_completion, ref);
+ assert(!urcu_ref_get_unless_zero(>ref));


Perhaps what we really want here is an ANNOTATE_UNPUBLISH_MEMORY_RANGE() of 
some sort ?


I guess?

My experience with TSAN tells me, that you need some kind of memory barrier 
when using acquire-release
semantics and you do:

if (__atomic_sub_fetch(obj->ref, __ATOMIC_RELEASE) == 0) {
   /* __ATOMIC_ACQUIRE needed here */
free(obj);
}

we end up using following code in BIND 9:

if (__atomic_sub_fetch(obj->ref, __ATOMIC_ACQ_REL) == 0) {
free(obj);
}

So, I am guessing after the change of uatomic_sub_return() to __ATOMIC_ACQ_REL,
this patch should no longer be needed.


Actually we want __ATOMIC_SEQ_CST, which is even stronger than ACQ_REL.

Thanks,

Mathieu



Ondrej
--
Ondřej Surý (He/Him)
ond...@sury.org



--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [PATCH 6/7] Fix: uatomic_or() need retyping to uintptr_t in rculfhash.c

2023-03-21 Thread Mathieu Desnoyers via lttng-dev


On 2023-03-21 10:44, Mathieu Desnoyers wrote:

On 2023-03-21 06:15, Ondřej Surý wrote:


On 20. 3. 2023, at 19:31, Mathieu Desnoyers 
 wrote:


On 2023-03-17 17:37, Ondřej Surý via lttng-dev wrote:

When adding REMOVED_FLAG to the pointers in the rculfhash
implementation, retype the generic pointer to uintptr_t to fix the
compiler error.


What is the compiler error ? I'm wondering whether the expected choice
to match the rest of this file's content would be to use "uintptr_t 
*" or "unsigned long *" ?


This is the error:

rculfhash.c:1201:2: error: address argument to atomic operation must 
be a pointer to integer ('struct cds_lfht_node **' invalid)

 uatomic_or(>next, REMOVED_FLAG);
 ^
../include/urcu/uatomic.h:60:8: note: expanded from macro 'uatomic_or'
 (void)__atomic_or_fetch((addr), (mask), __ATOMIC_RELAXED)
   ^ ~~
rculfhash.c:1444:3: error: address argument to atomic operation must 
be a pointer to integer ('struct cds_lfht_node **' invalid)

 uatomic_or(_bucket->next, REMOVED_FLAG);
 ^~~~
../include/urcu/uatomic.h:60:8: note: expanded from macro 'uatomic_or'
 (void)__atomic_or_fetch((addr), (mask), __ATOMIC_RELAXED)
   ^ ~~

uintptr_t is defined as "unsigned integer type capable of holding a 
pointer to void" while unsigned long is at least 32-bit;


I guess that works in a practise, but using unsigned long to retype 
the pointers might blow up (thinking of x32 which I know

little about, but it's kind of hybrid architecture, isn't it?)


x32 uses 4 bytes for unsigned long, uintptr_t, and void * size. So even 
that architecture is OK with casting pointer to unsigned long.


I agree with you that uintptr_t is the semantically correct type, but it 
should come as a separate change across the urcu code base: currently 
there are many places where void * is cast to unsigned long to do 
bitwise operations.


I therefore recommend to use unsigned long here to stay similar to the 
rest of the code base, and keep the transition from unsigned long to 
uintptr_t for the future, as it is not an immediate issue we have to 
address.


I forgot to mention: you should add the compiler error to the commit 
message.


You should also explain why this was not an issue until now. It's 
probably related to the introduced use of __atomic builtins.


Thanks,

Mathieu



Thanks,

Mathieu




Ondrej
--
Ondřej Surý (He/Him)
ond...@sury.org





--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [PATCH 6/7] Fix: uatomic_or() need retyping to uintptr_t in rculfhash.c

2023-03-21 Thread Mathieu Desnoyers via lttng-dev


On 2023-03-21 06:15, Ondřej Surý wrote:



On 20. 3. 2023, at 19:31, Mathieu Desnoyers  
wrote:

On 2023-03-17 17:37, Ondřej Surý via lttng-dev wrote:

When adding REMOVED_FLAG to the pointers in the rculfhash
implementation, retype the generic pointer to uintptr_t to fix the
compiler error.


What is the compiler error ? I'm wondering whether the expected choice
to match the rest of this file's content would be to use "uintptr_t *" or "unsigned 
long *" ?


This is the error:

rculfhash.c:1201:2: error: address argument to atomic operation must be a 
pointer to integer ('struct cds_lfht_node **' invalid)
 uatomic_or(>next, REMOVED_FLAG);
 ^
../include/urcu/uatomic.h:60:8: note: expanded from macro 'uatomic_or'
 (void)__atomic_or_fetch((addr), (mask), __ATOMIC_RELAXED)
   ^ ~~
rculfhash.c:1444:3: error: address argument to atomic operation must be a 
pointer to integer ('struct cds_lfht_node **' invalid)
 uatomic_or(_bucket->next, REMOVED_FLAG);
 ^~~~
../include/urcu/uatomic.h:60:8: note: expanded from macro 'uatomic_or'
 (void)__atomic_or_fetch((addr), (mask), __ATOMIC_RELAXED)
   ^ ~~

uintptr_t is defined as "unsigned integer type capable of holding a pointer to 
void" while unsigned long is at least 32-bit;

I guess that works in a practise, but using unsigned long to retype the 
pointers might blow up (thinking of x32 which I know
little about, but it's kind of hybrid architecture, isn't it?)


x32 uses 4 bytes for unsigned long, uintptr_t, and void * size. So even 
that architecture is OK with casting pointer to unsigned long.


I agree with you that uintptr_t is the semantically correct type, but it 
should come as a separate change across the urcu code base: currently 
there are many places where void * is cast to unsigned long to do 
bitwise operations.


I therefore recommend to use unsigned long here to stay similar to the 
rest of the code base, and keep the transition from unsigned long to 
uintptr_t for the future, as it is not an immediate issue we have to 
address.


Thanks,

Mathieu




Ondrej
--
Ondřej Surý (He/Him)
ond...@sury.org



--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [PATCH 2/7] Use gcc __atomic builtis for implementation

2023-03-20 Thread Mathieu Desnoyers via lttng-dev


On 2023-03-20 14:38, Mathieu Desnoyers via lttng-dev wrote:

On 2023-03-20 14:28, Ondřej Surý wrote:


On 20. 3. 2023, at 19:03, Mathieu Desnoyers 
 wrote:


In doc/uatomic-api.md, we document:

"```c
type uatomic_cmpxchg(type *addr, type old, type new);
```

An atomic read-modify-write operation that performs this
sequence of operations atomically: check if `addr` contains `old`.
If true, then replace the content of `addr` by `new`. Return the
value previously contained by `addr`. This function implies a full
memory barrier before and after the atomic operation."

This would map to a "__ATOMIC_ACQ_REL" semantic on cmpxchg failure
rather than __ATOMIC_CONSUME".



From: https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html

If desired is written into *ptr then true is returned and memory is 
affected according to the memory order specified by success_memorder. 
There are no restrictions on what memory order can be used here.


Otherwise, false is returned and memory is affected according to 
failure_memorder. This memory order cannot be __ATOMIC_RELEASE nor 
__ATOMIC_ACQ_REL. It also cannot be a stronger order than that 
specified by success_memorder.


I think it makes sense that the failure_memorder has the same memorder 
as uatomic_read(), but it definitelly cannot be __ATOMIC_ACQ_REL - 
it's same as with __atomic_load_n, only following are permitted:


The valid memory order variants are __ATOMIC_RELAXED, 
__ATOMIC_SEQ_CST, __ATOMIC_ACQUIRE, and __ATOMIC_CONSUME.


Based on my other reply, we want "SEQ_CST" rather than ACQ_REL everywhere.


And it _would_ make sense to use the same memorder on cmpxchg failure as 
uatomic_read if we were exposing a new API, but we are modifying an 
already exposed documented API, so I would stick to SEQ_CST for both 
cmpxchg success/failure.


If we want to expose a new cmpxchg_relaxed_failure with a relaxed 
memorder on failure that would be fine, but we cannot change the 
semantic that is already documented.


Thanks,

Mathieu



Thanks,

Mathieu



Ondrej
--
Ondřej Surý (He/Him)
ond...@sury.org





--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [PATCH 2/7] Use gcc __atomic builtis for implementation

2023-03-20 Thread Mathieu Desnoyers via lttng-dev


On 2023-03-20 14:28, Ondřej Surý wrote:



On 20. 3. 2023, at 19:03, Mathieu Desnoyers  
wrote:

In doc/uatomic-api.md, we document:

"```c
type uatomic_cmpxchg(type *addr, type old, type new);
```

An atomic read-modify-write operation that performs this
sequence of operations atomically: check if `addr` contains `old`.
If true, then replace the content of `addr` by `new`. Return the
value previously contained by `addr`. This function implies a full
memory barrier before and after the atomic operation."

This would map to a "__ATOMIC_ACQ_REL" semantic on cmpxchg failure
rather than __ATOMIC_CONSUME".



From: https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html


If desired is written into *ptr then true is returned and memory is affected 
according to the memory order specified by success_memorder. There are no 
restrictions on what memory order can be used here.

Otherwise, false is returned and memory is affected according to 
failure_memorder. This memory order cannot be __ATOMIC_RELEASE nor 
__ATOMIC_ACQ_REL. It also cannot be a stronger order than that specified by 
success_memorder.


I think it makes sense that the failure_memorder has the same memorder as 
uatomic_read(), but it definitelly cannot be __ATOMIC_ACQ_REL - it's same as 
with __atomic_load_n, only following are permitted:


The valid memory order variants are __ATOMIC_RELAXED, __ATOMIC_SEQ_CST, 
__ATOMIC_ACQUIRE, and __ATOMIC_CONSUME.


Based on my other reply, we want "SEQ_CST" rather than ACQ_REL everywhere.

Thanks,

Mathieu



Ondrej
--
Ondřej Surý (He/Him)
ond...@sury.org



--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [PATCH 7/7] Experiment: Add explicit memory barrier in free_completion()

2023-03-20 Thread Mathieu Desnoyers via lttng-dev


On 2023-03-17 17:37, Ondřej Surý via lttng-dev wrote:

FIXME: This is experiment that adds explicit memory barrier in the
free_completion in the workqueue.c, so ThreadSanitizer knows it's ok to
free the resources.

Signed-off-by: Ondřej Surý 
---
  src/workqueue.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/src/workqueue.c b/src/workqueue.c
index 1039d72..f21907f 100644
--- a/src/workqueue.c
+++ b/src/workqueue.c
@@ -377,6 +377,7 @@ void free_completion(struct urcu_ref *ref)
struct urcu_workqueue_completion *completion;
  
  	completion = caa_container_of(ref, struct urcu_workqueue_completion, ref);

+   assert(!urcu_ref_get_unless_zero(>ref));


Perhaps what we really want here is an ANNOTATE_UNPUBLISH_MEMORY_RANGE() 
of some sort ?


Thanks,

Mathieu


free(completion);
  }
  


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [PATCH 6/7] Fix: uatomic_or() need retyping to uintptr_t in rculfhash.c

2023-03-20 Thread Mathieu Desnoyers via lttng-dev


On 2023-03-17 17:37, Ondřej Surý via lttng-dev wrote:

When adding REMOVED_FLAG to the pointers in the rculfhash
implementation, retype the generic pointer to uintptr_t to fix the
compiler error.


What is the compiler error ? I'm wondering whether the expected choice
to match the rest of this file's content would be to use "uintptr_t *" 
or "unsigned long *" ?


Thanks,

Mathieu



Signed-off-by: Ondřej Surý 
---
  src/rculfhash.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/rculfhash.c b/src/rculfhash.c
index b456415..863387e 100644
--- a/src/rculfhash.c
+++ b/src/rculfhash.c
@@ -1198,7 +1198,7 @@ int _cds_lfht_del(struct cds_lfht *ht, unsigned long size,
 * Knowing which wins the race will be known after the garbage
 * collection phase, stay tuned!
 */
-   uatomic_or(>next, REMOVED_FLAG);
+   uatomic_or((uintptr_t *)>next, REMOVED_FLAG);
/* We performed the (logical) deletion. */
  
  	/*

@@ -1441,7 +1441,7 @@ void remove_table_partition(struct cds_lfht *ht, unsigned 
long i,
dbg_printf("remove entry: order %lu index %lu hash %lu\n",
   i, j, j);
/* Set the REMOVED_FLAG to freeze the ->next for gc */
-   uatomic_or(_bucket->next, REMOVED_FLAG);
+   uatomic_or((uintptr_t *)_bucket->next, REMOVED_FLAG);
_cds_lfht_gc_bucket(parent_bucket, fini_bucket);
    }
    ht->flavor->read_unlock();


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [PATCH 5/7] Use __atomic builtins to implement CMM_{LOAD, STORE}_SHARED

2023-03-20 Thread Mathieu Desnoyers via lttng-dev


On 2023-03-17 17:37, Ondřej Surý via lttng-dev wrote:

Instead of using CMM_ACCESS_ONCE() with memory barriers, use __atomic
builtins with relaxed memory ordering to implement CMM_LOAD_SHARED() and
CMM_STORE_SHARED().

Signed-off-by: Ondřej Surý 
---
  include/urcu/system.h | 7 +++
  1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/include/urcu/system.h b/include/urcu/system.h
index faae390..99e7443 100644
--- a/include/urcu/system.h
+++ b/include/urcu/system.h
@@ -26,7 +26,7 @@
   * Identify a shared load. A cmm_smp_rmc() or cmm_smp_mc() should come
   * before the load.
   */
-#define _CMM_LOAD_SHARED(p)   CMM_ACCESS_ONCE(p)
+#define _CMM_LOAD_SHARED(p)   __atomic_load_n(&(p), __ATOMIC_RELAXED)
  
  /*

   * Load a data from shared memory, doing a cache flush if required.
@@ -42,7 +42,7 @@
   * Identify a shared store. A cmm_smp_wmc() or cmm_smp_mc() should
   * follow the store.
   */
-#define _CMM_STORE_SHARED(x, v)__extension__ ({ CMM_ACCESS_ONCE(x) = 
(v); })
+#define _CMM_STORE_SHARED(x, v)__atomic_store_n(&(x), (v), 
__ATOMIC_RELAXED)


__atomic_store_n() is void. _CMM_STORE_SHARED() should evaluate to (v) 
(unless we decide to change the semantic, which I would rather avoid).


Thanks,

Mathieu

  
  /*

   * Store v into x, where x is located in shared memory. Performs the
@@ -51,9 +51,8 @@
  #define CMM_STORE_SHARED(x, v)
\
__extension__   \
({  \
-   __typeof__(x) _v = _CMM_STORE_SHARED(x, v); \
+   _CMM_STORE_SHARED(x, v);\
cmm_smp_wmc();  \
-   _v = _v;/* Work around clang "unused result" */   \
})
  
  #endif /* _URCU_SYSTEM_H */


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [PATCH 4/7] Replace the internal pointer manipulation with __atomic builtins

2023-03-20 Thread Mathieu Desnoyers via lttng-dev

EREFERENCE_USE_VOLATILE, the user requires use of
- * volatile access to implement rcu_dereference rather than
- * memory_order_consume load from the C11/C++11 standards.
- *
   * This may improve performance on weakly-ordered architectures where
   * the compiler implements memory_order_consume as a
   * memory_order_acquire, which is stricter than required by the
@@ -83,35 +73,7 @@ extern "C" {
   * meets the 10-line criterion in LGPL, allowing this function to be
   * expanded directly in non-LGPL code.
   */
-
-#if !defined (URCU_DEREFERENCE_USE_VOLATILE) &&\
-   ((defined (__cplusplus) && __cplusplus >= 201103L) ||\
-   (defined (__STDC_VERSION__) && __STDC_VERSION__ >= 201112L))
-# define __URCU_DEREFERENCE_USE_ATOMIC_CONSUME
-#endif
-
-/*
- * If p is const (the pointer itself, not what it points to), using
- * __typeof__(p) would declare a const variable, leading to
- * -Wincompatible-pointer-types errors.  Using the statement expression
- * makes it an rvalue and gets rid of the const-ness.
- */
-#ifdef __URCU_DEREFERENCE_USE_ATOMIC_CONSUME
-# define _rcu_dereference(p) __extension__ ({  
\
-   __typeof__(__extension__ ({ 
\
-   __typeof__(p) __attribute__((unused)) 
_p0 = { 0 }; \
-   _p0;
\
-   })) _p1;
\
-   __atomic_load(&(p), &_p1, 
__ATOMIC_CONSUME);\
-   (_p1);  
\
-   })
-#else
-# define _rcu_dereference(p) __extension__ ({  
\
-   __typeof__(p) _p1 = CMM_LOAD_SHARED(p); 
\
-   cmm_smp_read_barrier_depends(); 
\
-   (_p1);  
\
-   })
-#endif
+#define _rcu_dereference(p) _rcu_get_pointer(&(p))
  
  /**

   * _rcu_cmpxchg_pointer - same as rcu_assign_pointer, but tests if the pointer
@@ -126,12 +88,12 @@ extern "C" {
   * meets the 10-line criterion in LGPL, allowing this function to be
   * expanded directly in non-LGPL code.
   */
-#define _rcu_cmpxchg_pointer(p, old, _new) \
-   __extension__   \
-   ({  \
-   __typeof__(*p) _pold = (old);   \
-   __typeof__(*p) _pnew = (_new);  \
-   uatomic_cmpxchg(p, _pold, _pnew);   \
+#define _rcu_cmpxchg_pointer(p, old, _new) 
\
+   ({  
\
+   __typeof__(*(p)) __old = old;   
\
+   __atomic_compare_exchange_n(p, &__old, _new, 0, 
\
+   __ATOMIC_ACQ_REL, 
__ATOMIC_CONSUME);\


__ATOMIC_SEQ_CST on both success and failure.


+   __old;  
\
})
  
  /**

@@ -145,22 +107,11 @@ extern "C" {
   * meets the 10-line criterion in LGPL, allowing this function to be
   * expanded directly in non-LGPL code.
   */
-#define _rcu_xchg_pointer(p, v)\
-   __extension__   \
-   ({  \
-   __typeof__(*p) _pv = (v);   \
-   uatomic_xchg(p, _pv);   \
-   })
-
+#define _rcu_xchg_pointer(p, v) \
+   __atomic_exchange_n(p, v, __ATOMIC_ACQ_REL)


__ATOMIC_SEQ_CST.

  
-#define _rcu_set_pointer(p, v)\

-   do {\
-   __typeof__(*p) _pv = (v);   \
-   if (!__builtin_constant_p(v) || \
-   ((v) != NULL))  \
-   cmm_wmb();  \
-   uatomic_set(p, _pv);\
-   } while (0)
+#define _rcu_set_pointer(p, v) \
+   __atomic_store_n(p, v, __ATOMIC_RELEASE)


OK.

Thanks,

Mathieu

  
  /**

   * _rcu_assign_pointer - assign (publicize) a pointer to a new data structure
@@ -178,7 +129,7 @@ extern "C" {
   * meets the 10-line criterion in LGPL, allowing this function to be
   * expanded directly in non-LGPL code.
   */
-#define _rcu_assign

Re: [lttng-dev] [PATCH 3/7] Use __atomic_thread_fence() for cmm_barrier()

2023-03-20 Thread Mathieu Desnoyers via lttng-dev


On 2023-03-20 14:06, Mathieu Desnoyers via lttng-dev wrote:

On 2023-03-17 17:37, Ondřej Surý via lttng-dev wrote:

Use __atomic_thread_fence(__ATOMIC_ACQ_REL) for cmm_barrier(), so
ThreadSanitizer can understand the memory synchronization.


You should update the patch subject and commit message to replace 
"thread" by "signal".




FIXME: What should be the correct memory ordering here?


ACQ_REL is what we want here, I think this is fine. We want to prevent
the compiler from reordering loads/stores across the fence, but don't
want any barrier instructions issued.


We should probably make it SEQ_CST here as well, even though I doubt it 
changes anything in this very particular case of atomic_signal_fence.


Thanks,

Mathieu



Thanks,

Mathieu



Signed-off-by: Ondřej Surý 
---
  include/urcu/compiler.h | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/urcu/compiler.h b/include/urcu/compiler.h
index 2f32b38..ede909f 100644
--- a/include/urcu/compiler.h
+++ b/include/urcu/compiler.h
@@ -28,7 +28,8 @@
  #define caa_likely(x)    __builtin_expect(!!(x), 1)
  #define caa_unlikely(x)    __builtin_expect(!!(x), 0)
-#define    cmm_barrier()    __asm__ __volatile__ ("" : : : "memory")
+/* FIXME: What would be a correct memory ordering here? */
+#define    cmm_barrier()    __atomic_signal_fence(__ATOMIC_ACQ_REL)
  /*
   * Instruct the compiler to perform only a single access to a variable




--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [PATCH 2/7] Use gcc __atomic builtis for implementation

2023-03-20 Thread Mathieu Desnoyers via lttng-dev


On 2023-03-20 14:03, Mathieu Desnoyers via lttng-dev wrote:

On 2023-03-17 17:37, Ondřej Surý via lttng-dev wrote:

Replace the custom assembly code in include/urcu/uatomic/ with __atomic
builtins provided by C11-compatible compiler.


[...]

+#define UATOMIC_HAS_ATOMIC_BYTE
+#define UATOMIC_HAS_ATOMIC_SHORT
+
+#define uatomic_set(addr, v) __atomic_store_n(addr, v, __ATOMIC_RELEASE)
+
+#define uatomic_read(addr) __atomic_load_n((addr), __ATOMIC_CONSUME)
+
+#define uatomic_xchg(addr, v) __atomic_exchange_n((addr), (v), 
__ATOMIC_ACQ_REL)

+
+#define uatomic_cmpxchg(addr, old, new) \
+    ({    \
+    __typeof__(*(addr)) __old = old;    \
+    __atomic_compare_exchange_n(addr, &__old, new, 0,    \
+    __ATOMIC_ACQ_REL, __ATOMIC_CONSUME);\




Actually, I suspect we'd want to change __ATOMIC_ACQ_REL to 
__ATOMIC_SEQ_CST everywhere, because we want total order.


Thanks,

Mathieu


In doc/uatomic-api.md, we document:

"```c
type uatomic_cmpxchg(type *addr, type old, type new);
```

An atomic read-modify-write operation that performs this
sequence of operations atomically: check if `addr` contains `old`.
If true, then replace the content of `addr` by `new`. Return the
value previously contained by `addr`. This function implies a full
memory barrier before and after the atomic operation."

This would map to a "__ATOMIC_ACQ_REL" semantic on cmpxchg failure
rather than __ATOMIC_CONSUME".


+    __old;    \
+    })
+
+#define uatomic_add_return(addr, v) \
+    __atomic_add_fetch((addr), (v), __ATOMIC_ACQ_REL)
+
+#define uatomic_add(addr, v) \
+    (void)__atomic_add_fetch((addr), (v), __ATOMIC_RELAXED)
+
+#define uatomic_sub_return(addr, v) \
+    __atomic_sub_fetch((addr), (v), __ATOMIC_ACQ_REL)
+
+#define uatomic_sub(addr, v) \
+    (void)__atomic_sub_fetch((addr), (v), __ATOMIC_RELAXED)
+
+#define uatomic_and(addr, mask) \
+    (void)__atomic_and_fetch((addr), (mask), __ATOMIC_RELAXED)
+
+#define uatomic_or(addr, mask)    \
+    (void)__atomic_or_fetch((addr), (mask), __ATOMIC_RELAXED)
+
+#define uatomic_inc(addr) (void)__atomic_add_fetch((addr), 1, 
__ATOMIC_RELAXED)
+#define uatomic_dec(addr) (void)__atomic_sub_fetch((addr), 1, 
__ATOMIC_RELAXED)

+
+#define cmm_smp_mb__before_uatomic_and()
__atomic_thread_fence(__ATOMIC_ACQ_REL)
+#define cmm_smp_mb__after_uatomic_and()
__atomic_thread_fence(__ATOMIC_ACQ_REL)
+#define cmm_smp_mb__before_uatomic_or()
__atomic_thread_fence(__ATOMIC_ACQ_REL)
+#define cmm_smp_mb__after_uatomic_or()
__atomic_thread_fence(__ATOMIC_ACQ_REL)
+#define cmm_smp_mb__before_uatomic_add()
__atomic_thread_fence(__ATOMIC_ACQ_REL)
+#define cmm_smp_mb__after_uatomic_add()
__atomic_thread_fence(__ATOMIC_ACQ_REL)
+#define cmm_smp_mb__before_uatomic_sub()
cmm_smp_mb__before_uatomic_add()
+#define cmm_smp_mb__after_uatomic_sub()
cmm_smp_mb__after_uatomic_add()
+#define cmm_smp_mb__before_uatomic_inc()
cmm_smp_mb__before_uatomic_add()
+#define cmm_smp_mb__after_uatomic_inc()
cmm_smp_mb__after_uatomic_add()
+#define cmm_smp_mb__before_uatomic_dec()
cmm_smp_mb__before_uatomic_add()
+#define cmm_smp_mb__after_uatomic_dec()
cmm_smp_mb__after_uatomic_add()

+
+#define cmm_smp_mb()    cmm_mb()


While OK for the general case, I would recommend that we immediately 
implement something more efficient on x86 32/64 which takes into account 
that __ATOMIC_ACQ_REL atomic operations are implemented with LOCK 
prefixed atomic ops, which imply the barrier already, leaving the 
before/after_uatomic_*() as no-ops.


Thanks,

Mathieu



--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [PATCH 3/7] Use __atomic_thread_fence() for cmm_barrier()

2023-03-20 Thread Mathieu Desnoyers via lttng-dev


On 2023-03-17 17:37, Ondřej Surý via lttng-dev wrote:

Use __atomic_thread_fence(__ATOMIC_ACQ_REL) for cmm_barrier(), so
ThreadSanitizer can understand the memory synchronization.


You should update the patch subject and commit message to replace 
"thread" by "signal".




FIXME: What should be the correct memory ordering here?


ACQ_REL is what we want here, I think this is fine. We want to prevent
the compiler from reordering loads/stores across the fence, but don't
want any barrier instructions issued.

Thanks,

Mathieu



Signed-off-by: Ondřej Surý 
---
  include/urcu/compiler.h | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/urcu/compiler.h b/include/urcu/compiler.h
index 2f32b38..ede909f 100644
--- a/include/urcu/compiler.h
+++ b/include/urcu/compiler.h
@@ -28,7 +28,8 @@
  #define caa_likely(x) __builtin_expect(!!(x), 1)
  #define caa_unlikely(x)   __builtin_expect(!!(x), 0)
  
-#define	cmm_barrier()	__asm__ __volatile__ ("" : : : "memory")

+/* FIXME: What would be a correct memory ordering here? */
+#definecmm_barrier()   __atomic_signal_fence(__ATOMIC_ACQ_REL)
  
  /*

   * Instruct the compiler to perform only a single access to a variable


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] [PATCH 2/7] Use gcc __atomic builtis for implementation

2023-03-20 Thread Mathieu Desnoyers via lttng-dev


On 2023-03-17 17:37, Ondřej Surý via lttng-dev wrote:

Replace the custom assembly code in include/urcu/uatomic/ with __atomic
builtins provided by C11-compatible compiler.


[...]

+#define UATOMIC_HAS_ATOMIC_BYTE
+#define UATOMIC_HAS_ATOMIC_SHORT
+
+#define uatomic_set(addr, v) __atomic_store_n(addr, v, __ATOMIC_RELEASE)
+
+#define uatomic_read(addr) __atomic_load_n((addr), __ATOMIC_CONSUME)
+
+#define uatomic_xchg(addr, v) __atomic_exchange_n((addr), (v), 
__ATOMIC_ACQ_REL)
+
+#define uatomic_cmpxchg(addr, old, new) \
+   ({  
\
+   __typeof__(*(addr)) __old = old;
\
+   __atomic_compare_exchange_n(addr, &__old, new, 0,   \
+   __ATOMIC_ACQ_REL, 
__ATOMIC_CONSUME);\


In doc/uatomic-api.md, we document:

"```c
type uatomic_cmpxchg(type *addr, type old, type new);
```

An atomic read-modify-write operation that performs this
sequence of operations atomically: check if `addr` contains `old`.
If true, then replace the content of `addr` by `new`. Return the
value previously contained by `addr`. This function implies a full
memory barrier before and after the atomic operation."

This would map to a "__ATOMIC_ACQ_REL" semantic on cmpxchg failure
rather than __ATOMIC_CONSUME".


+   __old;  
\
+   })
+
+#define uatomic_add_return(addr, v) \
+   __atomic_add_fetch((addr), (v), __ATOMIC_ACQ_REL)
+
+#define uatomic_add(addr, v) \
+   (void)__atomic_add_fetch((addr), (v), __ATOMIC_RELAXED)
+
+#define uatomic_sub_return(addr, v) \
+   __atomic_sub_fetch((addr), (v), __ATOMIC_ACQ_REL)
+
+#define uatomic_sub(addr, v) \
+   (void)__atomic_sub_fetch((addr), (v), __ATOMIC_RELAXED)
+
+#define uatomic_and(addr, mask) \
+   (void)__atomic_and_fetch((addr), (mask), __ATOMIC_RELAXED)
+
+#define uatomic_or(addr, mask) \
+   (void)__atomic_or_fetch((addr), (mask), __ATOMIC_RELAXED)
+
+#define uatomic_inc(addr) (void)__atomic_add_fetch((addr), 1, __ATOMIC_RELAXED)
+#define uatomic_dec(addr) (void)__atomic_sub_fetch((addr), 1, __ATOMIC_RELAXED)
+
+#define cmm_smp_mb__before_uatomic_and()   
__atomic_thread_fence(__ATOMIC_ACQ_REL)
+#define cmm_smp_mb__after_uatomic_and()
__atomic_thread_fence(__ATOMIC_ACQ_REL)
+#define cmm_smp_mb__before_uatomic_or()
__atomic_thread_fence(__ATOMIC_ACQ_REL)
+#define cmm_smp_mb__after_uatomic_or() 
__atomic_thread_fence(__ATOMIC_ACQ_REL)
+#define cmm_smp_mb__before_uatomic_add()   
__atomic_thread_fence(__ATOMIC_ACQ_REL)
+#define cmm_smp_mb__after_uatomic_add()
__atomic_thread_fence(__ATOMIC_ACQ_REL)
+#define cmm_smp_mb__before_uatomic_sub()   cmm_smp_mb__before_uatomic_add()
+#define cmm_smp_mb__after_uatomic_sub()
cmm_smp_mb__after_uatomic_add()
+#define cmm_smp_mb__before_uatomic_inc()   cmm_smp_mb__before_uatomic_add()
+#define cmm_smp_mb__after_uatomic_inc()
cmm_smp_mb__after_uatomic_add()
+#define cmm_smp_mb__before_uatomic_dec()   cmm_smp_mb__before_uatomic_add()
+#define cmm_smp_mb__after_uatomic_dec()
cmm_smp_mb__after_uatomic_add()
+
+#define cmm_smp_mb()   cmm_mb()


While OK for the general case, I would recommend that we immediately 
implement something more efficient on x86 32/64 which takes into account 
that __ATOMIC_ACQ_REL atomic operations are implemented with LOCK 
prefixed atomic ops, which imply the barrier already, leaving the 
before/after_uatomic_*() as no-ops.


Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] userspace-rcu and ThreadSanitizer

2023-03-17 Thread Mathieu Desnoyers via lttng-dev


On 2023-03-17 13:02, Ondřej Surý wrote:

On 17. 3. 2023, at 14:44, Mathieu Desnoyers  
wrote:

I would indeed like to remove all the custom atomics assembly code from liburcu 
now that there are good atomics support in the major compilers (gcc and clang).


Here's very preliminary implementation:

https://gitlab.isc.org/isc-projects/userspace-rcu/-/merge_requests/2

I just did something wrong somewhere along the path and it doesn't compile now,
but it did for me locally.

I am submitting this now as it's 18:00 Friday evening and my kids are starting 
to
be angry at me :).

This will need some more work - I think some of the cmm_ macros might be dropped
now, and somebody who does that more often than I should take a look at the 
memory
orderings.


A few comments:

cmm_barrier() should rather be __atomic_signal_fence().

Also I notice this macro pattern (coding style):

#define uatomic_set(addr, v) __atomic_store_n((addr), (v), __ATOMIC_RELEASE)

The extra parentheses for parameters are not needed, because the comma is pretty
much the last operator in terms of priority. The following would be preferred
specifically because those are separated by comma:

#define uatomic_set(addr, v) __atomic_store_n(addr, v, __ATOMIC_RELEASE)

Our memory barrier semantic are similar to the Linux kernel, where the following
imply ACQ_REL because they return something: cmpxchg, add_return, sub_return, 
xchg.

The rest (add, sub, and, or, inc, dec) are __ATOMIC_RELAXED. Note that
cmm_smp_mb__before/after_uatomic_*() need to be implemented as
__atomic_thread_fence(__ATOMIC_ACQ_REL).

There are some architectures where we will want to keep a specialized version
of those add, sub, and, or, inc, dec operations which include the ACQ_REL 
semantic,
e.g. x86, where this is implied by the LOCK prefix. For those the 
cmm_smp_mb__before/after_uatomic_*()
will be no-ops.

The CMM_STORE_SHARED is not meant to have a RELEASE semantic. It is meant to
update variables that don't need the release ordering. The ATOMIC_CONSUME was
not the intent at the CMM_LOAD_SHARED level neither.

(this is just from looking around at the patches, it would be better if we can 
have the
patches posted to the mailing list for further discussion)

Thanks!

Mathieu




Ondrej
--
Ondřej Surý (He/Him)
ond...@sury.org



--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] userspace-rcu and ThreadSanitizer

2023-03-17 Thread Mathieu Desnoyers via lttng-dev


On 2023-03-17 11:50, Ondřej Surý wrote:

On 17. 3. 2023, at 14:44, Mathieu Desnoyers  
wrote:

Sure, can you please submit the patch as a separate email with subject/commit 
message/signed-off-by tag ?



https://gitlab.isc.org/isc-projects/userspace-rcu/-/merge_requests/1.patch

Would this work for you?

Or do you need to have the patch attached?


Having the patch attached (e.g. using git send-email) would be better, 
but I don't mind downloading the file for this time. Merged into liburcu 
master branch, thanks!


Mathieu



Ondrej
--
Ondřej Surý (He/Him)
ond...@sury.org



--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] userspace-rcu and ThreadSanitizer

2023-03-17 Thread Mathieu Desnoyers via lttng-dev

ded with clang-17 as noreturn is 
now
reserved word:


Sure, can you please submit the patch as a separate email with 
subject/commit message/signed-off-by tag ?


Thanks!

Mathieu



diff --git a/include/urcu/uatomic/generic.h b/include/urcu/uatomic/generic.h
index 89d1cfa..c3762b0 100644
--- a/include/urcu/uatomic/generic.h
+++ b/include/urcu/uatomic/generic.h
@@ -38,7 +38,7 @@ extern "C" {
  #endif

  #if !defined __OPTIMIZE__  || defined UATOMIC_NO_LINK_ERROR
-static inline __attribute__((always_inline, noreturn))
+static inline __attribute__((always_inline, __noreturn__))
  void _uatomic_link_error(void)
  {
  #ifdef ILLEGAL_INSTR
diff --git a/src/urcu-call-rcu-impl.h b/src/urcu-call-rcu-impl.h
index 187727e..cc76f53 100644
--- a/src/urcu-call-rcu-impl.h
+++ b/src/urcu-call-rcu-impl.h
@@ -1055,7 +1055,7 @@ void urcu_register_rculfhash_atfork(struct urcu_atfork 
*atfork)
   * This unregistration function is deprecated, meant only for internal
   * use by rculfhash.
   */
-__attribute__((noreturn))
+__attribute__((__noreturn__))
  void urcu_unregister_rculfhash_atfork(struct urcu_atfork *atfork 
__attribute__((unused)))
  {
 urcu_die(EPERM);


Ondrej
--
Ondřej Surý (He/Him)
ond...@sury.org

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] urcu/rculist.h clarifications - for implementing LRU

2023-03-13 Thread Mathieu Desnoyers via lttng-dev


On 2023-03-13 11:30, Ondřej Surý wrote:

Hi Matthieu,

I spent some more time with the userspace-rcu on Friday and over weekend and
now I am in much better place.


On 13. 3. 2023, at 15:29, Mathieu Desnoyers  
wrote:

On 2023-03-11 01:04, Ondřej Surý via lttng-dev wrote:

Hey,
so, we are integrating userspace-rcu to BIND 9 (yay!) and as experiment,
I am rewriting the internal address database (keeps the infrastructure
information about names and addresses).


That's indeed very interesting !


Thanks for the userspace-rcu! It saves a lot of time - while my colleague Tony 
Finch
already wrote our internal QSBR implementation from scratch, it would be waste 
of
time to try to reimplement the CDS part of the library.

This is part of larger work to replace the internal BIND 9 database that's 
currently
implemented as rwlocked RBT with qptrie, if you are interested Tony has good
summary here: https://dotat.at/@/2023-02-28-qp-bind.html


Speaking of tries, I have implemented RCU Judy arrays in liburcu feature 
branches a
while back. Those never made it to the liburcu master branch because I had no 
real-life
use for those so far, and I did not want to expose a public API that would 
bitrot without
real-life user feedback.

The lookups and ordered traversals (next/prev) are entirely RCU, and updates are
either single-threaded, or use a strategy where locking is distributed within
the trie so updates to data spatially discontinuous would not contend with each 
other.

My original implementation supported integer keys as well as variable-length 
string keys.

The advantage of Judy arrays is that it minimizes the number of cache-lines 
touched
on lookup traversal. Let me know if this would be useful for your use-cases, 
and if
so I can provide links to prototype branches.

[...]




So this is part with the hashtable lookup which seems to work well:
 rcu_read_lock();
 struct cds_lfht_iter iter;
 struct cds_lfht_node *ht_node;
 cds_lfht_lookup(adb->names_ht, hashval, names_match, , );
 ht_node = cds_lfht_iter_get_node();
 bool unlink = false;
 if (ht_node == NULL) {
 /* Allocate a new name and add it to the hash table. */
 adbname = new_adbname(adb, name, start_at_zone);
 ht_node = cds_lfht_add_unique(adb->names_ht, hashval,
   names_match, ,
   >ht_node);
 if (ht_node != >ht_node) {
 /* ISC_R_EXISTS */
 destroy_adbname(adbname);
 adbname = NULL;
 }
 }
 if (adbname == NULL) {
 INSIST(ht_node != NULL);
 adbname = caa_container_of(ht_node, dns_adbname_t, ht_node);
 unlink = true;
 }
 dns_adbname_ref(adbname);


What is this dns_adbname_ref() supposed to do ? And is there a reference to 
adbname
that is still used after rcu_read_unlock() ? What guarantees the existence of 
the
adbname after rcu_read_unlock() ?


This is part of the internal reference counting - there's a macro that expects 
`isc_refcount_t references;`
member on the struct and it creates _ref, _unref, _attach and _detach functions 
for each struct.

The last _detach/_unref calls a destroy function.


 rcu_read_unlock();
and here's the part where LRU gets updated:
 LOCK(>lock); /* Must be unlocked by the caller */


I suspect you use a scheme where you hold the RCU read-side to perform the 
lookup, and
then you use the object with an internal lock held. But expecting the object to 
still
exist after rcu read unlock is incorrect, unless some other reference counting 
scheme
is used.


Yeah, I was trying to minimize the sections where we hold the rcu_read locks, 
but I gave
up and now there's rcu_read lock held for longer periods of time.


We've used that kind of scheme in LTTng lttng-relayd, where we use RCU for 
short-term
existence guarantee, and reference counting for longer-term existence 
guarantee. An
example can be found here:

https://github.com/lttng/lttng-tools/blob/master/src/bin/lttng-relayd/viewer-stream.cpp

viewer_stream_get_by_id() attempts lookup from the hash table, and re-validates 
that the
object exists with viewer_stream_get(), which checks if the refcount is already 
0 as it
tries to increment it with urcu_ref_get_unless_zero(). If zero, it does as if 
the object
was not found. I recommend this kind of scheme if you intend to use both RCU 
and reference
counting.

Then you can place a mutex within the object, and use that mutex to provide 
mutual
exclusion between concurrent accesses to the object that need to be serialized.

In the destroy handler (called when the reference count reaches 0), you will 
typically
want to unlink your object from the various data structures holding references 
to it
(hash tables, lists), and th

Re: [lttng-dev] urcu/rculist.h clarifications - for implementing LRU

2023-03-13 Thread Mathieu Desnoyers via lttng-dev

start_at_zone=true, now=) at 
adb.c:1446
#2  0x7fae87a392bf in dns_adb_createfind (adb=0x7fae830142a0, loop=0x7fae842c3a20, 
cb=cb@entry=0x7fae87b28d9f , cbarg=0x7fae7c679000, 
name=name@entry=0x7fae804fc9b0, qname=0x7fae7c679010, qtype=1, options=63, 
now=, target=0x0, port=53, depth=1, qc=0x7fae7c651060, 
findp=0x7fae804fc698) at adb.c:2149

(gdb) frame 0
#0  0x7fae87a34c96 in cds_list_del_rcu (elem=0x7fae37e78880) at 
/usr/include/x86_64-linux-gnu/urcu/rculist.h:71
71  elem->next->prev = elem->prev;
(gdb) print elem->next
$1 = (struct cds_list_head *) 0x0
(gdb) print elem
$2 = (struct cds_list_head *) 0x7fae37e78880

So, I suspect, I am doing something wrong when updating the position of the the 
name in the LRU list.

There are couple of places where we iterate through the LRU list (overmem 
cleaning can kick-in, the user initiated cleaning can start, shutdown can be 
happening...)



It gets me to wonder whether you really need RCU for the LRU list ? Are those 
lookups
very frequent ? And do they typically end up needing to grab a lock to protect 
against
concurrent list modifications ?


Is there perhaps already some LRU implementation using Userspace-RCU that I can 
take look at?


I don't have an example implementing an LRU with a linked list specifically, 
but this is not
different from other linked-list uses.

Thanks,

Mathieu



Thank you!
Ondrej
--
Ondřej Surý (He/Him)
ond...@sury.org

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] how to disable local file writing in relayd?

2023-03-08 Thread Mathieu Desnoyers via lttng-dev


On 2023-03-06 00:12, Yuan Bin via lttng-dev wrote:
  Can I disable local-file-writing in lttng-relayd to avoid the disk 
space overhead, only using it as a live viewer?


Not explicitly, but you can store your temporary files on a tmpfs file 
system (see lttng-relayd(8) --output command line parameter), which will 
only keep the relayd files in memory, and use the tracefile rotation 
feature to prevent the files from growing forever, e.g.:


https://lttng.org/docs/v2.13/#doc-enabling-disabling-channels

Example:Create a Linux kernel channel which rotates eight trace files of 
4 MiB each for each stream.


lttng enable-channel --kernel --tracefile-count=8 \
 --tracefile-size=4194304 my-channel

See lttng-enable-channel(1) for more info.

I hope this helps!

Mathieu


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

[lttng-dev] [RELEASE] LTTng-modules 2.12.13 and 2.13.9 (Linux kernel tracer)

2023-03-03 Thread Mathieu Desnoyers via lttng-dev

This is a release announcement for the two currently maintained stable 
branches of the LTTng-modules project.


* New in these releases

LTTng-modules v2.13.9 contains a fix required to build against Linux v6.2.

Both v2.12.13 and v2.13.9 contain a set of build fixes to follow 
evolution of the jbd2 tracepoint instrumentation within the Linux kernel 
5.4 and 5.10 stable branches.


* Changelog

2023-03-03 (Canadian Bacon Day) LTTng modules 2.13.9
* fix: jbd2: use the correct print format (v5.4.229)
* fix: jbd2 upper bound for v5.10.163
* fix: jbd2: use the correct print format (v5.10.163)
* fix: btrfs: move accessor helpers into accessors.h (v6.2)

2023-03-03 (Canadian Bacon Day) 2.12.13
* fix: jbd2: use the correct print format (v5.4.229)
* fix: jbd2 upper bound for v5.10.163
* fix: jbd2: use the correct print format (v5.10.163)

Project website: https://lttng.org
Documentation: https://lttng.org/docs
Download link: https://lttng.org/download

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] Filtering tracing by process name or PID/TID

2023-02-15 Thread Mathieu Desnoyers via lttng-dev


On 2023-02-15 04:09, Rengar Stinkt via lttng-dev wrote:

Dear community,
I only recently started working with lttng tracing due to work related 
projects, so I am very new to this. I have done some research before 
posting this but I can't seem to find an answer.
I am running several CPU load tests for specific processes on different 
devices using lttng and TraceCompass for visualization. I am running 
into the issue that 99.9% of traced processes are not of value to me and 
the tracing files get extremely big and hard to work with (filtering 
with TraceCompass is very slow).
Now I thought of filtering the processes before tracing and I found 
filtering by PID and TID. The issue with this is that the PIDs and TIDs 
are unique on each device but change between devices.
I then found the command "htop -d 0.1 -u **String**" to see currently 
running processes with a certain name.
Now if I run this it shows me the running process IF they are running. I 
have time triggered and event triggered processes. There are many 
inconvenient workarounds to make it work, like triggering the events and 
finding out the PID and then manually copying all of the IDs and pasting 
them into "lttng track --kernel --pid=""". But I am trying to find a way 
to either filter by name right away, avoiding relying on PIDs or at 
least to have an automated process of doing it. But I am unfamiliar with 
running code in the PuTTY terminal that we are using, so I am trying to 
avoid this (for now). If this is the only option though, I will have to 
look into it.
Is there any way to filter by name right away like in the mentioned htop 
command?

Thank you so much in advance.


This would be:

lttng enable-event -k event_name --filter '$ctx.procname == "string"'

Where "string" can include wildcards as well.

Hoping this helps,

Mathieu



Dom

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

[lttng-dev] [RELEASE] Userspace RCU 0.14.0, 0.13.3, 0.12.5 [EOL]

2023-02-14 Thread Mathieu Desnoyers via lttng-dev


Hi,

This is a release announcement for the Userspace RCU project.

This is a set of releases, including the new 0.14 branch with the 0.14.0 
release, and bug fix releases for the 0.13 and 0.12 branches. The 0.12.5 
release is the last of the 0.12 branch, which reaches end of life with 
the release of 0.14.


Here are the new features introduced in urcu 0.14.0:

- C99 and C++11 are now the baseline requirements, as documented in
  README.md.

- Introduce public APIs for C++,

  An important point to consider: urcu/compiler.h needs to include
   in C++, which prevents including urcu/compiler.h
  from extern "C" code.

- Introduce new grace period polling APIs in urcu-memb,mb,signal,qsbr,bp
  flavors:

  struct urcu_gp_poll_state start_poll_synchronize_rcu(void);
  bool poll_state_synchronize_rcu(struct urcu_gp_poll_state state);

  This allow periodically polling to check if a started grace period has
  completed, and thus check for grace period completion and some other
  condition as well.

- rculfhash: introduce cds_lfht_node_init_deleted

  Allow initializing lfht node to "removed" state to allow querying
  whether the node is published in a hash table before it is added to
  the hash table and after it has been removed from the hash table.

- Disable signals in URCU background threads

  Applications using signalfd depend on signals being blocked in all
  threads of the process, otherwise threads with unblocked signals
  can receive them and starve the signalfd.

  While some threads in URCU do block signals (e.g. workqueue
  worker for rculfhash), the call_rcu, defer_rcu, and rculfhash
  partition_resize_helper threads do not.

  Always block all signals before creating threads, and only unblock
  SIGRCU when registering a urcu-signal thread. Restore the SIGRCU
  signal to its pre-registration blocked state on unregistration.

  For rculfhash, cds_lfht_worker_init can be removed, because its only
  effect is to block all signals except SIGRCU. Blocking all signals is
  already done by the workqueue code, and unbloking SIGRCU is now done
  by the urcu signal flavor thread regisration.

- Always use '__thread' for Thread local storage except on MSVC

  Use the GCC extension '__thread' [1] for Thread local storage on all C
  and C++ compilers except MSVC.

  While C11 and C++11 respectively offer '_Thread_local' and
  'thread_local' as potentialy faster implementations, they offer no
  guarantees of compatibility when used in a library interface which
  might be used by both C and C++ client code.

- Various test framework improvements.

- Wire up membarrier system call on Alpha. The only missing architecture
  without membarrier wired up is MIPS. https://bugs.lttng.org/issues/940


Here are the fixes introduced in urcu 0.14.0, 0.13.3 and 0.12.5:

- Fix: auto-resize hash table destroy deadlock

  Fix a deadlock for auto-resize hash tables when cds_lfht_destroy
  is called with RCU read-side lock held.

- Join call_rcu worker thread in call_rcu_data_free (eliminate leaks)

- Teardown default call_rcu worker on application exit

  Teardown the default call_rcu worker thread if there are no queued
  callbacks on process exit. This prevents leaking memory.

  Here is how an application can ensure graceful teardown of this
  worker thread:

  - An application queuing call_rcu callbacks should invoke
rcu_barrier() before it exits.
  - When chaining call_rcu callbacks, the number of calls to
rcu_barrier() on application exit must match at least the maximum
number of chained callbacks.
  - If an application chains callbacks endlessly, it would have to be
modified to stop chaining callbacks when it detects an application
exit (e.g. with a flag), and wait for quiescence with rcu_barrier()
after setting that flag.
  - The statements above apply to a library which queues call_rcu
callbacks, only it needs to invoke rcu_barrier in its library
destructor.

- Allow building on MSYS2

  Update cygwin libtool config in `configure.ac` to match MSYS2 build
  environments as well. MSYS2 is also a Windows build environment that
  produces DLLs.

Feedback is welcome!

Mathieu


Project website: https://liburcu.org
Git repository: git://git.liburcu.org/urcu.git

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] lttng-consumerd crash on aarch64 due to x86 arch specific optimization

2023-02-06 Thread Mathieu Desnoyers via lttng-dev


Hi Micke,

I did tweaks to make the code C++ compatible even though it's currently 
only built in C. It makes it more future-proof.


I've merged the resulting patch into lttng-ust 
master/stable-2.13/stable-2.12. Thanks for testing !


Mathieu

On 2023-02-06 11:15, Beckius, Mikael wrote:

Hello Mathieu!

I added your latest implementation to my test and it seems to perform well on 
both arm and arm64. Since the test was written in C++ I had to make a small 
change to the cast in order for the test to compile.

Micke


-Ursprungligt meddelande-
Från: Mathieu Desnoyers 
Skickat: den 2 februari 2023 17:26
Till: Beckius, Mikael ; lttng-
d...@lists.lttng.org
Ämne: Re: [lttng-dev] lttng-consumerd crash on aarch64 due to x86 arch
specific optimization

CAUTION: This email comes from a non Wind River email account!
Do not click links or open attachments unless you recognize the sender and
know the content is safe.

Hi  Mikael,

I just tried another approach to fix this issue, see:

https://review.lttng.org/c/lttng-ust/+/9413 Fix: use unaligned pointer
accesses for lttng_inline_memcpy

It is less intrusive than other approaches, and does not change the generated
code on the
most relevant architectures.

Feedback is welcome,

Thanks,

Mathieu


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com




--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] lttng-consumerd crash on aarch64 due to x86 arch specific optimization

2023-02-02 Thread Mathieu Desnoyers via lttng-dev


Hi  Mikael,

I just tried another approach to fix this issue, see:

https://review.lttng.org/c/lttng-ust/+/9413 Fix: use unaligned pointer accesses 
for lttng_inline_memcpy

It is less intrusive than other approaches, and does not change the generated 
code on the
most relevant architectures.

Feedback is welcome,

Thanks,

Mathieu


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] lttng-consumerd crash on aarch64 due to x86 arch specific optimization

2023-01-31 Thread Mathieu Desnoyers via lttng-dev


On 2023-01-31 11:18, Mathieu Desnoyers wrote:

On 2023-01-31 11:08, Mathieu Desnoyers wrote:

On 2023-01-30 01:50, Beckius, Mikael via lttng-dev wrote:

Hello Matthieu!

I have looked at this in place of Anders and as far as I can tell 
this is not an arm64 issue but an arm issue. And even on arm 
__ARM_FEATURE_UNALIGNED is 1 so it seems the problem only occurs if 
size equals 8.


So for ARM, perhaps we should do the following in 
include/lttng/ust-arch.h:


#if defined(LTTNG_UST_ARCH_ARM) && defined(__ARM_FEATURE_UNALIGNED)
#define LTTNG_UST_ARCH_HAS_EFFICIENT_UNALIGNED_ACCESS 1
#endif

And refer to 
https://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html#ARM-Options


Based on that documentation, it is possible to build with 
-mno-unaligned-access,
and for all pre-ARMv6, all ARMv6-M and for ARMv8-M Baseline 
architectures,

unaligned accesses are not enabled.

I would only push this kind of change into the master branch though, 
due to

its impact and the fact that this is only a performance improvement.


But setting LTTNG_UST_ARCH_HAS_EFFICIENT_UNALIGNED_ACCESS 1 for arm32
when __ARM_FEATURE_UNALIGNED is defined would still cause issues for
8-byte lttng_inline_memcpy with my proposed patch right ?

AFAIU 32-bit arm with __ARM_FEATURE_UNALIGNED has unaligned accesses for
2 and 4 bytes accesses, but somehow traps for unaligned 8-bytes
accesses ?


Re-reading your analysis, I may have mistakenly concluded that using the
lttng ust ring buffer in "packed" mode would be faster than aligned mode 
on arm32 and aarch64, but that's not really what you have benchmarked there.


So forget what I said about setting 
LTTNG_UST_ARCH_HAS_EFFICIENT_UNALIGNED_ACCESS to 1 for arm32 and aarch64.


There is a distinction between having efficient unaligned access and
supporting unaligned accesses at all.

For aarch64, it appears to support unaligned accesses, but it may be
slower than aligned accesses AFAIU.

For arm32, it supports unaligned accesses for 2 and 4 bytes when 
__ARM_FEATURE_UNALIGNED is set, but not for 8 bytes (it traps). Then 
it's not clear whether a 2 or 4 bytes access is slower when unaligned 
compared to aligned.


At the end of the day, it's a question of compactness of the generated 
trace data (added throughput overhead) vs cpu time required to perform 
an unaligned access vs aligned.


Thoughts ?

Thanks,

Mathieu



Thanks,

Mathieu





In addition I did some performance testing of lttng_inline_memcpy by 
extracting it and adding it to a simple test program. It appears that 
the general performance increases on arm, arm64, arm on arm64 
hardware and x86-64. But it also appears that on arm if you end up in 
memcpy the old code where you call memcpy directly is actually 
slightly faster.


Nothing unexpected here. Just make sure that your test program does 
not call lttng_inline_memcpy
with constant size values which end up optimizing away branches. In 
the context where lttng_inline_memcpy

is used, most of the time its arguments are not constants.



Skipping the memcpy fallback on arm for unaligned copies of sizes 2 
and 4 further improves the performance


This would be naturally done on your board if we conditionally
set LTTNG_UST_ARCH_HAS_EFFICIENT_UNALIGNED_ACCESS 1 for 
__ARM_FEATURE_UNALIGNED

right ?

and setting LTTNG_UST_ARCH_HAS_EFFICIENT_UNALIGNED_ACCESS 1 yields the 
best performance on arm64.


This could go into lttng-ust master branch as well, e.g.:

#if defined(LTTNG_UST_ARCH_AARCH64)
#define LTTNG_UST_ARCH_HAS_EFFICIENT_UNALIGNED_ACCESS 1
#endif

Thanks!

Mathieu



Micke
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev






--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] lttng-consumerd crash on aarch64 due to x86 arch specific optimization

2023-01-31 Thread Mathieu Desnoyers via lttng-dev


On 2023-01-31 11:08, Mathieu Desnoyers wrote:

On 2023-01-30 01:50, Beckius, Mikael via lttng-dev wrote:

Hello Matthieu!

I have looked at this in place of Anders and as far as I can tell this 
is not an arm64 issue but an arm issue. And even on arm 
__ARM_FEATURE_UNALIGNED is 1 so it seems the problem only occurs if 
size equals 8.


So for ARM, perhaps we should do the following in include/lttng/ust-arch.h:

#if defined(LTTNG_UST_ARCH_ARM) && defined(__ARM_FEATURE_UNALIGNED)
#define LTTNG_UST_ARCH_HAS_EFFICIENT_UNALIGNED_ACCESS 1
#endif

And refer to 
https://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html#ARM-Options


Based on that documentation, it is possible to build with 
-mno-unaligned-access,

and for all pre-ARMv6, all ARMv6-M and for ARMv8-M Baseline architectures,
unaligned accesses are not enabled.

I would only push this kind of change into the master branch though, due to
its impact and the fact that this is only a performance improvement.


But setting LTTNG_UST_ARCH_HAS_EFFICIENT_UNALIGNED_ACCESS 1 for arm32
when __ARM_FEATURE_UNALIGNED is defined would still cause issues for
8-byte lttng_inline_memcpy with my proposed patch right ?

AFAIU 32-bit arm with __ARM_FEATURE_UNALIGNED has unaligned accesses for
2 and 4 bytes accesses, but somehow traps for unaligned 8-bytes
accesses ?

Thanks,

Mathieu





In addition I did some performance testing of lttng_inline_memcpy by 
extracting it and adding it to a simple test program. It appears that 
the general performance increases on arm, arm64, arm on arm64 hardware 
and x86-64. But it also appears that on arm if you end up in memcpy 
the old code where you call memcpy directly is actually slightly faster.


Nothing unexpected here. Just make sure that your test program does not 
call lttng_inline_memcpy
with constant size values which end up optimizing away branches. In the 
context where lttng_inline_memcpy

is used, most of the time its arguments are not constants.



Skipping the memcpy fallback on arm for unaligned copies of sizes 2 
and 4 further improves the performance


This would be naturally done on your board if we conditionally
set LTTNG_UST_ARCH_HAS_EFFICIENT_UNALIGNED_ACCESS 1 for 
__ARM_FEATURE_UNALIGNED

right ?

and setting LTTNG_UST_ARCH_HAS_EFFICIENT_UNALIGNED_ACCESS 1 yields the 
best performance on arm64.


This could go into lttng-ust master branch as well, e.g.:

#if defined(LTTNG_UST_ARCH_AARCH64)
#define LTTNG_UST_ARCH_HAS_EFFICIENT_UNALIGNED_ACCESS 1
#endif

Thanks!

Mathieu



Micke
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev




--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] lttng-consumerd crash on aarch64 due to x86 arch specific optimization

2023-01-31 Thread Mathieu Desnoyers via lttng-dev


On 2023-01-30 01:50, Beckius, Mikael via lttng-dev wrote:

Hello Matthieu!

I have looked at this in place of Anders and as far as I can tell this is not 
an arm64 issue but an arm issue. And even on arm __ARM_FEATURE_UNALIGNED is 1 
so it seems the problem only occurs if size equals 8.


So for ARM, perhaps we should do the following in include/lttng/ust-arch.h:

#if defined(LTTNG_UST_ARCH_ARM) && defined(__ARM_FEATURE_UNALIGNED)
#define LTTNG_UST_ARCH_HAS_EFFICIENT_UNALIGNED_ACCESS 1
#endif

And refer to https://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html#ARM-Options

Based on that documentation, it is possible to build with -mno-unaligned-access,
and for all pre-ARMv6, all ARMv6-M and for ARMv8-M Baseline architectures,
unaligned accesses are not enabled.

I would only push this kind of change into the master branch though, due to
its impact and the fact that this is only a performance improvement.



In addition I did some performance testing of lttng_inline_memcpy by extracting 
it and adding it to a simple test program. It appears that the general 
performance increases on arm, arm64, arm on arm64 hardware and x86-64. But it 
also appears that on arm if you end up in memcpy the old code where you call 
memcpy directly is actually slightly faster.


Nothing unexpected here. Just make sure that your test program does not call 
lttng_inline_memcpy
with constant size values which end up optimizing away branches. In the context 
where lttng_inline_memcpy
is used, most of the time its arguments are not constants.



Skipping the memcpy fallback on arm for unaligned copies of sizes 2 and 4 
further improves the performance


This would be naturally done on your board if we conditionally
set LTTNG_UST_ARCH_HAS_EFFICIENT_UNALIGNED_ACCESS 1 for __ARM_FEATURE_UNALIGNED
right ?

and setting LTTNG_UST_ARCH_HAS_EFFICIENT_UNALIGNED_ACCESS 1 yields the best 
performance on arm64.

This could go into lttng-ust master branch as well, e.g.:

#if defined(LTTNG_UST_ARCH_AARCH64)
#define LTTNG_UST_ARCH_HAS_EFFICIENT_UNALIGNED_ACCESS 1
#endif

Thanks!

Mathieu



Micke
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] lttng-consumerd crash on aarch64 due to x86 arch specific optimization

2023-01-26 Thread Mathieu Desnoyers via lttng-dev


On 2023-01-26 14:32, Anders Wallin wrote:

Hi Matthieu,

I've retired and no longer have access to any arch64  target to test it on.



Thanks for your reply Anders,

I've talked to Henrik and Pär today and they are already testing it out.

Enjoy your retirement :)

Best regards,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] lttng-consumerd crash on aarch64 due to x86 arch specific optimization

2023-01-25 Thread Mathieu Desnoyers via lttng-dev


Hi Anders,

Sorry for the long delay on this one, can you have a look at the following fix ?

https://review.lttng.org/c/lttng-ust/+/9319 Fix: aarch64: do not perform 
unaligned stores

If it passes your testing, I'll merge this into lttng-ust.

Thanks,

Mathieu

On 2017-12-28 09:13, Anders Wallin wrote:

Hi Mathieu,

I finally got some time to dig into this issue. The crash only happens 
when metadata is written AND the size of the metadata will end up in a 
write that is 8,4,2 or 1 bytes long AND
that the source or destination is not aligned correctly according to HW 
limitation. I have not found any simple way to keep the performance 
enhancement code that is run most of the time.

Maybe the metadata writes should have it's own write function instead.

Here is an example of a crash (code is from lttng-ust 2.9.1 and 
lttng-tools 2.9.6) where the size is 8 bytes and the src address is 
unaligned at 0xf3b7eeb2;


#0  lttng_inline_memcpy (len=8, src=0xf3b7eeb2, dest=) at 
/usr/src/debug/lttng-ust/2.9.1/git/libringbuffer/backend_internal.h:610

No locals.
#1  lib_ring_buffer_write (len=8, src=0xf3b7eeb2, ctx=0xf57c47d0, 
config=0xf737c560 ) at 
/usr/src/debug/lttng-ust/2.9.1/git/libringbuffer/backend.h:100

         __len = 8
         handle = 0xf3b2e0c0
         backend_pages = 
         chanb = 0xf3b2e2e0
         offset = 

#2  lttng_event_write (ctx=0xf57c47d0, src=0xf3b7eeb2, len=8) at 
/usr/src/debug/lttng-ust/2.9.1/git/liblttng-ust/lttng-ring-buffer-metadata-client.h:267

No locals.

#3  0xf7337ef8 in ustctl_write_one_packet_to_channel (channel=out>, metadata_str=0xf3b7eeb2 "", len=) at 
/usr/src/debug/lttng-ust/2.9.1/git/liblttng-ust-ctl/ustctl.c:1183
         ctx = {chan = 0xf3b2e290, priv = 0x0, handle = 0xf3b2e0c0, 
data_size = 8, largest_align = 1, cpu = -1, buf = 0xf6909000, slot_size 
= 8, buf_offset = 163877, pre_offset = 163877, tsc = 0, rflags = 0, 
ctx_len = 80, ip = 0x0, priv2 = 0x0, padding2 = '\000' times>, backend_pages = 0xf690c000}

         chan = 0xf3b2e4d8
         str = 0xf3b7eeb2 ""
         reserve_len = 8
         ret = 
         __func__ = '\000' 
         __PRETTY_FUNCTION__ = '\000' 
---Type  to continue, or q  to quit---

#4  0x000344cc in commit_one_metadata_packet 
(stream=stream@entry=0xf3b2e560) at ust-consumer.c:2206

         write_len = 
         ret = 
         __PRETTY_FUNCTION__ = "commit_one_metadata_packet"

#5  0x00036538 in lttng_ustconsumer_read_subbuffer 
(stream=stream@entry=0xf3b2e560, ctx=ctx@entry=0x25e6e8) at 
ust-consumer.c:2452

         len = 4096
         subbuf_size = 4093
         padding = 
         err = -11
         write_index = 1
         ret = 
         ustream = 
         index = {offset = 0, packet_size = 575697416355872, 
content_size = 17564043391468256584, timestamp_begin = 
17564043425827782792, timestamp_end = 34359738496,

Regards
Anders

fre 24 nov. 2017 kl 20:18 skrev Mathieu Desnoyers 
mailto:mathieu.desnoy...@efficios.com>>:


- On Nov 24, 2017, at 3:23 AM, Anders Wallin mailto:walli...@gmail.com>> wrote:

Hi,
architectures that has memory alignment restrictions may/will
fail with the
optimization done in 51b8f2fa2b972e62117caa946dd3e3565b6ca4a3.
Please revert the patch or make it X86 specific.


Hi Anders,

This was added in the development cycle of lttng-ust 2.9. We could
perhaps
add a test on the pointer alignment for architectures that care
about it, and
fallback to memcpy in those cases.

The revert approach would have been justified if this commit had
been backported
as a "fix" to a stable branch, which is not the case here. We should
work on
finding an acceptable solution that takes care of dealing with
unaligned pointers
on architectures that care about the difference.

Thanks,

Mathieu



Regards

Anders Wallin


commit 51b8f2fa2b972e62117caa946dd3e3565b6ca4a3
Author: Mathieu Desnoyers mailto:mathieu.desnoy...@efficios.com>>
Date:   Sun Sep 25 12:31:11 2016 -0400

     Performance: implement lttng_inline_memcpy
     Because all length parameters received for serializing data
coming from
     applications go through a callback, they are never
constant, and it
     hurts performance to perform a call to memcpy each time.
     Signed-off-by: Mathieu Desnoyers
mailto:mathieu.desnoy...@efficios.com>>

diff --git a/libringbuffer/backend_internal.h
b/libringbuffer/backend_internal.h
index 90088b89..e597cf4d 100644
--- a/libringbuffer/backend_internal.h
+++ b/libringbuffer/backend_internal.h
@@ -592,6 +592,28 @@ int update_read_sb_index(const struct
lttng_ust_lib_ring_buffer_config

[lttng-dev] [RELEASE] LTTng-modules 2.12.12 and 2.13.8 (Linux kernel tracer)

2023-01-13 Thread Mathieu Desnoyers via lttng-dev


Hi,

Those are stable release updates of the LTTng modules project.

The most relevant change is that the 2.13.8 version introduces
support for the 6.1 Linux kernel, kernel version ranges updates
for the RHEL kernels, and a kallsyms wrapper fix on ppc64el.

The LTTng modules provide Linux kernel tracing capability to the LTTng
tracer toolset.

* New in these releases:

2023-01-13 (National Sticker Day) LTTng modules 2.13.8
* fix: jbd2: use the correct print format
* Fix: in_x32_syscall was introduced in v4.7.0
* Explicitly skip tracing x32 system calls
* fix: kallsyms wrapper on ppc64el
* fix: Adjust ranges for RHEL 8.6 kernels
* fix: kvm-x86 requires CONFIG_KALLSYMS_ALL
* fix: mm/slab_common: drop kmem_alloc & avoid dereferencing fields 
when not using (v6.1)

2023-01-13 (National Sticker Day) LTTng modules 2.12.12
* fix: jbd2: use the correct print format
* Fix: in_x32_syscall was introduced in v4.7.0
* Explicitly skip tracing x32 system calls
* fix: kallsyms wrapper on ppc64el
* fix: Adjust ranges for RHEL 8.6 kernels
* fix: kvm-x86 requires CONFIG_KALLSYMS_ALL

Project website: https://lttng.org
Documentation: https://lttng.org/docs
Download link: https://lttng.org/download

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] LTTng UST structure support

2023-01-12 Thread Mathieu Desnoyers via lttng-dev


On 2023-01-09 09:02, chafraysse--- via lttng-dev wrote:

Hi,

I'm looking for a CTF writer to serialize instrumentations in an 
embedded Linux/Rust framework
LTTng UST looked like a very strong option, but I want to serialize 
structures as CTF compound type structures and I did not see those 
supported in the doc or api


This is correct. I am currently working on a new project called 
"libside" (see https://git.efficios.com/?p=libside.git;a=summary) which 
features support for compound types.


However, we still need to do the heavy-lifting implementation work of 
integrating this with LTTng-UST. This is the plan towards supporting 
compound types in LTTng-UST.



I'd love to have confirmation that I did not just miss something :)
If LTTng UST is out for me I will probably try to use the ctf-writer 
module of babeltrace 2 instead


For now the ctf-writer modules of bt2 would be an alternative to 
consider, but remember that it is not designed for low-impact tracing 
such as lttng-ust. So it depends on how much tracer overhead/runtime 
impact you can afford in your use-case.


Thanks,

Mathieu



Best regards,

Charles
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

[lttng-dev] lttv: Document project status as unmaintained

2023-01-10 Thread Mathieu Desnoyers via lttng-dev


Hi Florian,

I'll pull your patch into the lttv master branch, but please be aware
that the LTTV project has not seen activity since 2013, is currently
unmaintained, and that we do not plan on doing further releases of this
project. Our efforts were diverted elsewhere on the trace analysis
front, namely into Trace Compass and Babeltrace.

In order to clarify the situation, I will introduce a commit into
LTTV's master branch which will remove Yannick Brosseau from the
maintainer role in the README file, and add this section at the
beginning:

PROJECT STATUS


The LTTV project is currently unmaintained. If you need up-to-date tools
to view/analyze LTTng traces, please consider the following alternatives:

- Trace Compass (https://www.eclipse.org/tracecompass)
- Babeltrace (https://babeltrace.org)


Thank you Yannick for stepping into the role of maintainer near the
end of this project lifetime.

Michael Jeanson noticed that the lttv Fedora package was orphaned.
He just adopted it and is currently investigating the Fedora
documentation to figure out how to request its removal from Fedora.

For those interested in historical artifacts, I created the lttv
svn repository back in 2003 when I was sitting at the Decelles building
at Ecole Polytechnique, working for Prof. Michel Dagenais:

commit bbdf43d6e0e3bd3f9ade420e81915408cbe4fbba
Author: compudj 
Date:   Thu May 15 13:07:17 2003 +

Initial repository layout

git-svn-id: http://ltt.polymtl.ca/svn@1 04897980-b3bd-0310-b5e0-8ef037075253


This was the beginning of a fun ride which turned out motivating the
creation of LTTng, the Linux kernel Tracepoints, the Common Trace Format,
Trace Compass, Babeltrace, liburcu, the membarrier(2), and the rseq(2)
system calls.

LTTV had a good 10 years of activity from 2003 to 2013, but it is now high
time to redirect users to Trace Compass and Babeltrace instead.

Thanks,

Mathieu


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] README issue of liburcu

2022-11-10 Thread Mathieu Desnoyers via lttng-dev


On 2022-11-02 03:24, Yongwei Wu via lttng-dev wrote:
I apologize if this is not the right place. I do not see an Issues page 
on GitHub.


The README on GitHub now says MacOS is among "Tested on", so should we 
remove Darwin from "Should also work on"?


Removed from README file in the master branch.

Thanks,

Mathieu



Best regards,

Yongwei

--
Yongwei Wu
URL: http://wyw.dcweb.cn/ <http://wyw.dcweb.cn/>

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 2970 matches

Mail list logo