[PATCHv3] Introduced new tracing mode KCOV_MODE_UNIQUE.

2021-03-26 Thread Alexander Lochmann
It simply stores the executed PCs.
The execution order is discarded.
Each bit in the shared buffer represents every fourth
byte of the text segment.
Since a call instruction on every supported
architecture is at least four bytes, it is safe
to just store every fourth byte of the text segment.
In contrast to KCOV_MODE_TRACE_PC, the shared buffer
cannot overflow. Thus, all executed PCs are recorded.

Signed-off-by: Alexander Lochmann 
---
 Documentation/dev-tools/kcov.rst | 79 +
 include/linux/kcov.h | 12 ++--
 include/uapi/linux/kcov.h| 10 
 kernel/kcov.c| 98 
 4 files changed, 172 insertions(+), 27 deletions(-)

diff --git a/Documentation/dev-tools/kcov.rst b/Documentation/dev-tools/kcov.rst
index d2c4c27e1702..9b0df2f8474c 100644
--- a/Documentation/dev-tools/kcov.rst
+++ b/Documentation/dev-tools/kcov.rst
@@ -127,6 +127,85 @@ That is, a parent process opens /sys/kernel/debug/kcov, 
enables trace mode,
 mmaps coverage buffer and then forks child processes in a loop. Child processes
 only need to enable coverage (disable happens automatically on thread end).
 
+If someone is interested in a set of executed PCs, and does not care about
+execution order, he or she can use KCOV_INIT_UNIQUE to do so:
+
+.. code-block:: c
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define KCOV_INIT_UNIQUE_IOR('c', 2, unsigned long)
+#define KCOV_ENABLE_IO('c', 100)
+#define KCOV_DISABLE   _IO('c', 101)
+
+#define BITS_PER_LONG 64
+#define KCOV_TRACE_PC  0
+#define KCOV_TRACE_CMP 1
+#define KCOV_UNIQUE_PC 2
+/*
+ * Determine start of text segment via 'nm vmlinux | grep _stext | cut -d 
" " -f1',
+ * and fill in.
+ */
+#define STEXT_START 0x8100
+
+
+
+int main(int argc, char **argv)
+{
+   int fd;
+   unsigned long *cover, n, i;
+
+   /* A single fd descriptor allows coverage collection on a single
+* thread.
+*/
+   fd = open("/sys/kernel/debug/kcov", O_RDWR);
+   if (fd == -1)
+   perror("open"), exit(1);
+   /* Setup trace mode and trace size. */
+   if ((n = ioctl(fd, KCOV_INIT_UNIQUE, 0)) < 0)
+   perror("ioctl"), exit(1);
+   /* Mmap buffer shared between kernel- and user-space. */
+   cover = (unsigned long*)mmap(NULL, n,
+PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
+   if ((void*)cover == MAP_FAILED)
+   perror("mmap"), exit(1);
+   /* Enable coverage collection on the current thread. */
+   if (ioctl(fd, KCOV_ENABLE, KCOV_UNIQUE_PC))
+   perror("ioctl"), exit(1);
+   /* That's the target syscal call. */
+   read(-1, NULL, 0);
+   /* Disable coverage collection for the current thread. After this call
+* coverage can be enabled for a different thread.
+*/
+   if (ioctl(fd, KCOV_DISABLE, 0))
+   perror("ioctl"), exit(1);
+/* Convert byte size into element size */
+n /= sizeof(unsigned long);
+/* Print executed PCs in sorted order */
+for (i = 0; i < n; i++) {
+for (int j = 0; j < BITS_PER_LONG; j++) {
+if (cover[i] & (1L << j)) {
+printf("0x%jx\n", (uintmax_t)(STEXT_START + (i * 
BITS_PER_LONG + j) * 4));
+}
+}
+}
+   /* Free resources. */
+   if (munmap(cover, n * sizeof(unsigned long)))
+   perror("munmap"), exit(1);
+   if (close(fd))
+   perror("close"), exit(1);
+   return 0;
+}
+
 Comparison operands collection
 --
 
diff --git a/include/linux/kcov.h b/include/linux/kcov.h
index 4e3037dc1204..99c309b3a53b 100644
--- a/include/linux/kcov.h
+++ b/include/linux/kcov.h
@@ -12,17 +12,21 @@ enum kcov_mode {
/* Coverage collection is not enabled yet. */
KCOV_MODE_DISABLED = 0,
/* KCOV was initialized, but tracing mode hasn't been chosen yet. */
-   KCOV_MODE_INIT = 1,
+   KCOV_MODE_INIT_TRACE = 1,
+   /* KCOV was initialized, but recording of unique PCs hasn't been chosen 
yet. */
+   KCOV_MODE_INIT_UNIQUE = 2,
/*
 * Tracing coverage collection mode.
 * Covered PCs are collected in a per-task buffer.
 */
-   KCOV_MODE_TRACE_PC = 2,
+   KCOV_MODE_TRACE_PC = 4,
/* Collecting comparison operands mode. */
-   KCOV_MODE_TRACE_CMP = 3,
+   KCOV_MODE_TRACE_CMP = 8,
+   /* Collecting unique covered PCs. Execution order is not saved. */
+   KCOV_MODE_UN

Re: [RFC] inode.i_opflags - Usage of two different locking schemes

2021-03-26 Thread Alexander Lochmann



On 16.03.21 18:14, Jan Kara wrote:
> 
> So i_lock is supposed to protect i_opflags for writing AFAICT. For reading
> we don't seem to bother in some cases and I agree that is potentially
> problematic. It is *mostly* OK because we initialize i_opflags when loading
> inode into memory / adding it to dcache. But sometimes we also update them
> while the inode is alive. Now this is fine for the particular flag we
> update but in theory, if the compiler wants to screw us and stores
> temporarily some nonsensical value in i_opflags we'd have a problem. This
> is mostly a theoretical issue but eventually we probably want to fix this.
> 
>   Honza
> 
Thx for the detailed explanation. :-)

- Alex

-- 
Technische Universität Dortmund
Alexander LochmannPGP key: 0xBC3EF6FD
Otto-Hahn-Str. 16 phone:  +49.231.7556141
D-44227 Dortmund  fax:+49.231.7556116
http://ess.cs.tu-dortmund.de/Staff/al


Re: [PATCH 1/2] Updated locking documentation for transaction_t

2021-03-26 Thread Alexander Lochmann
On 11.02.21 10:30, Jan Kara wrote:
>> diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
>> index 99d3cd051ac3..18f77d9b1745 100644
>> --- a/include/linux/jbd2.h
>> +++ b/include/linux/jbd2.h
>> @@ -594,18 +594,18 @@ struct transaction_s
>>   */
>>  unsigned long   t_log_start;
>>  
>> -/* Number of buffers on the t_buffers list [j_list_lock] */
>> +/* Number of buffers on the t_buffers list [j_list_lock, no lock for 
>> quick racy checks] */
>>  int t_nr_buffers;
> 
> So this case is actually somewhat different now that I audited the uses.
> There are two types of users - commit code (fs/jbd2/commit.c) and others.
> Other users properly use j_list_lock to access t_nr_buffers. Commit code
> does not use any locks because committing transaction is fully in
> ownership of the jbd2 thread and all other users need to check & wait for
> commit to be finished before doing anything with the transaction's buffers.

I'm still trying understand how thinks work:
Accesses to transaction_t might occur from different contexts. Thus,
locks are necessary. If it comes to the commit phase, every other
context has to wait until jbd2 thread has done its work. Therefore, jbd2
thread does not need any locks to access a transaction_t (or just parts
of it?) during commit phase.
Is that correct?

If so: I was thinking whether it make sense to ignore all memory
accesses to a transaction_t (or parts of it) that happen in the commit
phase. They deliberately ignore the locking policy, and would confuse
our approach.

Is the commit phase performed by jbd2_journal_commit_transaction()?
We would add this function to our blacklist for transaction_t.

- Alex
-- 
Technische Universität Dortmund
Alexander LochmannPGP key: 0xBC3EF6FD
Otto-Hahn-Str. 16 phone:  +49.231.7556141
D-44227 Dortmund  fax:+49.231.7556116
http://ess.cs.tu-dortmund.de/Staff/al


Re: [PATCH] Introduced new tracing mode KCOV_MODE_UNIQUE.

2021-03-23 Thread Alexander Lochmann
;
>> kcov->remote = false;
>> kcov->remote_size = 0;
>> kcov->sequence++;
>> @@ -468,12 +477,13 @@ static int kcov_mmap(struct file *filep, struct 
>> vm_area_struct *vma)
>>
>> spin_lock_irqsave(&kcov->lock, flags);
>> size = kcov->size * sizeof(unsigned long);
>> -   if (kcov->mode != KCOV_MODE_INIT || vma->vm_pgoff != 0 ||
>> +   if (kcov->mode & ~(KCOV_INIT_TRACE | KCOV_INIT_UNIQUE) || 
>> vma->vm_pgoff != 0 ||
>> vma->vm_end - vma->vm_start != size) {
>> res = -EINVAL;
>> goto exit;
>> }
>> if (!kcov->area) {
>> +   kcov_debug("mmap(): Allocating 0x%lx bytes\n", size);
>> kcov->area = area;
>> vma->vm_flags |= VM_DONTEXPAND;
>> spin_unlock_irqrestore(&kcov->lock, flags);
>> @@ -515,6 +525,8 @@ static int kcov_get_mode(unsigned long arg)
>>  {
>> if (arg == KCOV_TRACE_PC)
>> return KCOV_MODE_TRACE_PC;
>> +   else if (arg == KCOV_UNIQUE_PC)
>> +   return KCOV_MODE_UNIQUE_PC;
>> else if (arg == KCOV_TRACE_CMP)
>>  #ifdef CONFIG_KCOV_ENABLE_COMPARISONS
>> return KCOV_MODE_TRACE_CMP;
>> @@ -562,12 +574,14 @@ static int kcov_ioctl_locked(struct kcov *kcov, 
>> unsigned int cmd,
>>  {
>> struct task_struct *t;
>> unsigned long size, unused;
>> -   int mode, i;
>> +   int mode, i, text_size, ret = 0;
>> struct kcov_remote_arg *remote_arg;
>> struct kcov_remote *remote;
>> unsigned long flags;
>>
>> switch (cmd) {
>> +   case KCOV_INIT_UNIQUE:
>> +   /* fallthrough here */
> 
> Looking at "git log --grep fallthrough", it seems that the modern way
> to say this is to use the fallthrough keyword.
> 
> Please run checkpatch, it shows a bunch of other warnings as well:
> 
> git diff HEAD^ | scripts/checkpatch.pl -
Yeah. I'll do that.

-- 
Alexander LochmannPGP key: 0xBC3EF6FD
Heiliger Weg 72   phone:  +49.231.28053964
D-44141 Dortmund  mobile: +49.151.15738323


Re: [PATCH] Introduced new tracing mode KCOV_MODE_UNIQUE.

2021-03-22 Thread Alexander Lochmann



On 22.03.21 13:17, Miguel Ojeda wrote:
> Hi Alexander,
> 
> On Sun, Mar 21, 2021 at 8:14 PM Alexander Lochmann
>  wrote:
>>
>> diff --git a/Documentation/dev-tools/kcov.rst 
>> b/Documentation/dev-tools/kcov.rst
>> index d2c4c27e1702..e105ffe6b6e3 100644
>> --- a/Documentation/dev-tools/kcov.rst
>> +++ b/Documentation/dev-tools/kcov.rst
>> @@ -127,6 +127,86 @@ That is, a parent process opens /sys/kernel/debug/kcov, 
>> enables trace mode,
>>  mmaps coverage buffer and then forks child processes in a loop. Child 
>> processes
>>  only need to enable coverage (disable happens automatically on thread end).
>>
>> +If someone is interested in a set of executed PCs, and does not care about
>> +execution order, he or she can advise KCOV to do so:
> 
> Please mention explicitly that KCOV_INIT_UNIQUE should be used for
> that, i.e. readers of the example shouldn't need to read every line to
> figure it out.
> 
>> +#define KCOV_INIT_TRACE_IOR('c', 1, unsigned long)
> 
> Trace is not used in the example.
> 
>> +   /* KCOV was initialized, but recording of unique PCs hasn't been 
>> chosen yet. */
>> +   KCOV_MODE_INIT_UNQIUE = 2,
> 
> Typo? It isn't used?
It is a typo. It should be used...
> 
> PS: not sure why I was Cc'd, but I hope that helps.
Thx for your feedback. get_maintainer.pl told me to include you in Cc.

Cheers,
Alex
> 
> Cheers,
> Miguel
> 

-- 
Alexander LochmannPGP key: 0xBC3EF6FD
Heiliger Weg 72   phone:  +49.231.28053964
D-44141 Dortmund  mobile: +49.151.15738323


[PATCH] Introduced new tracing mode KCOV_MODE_UNIQUE.

2021-03-21 Thread Alexander Lochmann
It simply stores the executed PCs.
The execution order is discarded.
Each bit in the shared buffer represents every fourth
byte of the text segment.
Since a call instruction on every supported
architecture is at least four bytes, it is safe
to just store every fourth byte of the text segment.
In contrast to KCOV_MODE_TRACE_PC, the shared buffer
cannot overflow. Thus, all executed PCs are recorded.

Signed-off-by: Alexander Lochmann 
---
 Documentation/dev-tools/kcov.rst | 80 +++
 include/linux/kcov.h | 12 ++--
 include/uapi/linux/kcov.h| 10 
 kernel/kcov.c| 94 
 4 files changed, 169 insertions(+), 27 deletions(-)

diff --git a/Documentation/dev-tools/kcov.rst b/Documentation/dev-tools/kcov.rst
index d2c4c27e1702..e105ffe6b6e3 100644
--- a/Documentation/dev-tools/kcov.rst
+++ b/Documentation/dev-tools/kcov.rst
@@ -127,6 +127,86 @@ That is, a parent process opens /sys/kernel/debug/kcov, 
enables trace mode,
 mmaps coverage buffer and then forks child processes in a loop. Child processes
 only need to enable coverage (disable happens automatically on thread end).
 
+If someone is interested in a set of executed PCs, and does not care about
+execution order, he or she can advise KCOV to do so:
+
+.. code-block:: c
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define KCOV_INIT_TRACE_IOR('c', 1, unsigned long)
+#define KCOV_INIT_UNIQUE_IOR('c', 2, unsigned long)
+#define KCOV_ENABLE_IO('c', 100)
+#define KCOV_DISABLE   _IO('c', 101)
+
+#define BITS_PER_LONG 64
+#define KCOV_TRACE_PC  0
+#define KCOV_TRACE_CMP 1
+#define KCOV_UNIQUE_PC 2
+/*
+ * Determine start of text segment via 'nm vmlinux | grep _stext | cut -d 
" " -f1',
+ * and fill in.
+ */
+#define STEXT_START 0x8100
+
+
+
+int main(int argc, char **argv)
+{
+   int fd;
+   unsigned long *cover, n, i;
+
+   /* A single fd descriptor allows coverage collection on a single
+* thread.
+*/
+   fd = open("/sys/kernel/debug/kcov", O_RDWR);
+   if (fd == -1)
+   perror("open"), exit(1);
+   /* Setup trace mode and trace size. */
+   if ((n = ioctl(fd, KCOV_INIT_UNIQUE, 0)) < 0)
+   perror("ioctl"), exit(1);
+   /* Mmap buffer shared between kernel- and user-space. */
+   cover = (unsigned long*)mmap(NULL, n,
+PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
+   if ((void*)cover == MAP_FAILED)
+   perror("mmap"), exit(1);
+   /* Enable coverage collection on the current thread. */
+   if (ioctl(fd, KCOV_ENABLE, KCOV_UNIQUE_PC))
+   perror("ioctl"), exit(1);
+   /* That's the target syscal call. */
+   read(-1, NULL, 0);
+   /* Disable coverage collection for the current thread. After this call
+* coverage can be enabled for a different thread.
+*/
+   if (ioctl(fd, KCOV_DISABLE, 0))
+   perror("ioctl"), exit(1);
+/* Convert byte size into element size */
+n /= sizeof(unsigned long);
+/* Print executed PCs in sorted order */
+for (i = 0; i < n; i++) {
+for (int j = 0; j < BITS_PER_LONG; j++) {
+if (cover[i] & (1L << j)) {
+printf("0x%jx\n", (uintmax_t)(STEXT_START + (i * 
BITS_PER_LONG + j) * 4));
+}
+}
+}
+   /* Free resources. */
+   if (munmap(cover, n * sizeof(unsigned long)))
+   perror("munmap"), exit(1);
+   if (close(fd))
+   perror("close"), exit(1);
+   return 0;
+}
+
 Comparison operands collection
 --
 
diff --git a/include/linux/kcov.h b/include/linux/kcov.h
index 4e3037dc1204..d72dd73388d1 100644
--- a/include/linux/kcov.h
+++ b/include/linux/kcov.h
@@ -12,17 +12,21 @@ enum kcov_mode {
/* Coverage collection is not enabled yet. */
KCOV_MODE_DISABLED = 0,
/* KCOV was initialized, but tracing mode hasn't been chosen yet. */
-   KCOV_MODE_INIT = 1,
+   KCOV_MODE_INIT_TRACE = 1,
+   /* KCOV was initialized, but recording of unique PCs hasn't been chosen 
yet. */
+   KCOV_MODE_INIT_UNQIUE = 2,
/*
 * Tracing coverage collection mode.
 * Covered PCs are collected in a per-task buffer.
 */
-   KCOV_MODE_TRACE_PC = 2,
+   KCOV_MODE_TRACE_PC = 4,
/* Collecting comparison operands mode. */
-   KCOV_MODE_TRACE_CMP = 3,
+   KCOV_MODE_TRACE_CMP = 8,
+ 

[PATCH v2] Updated locking documentation for journal_head

2021-03-19 Thread Alexander Lochmann
LockDoc's results show that t_list_lock has been
replaced by j_list_lock for b_next_transaction,
b_tnext, and b_tprev.
We, therefore, updated the documentation
accordingly.

Signed-off-by: Alexander Lochmann 
Signed-off-by: Horst Schirmeier 
Reviewed-by: Jan Kara 
---
 include/linux/journal-head.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/journal-head.h b/include/linux/journal-head.h
index 75bc56109031..d68ae72784eb 100644
--- a/include/linux/journal-head.h
+++ b/include/linux/journal-head.h
@@ -80,13 +80,13 @@ struct journal_head {
 * Pointer to the running compound transaction which is currently
 * modifying the buffer's metadata, if there was already a transaction
 * committing it when the new transaction touched it.
-* [t_list_lock] [b_state_lock]
+* [j_list_lock] [b_state_lock]
 */
transaction_t *b_next_transaction;
 
/*
 * Doubly-linked list of buffers on a transaction's data, metadata or
-* forget queue. [t_list_lock] [b_state_lock]
+* forget queue. [j_list_lock] [b_state_lock]
 */
struct journal_head *b_tnext, *b_tprev;
 
-- 
2.20.1



[PATCH] Updated locking documentation for journal_head

2021-03-19 Thread Alexander Lochmann
LockDoc's results show that t_list_lock has been
replaced by j_list_lock for b_next_transaction,
b_tnext, and b_tprev.
We, therefore, updated the documentation
accordingly.

Signed-off-by: Alexander Lochmann 
Signed-off-by: Horst Schirmeier 
---
 include/linux/journal-head.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/journal-head.h b/include/linux/journal-head.h
index 75bc56109031..d68ae72784eb 100644
--- a/include/linux/journal-head.h
+++ b/include/linux/journal-head.h
@@ -80,13 +80,13 @@ struct journal_head {
 * Pointer to the running compound transaction which is currently
 * modifying the buffer's metadata, if there was already a transaction
 * committing it when the new transaction touched it.
-* [t_list_lock] [b_state_lock]
+* [j_list_lock] [b_state_lock]
 */
transaction_t *b_next_transaction;
 
/*
 * Doubly-linked list of buffers on a transaction's data, metadata or
-* forget queue. [t_list_lock] [b_state_lock]
+* forget queue. [j_list_lock] [b_state_lock]
 */
struct journal_head *b_tnext, *b_tprev;
 
-- 
2.20.1



Re: [PATCH] KCOV: Introduced tracing unique covered PCs

2021-03-17 Thread Alexander Lochmann



On 15.03.21 09:02, Dmitry Vyukov wrote:
>>> Does this introduce an additional real of t->kcov_mode?
>>> If yes, please reuse the value read in check_kcov_mode.
>> Okay. How do I get that value from check_kcov_mode() to the caller?
>> Shall I add an additional parameter to check_kcov_mode()?
> 
> Yes, I would try to add an additional pointer parameter for mode. I
> think after inlining the compiler should be able to regestrize it.
First, I'll go for the extra argument. However, the compiler doesn't
seem to inline check_kcov_mode(). Can I enforce inlining?
I'm using GCC 9.3 on Debian Testing.

- Alex

-- 
Alexander LochmannPGP key: 0xBC3EF6FD
Heiliger Weg 72   phone:  +49.231.28053964
D-44141 Dortmund  mobile: +49.151.15738323


Re: [PATCH v3] Updated locking documentation for transaction_t

2021-03-17 Thread Alexander Lochmann
Does this patch look good to you either?

- Alex

On 11.02.21 18:14, Alexander Lochmann wrote:
> Some members of transaction_t are allowed to be read without
> any lock being held if accessed from the correct context.
> We used LockDoc's findings to determine those members.
> Each member of them is marked with a short comment:
> "no lock needed for jbd2 thread".
> 
> Signed-off-by: Alexander Lochmann 
> Signed-off-by: Horst Schirmeier 
> Reviewed-by: Jan Kara 
> ---
>  include/linux/jbd2.h | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
> index 99d3cd051ac3..1f19d19f6435 100644
> --- a/include/linux/jbd2.h
> +++ b/include/linux/jbd2.h
> @@ -594,18 +594,18 @@ struct transaction_s
>*/
>   unsigned long   t_log_start;
>  
> - /* Number of buffers on the t_buffers list [j_list_lock] */
> + /* Number of buffers on the t_buffers list [j_list_lock, no locks 
> needed for jbd2 thread] */
>   int t_nr_buffers;
>  
>   /*
>* Doubly-linked circular list of all buffers reserved but not yet
> -  * modified by this transaction [j_list_lock]
> +  * modified by this transaction [j_list_lock, no locks needed for jbd2 
> thread]
>*/
>   struct journal_head *t_reserved_list;
>  
>   /*
>* Doubly-linked circular list of all metadata buffers owned by this
> -  * transaction [j_list_lock]
> +  * transaction [j_list_lock, no locks needed for jbd2 thread]
>*/
>   struct journal_head *t_buffers;
>  
> @@ -631,7 +631,7 @@ struct transaction_s
>   /*
>* Doubly-linked circular list of metadata buffers being shadowed by log
>* IO.  The IO buffers on the iobuf list and the shadow buffers on this
> -  * list match each other one for one at all times. [j_list_lock]
> +  * list match each other one for one at all times. [j_list_lock, no 
> locks needed for jbd2 thread]
>*/
>   struct journal_head *t_shadow_list;
>  
> 

-- 
Technische Universität Dortmund
Alexander LochmannPGP key: 0xBC3EF6FD
Otto-Hahn-Str. 16 phone:  +49.231.7556141
D-44227 Dortmund  fax:+49.231.7556116
http://ess.cs.tu-dortmund.de/Staff/al


Re: [PATCH v2] Updated locking documentation for journal_t

2021-03-17 Thread Alexander Lochmann
Does this patch look good to you?
Might it be ready to be merged?

- Alex

On 11.02.21 10:51, Alexander Lochmann wrote:
> Some members of transaction_t are allowed to be read without
> any lock being held if consistency doesn't matter.
> Based on LockDoc's findings, we extended the locking
> documentation of those members.
> Each one of them is marked with a short comment:
> "no lock for quick racy checks".
> 
> Signed-off-by: Alexander Lochmann 
> Signed-off-by: Horst Schirmeier 
> Reviewed-by: Jan Kara 
> ---
>  include/linux/jbd2.h | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
> index 18f77d9b1745..4dca33a063dd 100644
> --- a/include/linux/jbd2.h
> +++ b/include/linux/jbd2.h
> @@ -768,7 +768,7 @@ enum passtype {PASS_SCAN, PASS_REVOKE, PASS_REPLAY};
>  struct journal_s
>  {
>  /**
> - * @j_flags: General journaling state flags [j_state_lock]
> + * @j_flags: General journaling state flags [j_state_lock, no lock
> for quick racy checks]
>   */
>  unsigned long    j_flags;
> 
> @@ -808,7 +808,7 @@ struct journal_s
>  /**
>   * @j_barrier_count:
>   *
> - * Number of processes waiting to create a barrier lock [j_state_lock]
> + * Number of processes waiting to create a barrier lock
> [j_state_lock, no lock for quick racy checks]
>   */
>  int    j_barrier_count;
> 
> @@ -821,7 +821,7 @@ struct journal_s
>   * @j_running_transaction:
>   *
>   * Transactions: The current running transaction...
> - * [j_state_lock] [caller holding open handle]
> + * [j_state_lock, no lock for quick racy checks] [caller holding
> open handle]
>   */
>  transaction_t    *j_running_transaction;
> 
> @@ -1033,7 +1033,7 @@ struct journal_s
>   * @j_commit_sequence:
>   *
>   * Sequence number of the most recently committed transaction
> - * [j_state_lock].
> + * [j_state_lock, no lock for quick racy checks].
>   */
>  tid_t    j_commit_sequence;
> 
> @@ -1041,7 +1041,7 @@ struct journal_s
>   * @j_commit_request:
>   *
>   * Sequence number of the most recent transaction wanting commit
> - * [j_state_lock]
> + * [j_state_lock, no lock for quick racy checks]
>   */
>  tid_t    j_commit_request;
> 

-- 
Technische Universität Dortmund
Alexander LochmannPGP key: 0xBC3EF6FD
Otto-Hahn-Str. 16 phone:  +49.231.7556141
D-44227 Dortmund  fax:+49.231.7556116
http://ess.cs.tu-dortmund.de/Staff/al


Re: [PATCH] KCOV: Introduced tracing unique covered PCs

2021-03-15 Thread Alexander Lochmann



On 15.03.21 09:02, Dmitry Vyukov wrote:
>>>>  static notrace unsigned long canonicalize_ip(unsigned long ip)
>>>> @@ -191,18 +192,26 @@ void notrace __sanitizer_cov_trace_pc(void)
>>>> struct task_struct *t;
>>>> unsigned long *area;
>>>> unsigned long ip = canonicalize_ip(_RET_IP_);
>>>> -   unsigned long pos;
>>>> +   unsigned long pos, idx;
>>>>
>>>> t = current;
>>>> -   if (!check_kcov_mode(KCOV_MODE_TRACE_PC, t))
>>>> +   if (!check_kcov_mode(KCOV_MODE_TRACE_PC | KCOV_MODE_UNIQUE_PC, t))
>>>> return;
>>>>
>>>> area = t->kcov_area;
>>>> -   /* The first 64-bit word is the number of subsequent PCs. */
>>>> -   pos = READ_ONCE(area[0]) + 1;
>>>> -   if (likely(pos < t->kcov_size)) {
>>>> -   area[pos] = ip;
>>>> -   WRITE_ONCE(area[0], pos);
>>>> +   if (likely(t->kcov_mode == KCOV_MODE_TRACE_PC)) {
>>>
>>> Does this introduce an additional real of t->kcov_mode?
>>> If yes, please reuse the value read in check_kcov_mode.
>> Okay. How do I get that value from check_kcov_mode() to the caller?
>> Shall I add an additional parameter to check_kcov_mode()?
> 
> Yes, I would try to add an additional pointer parameter for mode. I
> think after inlining the compiler should be able to regestrize it.
> 
Should kcov->mode be written directly to that ptr?
Otherwise, it must be written to the already present variable mode, and
than copied to the ptr (if not NULL).

- Alex

-- 
Alexander LochmannPGP key: 0xBC3EF6FD
Heiliger Weg 72   phone:  +49.231.28053964
D-44141 Dortmund  mobile: +49.151.15738323


Re: [PATCH] KCOV: Introduced tracing unique covered PCs

2021-03-14 Thread Alexander Lochmann



On 12.02.21 13:54, Dmitry Vyukov wrote:
> 
> I think we could make KCOV_IN_CTXSW sign bit and then express the check as:
> 
> void foo2(unsigned mode) {
>   if (((int)(mode & 0x800a)) > 0)
> foo();
> }
> 
> 0020 :
>   20: 81 e7 0a 00 00 80and$0x800a,%edi
>   26: 7f 08jg 30 
>   28: c3retq
> 
So ((int)(mode & (KCOV_IN_CTXSW | needed_mode))) > 0?
> 
> 
> 
>>  }
>>
>>  static notrace unsigned long canonicalize_ip(unsigned long ip)
>> @@ -191,18 +192,26 @@ void notrace __sanitizer_cov_trace_pc(void)
>> struct task_struct *t;
>> unsigned long *area;
>> unsigned long ip = canonicalize_ip(_RET_IP_);
>> -   unsigned long pos;
>> +   unsigned long pos, idx;
>>
>> t = current;
>> -   if (!check_kcov_mode(KCOV_MODE_TRACE_PC, t))
>> +   if (!check_kcov_mode(KCOV_MODE_TRACE_PC | KCOV_MODE_UNIQUE_PC, t))
>> return;
>>
>> area = t->kcov_area;
>> -   /* The first 64-bit word is the number of subsequent PCs. */
>> -   pos = READ_ONCE(area[0]) + 1;
>> -   if (likely(pos < t->kcov_size)) {
>> -   area[pos] = ip;
>> -   WRITE_ONCE(area[0], pos);
>> +   if (likely(t->kcov_mode == KCOV_MODE_TRACE_PC)) {
> 
> Does this introduce an additional real of t->kcov_mode?
> If yes, please reuse the value read in check_kcov_mode.
Okay. How do I get that value from check_kcov_mode() to the caller?
Shall I add an additional parameter to check_kcov_mode()?
> 
> 
>> +   /* The first 64-bit word is the number of subsequent PCs. */
>> +   pos = READ_ONCE(area[0]) + 1;
>> +   if (likely(pos < t->kcov_size)) {
>> +   area[pos] = ip;
>> +   WRITE_ONCE(area[0], pos);
>> +   }
>> +   } else {
>> +   idx = (ip - canonicalize_ip((unsigned long)&_stext)) / 4;
>> +   pos = idx % BITS_PER_LONG;
>> +   idx /= BITS_PER_LONG;
>> +   if (likely(idx < t->kcov_size))
>> +   WRITE_ONCE(area[idx], READ_ONCE(area[idx]) | 1L << 
>> pos);
>> }
>>  }
>>  EXPORT_SYMBOL(__sanitizer_cov_trace_pc);
>> @@ -474,6 +483,7 @@ static int kcov_mmap(struct file *filep, struct 
>> vm_area_struct *vma)
>> goto exit;
>> }
>> if (!kcov->area) {
>> +   kcov_debug("mmap(): Allocating 0x%lx bytes\n", size);
>> kcov->area = area;
>> vma->vm_flags |= VM_DONTEXPAND;
>> spin_unlock_irqrestore(&kcov->lock, flags);
>> @@ -515,6 +525,8 @@ static int kcov_get_mode(unsigned long arg)
>>  {
>> if (arg == KCOV_TRACE_PC)
>> return KCOV_MODE_TRACE_PC;
>> +   else if (arg == KCOV_UNIQUE_PC)
>> +   return KCOV_MODE_UNIQUE_PC;
> 
> As far as I understand, users can first do KCOV_INIT_UNIQUE and then
> enable KCOV_TRACE_PC, or vice versa.
> It looks somewhat strange. Is it intentional? 
I'll fix that.
It's not possible to
> specify buffer size for KCOV_INIT_UNIQUE, so most likely the buffer
> will be either too large or too small for a trace.
No, the buffer will be calculated by KCOV_INIT_UNIQUE based on the size
of the text segment.

- Alex

-- 
Alexander LochmannPGP key: 0xBC3EF6FD
Heiliger Weg 72   phone:  +49.231.28053964
D-44141 Dortmund  mobile: +49.151.15738323


Re: [RFC] inode.i_opflags - Usage of two different locking schemes

2021-03-08 Thread Alexander Lochmann



On 05.03.21 17:04, Theodore Ts'o wrote:

On Fri, Mar 05, 2021 at 04:35:47PM +0100, Alexander Lochmann wrote:



On 05.03.21 16:18, Theodore Ts'o wrote:

1)  I don't see where i_opflags is being read in ipc/mqueue.c at all,
either with or without i_rwsem.


It is read in fs/dcache.c


So why is this unique to the mqueue inode then?  It might be helpful
to have explicit call stacks in the e-mail, in text form, when you
resend to LKML.
It is unique to mqeue inode, because the control flow goes through 
ipc/mqueue.c where almost always the i_rwsem is taken.

Hence, we see more memory accesses to an mqueue inode with the i_rwsem.
The i_lock is less often hold compared to the i_rwsem.
We conclude the i_rwsem is needed. So it might not be a contradiction at 
all. It rather could be a flaw in our approach. :-/


Besides from our current discussion:
Does the i_lock protect i_opflags for both reading and writing?

Cheers,
Alex



That's because the HTML file is ***huge*** (1.7Meg), and I'm having
trouble with my browser properly rendering it.  In any case, the html
claims to be showing the counter examples and I'm still stuck on the
*example*?

Cheers,

- Ted



--
Technische Universität Dortmund
Alexander LochmannPGP key: 0xBC3EF6FD
Otto-Hahn-Str. 16 phone:  +49.231.7556141
D-44227 Dortmund  fax:+49.231.7556116
http://ess.cs.tu-dortmund.de/Staff/al



OpenPGP_signature
Description: OpenPGP digital signature


Re: [RFC] inode.i_opflags - Usage of two different locking schemes

2021-03-05 Thread Alexander Lochmann



On 05.03.21 16:18, Theodore Ts'o wrote:

1)  I don't see where i_opflags is being read in ipc/mqueue.c at all,
either with or without i_rwsem.


It is read in fs/dcache.c

2)  I'm not sure what this has to do with ext4?

  - Ted


Yeah. You're right. That was my fault. Sry.
I should have sent it to linux-kernel@... only.

- Alex

--
Technische Universität Dortmund
Alexander LochmannPGP key: 0xBC3EF6FD
Otto-Hahn-Str. 16 phone:  +49.231.7556141
D-44227 Dortmund  fax:+49.231.7556116
http://ess.cs.tu-dortmund.de/Staff/al



OpenPGP_signature
Description: OpenPGP digital signature


[RFC] inode.i_opflags - Usage of two different locking schemes

2021-03-05 Thread Alexander Lochmann

Hi folks,

I've stumbled across an interesting locking scheme. It's related to 
struct inode, more precisely it is an mqueue inode.
Our results show that inode:mqueue.i_opflags is read with i_rwsem being 
hold.
In d_flags_for_inode, and do_inode_permission the i_lock is used to read 
and write i_opflags.

Is this a real locking scheme? Is a lock needed to access i_opflags at all?
What is the magic behind this contradiction?

I've put the report of the counterexamples on our webserver: 
https://ess.cs.tu-dortmund.de/lockdoc-bugs/cex-inode-mqueue.html.
It contains the stacktraces leading to those accesses, and the locks 
that were actually held.


Regards,
Alex

--
Technische Universität Dortmund
Alexander LochmannPGP key: 0xBC3EF6FD
Otto-Hahn-Str. 16 phone:  +49.231.7556141
D-44227 Dortmund  fax:+49.231.7556116
http://ess.cs.tu-dortmund.de/Staff/al



OpenPGP_signature
Description: OpenPGP digital signature


[PATCH v3] Updated locking documentation for transaction_t

2021-02-11 Thread Alexander Lochmann
Some members of transaction_t are allowed to be read without
any lock being held if accessed from the correct context.
We used LockDoc's findings to determine those members.
Each member of them is marked with a short comment:
"no lock needed for jbd2 thread".

Signed-off-by: Alexander Lochmann 
Signed-off-by: Horst Schirmeier 
Reviewed-by: Jan Kara 
---
 include/linux/jbd2.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index 99d3cd051ac3..1f19d19f6435 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -594,18 +594,18 @@ struct transaction_s
 */
unsigned long   t_log_start;
 
-   /* Number of buffers on the t_buffers list [j_list_lock] */
+   /* Number of buffers on the t_buffers list [j_list_lock, no locks 
needed for jbd2 thread] */
int t_nr_buffers;
 
/*
 * Doubly-linked circular list of all buffers reserved but not yet
-* modified by this transaction [j_list_lock]
+* modified by this transaction [j_list_lock, no locks needed for jbd2 
thread]
 */
struct journal_head *t_reserved_list;
 
/*
 * Doubly-linked circular list of all metadata buffers owned by this
-* transaction [j_list_lock]
+* transaction [j_list_lock, no locks needed for jbd2 thread]
 */
struct journal_head *t_buffers;
 
@@ -631,7 +631,7 @@ struct transaction_s
/*
 * Doubly-linked circular list of metadata buffers being shadowed by log
 * IO.  The IO buffers on the iobuf list and the shadow buffers on this
-* list match each other one for one at all times. [j_list_lock]
+* list match each other one for one at all times. [j_list_lock, no 
locks needed for jbd2 thread]
 */
struct journal_head *t_shadow_list;
 
-- 
2.20.1



[PATCH v2] Updated locking documentation for transaction_t

2021-02-11 Thread Alexander Lochmann
Some members of transaction_t are allowed to be read without
any lock being held if accessed from the correct context.
We used LockDoc's findings to determine those members.
Each member of them is marked with a short comment:
"no lock needed for jbd2 thread".

Signed-off-by: Alexander Lochmann 
Signed-off-by: Horst Schirmeier 
---
 include/linux/jbd2.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index 99d3cd051ac3..1f19d19f6435 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -594,18 +594,18 @@ struct transaction_s
 */
unsigned long   t_log_start;
 
-   /* Number of buffers on the t_buffers list [j_list_lock] */
+   /* Number of buffers on the t_buffers list [j_list_lock, no locks 
needed for jbd2 thread] */
int t_nr_buffers;
 
/*
 * Doubly-linked circular list of all buffers reserved but not yet
-* modified by this transaction [j_list_lock]
+* modified by this transaction [j_list_lock, no locks needed for jbd2 
thread]
 */
struct journal_head *t_reserved_list;
 
/*
 * Doubly-linked circular list of all metadata buffers owned by this
-* transaction [j_list_lock]
+* transaction [j_list_lock, no locks needed for jbd2 thread]
 */
struct journal_head *t_buffers;
 
@@ -631,7 +631,7 @@ struct transaction_s
/*
 * Doubly-linked circular list of metadata buffers being shadowed by log
 * IO.  The IO buffers on the iobuf list and the shadow buffers on this
-* list match each other one for one at all times. [j_list_lock]
+* list match each other one for one at all times. [j_list_lock, no 
locks needed for jbd2 thread]
 */
struct journal_head *t_shadow_list;
 
-- 
2.20.1



Re: [PATCH 1/2] Updated locking documentation for transaction_t

2021-02-11 Thread Alexander Lochmann



On 11.02.21 10:30, Jan Kara wrote:

 */
unsigned long   t_log_start;
  
-	/* Number of buffers on the t_buffers list [j_list_lock] */

+   /* Number of buffers on the t_buffers list [j_list_lock, no lock for 
quick racy checks] */
int t_nr_buffers;


So this case is actually somewhat different now that I audited the uses.
There are two types of users - commit code (fs/jbd2/commit.c) and others.
Other users properly use j_list_lock to access t_nr_buffers. Commit code
does not use any locks because committing transaction is fully in
ownership of the jbd2 thread and all other users need to check & wait for
commit to be finished before doing anything with the transaction's buffers.

Mhm I see.
What about '[..., no locks needed for jbd2 thread]'?

How do the others wait for the commit to be finished?

- Alex

--
Technische Universität Dortmund
Alexander LochmannPGP key: 0xBC3EF6FD
Otto-Hahn-Str. 16 phone:  +49.231.7556141
D-44227 Dortmund  fax:+49.231.7556116
http://ess.cs.tu-dortmund.de/Staff/al



OpenPGP_signature
Description: OpenPGP digital signature


[PATCH v2] Updated locking documentation for journal_t

2021-02-11 Thread Alexander Lochmann

Some members of transaction_t are allowed to be read without
any lock being held if consistency doesn't matter.
Based on LockDoc's findings, we extended the locking
documentation of those members.
Each one of them is marked with a short comment:
"no lock for quick racy checks".

Signed-off-by: Alexander Lochmann 
Signed-off-by: Horst Schirmeier 
Reviewed-by: Jan Kara 
---
 include/linux/jbd2.h | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index 18f77d9b1745..4dca33a063dd 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -768,7 +768,7 @@ enum passtype {PASS_SCAN, PASS_REVOKE, PASS_REPLAY};
 struct journal_s
 {
/**
-* @j_flags: General journaling state flags [j_state_lock]
+	 * @j_flags: General journaling state flags [j_state_lock, no lock for 
quick racy checks]

 */
unsigned long   j_flags;

@@ -808,7 +808,7 @@ struct journal_s
/**
 * @j_barrier_count:
 *
-* Number of processes waiting to create a barrier lock [j_state_lock]
+	 * Number of processes waiting to create a barrier lock [j_state_lock, 
no lock for quick racy checks]

 */
int j_barrier_count;

@@ -821,7 +821,7 @@ struct journal_s
 * @j_running_transaction:
 *
 * Transactions: The current running transaction...
-* [j_state_lock] [caller holding open handle]
+	 * [j_state_lock, no lock for quick racy checks] [caller holding open 
handle]

 */
transaction_t   *j_running_transaction;

@@ -1033,7 +1033,7 @@ struct journal_s
 * @j_commit_sequence:
 *
 * Sequence number of the most recently committed transaction
-* [j_state_lock].
+* [j_state_lock, no lock for quick racy checks].
 */
tid_t   j_commit_sequence;

@@ -1041,7 +1041,7 @@ struct journal_s
 * @j_commit_request:
 *
 * Sequence number of the most recent transaction wanting commit
-* [j_state_lock]
+* [j_state_lock, no lock for quick racy checks]
 */
tid_t   j_commit_request;

--
2.20.1





OpenPGP_signature
Description: OpenPGP digital signature


[PATCH] KCOV: Introduced tracing unique covered PCs

2021-02-11 Thread Alexander Lochmann
Introduced new tracing mode KCOV_MODE_UNIQUE.
It simply stores the executed PCs.
The execution order is discarded.
Each bit in the shared buffer represents every fourth
byte of the text segment.
Since a call instruction on every supported
architecture is at least four bytes, it is safe
to just store every fourth byte of the text segment.
In contrast to KCOV_MODE_TRACE_PC, the shared buffer
cannot overflow. Thus, all executed PCs are recorded.

Signed-off-by: Alexander Lochmann 
---
 Documentation/dev-tools/kcov.rst | 80 
 include/linux/kcov.h |  4 +-
 include/uapi/linux/kcov.h| 10 
 kernel/kcov.c| 67 --
 4 files changed, 147 insertions(+), 14 deletions(-)

diff --git a/Documentation/dev-tools/kcov.rst b/Documentation/dev-tools/kcov.rst
index 8548b0b04e43..4712a730a06a 100644
--- a/Documentation/dev-tools/kcov.rst
+++ b/Documentation/dev-tools/kcov.rst
@@ -127,6 +127,86 @@ That is, a parent process opens /sys/kernel/debug/kcov, 
enables trace mode,
 mmaps coverage buffer and then forks child processes in a loop. Child processes
 only need to enable coverage (disable happens automatically on thread end).
 
+If someone is interested in a set of executed PCs, and does not care about
+execution order, he or she can advise KCOV to do so:
+
+.. code-block:: c
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define KCOV_INIT_TRACE_IOR('c', 1, unsigned long)
+#define KCOV_INIT_UNIQUE_IOR('c', 2, unsigned long)
+#define KCOV_ENABLE_IO('c', 100)
+#define KCOV_DISABLE   _IO('c', 101)
+
+#define BITS_PER_LONG 64
+#define KCOV_TRACE_PC  0
+#define KCOV_TRACE_CMP 1
+#define KCOV_UNIQUE_PC 2
+/*
+ * Determine start of text segment via 'nm vmlinux | grep _stext | cut -d 
" " -f1',
+ * and fill in.
+ */
+#define STEXT_START 0x8100
+
+
+
+int main(int argc, char **argv)
+{
+   int fd;
+   unsigned long *cover, n, i;
+
+   /* A single fd descriptor allows coverage collection on a single
+* thread.
+*/
+   fd = open("/sys/kernel/debug/kcov", O_RDWR);
+   if (fd == -1)
+   perror("open"), exit(1);
+   /* Setup trace mode and trace size. */
+   if ((n = ioctl(fd, KCOV_INIT_UNIQUE, 0)) < 0)
+   perror("ioctl"), exit(1);
+   /* Mmap buffer shared between kernel- and user-space. */
+   cover = (unsigned long*)mmap(NULL, n,
+PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
+   if ((void*)cover == MAP_FAILED)
+   perror("mmap"), exit(1);
+   /* Enable coverage collection on the current thread. */
+   if (ioctl(fd, KCOV_ENABLE, KCOV_UNIQUE_PC))
+   perror("ioctl"), exit(1);
+   /* That's the target syscal call. */
+   read(-1, NULL, 0);
+   /* Disable coverage collection for the current thread. After this call
+* coverage can be enabled for a different thread.
+*/
+   if (ioctl(fd, KCOV_DISABLE, 0))
+   perror("ioctl"), exit(1);
+/* Convert byte size into element size */
+n /= sizeof(unsigned long);
+/* Print executed PCs in sorted order */
+for (i = 0; i < n; i++) {
+for (int j = 0; j < BITS_PER_LONG; j++) {
+if (cover[i] & (1L << j)) {
+printf("0x%jx\n", (uintmax_t)(STEXT_START + (i * 
BITS_PER_LONG + j) * 4));
+}
+}
+}
+   /* Free resources. */
+   if (munmap(cover, n * sizeof(unsigned long)))
+   perror("munmap"), exit(1);
+   if (close(fd))
+   perror("close"), exit(1);
+   return 0;
+}
+
 Comparison operands collection
 --
 
diff --git a/include/linux/kcov.h b/include/linux/kcov.h
index a10e84707d82..aa0c8bcf8299 100644
--- a/include/linux/kcov.h
+++ b/include/linux/kcov.h
@@ -19,7 +19,9 @@ enum kcov_mode {
 */
KCOV_MODE_TRACE_PC = 2,
/* Collecting comparison operands mode. */
-   KCOV_MODE_TRACE_CMP = 3,
+   KCOV_MODE_TRACE_CMP = 4,
+   /* Collecting unique covered PCs. Execution order is not saved. */
+   KCOV_MODE_UNIQUE_PC = 8,
 };
 
 #define KCOV_IN_CTXSW  (1 << 30)
diff --git a/include/uapi/linux/kcov.h b/include/uapi/linux/kcov.h
index 1d0350e44ae3..5b99b6d1a1ac 100644
--- a/include/uapi/linux/kcov.h
+++ b/include/uapi/linux/kcov.h
@@ -19,6 +19,7 @@ struct kcov_remote_arg {
 #define KCOV_REMOTE_MAX_HANDLES0x100
 
 #define KCOV_INIT_TRACE   

[PATCH 2/2] Updated locking documentation for journal_t

2021-02-10 Thread Alexander Lochmann
Some members of transaction_t are allowed to be read without
any lock being held if consistency doesn't matter.
Based on LockDoc's findings, we extended the locking
documentation of those members.
Each one of them is marked with a short comment:
"no lock for quick racy checks".

Signed-off-by: Alexander Lochmann 
Signed-off-by: Horst Schirmeier 
---
 include/linux/jbd2.h | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index 18f77d9b1745..4dca33a063dd 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -768,7 +768,7 @@ enum passtype {PASS_SCAN, PASS_REVOKE, PASS_REPLAY};
 struct journal_s
 {
/**
-* @j_flags: General journaling state flags [j_state_lock]
+* @j_flags: General journaling state flags [j_state_lock, no lock for 
quick racy checks]
 */
unsigned long   j_flags;
 
@@ -808,7 +808,7 @@ struct journal_s
/**
 * @j_barrier_count:
 *
-* Number of processes waiting to create a barrier lock [j_state_lock]
+* Number of processes waiting to create a barrier lock [j_state_lock, 
no lock for quick racy checks]
 */
int j_barrier_count;
 
@@ -821,7 +821,7 @@ struct journal_s
 * @j_running_transaction:
 *
 * Transactions: The current running transaction...
-* [j_state_lock] [caller holding open handle]
+* [j_state_lock, no lock for quick racy checks] [caller holding open 
handle]
 */
transaction_t   *j_running_transaction;
 
@@ -1033,7 +1033,7 @@ struct journal_s
 * @j_commit_sequence:
 *
 * Sequence number of the most recently committed transaction
-* [j_state_lock].
+* [j_state_lock, no lock for quick racy checks].
 */
tid_t   j_commit_sequence;
 
@@ -1041,7 +1041,7 @@ struct journal_s
 * @j_commit_request:
 *
 * Sequence number of the most recent transaction wanting commit
-* [j_state_lock]
+* [j_state_lock, no lock for quick racy checks]
 */
tid_t   j_commit_request;
 
-- 
2.20.1



[PATCH 1/2] Updated locking documentation for transaction_t

2021-02-10 Thread Alexander Lochmann
Some members of transaction_t are allowed to be read without
any lock being held if consistency doesn't matter.
Based on LockDoc's findings, we extended the locking
documentation of those members.
Each one of them is marked with a short comment:
"no lock for quick racy checks".

Signed-off-by: Alexander Lochmann 
Signed-off-by: Horst Schirmeier 
---
 include/linux/jbd2.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index 99d3cd051ac3..18f77d9b1745 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -594,18 +594,18 @@ struct transaction_s
 */
unsigned long   t_log_start;
 
-   /* Number of buffers on the t_buffers list [j_list_lock] */
+   /* Number of buffers on the t_buffers list [j_list_lock, no lock for 
quick racy checks] */
int t_nr_buffers;
 
/*
 * Doubly-linked circular list of all buffers reserved but not yet
-* modified by this transaction [j_list_lock]
+* modified by this transaction [j_list_lock, no lock for quick racy 
checks]
 */
struct journal_head *t_reserved_list;
 
/*
 * Doubly-linked circular list of all metadata buffers owned by this
-* transaction [j_list_lock]
+* transaction [j_list_lock, no lock for quick racy checks]
 */
struct journal_head *t_buffers;
 
@@ -631,7 +631,7 @@ struct transaction_s
/*
 * Doubly-linked circular list of metadata buffers being shadowed by log
 * IO.  The IO buffers on the iobuf list and the shadow buffers on this
-* list match each other one for one at all times. [j_list_lock]
+* list match each other one for one at all times. [j_list_lock, no 
lock for quick racy checks]
 */
struct journal_head *t_shadow_list;
 
-- 
2.20.1



[PATCH 1/2] Updated locking documentation for transaction_t

2021-02-10 Thread Alexander Lochmann
Some members of transaction_t are allowed to be read without
any lock being held if consistency doesn't matter.
Based on LockDoc's findings, we extended the locking
documentation of those members.
Each one of them is marked with a short comment:
"no lock for quick racy checks".

Signed-off-by: Alexander Lochmann 
Signed-off-by: Horst Schirmeier 
---
 include/linux/jbd2.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index 99d3cd051ac3..18f77d9b1745 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -594,18 +594,18 @@ struct transaction_s
 */
unsigned long   t_log_start;
 
-   /* Number of buffers on the t_buffers list [j_list_lock] */
+   /* Number of buffers on the t_buffers list [j_list_lock, no lock for 
quick racy checks] */
int t_nr_buffers;
 
/*
 * Doubly-linked circular list of all buffers reserved but not yet
-* modified by this transaction [j_list_lock]
+* modified by this transaction [j_list_lock, no lock for quick racy 
checks]
 */
struct journal_head *t_reserved_list;
 
/*
 * Doubly-linked circular list of all metadata buffers owned by this
-* transaction [j_list_lock]
+* transaction [j_list_lock, no lock for quick racy checks]
 */
struct journal_head *t_buffers;
 
@@ -631,7 +631,7 @@ struct transaction_s
/*
 * Doubly-linked circular list of metadata buffers being shadowed by log
 * IO.  The IO buffers on the iobuf list and the shadow buffers on this
-* list match each other one for one at all times. [j_list_lock]
+* list match each other one for one at all times. [j_list_lock, no 
lock for quick racy checks]
 */
struct journal_head *t_shadow_list;
 
-- 
2.20.1



Re: [PATCH v3] Updated locking documentation for transaction_t

2020-12-03 Thread Alexander Lochmann



On 03.12.20 15:04, Theodore Y. Ts'o wrote:

On Thu, Oct 15, 2020 at 03:26:28PM +0200, Alexander Lochmann wrote:

Hi folks,

I've updated the lock documentation according to our finding for
transaction_t.
Does this patch look good to you?


I updated the annotations to match with the local usage, e.g:

 * When commit was requested [journal_t.j_state_lock]

became:

 * When commit was requested [j_state_lock]What do you mean by local 
usage?

The annotations of other members of transaction_t?

Shouldn't the annotation look like this?
[t_journal->j_state_lock]
It would be more precise.


Otherwise, looks good.  Thanks for the patch!

Thanks!

- Alex


   - Ted



--
Technische Universität Dortmund
Alexander LochmannPGP key: 0xBC3EF6FD
Otto-Hahn-Str. 16 phone:  +49.231.7556141
D-44227 Dortmund  fax:+49.231.7556116
http://ess.cs.tu-dortmund.de/Staff/al



OpenPGP_signature
Description: OpenPGP digital signature


[PATCH v3] Updated locking documentation for transaction_t

2020-10-15 Thread Alexander Lochmann
Hi folks,

I've updated the lock documentation according to our finding for
transaction_t.
Does this patch look good to you?

Cheers,
Alex
commit 13ac907c45c5da7d691f6e10972de5e56e0072c6
Author: Alexander Lochmann 
Date:   Thu Oct 15 15:24:52 2020 +0200

Updated locking documentation for transaction_t

We used LockDoc to derive locking rules for each member
of struct transaction_t.
Based on those results, we extended the existing documentation
by more members of struct transaction_t, and updated the existing
documentation.

Signed-off-by: Alexander Lochmann 
Signed-off-by: Horst Schirmeier 

diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index 08f904943ab2..a11c78e4af4e 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -532,6 +532,7 @@ struct transaction_chp_stats_s {
  * The transaction keeps track of all of the buffers modified by a
  * running transaction, and all of the buffers committed but not yet
  * flushed to home for finished transactions.
+ * (Locking Documentation improved by LockDoc)
  */
 
 /*
@@ -650,12 +651,12 @@ struct transaction_s
 	unsigned long		t_start;
 
 	/*
-	 * When commit was requested
+	 * When commit was requested [journal_t.j_state_lock]
 	 */
 	unsigned long		t_requested;
 
 	/*
-	 * Checkpointing stats [j_checkpoint_sem]
+	 * Checkpointing stats [journal_t.j_list_lock]
 	 */
 	struct transaction_chp_stats_s t_chp_stats;
 


signature.asc
Description: OpenPGP digital signature


Re: [PATCH v3] Updated locking documentation for transaction_t

2019-06-20 Thread Alexander Lochmann
Hi Ted,

Have you had the chance to review the most recent version of the patch?
Does it look reasonable to you?

Cheers,
Alex

On 08.04.19 10:35, Alexander Lochmann wrote:
> We used LockDoc to derive locking rules for each member
> of struct transaction_t.
> Based on those results, we extended the existing documentation
> by more members of struct transaction_t, and updated the existing
> documentation.
> 
> Signed-off-by: Alexander Lochmann 
> Signed-off-by: Horst Schirmeier 
> ---
>  include/linux/jbd2.h | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
> index 0f919d5fe84f..34b728e2b702 100644
> --- a/include/linux/jbd2.h
> +++ b/include/linux/jbd2.h
> @@ -534,6 +534,7 @@ struct transaction_chp_stats_s {
>   * The transaction keeps track of all of the buffers modified by a
>   * running transaction, and all of the buffers committed but not yet
>   * flushed to home for finished transactions.
> + * (Locking Documentation improved by LockDoc)
>   */
>  
>  /*
> @@ -652,12 +653,12 @@ struct transaction_s
>   unsigned long   t_start;
>  
>   /*
> -  * When commit was requested
> +  * When commit was requested [journal_t.j_state_lock]
>*/
>   unsigned long   t_requested;
>  
>   /*
> -  * Checkpointing stats [j_checkpoint_sem]
> +  * Checkpointing stats [journal_t.j_list_lock]
>    */
>   struct transaction_chp_stats_s t_chp_stats;
>  
> 

-- 
Technische Universität Dortmund
Alexander LochmannPGP key: 0xBC3EF6FD
Otto-Hahn-Str. 16 phone:  +49.231.7556141
D-44227 Dortmund  fax:+49.231.7556116
http://ess.cs.tu-dortmund.de/Staff/al



signature.asc
Description: OpenPGP digital signature


[PATCH v3] Updated locking documentation for transaction_t

2019-04-08 Thread Alexander Lochmann
We used LockDoc to derive locking rules for each member
of struct transaction_t.
Based on those results, we extended the existing documentation
by more members of struct transaction_t, and updated the existing
documentation.

Signed-off-by: Alexander Lochmann 
Signed-off-by: Horst Schirmeier 
---
 include/linux/jbd2.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index 0f919d5fe84f..34b728e2b702 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -534,6 +534,7 @@ struct transaction_chp_stats_s {
  * The transaction keeps track of all of the buffers modified by a
  * running transaction, and all of the buffers committed but not yet
  * flushed to home for finished transactions.
+ * (Locking Documentation improved by LockDoc)
  */
 
 /*
@@ -652,12 +653,12 @@ struct transaction_s
unsigned long   t_start;
 
/*
-* When commit was requested
+* When commit was requested [journal_t.j_state_lock]
 */
unsigned long   t_requested;
 
/*
-* Checkpointing stats [j_checkpoint_sem]
+* Checkpointing stats [journal_t.j_list_lock]
 */
struct transaction_chp_stats_s t_chp_stats;
 
-- 
2.20.1



Re: [v2] Updated locking documentation for transaction_t

2019-04-08 Thread Alexander Lochmann
Thanks, Ted, for your feedback!
I'll submit a modified version.

- Alex

On 07.04.19 18:52, Theodore Ts'o wrote:
> On Mon, Mar 18, 2019 at 07:42:37PM +0100, Alexander Lochmann wrote:
>>  /*t
>> - * Where in the log does this transaction's commit start? [no locking]
>> + * Where in the log does this transaction's commit start?
>> + * [journal_t.j_state_lock]
>>   */
>>  unsigned long   t_log_start;
> 
> Well, technically, that's not quite right.  It's only assigned in one
> location, and we hold j_state_lock, yes.  But that's because we need
> to access journal->j_head.  At the point where we set t_log_start, the
> transaction has already been locked down (transaction->t_state >
> T_LOCKED).
> 
> Similarly, we happen to be holding j_state where it is currently being
> accessed, but it's not because we needed the lock in order to access
> t_log_start safely.
> 
>>  /*
>> - * When transaction started
>> + * When transaction started [journal_t.j_state_lock]
>>   */
>>  unsigned long   t_start;
> 
> And again, not really.  The primary place where t_start is set is when
> the transaction is firstt created, before it's visible anywhere else.
> after that, it is used exclusively by the commit thread, and so no
> locking is necessary.  It's true that in the places where it is used,
> j_state_lock happens to be taken, but it's strictly not necessary.
> 
>>  
>>  /*
>> - * When commit was requested
>> + * When commit was requested [journal_t.j_state_lock]
>>   */
>>  unsigned long   t_requested;
> 
> Yes, that appears to be correct.
> 
>>  
>>  /*
>> - * Checkpointing stats [j_checkpoint_sem]
>> +     * Checkpointing stats [journal_t.j_list_lock]
>>   */
>>  struct transaction_chp_stats_s t_chp_stats;
>>
> 
> This appears to be correct.
> 
>   - Ted
> 

-- 
Technische Universität Dortmund
Alexander LochmannPGP key: 0xBC3EF6FD
Otto-Hahn-Str. 16 phone:  +49.231.7556141
D-44227 Dortmund  fax:+49.231.7556116
http://ess.cs.tu-dortmund.de/Staff/al



signature.asc
Description: OpenPGP digital signature


[PATCH v2] Updated locking documentation for transaction_t

2019-03-18 Thread Alexander Lochmann
We used LockDoc to derive locking rules for each member
of struct transaction_t.
Based on those results, we extended the existing documentation
by more members of struct inode, and updated the existing
documentation.

Signed-off-by: Alexander Lochmann 
Signed-off-by: Horst Schirmeier 
---
 include/linux/jbd2.h | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index 0f919d5fe84f..72f689746340 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -534,6 +534,7 @@ struct transaction_chp_stats_s {
  * The transaction keeps track of all of the buffers modified by a
  * running transaction, and all of the buffers committed but not yet
  * flushed to home for finished transactions.
+ * (Locking Documentation improved by LockDoc)
  */
 
 /*
@@ -585,7 +586,8 @@ struct transaction_s
}   t_state;
 
/*
-* Where in the log does this transaction's commit start? [no locking]
+* Where in the log does this transaction's commit start?
+* [journal_t.j_state_lock]
 */
unsigned long   t_log_start;
 
@@ -647,17 +649,17 @@ struct transaction_s
unsigned long   t_max_wait;
 
/*
-* When transaction started
+* When transaction started [journal_t.j_state_lock]
 */
unsigned long   t_start;
 
/*
-* When commit was requested
+* When commit was requested [journal_t.j_state_lock]
 */
unsigned long   t_requested;
 
/*
-* Checkpointing stats [j_checkpoint_sem]
+* Checkpointing stats [journal_t.j_list_lock]
 */
struct transaction_chp_stats_s t_chp_stats;
 
-- 
2.20.1



[PATCH 1/1] Updated locking documentation for struct inode

2019-03-08 Thread Alexander Lochmann
We used LockDoc to derive locking rules for each member
of struct inode.
Based on those results, we extended the existing documentation
by more members of struct inode, and updated the existing
documentation.

Signed-off-by: Alexander Lochmann 
Signed-off-by: Horst Schirmeier 
---
 fs/inode.c | 49 --
 include/linux/fs.h |  2 +-
 2 files changed, 40 insertions(+), 11 deletions(-)

diff --git a/fs/inode.c b/fs/inode.c
index 0cd47fe0dbe5..ade9d3aa1ada 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -24,17 +24,46 @@
 
 /*
  * Inode locking rules:
+ * (Documentation improved by LockDoc)
  *
- * inode->i_lock protects:
- *   inode->i_state, inode->i_hash, __iget()
- * Inode LRU list locks protect:
- *   inode->i_sb->s_inode_lru, inode->i_lru
- * inode->i_sb->s_inode_list_lock protects:
- *   inode->i_sb->s_inodes, inode->i_sb_list
- * bdi->wb.list_lock protects:
- *   bdi->wb.b_{dirty,io,more_io,dirty_time}, inode->i_io_list
- * inode_hash_lock protects:
- *   inode_hashtable, inode->i_hash
+ * No locks needed for:
+ *   i_data.a_ops, i_data.nrexceptional, i_rdev, i_data.gfp_mask,
+ *   i_generation, i_security, i_nlink, __i_nlink, i_flctx,
+ *   i_size, i_atime, i_mtime, i_data.host, i_sb
+ *
+ * backing_dev_info.wb.list_lock protects:
+ *   dirtied_when, i_io_list
+ *
+ * inode.i_rwsem protects:
+ *   i_flags, i_uid, i_gid, i_version, i_ctime, i_size_seqcount
+ *
+ * inode.i_rwsem -> rcu protects:
+ *   i_default_acl
+ *
+ * block_device.bd_mutex protects:
+ *   i_blkbits
+ *
+ * cdev_lock protects:
+ *   i_cdev, i_devices
+ *
+ * inode.i_data.i_mmap_rwsem protects:
+ *   i_data.i_mmap, i_data.i_mmap
+ *
+ * hardirq -> inode.i_data.tree_lock protects:
+ *   i_data.page_tree, i_wb_list, i_data.nrpages
+ *
+ * hardirq -> super_block.s_inode_wblist_lock protects:
+ *   i_wb_list
+ *
+ * inode.i_lock protects:
+ *   i_pipe, i_fsnotify_mask, i_fsnotify_marks, i_blocks, i_opflags,
+ *   i_state, i_bytes
+ *
+ * super_block.s_inode_list_lock protects:
+ *   i_sb_list
+ *
+ * inode_hash_lock -> inode.i_lock protects:
+ *   i_hash
  *
  * Lock ordering:
  *
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 92966678539d..744e2a817ad3 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -639,7 +639,7 @@ struct inode {
struct timespec64   i_atime;
struct timespec64   i_mtime;
struct timespec64   i_ctime;
-   spinlock_t  i_lock; /* i_blocks, i_bytes, maybe i_size */
+   spinlock_t  i_lock;
unsigned short  i_bytes;
u8  i_blkbits;
u8  i_write_hint;
-- 
2.20.1



[PATCH 1/1] Updated locking documentation for struct journal_t

2019-03-08 Thread Alexander Lochmann
We used LockDoc to derive locking rules for each member
of struct journal_t. Based on those results, we extended
the existing documentation by more members
of struct inode, and updated the existing documentation.

Signed-off-by: Alexander Lochmann 
Signed-off-by: Horst Schirmeier 
---
 include/linux/jbd2.h | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index 0f919d5fe84f..c1ba44ca515a 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -585,7 +585,8 @@ struct transaction_s
}   t_state;
 
/*
-* Where in the log does this transaction's commit start? [no locking]
+* Where in the log does this transaction's commit start?
+* [journal_t->j_state_lock]
 */
unsigned long   t_log_start;
 
@@ -647,17 +648,17 @@ struct transaction_s
unsigned long   t_max_wait;
 
/*
-* When transaction started
+* When transaction started [j_state_lock]
 */
unsigned long   t_start;
 
/*
-* When commit was requested
+* When commit was requested [j_state_lock]
 */
unsigned long   t_requested;
 
/*
-* Checkpointing stats [j_checkpoint_sem]
+* Checkpointing stats [j_list_lock]
 */
struct transaction_chp_stats_s t_chp_stats;
 
@@ -744,6 +745,7 @@ jbd2_time_diff(unsigned long start, unsigned long end)
 /**
  * struct journal_s - The journal_s type is the concrete type associated with
  * journal_t.
+ * (Locking Documentation improved by LockDoc)
  */
 struct journal_s
 {
@@ -762,6 +764,7 @@ struct journal_s
 
/**
 * @j_sb_buffer: The first part of the superblock buffer.
+* [j_checkpoint_mutex]
 */
struct buffer_head  *j_sb_buffer;
 
@@ -1015,6 +1018,7 @@ struct journal_s
 * @j_commit_interval:
 *
 * What is the maximum transaction lifetime before we begin a commit?
+* [super_block->s_umount]
 */
unsigned long   j_commit_interval;
 
@@ -1032,7 +1036,7 @@ struct journal_s
 * @j_revoke:
 *
 * The revoke table - maintains the list of revoked blocks in the
-* current transaction.
+* current transaction. [j_state_lock]
 */
struct jbd2_revoke_table_s *j_revoke;
 
-- 
2.20.1



Re: [PATCH] Abort file_remove_privs() for non-reg. files

2019-03-08 Thread Alexander Lochmann
Hello Al,

Meanwhile, have you had the opportunity to review our patch?

Regards,
Alex

On 11.01.19 16:42, Alexander Lochmann wrote:
> Hello Al,
> 
> Have you had the opportunity to review our patch?
> 
> Cheers,
> Alex
> 
> On 17.12.18 09:28, Jan Kara wrote:
>> On Fri 14-12-18 11:55:52, Alexander Lochmann wrote:
>>>
>>> file_remove_privs() might be called for non-regular files, e.g.
>>> blkdev inode. There is no reason to do its job on things
>>> like blkdev inodes, pipes, or cdevs. Hence, abort if
>>> file does not refer to a regular inode.
>>> The following stacktrace shows how to get there:
>>> 13: entry_SYSENTER_32:460
>>> 12: do_fast_syscall_32:410
>>> 11: _static_cpu_has:146
>>> 10: do_syscall_32_irqs_on:322
>>> 09: SyS_pwrite64:636
>>> 08: SYSC_pwrite64:650
>>> 07: fdput:38
>>> 06: vfs_write:560
>>> 05: __vfs_write:512
>>> 04: new_sync_write:500
>>> 03: blkdev_write_iter:1977
>>> 02: __generic_file_write_iter:2897
>>> 01: file_remove_privs:1818
>>> 00: inode_has_no_xattr:3163
>>>
>>> Found by LockDoc (Alexander Lochmann, Horst Schirmeier and Olaf
>>> Spinczyk)
>>>
>>> Signed-off-by: Alexander Lochmann 
>>> Signed-off-by: Horst Schirmeier 
>>
>> The patch looks good to me. You can add:
>>
>> Reviewed-by: Jan Kara 
>>
>>  Honza
>>
>>> ---
>>>  fs/inode.c | 9 +++--
>>>  1 file changed, 7 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/fs/inode.c b/fs/inode.c
>>> index 35d2108d567c..682088190413 100644
>>> --- a/fs/inode.c
>>> +++ b/fs/inode.c
>>> @@ -1820,8 +1820,13 @@ int file_remove_privs(struct file *file)
>>> int kill;
>>> int error = 0;
>>>
>>> -   /* Fast path for nothing security related */
>>> -   if (IS_NOSEC(inode))
>>> +   /*
>>> +* Fast path for nothing security related.
>>> +* As well for non-regular files, e.g. blkdev inodes.
>>> +* For example, blkdev_write_iter() might get here
>>> +* trying to remove privs which it is not allowed to.
>>> +*/
>>> +   if (IS_NOSEC(inode) || !S_ISREG(inode->i_mode))
>>> return 0;
>>>
>>> kill = dentry_needs_remove_privs(dentry);
>>> -- 
>>> 2.19.2
>>>
>>
>>
>>
> 

-- 
Technische Universität Dortmund
Alexander LochmannPGP key: 0xBC3EF6FD
Otto-Hahn-Str. 16 phone:  +49.231.7556141
D-44227 Dortmund  fax:+49.231.7556116
http://ess.cs.tu-dortmund.de/Staff/al



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] Abort file_remove_privs() for non-reg. files

2019-01-11 Thread Alexander Lochmann
Hello Al,

Have you had the opportunity to review our patch?

Cheers,
Alex

On 17.12.18 09:28, Jan Kara wrote:
> On Fri 14-12-18 11:55:52, Alexander Lochmann wrote:
>>
>> file_remove_privs() might be called for non-regular files, e.g.
>> blkdev inode. There is no reason to do its job on things
>> like blkdev inodes, pipes, or cdevs. Hence, abort if
>> file does not refer to a regular inode.
>> The following stacktrace shows how to get there:
>> 13: entry_SYSENTER_32:460
>> 12: do_fast_syscall_32:410
>> 11: _static_cpu_has:146
>> 10: do_syscall_32_irqs_on:322
>> 09: SyS_pwrite64:636
>> 08: SYSC_pwrite64:650
>> 07: fdput:38
>> 06: vfs_write:560
>> 05: __vfs_write:512
>> 04: new_sync_write:500
>> 03: blkdev_write_iter:1977
>> 02: __generic_file_write_iter:2897
>> 01: file_remove_privs:1818
>> 00: inode_has_no_xattr:3163
>>
>> Found by LockDoc (Alexander Lochmann, Horst Schirmeier and Olaf
>> Spinczyk)
>>
>> Signed-off-by: Alexander Lochmann 
>> Signed-off-by: Horst Schirmeier 
> 
> The patch looks good to me. You can add:
> 
> Reviewed-by: Jan Kara 
> 
>   Honza
> 
>> ---
>>  fs/inode.c | 9 +++--
>>  1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/inode.c b/fs/inode.c
>> index 35d2108d567c..682088190413 100644
>> --- a/fs/inode.c
>> +++ b/fs/inode.c
>> @@ -1820,8 +1820,13 @@ int file_remove_privs(struct file *file)
>>  int kill;
>>  int error = 0;
>>
>> -/* Fast path for nothing security related */
>> -if (IS_NOSEC(inode))
>> +/*
>> + * Fast path for nothing security related.
>> + * As well for non-regular files, e.g. blkdev inodes.
>> + * For example, blkdev_write_iter() might get here
>> + * trying to remove privs which it is not allowed to.
>> + */
>> +if (IS_NOSEC(inode) || !S_ISREG(inode->i_mode))
>>  return 0;
>>
>>  kill = dentry_needs_remove_privs(dentry);
>> -- 
>> 2.19.2
>>
> 
> 
> 

-- 
Technische Universität Dortmund
Alexander LochmannPGP key: 0xBC3EF6FD
Otto-Hahn-Str. 16 phone:  +49.231.7556141
D-44227 Dortmund  fax:+49.231.7556116
http://ess.cs.tu-dortmund.de/Staff/al



signature.asc
Description: OpenPGP digital signature


[PATCH] Abort file_remove_privs() for non-reg. files

2018-12-14 Thread Alexander Lochmann

file_remove_privs() might be called for non-regular files, e.g.
blkdev inode. There is no reason to do its job on things
like blkdev inodes, pipes, or cdevs. Hence, abort if
file does not refer to a regular inode.
The following stacktrace shows how to get there:
13: entry_SYSENTER_32:460
12: do_fast_syscall_32:410
11: _static_cpu_has:146
10: do_syscall_32_irqs_on:322
09: SyS_pwrite64:636
08: SYSC_pwrite64:650
07: fdput:38
06: vfs_write:560
05: __vfs_write:512
04: new_sync_write:500
03: blkdev_write_iter:1977
02: __generic_file_write_iter:2897
01: file_remove_privs:1818
00: inode_has_no_xattr:3163

Found by LockDoc (Alexander Lochmann, Horst Schirmeier and Olaf
Spinczyk)

Signed-off-by: Alexander Lochmann 
Signed-off-by: Horst Schirmeier 
---
 fs/inode.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/inode.c b/fs/inode.c
index 35d2108d567c..682088190413 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -1820,8 +1820,13 @@ int file_remove_privs(struct file *file)
int kill;
int error = 0;

-   /* Fast path for nothing security related */
-   if (IS_NOSEC(inode))
+   /*
+* Fast path for nothing security related.
+* As well for non-regular files, e.g. blkdev inodes.
+* For example, blkdev_write_iter() might get here
+* trying to remove privs which it is not allowed to.
+*/
+   if (IS_NOSEC(inode) || !S_ISREG(inode->i_mode))
return 0;

kill = dentry_needs_remove_privs(dentry);
-- 
2.19.2



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] Fix sync. in blkdev_write_iter() acessing i_flags

2018-12-07 Thread Alexander Lochmann


On 07.12.18 18:58, Al Viro wrote:
> On Fri, Dec 07, 2018 at 05:10:15PM +0100, Alexander Lochmann wrote:
>>
>> inode.i_flags might be altered without proper
>> synchronisation when the inode belongs to devtmpfs.
>> blkdev_write_iter() starts writing via __generic_file_write_iter()
>> which sets S_NOSEC bit without any synchronisation.
>> The following stacktrace shows how to get there:
>> 13: entry_SYSENTER_32:460
>> 12: do_fast_syscall_32:410
>> 11: _static_cpu_has:146
>> 10: do_syscall_32_irqs_on:322
>> 09: SyS_pwrite64:636
>> 08: SYSC_pwrite64:650
>> 07: fdput:38
>> 06: vfs_write:560
>> 05: __vfs_write:512
>> 04: new_sync_write:500
>> 03: blkdev_write_iter:1977
>> 02: __generic_file_write_iter:2897
>> 01: file_remove_privs:1818
>> 00: inode_has_no_xattr:3163
>> If S_NOSEC is *not* set, i_rwsem is acquired around
>> __generic_file_write_iter().
> 
>> + * Ensure excl. access to i_flags in __generic_file_write_iter().
>> + * Otherwise, it would race with chmod adding SUID bit.
>> + */
> 
> _What_ SUID bit?  We are talking about a write to block device, for fsck 
> sake...
> 
That's the way I understood Jan's explanation:
"
Thinking more about this I'm not sure if this is actually the right
solution. Because for example the write(2) can set S_NOSEC flag wrongly
when it would race with chmod adding SUID bit. So probably we rather need
to acquire i_rwsem in blkdev_write_iter() if file does not have S_NOSEC set
(we don't want to acquire it unconditionally as that would heavily impact
scalability of block device writes).

Honza"

If I'm mistaking, pls help me with fixing it.

- Alex

-- 
Technische Universität Dortmund
Alexander LochmannPGP key: 0xBC3EF6FD
Otto-Hahn-Str. 16 phone:  +49.231.7556141
D-44227 Dortmund  fax:+49.231.7556116
http://ess.cs.tu-dortmund.de/Staff/al



signature.asc
Description: OpenPGP digital signature


[PATCH] Fix sync. in blkdev_write_iter() acessing i_flags

2018-12-07 Thread Alexander Lochmann

inode.i_flags might be altered without proper
synchronisation when the inode belongs to devtmpfs.
blkdev_write_iter() starts writing via __generic_file_write_iter()
which sets S_NOSEC bit without any synchronisation.
The following stacktrace shows how to get there:
13: entry_SYSENTER_32:460
12: do_fast_syscall_32:410
11: _static_cpu_has:146
10: do_syscall_32_irqs_on:322
09: SyS_pwrite64:636
08: SYSC_pwrite64:650
07: fdput:38
06: vfs_write:560
05: __vfs_write:512
04: new_sync_write:500
03: blkdev_write_iter:1977
02: __generic_file_write_iter:2897
01: file_remove_privs:1818
00: inode_has_no_xattr:3163
If S_NOSEC is *not* set, i_rwsem is acquired around
__generic_file_write_iter().

Found by LockDoc (Alexander Lochmann, Horst Schirmeier and Olaf
Spinczyk)

Signed-off-by: Alexander Lochmann 
Signed-off-by: Horst Schirmeier 
---
 fs/block_dev.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index a80b4f0ee7c4..b4ece62e3c05 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1894,8 +1894,10 @@ ssize_t blkdev_write_iter(struct kiocb *iocb,
struct iov_iter *from)
 {
struct file *file = iocb->ki_filp;
struct inode *bd_inode = bdev_file_inode(file);
+   struct inode *inode = file_inode(file);
loff_t size = i_size_read(bd_inode);
struct blk_plug plug;
+   int locked = 0;
ssize_t ret;

if (bdev_read_only(I_BDEV(bd_inode)))
@@ -1913,9 +1915,19 @@ ssize_t blkdev_write_iter(struct kiocb *iocb,
struct iov_iter *from)
iov_iter_truncate(from, size - iocb->ki_pos);

blk_start_plug(&plug);
+   /*
+* Ensure excl. access to i_flags in __generic_file_write_iter().
+* Otherwise, it would race with chmod adding SUID bit.
+*/
+   if (!IS_NOSEC(inode)) {
+   inode_lock(inode);
+   locked = 1;
+   }
ret = __generic_file_write_iter(iocb, from);
if (ret > 0)
ret = generic_write_sync(iocb, ret);
+   if (locked)
+   inode_unlock(inode);
blk_finish_plug(&plug);
return ret;
 }
-- 
2.19.2



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] Fix sync. in inode_has_no_xattr()

2018-12-07 Thread Alexander Lochmann
Am 05.12.18 um 16:32 schrieb Jan Kara:
> 
> Thinking more about this I'm not sure if this is actually the right
> solution. Because for example the write(2) can set S_NOSEC flag wrongly
> when it would race with chmod adding SUID bit. So probably we rather need
> to acquire i_rwsem in blkdev_write_iter() if file does not have S_NOSEC set
> (we don't want to acquire it unconditionally as that would heavily impact
> scalability of block device writes).
> 
>   Honza
> 
Trying to implement your suggestion, I'm not sure which inode to use:
In blkdev_write_iter() there is the "bd_inode = bdev_file_inode(file)".
file_remove_privs() uses "inode = file_inode(file)" as a parameter for
inode_has_no_xattr().
So, do file->f_mapping->host and f->f_inode refer to the identical inode?

- Alex

-- 
Technische Universität Dortmund
Alexander LochmannPGP key: 0xBC3EF6FD
Otto-Hahn-Str. 16 phone:  +49.231.7556141
D-44227 Dortmund  fax:+49.231.7556116
http://ess.cs.tu-dortmund.de/Staff/al



signature.asc
Description: OpenPGP digital signature


[PATCH] Fix sync. in inode_has_no_xattr()

2018-12-05 Thread Alexander Lochmann

inode.i_flags might be altered without proper
synchronisation when the inode belongs to devtmpfs.
The following stacktrace shows how to get there:
13: entry_SYSENTER_32:460
12: do_fast_syscall_32:410
11: _static_cpu_has:146
10: do_syscall_32_irqs_on:322
09: SyS_pwrite64:636
08: SYSC_pwrite64:650
07: fdput:38
06: vfs_write:560
05: __vfs_write:512
04: new_sync_write:500
03: blkdev_write_iter:1977
02: __generic_file_write_iter:2897
01: file_remove_privs:1818
00: inode_has_no_xattr:3163

Found by LockDoc (Alexander Lochmann, Horst Schirmeier and Olaf
Spinczyk)

Signed-off-by: Alexander Lochmann 
Signed-off-by: Horst Schirmeier 
---
 include/linux/fs.h | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index c95c0807471f..40722678d741 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3443,10 +3443,14 @@ static inline int check_sticky(struct inode
*dir, struct inode *inode)
return __check_sticky(dir, inode);
 }

+/*
+ * blkdev_write_iter() can call this without i_rwsem, need to be
+ * careful with i_flags update.
+ */
 static inline void inode_has_no_xattr(struct inode *inode)
 {
if (!is_sxid(inode->i_mode) && (inode->i_sb->s_flags & SB_NOSEC))
-   inode->i_flags |= S_NOSEC;
+   inode_set_flags(inode, S_NOSEC, S_NOSEC);
 }

 static inline bool is_root_inode(struct inode *inode)
-- 
2.19.1



signature.asc
Description: OpenPGP digital signature


Re: [x86] BUG()/BUG_ON() macros cannot be disabled

2018-09-25 Thread Alexander Lochmann


Am 25.09.2018 um 14:20 schrieb Arnd Bergmann:
> 
> I think it's the most reasonable implementation, otherwise a
> function like
> 
> int something(void)
> {
>   if (x)
>  return 0;
>  else
>   BUG();
> }
> 
> will return an uninitialized value.
> 
> The arch specific implementations usually just contain a trapping
> instruction. With CONFIG_BUG() you get a nice console output
> that indicates where this happened, but without CONFIG_BUG(),
> this is just reported as an invalid instruction (if CONFIG_PRINTK
> is still enabled), killing the current process.
> 
>Arnd
> 
I see. In that case, you should really update the documentation and help
page of CONFIG_BUG. In its current version it is misleading. It can be
understood as 'It disables that macro completely.'.

Although I know what the purpose of BUG()/BUG_ON() is, I would not
consider the above example as valid C code
Defining BUG as an endless loop to overcome GCC warnings about not
returning a value is a dirty hack for me.

- Alex

-- 
Technische Universität Dortmund
Alexander LochmannPGP key: 0xBC3EF6FD
Otto-Hahn-Str. 16 phone:  +49.231.7556141
D-44227 Dortmund  fax:+49.231.7556116
http://ess.cs.tu-dortmund.de/Staff/al



signature.asc
Description: OpenPGP digital signature


Re: [x86] BUG()/BUG_ON() macros cannot be disabled

2018-09-25 Thread Alexander Lochmann


Am 25.09.2018 um 14:06 schrieb Arnd Bergmann:
> On Tue, Sep 25, 2018 at 1:35 PM Alexander Lochmann
>  wrote:
>>
>> Hi Arnd,
>>
>> I recently tried to disable the BUG macros on x86-32. I disabled the
>> config item in 'General Setup -> Configure standard kernel features
>> (expert users) -> BUG() support'.
>> However, BUG/BUG_ON is still defined. In the !CONFIG_BUG branch in
>> include/asm-generic/bug.h (line 181) the code checks for the existence
>> of architecture-specific variants of those macros.
>> Since x86 defines its own version of BUG(), line 182 is *not* used.
>> In the following, the x86 variant of BUG() is then used to define the
>> BUG_ON() macro.
>>
>> I propose a patch that really disables both macros if the developer
>> wants it.
>> It undefines the respective x86 variants, and defines both macros as
>> 'empty' macros.
> 
> Maybe we should update the documentation instead. Note that the
> generic version is using 'while (1)' rather than 'while (0)', so this is
> not an empty macro but rather one that does more than the
> arch-specific one-instruction version does.
> 
> We don't really want an empty macro any more, this was used in
> the past, but causes compile-time warnings and undefined behavior
> for no good reason
I see. So the documentation of the CONFIG_BUG option is wrong?
It currently states that 'Disabling this option eliminates support for
BUG and WARN'.
Is the current implemention (an endless loop) desired behavior?

- Alex
> 
>Arnd
> 

-- 
Technische Universität Dortmund
Alexander LochmannPGP key: 0xBC3EF6FD
Otto-Hahn-Str. 16 phone:  +49.231.7556141
D-44227 Dortmund  fax:+49.231.7556116
http://ess.cs.tu-dortmund.de/Staff/al



signature.asc
Description: OpenPGP digital signature


[x86] BUG()/BUG_ON() macros cannot be disabled

2018-09-25 Thread Alexander Lochmann
Hi Arnd,

I recently tried to disable the BUG macros on x86-32. I disabled the
config item in 'General Setup -> Configure standard kernel features
(expert users) -> BUG() support'.
However, BUG/BUG_ON is still defined. In the !CONFIG_BUG branch in
include/asm-generic/bug.h (line 181) the code checks for the existence
of architecture-specific variants of those macros.
Since x86 defines its own version of BUG(), line 182 is *not* used.
In the following, the x86 variant of BUG() is then used to define the
BUG_ON() macro.

I propose a patch that really disables both macros if the developer
wants it.
It undefines the respective x86 variants, and defines both macros as
'empty' macros.

Regards,
Alex

-- 
Technische Universität Dortmund
Alexander LochmannPGP key: 0xBC3EF6FD
Otto-Hahn-Str. 16 phone:  +49.231.7556141
D-44227 Dortmund  fax:+49.231.7556116
http://ess.cs.tu-dortmund.de/Staff/al
diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h
index 20561a60db9c..1e7977582277 100644
--- a/include/asm-generic/bug.h
+++ b/include/asm-generic/bug.h
@@ -178,14 +178,17 @@ void __warn(const char *file, int line, void *caller, unsigned taint,
 })
 
 #else /* !CONFIG_BUG */
-#ifndef HAVE_ARCH_BUG
-#define BUG() do {} while (1)
+#ifdef HAVE_ARCH_BUG
+#undef BUG
 #endif
 
-#ifndef HAVE_ARCH_BUG_ON
-#define BUG_ON(condition) do { if (condition) BUG(); } while (0)
+#ifdef HAVE_ARCH_BUG_ON
+#undef BUG_ON
 #endif
 
+#define BUG_ON(condition) do { if (condition) BUG(); } while (0)
+#define BUG() do {} while (1)
+
 #ifndef HAVE_ARCH_WARN_ON
 #define WARN_ON(condition) ({		\
 	int __ret_warn_on = !!(condition);\


signature.asc
Description: OpenPGP digital signature


Non-zero preemption count on voluntary preemptive kernel

2016-12-02 Thread Alexander Lochmann

Hi folks!

I'm sorry to bother you, guys, but neither the mailing list
'kpreempt-t...@lists.sourceforge.net' nor Robert's address
'r...@tech9.net' exist anymore.
If I have to address someone else, pls let me know.

We're currently investigating the locking of data structures in the
Linux Kernel -- namely 3.16.
Doing so, we've been logging every call to *lock() or *unlock(). For
further analysis, we log the preempt count as well.
Although we use a voluntary preemptive kernel
(defined(CONFIG_PREEMPT_VOLUNTARY) && !defined(CONFIG_PREEMPT) &&
!defined(CONFIG_PREEMPT_COUNT), the actual preemption count, the first
eight bits, aren't zero all time.
I'd expect the kernel to not touch those bits, because the
preempt_{enable,disable} functions do nearly nothing in my case.
They actually insert a memory barrier, but nothing more.
As you can see in the file (query.txt) attached to this mail, there are
several occasions where the preemption count is above zero. I'm a bit
puzzled why this happens...

The format is as follows:
The first three columns donate the position within the source code where
a lock has been acquired. The fourth column is the preemption count, as
returned by preemption_count(), after acquiring the lock.
The last column tells how often this happens.
I attached my kernel config as well.

Can you please tell why this happens?
Have I missed something? Might this be a bug?

Thanks!

Regards,
Alex
-- 
Technische Universität Dortmund
Alexander LochmannPGP key: 0xBC3EF6FD
Otto-Hahn-Str. 16 phone:  +49.231.7556141
D-44227 Dortmund  fax:+49.231.7556116
http://ess.cs.tu-dortmund.de/Staff/al


#
# Automatically generated file; DO NOT EDIT.
# Linux/x86 3.16.0 Kernel Configuration
#
# CONFIG_64BIT is not set
CONFIG_X86_32=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf32-i386"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/i386_defconfig"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_HAVE_LATENCYTOP_SUPPORT=y
CONFIG_MMU=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
# CONFIG_ZONE_DMA32 is not set
# CONFIG_AUDIT_ARCH is not set
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_X86_32_SMP=y
CONFIG_X86_HT=y
CONFIG_ARCH_HWEIGHT_CFLAGS="-fcall-saved-ecx -fcall-saved-edx"
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y

#
# General setup
#
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=""
# CONFIG_COMPILE_TEST is not set
CONFIG_LOCALVERSION="-al"
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
# CONFIG_KERNEL_GZIP is not set
# CONFIG_KERNEL_BZIP2 is not set
# CONFIG_KERNEL_LZMA is not set
CONFIG_KERNEL_XZ=y
# CONFIG_KERNEL_LZO is not set
# CONFIG_KERNEL_LZ4 is not set
CONFIG_DEFAULT_HOSTNAME="(none)"
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_POSIX_MQUEUE_SYSCTL=y
CONFIG_CROSS_MEMORY_ATTACH=y
CONFIG_FHANDLE=y
CONFIG_USELIB=y
CONFIG_AUDIT=y
CONFIG_HAVE_ARCH_AUDITSYSCALL=y
CONFIG_AUDITSYSCALL=y
CONFIG_AUDIT_WATCH=y
CONFIG_AUDIT_TREE=y

#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_GENERIC_IRQ_LEGACY_ALLOC_HWIRQ=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_GENERIC_IRQ_CHIP=y
CONFIG_IRQ_DOMAIN=y
# CONFIG_IRQ_DOMAIN_DEBUG is not set
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_SPARSE_IRQ=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_ARCH_CLOCKSOURCE_DATA=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_KTIME_SCALAR=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BUILD=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
CONFIG_GENERIC_CMOS_UPDATE=y

#
# Timers subsystem
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
CONFIG_NO_HZ_IDLE=y
# CONFIG_NO_HZ is not set
CONFIG_HIGH_RES_TIMERS=y

#
# CPU/Task time and stats accounting
#
CONFIG_TICK_CPU_ACCOUNTING=y
# CONFIG_IRQ_TIME_ACCOUNTING is not set
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
CONFIG_TASKSTATS=y
CONFIG_TASK_DELAY_ACCT=y
CONFIG_TASK_XACCT=y
CONFIG_TASK_IO_ACCOUNTING=y

#
# RCU Subsystem
#
CONFIG_TREE_RCU=y
# CONFIG_PREEMPT_RCU is not set
CONFIG_RCU_STALL_COMMON=y
C

[RFC] perf record: missing buildid for callstack modules -- again

2016-06-25 Thread Alexander Lochmann
Hi folks!

As mentioned by this discussion https://lkml.org/lkml/2016/1/7/661, perf
does not record the build id for modules, which are hit on the way down
the callstack.
Sadly that issue is not completely fixed by the cmd argument --buildid-all.
In my case, I have my own kernel module starting a kernel thread. It
does some stuff, and calls schedule from time to time.
If I start perf like this

perf record -e sched:sched_stat_sleep -e sched:sched_switch -a -g -o
perf.data.raw -p  -- sleep 10

, it records the callstacks for those three events.
Afterwards I want to do some post processing to generate a cold graph:

perf inject -v -s -i perf.data.raw -o perf.data

That step fails ('failed to write feature 2') due to the lack of a
build-id for my module, which isn't included in perf.data.raw.

Can anyone confirm that this is a bug?
Does anyone have a workaround for it?

Thanks in advance!

Cheers,
Alex

P.S.: Pls put me in CC, because I'm not subscribed to the mailinglist.
-- 
Technische Universität Dortmund
Alexander LochmannPGP key: 0xBC3EF6FD
Otto-Hahn-Str. 16 phone:  +49.231.7556141
D-44227 Dortmund  fax:+49.231.7556116
http://ess.cs.tu-dortmund.de/Staff/al



signature.asc
Description: OpenPGP digital signature