RE: [PATCH] makedumpfile/s390: Add get_kaslr_offset() for s390x

2019-12-09 Thread Kazuhito Hagio
Hi Mikhail,

Sorry for late reply.

> -Original Message-
> Since kernel v5.2 KASLR is supported on s390. In makedumpfile however no
> support has been added yet. This patch adds the arch specific function
> get_kaslr_offset() for s390x.
> Since the values in vmcoreinfo are already relocated, the patch is
> mainly relevant for vmlinux processing (-x option).

In the current implementation of makedumpfile, the get_kaslr_offset(vaddr)
is supposed to return the KASLR offset only when the offset is needed to
add to the vaddr.  So generally symbols from kernel (vmlinux) need it, but
symbols from modules are resolved dynamically and don't need the offset.

This patch always returns the offset if any, as a result, I guess this patch
will not work as expected with module symbols in filter config file.

So... How about making get_kaslr_offset_arm64() general for other archs
(get_kaslr_offset_general() or something), then using it also for s390?
If OK, I can do that generalization.

Thanks,
Kazu

> 
> Signed-off-by: Philipp Rudo 
> Signed-off-by: Mikhail Zaslonko 
> ---
>  arch/s390x.c   | 32 
>  makedumpfile.h |  3 ++-
>  2 files changed, 34 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/s390x.c b/arch/s390x.c
> index bf9d58e..892df14 100644
> --- a/arch/s390x.c
> +++ b/arch/s390x.c
> @@ -122,6 +122,38 @@ get_machdep_info_s390x(void)
>   return TRUE;
>  }
> 
> +unsigned long
> +get_kaslr_offset_s390x(unsigned long vaddr)
> +{
> + unsigned int i;
> + char buf[BUFSIZE_FGETS], *endp;
> +
> + if (!info->file_vmcoreinfo)
> + return FALSE;
> +
> + if (fseek(info->file_vmcoreinfo, 0, SEEK_SET) < 0) {
> + ERRMSG("Can't seek the vmcoreinfo file(%s). %s\n",
> +info->name_vmcoreinfo, strerror(errno));
> + return FALSE;
> + }
> +
> + while (fgets(buf, BUFSIZE_FGETS, info->file_vmcoreinfo)) {
> + i = strlen(buf);
> + if (!i)
> + break;
> + if (buf[i - 1] == '\n')
> + buf[i - 1] = '\0';
> + if (strncmp(buf, STR_KERNELOFFSET,
> + strlen(STR_KERNELOFFSET)) == 0) {
> + info->kaslr_offset =
> + strtoul(buf + strlen(STR_KERNELOFFSET), , 
> 16);
> + DEBUG_MSG("info->kaslr_offset: %lx\n", 
> info->kaslr_offset);
> + }
> + }
> +
> + return info->kaslr_offset;
> +}
> +
>  static int
>  is_vmalloc_addr_s390x(unsigned long vaddr)
>  {
> diff --git a/makedumpfile.h b/makedumpfile.h
> index ac11e90..26f6247 100644
> --- a/makedumpfile.h
> +++ b/makedumpfile.h
> @@ -1071,11 +1071,12 @@ unsigned long long vaddr_to_paddr_ppc(unsigned long 
> vaddr);
>  int get_machdep_info_s390x(void);
>  unsigned long long vaddr_to_paddr_s390x(unsigned long vaddr);
>  int is_iomem_phys_addr_s390x(unsigned long addr);
> +unsigned long get_kaslr_offset_s390x(unsigned long vaddr);
>  #define find_vmemmap()   stub_false()
>  #define get_phys_base()  stub_true()
>  #define get_machdep_info()   get_machdep_info_s390x()
>  #define get_versiondep_info()stub_true()
> -#define get_kaslr_offset(X)  stub_false()
> +#define get_kaslr_offset(X)  get_kaslr_offset_s390x(X)
>  #define vaddr_to_paddr(X)vaddr_to_paddr_s390x(X)
>  #define paddr_to_vaddr(X)paddr_to_vaddr_general(X)
>  #define is_phys_addr(X)  is_iomem_phys_addr_s390x(X)
> --
> 2.17.1
> 



___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [PATCH] makedumpfile/Makefile: remove -lebl from LIBS

2019-12-09 Thread Kazuhito Hagio
> -Original Message-
> 
> On 12/07/2019 12:28 AM, Kazuhito Hagio wrote:
> >> -Original Message-
> >>
> >> On 12/05/2019 06:36 AM, Kazuhito Hagio wrote:
> >>> Hi Pingfan,
> >>>
> >>> Thank you for the patch.
> >>>
>  -Original Message-
>  since the following commit, -lebl has been removed from elfutils.
>  commit b833c731359af12af9f16bcb621b3cdc170eafbc
>  Author: Mark Wielaard 
>  Date:   Thu Aug 29 23:34:11 2019 +0200
> 
>  libebl: Don't install libebl.a, libebl.h and remove backends from 
>  spec.
> 
>  All archive members from libebl.a are now in libdw.a. We don't 
>  generate
>  separate backend shared libraries anymore. So remove them from the
>  elfutils.spec file.
> 
>  Signed-off-by: Mark Wielaard 
> 
>  So remove it from LIBS for makedumpfile
> >>>
> >>> It seems that this is ok with the latest elfutils, but with older ones?
> >>> Is it possible to remove -lebl when elfutils does not have libebl.a?
> >> I have no idea about it for now. The method to check version depends on
> >> distribution. Is it doable by checking /usr/lib64/libebl ?
> >
> > We have 'try-run' function written by Petr in the Makefile, which checks
> > if clock_gettime() requies -lrt.  How about utilizing it like this?
> >
> > diff --git a/Makefile b/Makefile
> > index 1fdb6286e85d..d4d1fb563209 100644
> > --- a/Makefile
> > +++ b/Makefile
> > @@ -50,7 +50,7 @@ OBJ_PART=$(patsubst %.c,%.o,$(SRC_PART))
> >  SRC_ARCH = arch/arm.c arch/arm64.c arch/x86.c arch/x86_64.c arch/ia64.c 
> > arch/ppc64.c arch/s390x.c
> arch/ppc.c arch/sparc64.c
> >  OBJ_ARCH=$(patsubst %.c,%.o,$(SRC_ARCH))
> >
> > -LIBS = -ldw -lbz2 -lebl -ldl -lelf -lz
> > +LIBS = -ldw -lbz2 -ldl -lelf -lz
> >  ifneq ($(LINKTYPE), dynamic)
> >  LIBS := -static $(LIBS)
> >  endif
> > @@ -79,6 +79,11 @@ LINK_TEST_PROG="int clock_gettime(); int main(){ return 
> > clock_gettime(); }"
> >  LIBS := $(LIBS) $(call try-run,\
> > echo $(LINK_TEST_PROG) | $(CC) $(CFLAGS) -o "$$TMP" -x c -,,-lrt)
> >
> > +# elfutils-0.178 or later does not install libebl.a.
> > +LINK_TEST_PROG="int main() { return 0; }"
> > +LIBS := $(LIBS) $(call try-run,\
> > +   echo $(LINK_TEST_PROG) | $(CC) -o "$$TMP" -x c - -lebl,-lebl,)
> > +
> >  all: makedumpfile
> >
> >  $(OBJ_PART): $(SRC_PART)
> >
> >
> > If libebl.a does not exist (gcc with -lebl fails), it will not append
> > -lebl to LIBS.
> >
> Yes, it sounds a good idea.
> 
> Should I sumbit another patch or you will do by yourself?

Modified and applied.
https://sourceforge.net/p/makedumpfile/code/ci/71e798cb1b85e4879a19607ebb0a061cbc92f70f/

Thanks!
Kazu

> 
> Thanks,
> Pingfan
> 



___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [PATCH] makedumpfile: assign bitmap2 fd for sub process during refiltering

2019-12-09 Thread Kazuhito Hagio


> -Original Message-
> From: piliu 
> Sent: Monday, December 9, 2019 1:06 AM
> To: Hagio Kazuhito(萩尾 一仁) ; kexec@lists.infradead.org
> Subject: Re: [PATCH] makedumpfile: assign bitmap2 fd for sub process during 
> refiltering
> 
> 
> 
> On 12/07/2019 06:11 AM, Kazuhito Hagio wrote:
> > Hi Pingfan,
> >
> >> -Original Message-
> >> In refiltering mode, each sub process inherits bitmap2->fd from parent.
> >> Then they lseek()/read() on the same fd, which means that they interference
> >> with each other.
> >>
> >> This breaks the purpose of SPLITTING_FD_BITMAP(i) for each sub process.
> >> Fix it by assigning a sub process dedicated fd to bitmap2->fd.
> >>
> >> Signed-off-by: Pingfan Liu 
> >
> > Thanks for the patch.
> > I'm still reading the code, but it might be better to apply this to 
> > bitmap1->fd
> > as well?  see you next week..
> Yes. Although during my test, bitmap1 is not touched, but it is a
> reasonable step to against any future bug.

Reading the code, I think
- the issue might occur not only in refiltering, but also the first filtering
  with --split and --work-dir option (forced non-cyclic mode).
- pefer to gather things for --split option into writeout_multiple_dumpfiles()
  if we can, for readability.

So does the following patch work for you and your test?
I could not have reproduced the issue yet.

diff --git a/makedumpfile.c b/makedumpfile.c
index b9e9dfbd45ba..674c6a00e2dd 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -10091,6 +10091,10 @@ writeout_multiple_dumpfiles(void)
info->split_start_pfn = SPLITTING_START_PFN(i);
info->split_end_pfn   = SPLITTING_END_PFN(i);
 
+   if (!info->flag_cyclic) {
+   info->bitmap1->fd = info->fd_bitmap;
+   info->bitmap2->fd = info->fd_bitmap;
+   }
if (!reopen_dump_memory())
exit(1);
if ((status = writeout_dumpfile()) == FALSE)


BTW, what do you see when the issue occurs? an error or broken dump?

Thanks,
Kazu

> 
> Thanks,
> Pingfan
> >
> > Thanks,
> > Kazu
> >
> >> ---
> >>  makedumpfile.c | 3 ++-
> >>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/makedumpfile.c b/makedumpfile.c
> >> index d76a435..1dc8640 100644
> >> --- a/makedumpfile.c
> >> +++ b/makedumpfile.c
> >> @@ -8857,7 +8857,8 @@ write_kdump_pages_and_bitmap_cyclic(struct 
> >> cache_data *cd_header, struct cache_d
> >>if (info->flag_cyclic) {
> >>if (!prepare_bitmap2_buffer())
> >>return FALSE;
> >> -  }
> >> +  } else if (info->flag_refiltering)
> >> +  info->bitmap2->fd = info->fd_bitmap;
> >>
> >>/*
> >> * Write pages and bitmap cyclically.
> >> --
> >> 2.7.5
> >>
> >
> >
> >
> > ___
> > kexec mailing list
> > kexec@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/kexec
> >
> 



___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [RFC PATCH v5 1/3] printk-rb: new printk ringbuffer implementation (writer)

2019-12-09 Thread John Ogness
On 2019-12-09, Sergey Senozhatsky  wrote:
>> + * Sample reader code::
>> + *
>> + *  struct printk_info info;
>> + *  char text_buf[32];
>> + *  char dict_buf[32];
>> + *  u64 next_seq = 0;
>> + *  struct printk_record r = {
>> + *  .info   = ,
>> + *  .text_buf   = _buf[0],
>> + *  .dict_buf   = _buf[0],
>> + *  .text_buf_size  = sizeof(text_buf),
>> + *  .dict_buf_size  = sizeof(dict_buf),
>> + *  };
>> + *
>> + *  while (prb_read_valid(, next_seq, )) {
>> + *  if (info.seq != next_seq)
>> + *  pr_warn("lost %llu records\n", info.seq - next_seq);
>> + *
>> + *  if (info.text_len > r.text_buf_size) {
>> + *  pr_warn("record %llu text truncated\n", info.seq);
>> + *  text_buf[sizeof(text_buf) - 1] = 0;
>> + *  }
>> + *
>> + *  if (info.dict_len > r.dict_buf_size) {
>> + *  pr_warn("record %llu dict truncated\n", info.seq);
>> + *  dict_buf[sizeof(dict_buf) - 1] = 0;
>> + *  }
>> + *
>> + *  pr_info("%llu: %llu: %s;%s\n", info.seq, info.ts_nsec,
>> + *  _buf[0], info.dict_len ? _buf[0] : "");
>> + *
>> + *  next_seq = info.seq + 1;
>> + *  }
>> + */
>
> Will this loop ever end? :)
>
> pr_info() adds data to ringbuffer, which prb_read_valid() reads, so
> pr_info() can add more data, which prb_read_valid() will read, so
> pr_info()...

The sample code is assuming that @rb is not the same ringbuffer used by
kernel/printk/printk.c. (For example, the test module is doing that to
stress test the ringbuffer code without actually affecting printk.) I
can add a sentence to clarify that.

John Ogness

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [RFC PATCH v5 1/3] printk-rb: new printk ringbuffer implementation (writer)

2019-12-09 Thread Sergey Senozhatsky
On (19/11/28 02:58), John Ogness wrote:
> + * Sample reader code::
> + *
> + *   struct printk_info info;
> + *   char text_buf[32];
> + *   char dict_buf[32];
> + *   u64 next_seq = 0;
> + *   struct printk_record r = {
> + *   .info   = ,
> + *   .text_buf   = _buf[0],
> + *   .dict_buf   = _buf[0],
> + *   .text_buf_size  = sizeof(text_buf),
> + *   .dict_buf_size  = sizeof(dict_buf),
> + *   };
> + *
> + *   while (prb_read_valid(, next_seq, )) {
> + *   if (info.seq != next_seq)
> + *   pr_warn("lost %llu records\n", info.seq - next_seq);
> + *
> + *   if (info.text_len > r.text_buf_size) {
> + *   pr_warn("record %llu text truncated\n", info.seq);
> + *   text_buf[sizeof(text_buf) - 1] = 0;
> + *   }
> + *
> + *   if (info.dict_len > r.dict_buf_size) {
> + *   pr_warn("record %llu dict truncated\n", info.seq);
> + *   dict_buf[sizeof(dict_buf) - 1] = 0;
> + *   }
> + *
> + *   pr_info("%llu: %llu: %s;%s\n", info.seq, info.ts_nsec,
> + *   _buf[0], info.dict_len ? _buf[0] : "");
> + *
> + *   next_seq = info.seq + 1;
> + *   }
> + */

Will this loop ever end? :)

pr_info() adds data to ringbuffer, which prb_read_valid() reads, so pr_info()
can add more data, which prb_read_valid() will read, so pr_info()...

-ss

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [RFC PATCH v5 1/3] printk-rb: new printk ringbuffer implementation (writer)

2019-12-09 Thread Sergey Senozhatsky
On (19/12/02 16:48), Petr Mladek wrote:
> Hi,
> 
> I have seen few prelimitary versions before this public one.
> I am either happy with it or blind to see new problems[*].

I guess I'm happy with it.

-ss

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [RFC PATCH v5 2/3] printk-rb: new printk ringbuffer implementation (reader)

2019-12-09 Thread John Ogness
On 2019-12-09, Sergey Senozhatsky  wrote:
>> +/* Given @blk_lpos, copy an expected @len of data into the provided buffer. 
>> */
>> +static bool copy_data(struct prb_data_ring *data_ring,
>> +  struct prb_data_blk_lpos *blk_lpos, u16 len, char *buf,
>> +  unsigned int buf_size)
>> +{
>> +unsigned long data_size;
>> +char *data;
>> +
>> +/* Caller might not want the data. */
>> +if (!buf || !buf_size)
>> +return true;
>> +
>> +data = get_data(data_ring, blk_lpos, _size);
>> +if (!data)
>> +return false;
>> +
>> +/* Actual cannot be less than expected. */
>> +if (WARN_ON_ONCE(data_size < len))
>> +return false;
>> +
>> +data_size = min_t(u16, buf_size, len);
>> +
>> +if (!WARN_ON_ONCE(!data_size))
>> +memcpy([0], data, data_size);
>> +return true;
>> +}
>> +
>> +/*
>> + * Read the record @id and verify that it is committed and has the sequence
>> + * number @seq.
>> + *
>> + * Error return values:
>> + * -EINVAL: The record @seq does not exist.
>> + * -ENOENT: The record @seq exists, but its data is not available. This is a
>> + *  valid record, so readers should continue with the next seq.
>> + */
>> +static int desc_read_committed(struct prb_desc_ring *desc_ring, u32 id,
>> +   u64 seq, struct prb_desc *desc)
>> +{
>> +enum desc_state d_state;
>> +
>> +d_state = desc_read(desc_ring, id, desc);
>> +if (desc->info.seq != seq)
>> +return -EINVAL;
>> +else if (d_state == desc_reusable)
>> +return -ENOENT;
>> +else if (d_state != desc_committed)
>> +return -EINVAL;
>> +
>> +return 0;
>> +}
>> +
>> +/*
>> + * Copy the ringbuffer data from the record with @seq to the provided
>> + * @r buffer. On success, 0 is returned.
>> + *
>> + * See desc_read_committed() for error return values.
>> + */
>> +static int prb_read(struct printk_ringbuffer *rb, u64 seq,
>> +struct printk_record *r)
>> +{
>> +struct prb_desc_ring *desc_ring = >desc_ring;
>> +struct prb_desc *rdesc = to_desc(desc_ring, seq);
>> +atomic_t *state_var = >state_var;
>> +struct prb_desc desc;
>> +int err;
>> +u32 id;
>> +
>> +/* Get a reliable local copy of the descriptor and check validity. */
>> +id = DESC_ID(atomic_read(state_var));
>> +err = desc_read_committed(desc_ring, id, seq, );
>> +if (err)
>> +return err;
>> +
>> +/* If requested, copy meta data. */
>> +if (r->info)
>> +memcpy(r->info, , sizeof(*(r->info)));
>
> I wonder if those WARN_ON-s will trigger false positive sometimes.
>
> A theoretical case.
>
> What if reader gets preempted/interrupted in the middle of
> desc_read_committed()->desc_read()->memcpy(). The context which
> interrupts the reader recycles the descriptor and pushes new
> data. Suppose that reader was interrupted right after it copied
> ->info.seq and ->info.text_len.  So the first desc_read_committed()
> will pass - we have matching ->seq and committed state. copy_data(),
> however, most likely, will generate WARNs. The final
> desc_read_committed() will notice that local copy of desc was in
> non-consistent state and everything is fine, but we have WARNs in the
> log buffer now.

Be aware that desc_read_committed() is filling a copy of the descriptor
in the local variable @desc. If desc_read_committed() succeeded, that
local copy _must_ be consistent. If the WARNs trigger, either
desc_read_committed() or the writer code is broken.

John Ogness

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [RFC PATCH v5 2/3] printk-rb: new printk ringbuffer implementation (reader)

2019-12-09 Thread Sergey Senozhatsky
On (19/12/09 17:43), Sergey Senozhatsky wrote:
> > +static int desc_read_committed(struct prb_desc_ring *desc_ring, u32 id,
> > +  u64 seq, struct prb_desc *desc)
> > +{
> > +   enum desc_state d_state;
> > +
> > +   d_state = desc_read(desc_ring, id, desc);
> > +   if (desc->info.seq != seq)
> > +   return -EINVAL;
> > +   else if (d_state == desc_reusable)
> > +   return -ENOENT;
> > +   else if (d_state != desc_committed)
> > +   return -EINVAL;
> > +
> > +   return 0;
> > +}
> > +
> > +/*
> > + * Copy the ringbuffer data from the record with @seq to the provided
> > + * @r buffer. On success, 0 is returned.
> > + *
> > + * See desc_read_committed() for error return values.
> > + */
> > +static int prb_read(struct printk_ringbuffer *rb, u64 seq,
> > +   struct printk_record *r)
> > +{
> > +   struct prb_desc_ring *desc_ring = >desc_ring;
> > +   struct prb_desc *rdesc = to_desc(desc_ring, seq);
> > +   atomic_t *state_var = >state_var;
> > +   struct prb_desc desc;
> > +   int err;
> > +   u32 id;
> > +
> > +   /* Get a reliable local copy of the descriptor and check validity. */
> > +   id = DESC_ID(atomic_read(state_var));
> > +   err = desc_read_committed(desc_ring, id, seq, );
> > +   if (err)
> > +   return err;
> > +
> > +   /* If requested, copy meta data. */
> > +   if (r->info)
> > +   memcpy(r->info, , sizeof(*(r->info)));
> 
> I wonder if those WARN_ON-s will trigger false positive sometimes.
> 
> A theoretical case.
> 
> What if reader gets preempted/interrupted in the middle of
> desc_read_committed()->desc_read()->memcpy(). The context which interrupts
> the reader recycles the descriptor and pushes new data. Suppose that
> reader was interrupted right after it copied ->info.seq and ->info.text_len.
> So the first desc_read_committed() will pass - we have matching ->seq
> and committed state. copy_data(), however, most likely, will generate
> WARNs. The final desc_read_committed() will notice that local copy
> of desc was in non-consistent state and everything is fine, but we have
> WARNs in the log buffer now.

Hmm. No, that won't happen. We should get desc_miss first, and then -EINVAL.

-ss

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [RFC PATCH v5 1/3] printk-rb: new printk ringbuffer implementation (writer)

2019-12-09 Thread John Ogness
On 2019-12-09, Sergey Senozhatsky  wrote:
>> +#define _DATA_SIZE(sz_bits) (1UL << (sz_bits))
>> +#define _DESCS_COUNT(ct_bits)   (1U << (ct_bits))
>> +#define DESC_SV_BITS(sizeof(int) * 8)
>> +#define DESC_COMMITTED_MASK (1U << (DESC_SV_BITS - 1))
>
> What does SV state for? State Value?

Yes. Originally this thing was just called the state. But it was a bit
confusing in the code because there is also an enum desc_state (used for
state queries), which is _not_ the value that is stored in the state
variable. That's why the code is using state_var/state_val (SV) for the
actual data values, keeping it separate from desc_state/d_state for the
the state queries.

John Ogness

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [RFC PATCH v5 3/3] printk-rb: add test module

2019-12-09 Thread Sergey Senozhatsky
On (19/11/28 02:58), John Ogness wrote:
> + * This is a test module that starts "num_online_cpus()" writer threads
> + * that each write data of varying length. They do this as fast as
> + * they can.

Can we also add some IRQ (hard or soft) writers and (if possible) some NMI
writers?

-ss

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [RFC PATCH v5 2/3] printk-rb: new printk ringbuffer implementation (reader)

2019-12-09 Thread Sergey Senozhatsky
On (19/11/28 02:58), John Ogness wrote:
> +/* Given @blk_lpos, copy an expected @len of data into the provided buffer. 
> */
> +static bool copy_data(struct prb_data_ring *data_ring,
> +   struct prb_data_blk_lpos *blk_lpos, u16 len, char *buf,
> +   unsigned int buf_size)
> +{
> + unsigned long data_size;
> + char *data;
> +
> + /* Caller might not want the data. */
> + if (!buf || !buf_size)
> + return true;
> +
> + data = get_data(data_ring, blk_lpos, _size);
> + if (!data)
> + return false;
> +
> + /* Actual cannot be less than expected. */
> + if (WARN_ON_ONCE(data_size < len))
> + return false;
> +
> + data_size = min_t(u16, buf_size, len);
> +
> + if (!WARN_ON_ONCE(!data_size))
> + memcpy([0], data, data_size);
> + return true;
> +}
> +
> +/*
> + * Read the record @id and verify that it is committed and has the sequence
> + * number @seq.
> + *
> + * Error return values:
> + * -EINVAL: The record @seq does not exist.
> + * -ENOENT: The record @seq exists, but its data is not available. This is a
> + *  valid record, so readers should continue with the next seq.
> + */
> +static int desc_read_committed(struct prb_desc_ring *desc_ring, u32 id,
> +u64 seq, struct prb_desc *desc)
> +{
> + enum desc_state d_state;
> +
> + d_state = desc_read(desc_ring, id, desc);
> + if (desc->info.seq != seq)
> + return -EINVAL;
> + else if (d_state == desc_reusable)
> + return -ENOENT;
> + else if (d_state != desc_committed)
> + return -EINVAL;
> +
> + return 0;
> +}
> +
> +/*
> + * Copy the ringbuffer data from the record with @seq to the provided
> + * @r buffer. On success, 0 is returned.
> + *
> + * See desc_read_committed() for error return values.
> + */
> +static int prb_read(struct printk_ringbuffer *rb, u64 seq,
> + struct printk_record *r)
> +{
> + struct prb_desc_ring *desc_ring = >desc_ring;
> + struct prb_desc *rdesc = to_desc(desc_ring, seq);
> + atomic_t *state_var = >state_var;
> + struct prb_desc desc;
> + int err;
> + u32 id;
> +
> + /* Get a reliable local copy of the descriptor and check validity. */
> + id = DESC_ID(atomic_read(state_var));
> + err = desc_read_committed(desc_ring, id, seq, );
> + if (err)
> + return err;
> +
> + /* If requested, copy meta data. */
> + if (r->info)
> + memcpy(r->info, , sizeof(*(r->info)));

I wonder if those WARN_ON-s will trigger false positive sometimes.

A theoretical case.

What if reader gets preempted/interrupted in the middle of
desc_read_committed()->desc_read()->memcpy(). The context which interrupts
the reader recycles the descriptor and pushes new data. Suppose that
reader was interrupted right after it copied ->info.seq and ->info.text_len.
So the first desc_read_committed() will pass - we have matching ->seq
and committed state. copy_data(), however, most likely, will generate
WARNs. The final desc_read_committed() will notice that local copy
of desc was in non-consistent state and everything is fine, but we have
WARNs in the log buffer now.

-ss

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec