Re: [PATCH] selftests/bpf: fix warning comparing pointer to 0

2021-03-10 Thread Yauheni Kaliuta
On Wed, Mar 10, 2021 at 8:23 AM Jiapeng Chong
 wrote:
>
> Fix the following coccicheck warning:
>
> ./tools/testing/selftests/bpf/progs/test_global_func10.c:17:12-13:
> WARNING comparing pointer to 0.

but it's ok from the C standard point of view

>
> Reported-by: Abaci Robot 
> Signed-off-by: Jiapeng Chong 
> ---
>  tools/testing/selftests/bpf/progs/test_global_func10.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tools/testing/selftests/bpf/progs/test_global_func10.c 
> b/tools/testing/selftests/bpf/progs/test_global_func10.c
> index 61c2ae9..97b7031 100644
> --- a/tools/testing/selftests/bpf/progs/test_global_func10.c
> +++ b/tools/testing/selftests/bpf/progs/test_global_func10.c
> @@ -14,7 +14,7 @@ struct Big {
>
>  __noinline int foo(const struct Big *big)
>  {
> -   if (big == 0)
> +   if (!big)
> return 0;
>
> return bpf_get_prandom_u32() < big->y;
> --
> 1.8.3.1
>


-- 
WBR, Yauheni



Re: [PATCH v2] arm64: bpf: Fix branch offset in JIT

2020-09-16 Thread Yauheni Kaliuta
Hi, Jean-Philippe!

>>>>> On Wed, 16 Sep 2020 15:17:02 +0200, Jean-Philippe Brucker  wrote:

 > On Wed, Sep 16, 2020 at 03:39:37PM +0300, Yauheni Kaliuta wrote:
 >> If you start to amend extables, could you consider a change like
 >> 
 >> 05a68e892e89 ("s390/kernel: expand exception table logic to allow
 >> new handling options")
 >> 
 >> and implementation of BPF_PROBE_MEM then?

 > Commit 800834285361 ("bpf, arm64: Add BPF exception tables") should have
 > taken care of that, no?

Ah, missed that. Thanks!

-- 
WBR,
Yauheni Kaliuta



Re: [PATCH v2] arm64: bpf: Fix branch offset in JIT

2020-09-16 Thread Yauheni Kaliuta
Hi, Ilias!

>>>>> On Tue, 15 Sep 2020 22:23:11 +0300, Ilias Apalodimas  wrote:

 > Hi Will, 
 > On Tue, Sep 15, 2020 at 03:17:08PM +0100, Will Deacon wrote:
 >> On Tue, Sep 15, 2020 at 04:53:44PM +0300, Ilias Apalodimas wrote:
 >> > On Tue, Sep 15, 2020 at 02:11:03PM +0100, Will Deacon wrote:
 >> > > Hi Ilias,
 >> > > 
 >> > > On Mon, Sep 14, 2020 at 07:03:55PM +0300, Ilias Apalodimas wrote:
 >> > > > Running the eBPF test_verifier leads to random errors looking like 
 >> > > > this:
 >> > > > 
 >> > > > [ 6525.735488] Unexpected kernel BRK exception at EL1
 >> > > > [ 6525.735502] Internal error: ptrace BRK handler: f2000100 [#1] SMP
 >> > > 
 >> > > Does this happen because we poison the BPF memory with BRK instructions?
 >> > > Maybe we should look at using a special immediate so we can detect this,
 >> > > rather than end up in the ptrace handler.
 >> > 
 >> > As discussed offline this is what aarch64_insn_gen_branch_imm() will 
 >> > return for
 >> > offsets > 128M and yes replacing the handler with a more suitable message 
 >> > would 
 >> > be good.
 >> 
 >> Can you give the diff below a shot, please? Hopefully printing a more useful
 >> message will mean these things get triaged/debugged better in future.

 > [...]

 > The error print is going to be helpful imho. At least it will help
 > people notice something is wrong a lot faster than the previous one.


If you start to amend extables, could you consider a change like

05a68e892e89 ("s390/kernel: expand exception table logic to allow new handling 
options")

and implementation of BPF_PROBE_MEM then?

 > [ 575.273203] BPF JIT generated an invalid instruction at
 > bpf_prog_64e6f4ba80861823_F+0x2e4/0x9a4!
 > [  575.281996] Unexpected kernel BRK exception at EL1
 > [  575.286786] Internal error: BRK handler: f2000100 [#5] PREEMPT SMP
 > [ 575.292965] Modules linked in: crct10dif_ce drm ip_tables x_tables
 > ipv6 btrfs blake2b_generic libcrc32c xor xor_neon zstd_compress
 > raid6_pq nvme nvme_core realtek
 > [ 575.307516] CPU: 21 PID: 11760 Comm: test_verifier Tainted: G D W
 > 5.9.0-rc3-01410-ged6d9b022813-dirty #1
 > [ 575.318125] Hardware name: Socionext SynQuacer E-series
 > DeveloperBox, BIOS build #1 Jun 6 2020
 > [  575.326825] pstate: 2005 (nzCv daif -PAN -UAO BTYPE=--)
 > [  575.332396] pc : bpf_prog_64e6f4ba80861823_F+0x2e4/0x9a4
 > [  575.337705] lr : bpf_prog_d3e125b76c96daac+0x40/0xdec
 > [  575.342752] sp : 8000144e3ba0
 > [  575.346061] x29: 8000144e3bd0 x28: 
 > [  575.351371] x27: 0085f19dc08d x26: 
 > [  575.356681] x25: 8000144e3ba0 x24: 800011fdf038
 > [  575.361991] x23: 8000144e3d20 x22: 0001
 > [  575.367301] x21: 800011fdf000 x20: 0009609d4740
 > [  575.372611] x19:  x18: 
 > [  575.377921] x17:  x16: 
 > [  575.383231] x15:  x14: 
 > [  575.388540] x13:  x12: 
 > [  575.393850] x11:  x10: 800bc65c
 > [  575.399160] x9 :  x8 : 8000144e3c58
 > [  575.404469] x7 :  x6 : 000dd7ae967a
 > [  575.409779] x5 : 00ff x4 : 0007fabd6992cf96
 > [  575.415088] x3 : 0018 x2 : 800ba214
 > [  575.420398] x1 : 000a x0 : 0009
 > [  575.425708] Call trace:
 > [  575.428152]  bpf_prog_64e6f4ba80861823_F+0x2e4/0x9a4
 > [  575.433114]  bpf_prog_d3e125b76c96daac+0x40/0xdec
 > [  575.437822]  bpf_dispatcher_xdp_func+0x10/0x1c
 > [  575.442265]  bpf_test_run+0x80/0x240
 > [  575.445838]  bpf_prog_test_run_xdp+0xe8/0x190
 > [  575.450196]  __do_sys_bpf+0x8e8/0x1b00
 > [  575.453943]  __arm64_sys_bpf+0x24/0x510
 > [  575.457780]  el0_svc_common.constprop.0+0x6c/0x170
 > [  575.462570]  do_el0_svc+0x24/0x90
 > [  575.465883]  el0_sync_handler+0x90/0x19c
 > [  575.469802]  el0_sync+0x158/0x180
 > [  575.473118] Code: d4202000 d4202000 d4202000 d4202000 (d4202000)
 > [  575.479211] ---[ end trace 8cd54c7d5c0ffda4 ]---

 > Cheers
 > /Ilias


-- 
WBR,
Yauheni Kaliuta



Re: [selftests] 7cb32086e5: kernel-selftests.x86.check_initial_reg_state_32.fail

2020-07-13 Thread Yauheni Kaliuta
Hi, Shuah!

>>>>> On Fri, 10 Jul 2020 08:18:49 -0600, Shuah Khan  wrote:

 > On 7/10/20 12:02 AM, Yauheni Kaliuta wrote:
 >> On Thu, Jul 9, 2020 at 6:36 PM Shuah Khan  wrote:
 >>> 
 >>> On 7/9/20 12:49 AM, kernel test robot wrote:
 >>>> Greeting,
 >>>> 
 >>>> FYI, we noticed the following commit (built with gcc-9):
 >>>> 
 >>>> commit: 7cb32086e59b514a832a3e11f5370d37e7cfe022 ("selftests:
 >>>> simplify run_tests")
 >>>> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
 >>>> 
 >>>> 
 >>> 
 >>> Thanks for the report. I will drop this patch for now from next.
 >>> 
 >>> Yauheni,
 >>> 
 >>> This patch broke x86 32-bit test run
 >>> make run_tests -C x86
 >>> 
 >>> Please resubmit the patch with the fix.
 >> 
 >> I did not check carefully the report, but isn't it expected that some
 >> tests are moved after the patch since they originally were placed
 >> incorrectly?
 >> 
 >> 
 > The failure doesn't have anything to do with test being moved. You can
 > reproduce this very easily by running make as shown below in x86 dir
 > under tools/testing/selftests

 > make run_tests -C x86

 > I reproduced the problem with your and patch and verified that the
 > problem tracks your patch. I dropped the patch from linux-next
 > Your other two patches in the series are fine.

 > In any case, this patch isn't really adding any functionality and
 > is a good cleanup. Let's do the cleanup right or not.


Checked.

That is because with the patch both lib.mk and x86/Makefile add
the $(OUTPUT) prefix.

So the question is to agree about the convention, should lib.mk
targets expect short test names for TEST_PROGS or full path from
the subtests' Makefiles.

The existing code is hackish (incorrectly -- adding $(OUTPUT)
only to the first list members -- tries to handle it only for
out-of-tree build).

I can make the patch without adding $(OUTPUT). It will require to
fix possible tests which provided only one test and rely on that
behaviour for the OOT build. Do you have an easy way to get a
list of such tests?


-- 
WBR,
Yauheni Kaliuta



Re: [selftests] 7cb32086e5: kernel-selftests.x86.check_initial_reg_state_32.fail

2020-07-09 Thread Yauheni Kaliuta
On Thu, Jul 9, 2020 at 6:36 PM Shuah Khan  wrote:
>
> On 7/9/20 12:49 AM, kernel test robot wrote:
> > Greeting,
> >
> > FYI, we noticed the following commit (built with gcc-9):
> >
> > commit: 7cb32086e59b514a832a3e11f5370d37e7cfe022 ("selftests: simplify 
> > run_tests")
> > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
> >
> >
>
> Thanks for the report. I will drop this patch for now from next.
>
> Yauheni,
>
> This patch broke x86 32-bit test run
> make run_tests -C x86
>
> Please resubmit the patch with the fix.

I did not check carefully the report, but isn't it expected that some
tests are moved after the patch since they originally were placed
incorrectly?


-- 
WBR, Yauheni



[RFC PATCH] gcc-9: workaround array bounds warning on boot_params cleaning

2019-08-06 Thread Yauheni Kaliuta
The code is supposed to clear several fields in the structure with
memset() and it produces the warning:

In file included from arch/x86/kernel/head64.c:17:
In function ‘memset’,
inlined from ‘sanitize_boot_params’ at 
./arch/x86/include/asm/bootparam_utils.h:40:3,
inlined from ‘copy_bootdata’ at arch/x86/kernel/head64.c:391:2:
./include/linux/string.h:344:9: warning: ‘__builtin_memset’ offset [197, 448] 
from the object at ‘boot_params’ is out of the bounds of referenced subobject 
‘ext_ramdisk_image’ with type ‘unsigned int’ at offset 192 [-Warray-bounds]
  344 |  return __builtin_memset(p, c, size);
  | ^~~~

since gcc is aware of the field sizes of the structure.

Blind gcc treating the structure as a byte array and calculate
positions with offsetof().

Suggested-by: Ed Kellett 
Signed-off-by: Yauheni Kaliuta 
---
 arch/x86/include/asm/bootparam_utils.h | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/bootparam_utils.h 
b/arch/x86/include/asm/bootparam_utils.h
index 101eb944f13c..16e567a08b54 100644
--- a/arch/x86/include/asm/bootparam_utils.h
+++ b/arch/x86/include/asm/bootparam_utils.h
@@ -37,12 +37,14 @@ static void sanitize_boot_params(struct boot_params 
*boot_params)
if (boot_params->sentinel) {
/* fields in boot_params are left uninitialized, clear them */
boot_params->acpi_rsdp_addr = 0;
-   memset(&boot_params->ext_ramdisk_image, 0,
-  (char *)&boot_params->efi_info -
-   (char *)&boot_params->ext_ramdisk_image);
-   memset(&boot_params->kbd_status, 0,
-  (char *)&boot_params->hdr -
-  (char *)&boot_params->kbd_status);
+   memset((u8 *)boot_params + offsetof(struct boot_params, 
ext_ramdisk_image),
+  0,
+  offsetof(struct boot_params,  efi_info) -
+   offsetof(struct boot_params, ext_ramdisk_image));
+   memset((u8 *)boot_params + offsetof(struct boot_params, 
_pad7[0]),
+  0,
+  offsetof(struct boot_params, edd_mbr_sig_buffer[0]) -
+   offsetof(struct boot_params, _pad7[0]));
memset(&boot_params->_pad7[0], 0,
   (char *)&boot_params->edd_mbr_sig_buffer[0] -
(char *)&boot_params->_pad7[0]);
-- 
2.22.0



Re: bpf: test_verifier: sanitation: alu with different scalars

2019-07-26 Thread Yauheni Kaliuta
Hi, Daniel,

On Tue, Jun 25, 2019 at 12:39 PM Daniel Borkmann  wrote:
>
> On 06/25/2019 10:29 AM, Yauheni Kaliuta wrote:
> > Hi!
> >
> > I'm wondering, how the sanitaion tests (#903 5.2-rc6 for example)
> > are supposed to work on BE arches:
> >
> > {
> >   "sanitation: alu with different scalars 1",
> >   .insns = {
> >   BPF_MOV64_IMM(BPF_REG_0, 1),
> >   BPF_LD_MAP_FD(BPF_REG_ARG1, 0),
> >   BPF_MOV64_REG(BPF_REG_ARG2, BPF_REG_FP),
> >   BPF_ALU64_IMM(BPF_ADD, BPF_REG_ARG2, -16),
> >   BPF_ST_MEM(BPF_DW, BPF_REG_FP, -16, 0),
> >   BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem),
> >   BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1),
> >   BPF_EXIT_INSN(),
> >   BPF_LDX_MEM(BPF_B, BPF_REG_1, BPF_REG_0, 0),
> >
> > reads one byte 0 on BE and 28 on LE (from ->index) since
> >
> >   struct test_val {
> >   unsigned int index;
> >   int foo[MAX_ENTRIES];
> >   };
> >
> > struct test_val value = {
> >   .index = (6 + 1) * sizeof(int),
> >   .foo[6] = 0xabcdef12,
> >   };
> >
> >   BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 0, 3),
> >
> > So different branches are taken depending of the endianness.
> >
> >   BPF_MOV64_IMM(BPF_REG_2, 0),
> >   BPF_MOV64_IMM(BPF_REG_3, 0x10),
> >   BPF_JMP_A(2),
> >   BPF_MOV64_IMM(BPF_REG_2, 42),
> >   BPF_MOV64_IMM(BPF_REG_3, 0x11),
> >   BPF_ALU64_REG(BPF_ADD, BPF_REG_2, BPF_REG_3),
> >   BPF_MOV64_REG(BPF_REG_0, BPF_REG_2),
> >   BPF_EXIT_INSN(),
> >   },
> >   .fixup_map_array_48b = { 1 },
> >   .result = ACCEPT,
> >   .retval = 0x10,
> > },
>
> Let me get my hands on a s390x box later today and get back to you.

Any progress with that?

-- 
WBR, Yauheni


Re: ebpf: BPF_ALU32 | BPF_ARSH on BE arches

2019-06-25 Thread Yauheni Kaliuta
Hi, Jiong!

>>>>> On Tue, 25 Jun 2019 11:20:07 +0100, Jiong Wang  wrote:

 > Yauheni Kaliuta writes:

 >> Hi!
 >> 
 >> Looks like the code:
 >> 
 >> ALU_ARSH_X:
 >> DST = (u64) (u32) ((*(s32 *) &DST) >> SRC);
 >> CONT;
 >> ALU_ARSH_K:
 >> DST = (u64) (u32) ((*(s32 *) &DST) >> IMM);
 >> CONT;
 >> 
 >> works incorrectly on BE arches since it must operate on lower
 >> parts of 64bit registers.
 >> 
 >> See failure of test_verifier test 'arsh32 on imm 2' (#23 on
 >> 5.2-rc6).

 > Ah, thanks for reporting this.

 > Should not taken the address directly, does the following fix resolved the
 > failure?

 > ALU_ARSH_X:
 > DST = (u64) (u32) ((s32) DST) >> SRC);
 >     CONT;
 >     ALU_ARSH_K:
 > DST = (u64) (u32) ((s32) DST) >> IMM);
 > CONT;

Yes, thanks (just add the missing braces).

-- 
WBR,
Yauheni Kaliuta


ebpf: BPF_ALU32 | BPF_ARSH on BE arches

2019-06-25 Thread Yauheni Kaliuta
Hi!

Looks like the code:

   ALU_ARSH_X:
   DST = (u64) (u32) ((*(s32 *) &DST) >> SRC);
   CONT;
   ALU_ARSH_K:
   DST = (u64) (u32) ((*(s32 *) &DST) >> IMM);
   CONT;

works incorrectly on BE arches since it must operate on lower
parts of 64bit registers.

See failure of test_verifier test 'arsh32 on imm 2' (#23 on
5.2-rc6).


-- 
WBR,
Yauheni Kaliuta


bpf: test_verifier: sanitation: alu with different scalars

2019-06-25 Thread Yauheni Kaliuta
Hi!

I'm wondering, how the sanitaion tests (#903 5.2-rc6 for example)
are supposed to work on BE arches:

{
"sanitation: alu with different scalars 1",
.insns = {
BPF_MOV64_IMM(BPF_REG_0, 1),
BPF_LD_MAP_FD(BPF_REG_ARG1, 0),
BPF_MOV64_REG(BPF_REG_ARG2, BPF_REG_FP),
BPF_ALU64_IMM(BPF_ADD, BPF_REG_ARG2, -16),
BPF_ST_MEM(BPF_DW, BPF_REG_FP, -16, 0),
BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem),
BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1),
BPF_EXIT_INSN(),
BPF_LDX_MEM(BPF_B, BPF_REG_1, BPF_REG_0, 0),

reads one byte 0 on BE and 28 on LE (from ->index) since

struct test_val {
unsigned int index;
int foo[MAX_ENTRIES];
};

struct test_val value = {
.index = (6 + 1) * sizeof(int),
.foo[6] = 0xabcdef12,
};

BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 0, 3),

So different branches are taken depending of the endianness.

BPF_MOV64_IMM(BPF_REG_2, 0),
BPF_MOV64_IMM(BPF_REG_3, 0x10),
BPF_JMP_A(2),
BPF_MOV64_IMM(BPF_REG_2, 42),
BPF_MOV64_IMM(BPF_REG_3, 0x11),
BPF_ALU64_REG(BPF_ADD, BPF_REG_2, BPF_REG_3),
BPF_MOV64_REG(BPF_REG_0, BPF_REG_2),
BPF_EXIT_INSN(),
},
.fixup_map_array_48b = { 1 },
.result = ACCEPT,
.retval = 0x10,
},



-- 
WBR,
Yauheni Kaliuta


[tip:locking/core] tools/memory-model/Documentation: Fix typo, smb->smp

2018-07-17 Thread tip-bot for Yauheni Kaliuta
Commit-ID:  0fcff1715bec7593a0ba86f3fef46cd89af37a8b
Gitweb: https://git.kernel.org/tip/0fcff1715bec7593a0ba86f3fef46cd89af37a8b
Author: Yauheni Kaliuta 
AuthorDate: Mon, 16 Jul 2018 11:06:04 -0700
Committer:  Ingo Molnar 
CommitDate: Tue, 17 Jul 2018 09:30:35 +0200

tools/memory-model/Documentation: Fix typo, smb->smp

The tools/memory-model/Documentation/explanation.txt file says
"For each other CPU C', smb_wmb() forces all po-earlier stores"
This commit therefore replaces the "smb_wmb()" with "smp_wmb()".

Signed-off-by: Yauheni Kaliuta 
Signed-off-by: Paul E. McKenney 
Acked-by: Alan Stern 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: aki...@gmail.com
Cc: boqun.f...@gmail.com
Cc: dhowe...@redhat.com
Cc: j.algl...@ucl.ac.uk
Cc: linux-a...@vger.kernel.org
Cc: luc.maran...@inria.fr
Cc: npig...@gmail.com
Cc: parri.and...@gmail.com
Cc: will.dea...@arm.com
Link: 
http://lkml.kernel.org/r/20180716180605.16115-13-paul...@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar 
---
 tools/memory-model/Documentation/explanation.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/memory-model/Documentation/explanation.txt 
b/tools/memory-model/Documentation/explanation.txt
index 1b09f3175a1f..0cbd1ef8f86d 100644
--- a/tools/memory-model/Documentation/explanation.txt
+++ b/tools/memory-model/Documentation/explanation.txt
@@ -804,7 +804,7 @@ type of fence:
 Second, some types of fence affect the way the memory subsystem
 propagates stores.  When a fence instruction is executed on CPU C:
 
-   For each other CPU C', smb_wmb() forces all po-earlier stores
+   For each other CPU C', smp_wmb() forces all po-earlier stores
on C to propagate to C' before any po-later stores do.
 
For each other CPU C', any store which propagates to C before


Re: [GIT PULL] isolation: 1Hz residual tick offloading v4

2018-05-22 Thread Yauheni Kaliuta
Hi, Frederic!

>>>>> On Mon, 29 Jan 2018 02:10:26 +0100, Frederic Weisbecker  wrote:
 > On Wed, Jan 24, 2018 at 10:46:08AM -0500, Luiz Capitulino wrote:

[...]

 >> Since the 1Hz tick offload worked for you, I must be missing
 >> a way to disable this timer or the kernel is thinking my CPU
 >> has unstable TSC (which it doesn't AFAIK).

 > It's beyond the scope of this patchset but indeed that's
 > right, I run my kernels with tsc=reliable because my CPUs
 > don't have the TSC_RELIABLE flag.  That's the only way I found
 > to shutdown the tick completely on my test machine, otherwise
 > I keep having that clocksource watchdog.

[...]

Thanks, it helps. But I have accounting problem:

if I run user busy loop on the nohz cpu, the task accounting works
correctly (top shows the task takes 100% cpu), but cpu accounting is
wrong (cpu is 100% idle, in the per-core view as well).

If I understand correctly, the stats are updated by account_user_time()
-> task_group_account_field() but there is no call for it in case of
offloading (it is called from irqtime_account_process_tick,
account_process_tick, vtime_user_exit).

Moreover, task_group_account_field() uses __this_cpu_add() which will be
wrong for offloading.

For testing I used kcpustat_cpu(task_cpu(p)) in
task_group_account_field() and added call account_user_time(curr, delta)
to the sched_tick_remote() what fixes it for me, but what would be the
proper fix?

-- 
WBR,
Yauheni Kaliuta


Re: [PATCH] libkmod-elf: resolve CRC if module is built with MODULE_REL_CRCS

2017-07-19 Thread Yauheni Kaliuta
Hi, Ard!

>>>>> On Wed, 19 Jul 2017 20:10:17 +0100, Ard Biesheuvel  wrote:

 > On 19 July 2017 at 20:04, Yauheni Kaliuta  wrote:
 >> Hi, Lucas!
 >> 
 >>>>>>> On Wed, 19 Jul 2017 10:51:44 -0700, Lucas De Marchi  wrote:
 >> > On Wed, Jul 19, 2017 at 7:56 AM, Yauheni Kaliuta
 >> >  wrote:
 >> >> Normally exported symbol's crc is stored as absolute (SHN_ABS)
 >> >> value of special named symbol __crc_.
 >> >>
 >> >> When the kernel and modules are built with the config option
 >> >> CONFIG_MODULE_REL_CRCS, all the CRCs are put in a special section
 >> >> and the __crc_ symbols values are offsets in the
 >> >> section. See patch description of the commit:
 >> >>
 >> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=56067812d5b0e737ac2063e94a50f76b810d6ca3
 >> >>
 >> >> Add kmod support of this configuration.
 >> 
 >> [...]
 >> 
 >> > LGTM. I'll give this a try and apply.
 >> 
 >> Thanks! I would also like to get some feedback from the kernel feature
 >> author, Ard Biesheuvel, that this does not miss other usecases.
 >> 

 > The patch looks correct to me. My only comment is that
 > kmod_elf_resolve_crc() should probably return a uint32_t instead,
 > given the size of a CRC32.

Well, it's a specific of the internal libkmod implementation. Both
struct kmod_modversion::crc and return value of elf_get_uint()
are uint64_t, but in the patch elf_get_uint() reads correct
sizeof(uint32_t) value from the ELF.

 > In any case,

 > Reviewed-by: Ard Biesheuvel 

Thanks a lot!

-- 
WBR,
Yauheni Kaliuta


Re: [PATCH] libkmod-elf: resolve CRC if module is built with MODULE_REL_CRCS

2017-07-19 Thread Yauheni Kaliuta
Hi, Lucas!

>>>>> On Wed, 19 Jul 2017 10:51:44 -0700, Lucas De Marchi  wrote:
 > On Wed, Jul 19, 2017 at 7:56 AM, Yauheni Kaliuta
 >  wrote:
 >> Normally exported symbol's crc is stored as absolute (SHN_ABS)
 >> value of special named symbol __crc_.
 >> 
 >> When the kernel and modules are built with the config option
 >> CONFIG_MODULE_REL_CRCS, all the CRCs are put in a special section
 >> and the __crc_ symbols values are offsets in the
 >> section. See patch description of the commit:
 >> 
 >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=56067812d5b0e737ac2063e94a50f76b810d6ca3
 >> 
 >> Add kmod support of this configuration.

[...]

 > LGTM. I'll give this a try and apply.

Thanks! I would also like to get some feedback from the kernel feature
author, Ard Biesheuvel, that this does not miss other usecases.

 > Do you think it's possible to build an out-of-tree module with this
 > option so we can add one to the testsuite?

I'll check what I can do.

The problem is seen if you run depmod with -e -E, which I guess is not used
too often.

-- 
WBR,
Yauheni Kaliuta


[PATCH] libkmod-elf: resolve CRC if module is built with MODULE_REL_CRCS

2017-07-19 Thread Yauheni Kaliuta
Normally exported symbol's crc is stored as absolute (SHN_ABS)
value of special named symbol __crc_.

When the kernel and modules are built with the config option
CONFIG_MODULE_REL_CRCS, all the CRCs are put in a special section
and the __crc_ symbols values are offsets in the
section. See patch description of the commit:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=56067812d5b0e737ac2063e94a50f76b810d6ca3

Add kmod support of this configuration.

Signed-off-by: Yauheni Kaliuta 
---
 libkmod/libkmod-elf.c | 30 +-
 1 file changed, 29 insertions(+), 1 deletion(-)

diff --git a/libkmod/libkmod-elf.c b/libkmod/libkmod-elf.c
index 90da89aebbaf..ef4a8a3142a1 100644
--- a/libkmod/libkmod-elf.c
+++ b/libkmod/libkmod-elf.c
@@ -747,6 +747,31 @@ static inline uint8_t kmod_symbol_bind_from_elf(uint8_t 
elf_value)
}
 }
 
+static uint64_t kmod_elf_resolve_crc(const struct kmod_elf *elf, uint64_t crc, 
uint16_t shndx)
+{
+   int err;
+   uint64_t off, size;
+   uint32_t nameoff;
+
+   if (shndx == SHN_ABS || shndx == SHN_UNDEF)
+   return crc;
+
+   err = elf_get_section_info(elf, shndx, &off, &size, &nameoff);
+   if (err < 0) {
+   ELFDBG("Cound not find section index %"PRIu16" for crc", shndx);
+   return (uint64_t)-1;
+   }
+
+   if (crc > (size - sizeof(uint32_t))) {
+   ELFDBG("CRC offset %"PRIu64" is too big, section %"PRIu16" size 
is %"PRIu64"\n",
+  crc, shndx, size);
+   return (uint64_t)-1;
+   }
+
+   crc = elf_get_uint(elf, off + crc, sizeof(uint32_t));
+   return crc;
+}
+
 /* array will be allocated with strings in a single malloc, just free *array */
 int kmod_elf_get_symbols(const struct kmod_elf *elf, struct kmod_modversion 
**array)
 {
@@ -830,6 +855,7 @@ int kmod_elf_get_symbols(const struct kmod_elf *elf, struct 
kmod_modversion **ar
uint32_t name_off;
uint64_t crc;
uint8_t info, bind;
+   uint16_t shndx;
 
 #define READV(field)   \
elf_get_uint(elf, sym_off + offsetof(typeof(*s), field),\
@@ -839,11 +865,13 @@ int kmod_elf_get_symbols(const struct kmod_elf *elf, 
struct kmod_modversion **ar
name_off = READV(st_name);
crc = READV(st_value);
info = READV(st_info);
+   shndx = READV(st_shndx);
} else {
Elf64_Sym *s;
name_off = READV(st_name);
crc = READV(st_value);
info = READV(st_info);
+   shndx = READV(st_shndx);
}
 #undef READV
name = elf_get_mem(elf, str_off + name_off);
@@ -856,7 +884,7 @@ int kmod_elf_get_symbols(const struct kmod_elf *elf, struct 
kmod_modversion **ar
else
bind = ELF64_ST_BIND(info);
 
-   a[count].crc = crc;
+   a[count].crc = kmod_elf_resolve_crc(elf, crc, shndx);
a[count].bind = kmod_symbol_bind_from_elf(bind);
a[count].symbol = itr;
slen = strlen(name);
-- 
2.14.0.rc0



Re: depmod gets stuck in symlink cycle in arch/arm/boot/dts/include

2017-05-12 Thread Yauheni Kaliuta
Hi, Omar!

>>>>> On Fri, 12 May 2017 15:23:07 -0700, Omar Sandoval  wrote:

 > Hi,
 > Linux kernel commit 4027494ae6e3 ("ARM: dts: add arm/arm64 include
 > symlinks") introduced a couple of symlink cycles:

 > $ ls -al arch/arm{,64}/boot/dts/include
 > arch/arm64/boot/dts/include:
 > total 12
 > drwxr-xr-x 1 osandov users  38 May 11 14:01 .
 > drwxr-xr-x 1 osandov users 320 Jan 25 20:44 ..
 > lrwxrwxrwx 1 osandov users  24 May 11 14:01 arm -> ../../../../arm/boot/dts
 > lrwxrwxrwx 1 osandov users   2 May 11 14:01 arm64 -> ..
 > lrwxrwxrwx 1 osandov users 34 Nov 23 12:07 dt-bindings ->
 > ../../../../../include/dt-bindings

 > arch/arm/boot/dts/include:
 > total 12
 > drwxr-xr-x 1 osandov users38 May 11 14:01 .
 > drwxr-xr-x 1 osandov users 63102 May 11 14:01 ..
 > lrwxrwxrwx 1 osandov users 2 May 11 14:01 arm -> ..
 > lrwxrwxrwx 1 osandov users26 May 11 14:01 arm64 -> 
 > ../../../../arm64/boot/dts
 > lrwxrwxrwx 1 osandov users 34 Nov 23 12:07 dt-bindings ->
 > ../../../../../include/dt-bindings

 > On my system, /lib/modules/$(uname -r)/kernel is a symlink to
 > /lib/modules/$(uname -r)/build, which is my built linux.git. depmod
 > doesn't like these symlink cycles and ends up in an infinite loop.

 > Maybe I shouldn't be symlinking kernel to build like this, but depmod
 > shouldn't be getting stuck like this either. I wonder if anything else
 > is going to barf on these symlink cycles, too.

depmod checks for special names "build" and "source" to skip them.

I wonder if it worth to implement full symlink loop check.

-- 
WBR,
Yauheni Kaliuta


Re: [PATCH v1 2/3] usb: gadget: NCM: link socket buffers to the device for received packets

2016-09-15 Thread Yauheni Kaliuta
Hi, Harish!

>>>>> On Thu, 15 Sep 2016 16:22:57 +0200, Harish Jenny K N  wrote:

 > diff --git a/drivers/usb/gadget/function/u_ether.c 
 > b/drivers/usb/gadget/function/u_ether.c
   ^^^
Is subject a bit misleading, it's more generic, then NCM?

-- 
WBR,
Yauheni Kaliuta


[PATCH RFC 2/2] rlimits: report resource limits violations

2016-09-07 Thread Yauheni Kaliuta
The patch instrument different places of resource limits checks with
reporting using the infrastructure from the previous patch.

Signed-off-by: Yauheni Kaliuta 
---
 arch/ia64/kernel/perfmon.c |  4 +++-
 arch/powerpc/kvm/book3s_64_vio.c   |  6 --
 arch/powerpc/mm/mmu_context_iommu.c|  6 --
 drivers/android/binder.c   |  7 ++-
 drivers/infiniband/core/umem.c |  1 +
 drivers/infiniband/hw/hfi1/user_pages.c|  5 -
 drivers/infiniband/hw/qib/qib_user_pages.c |  1 +
 drivers/infiniband/hw/usnic/usnic_uiom.c   |  1 +
 drivers/misc/mic/scif/scif_rma.c   |  1 +
 drivers/vfio/vfio_iommu_spapr_tce.c|  6 --
 drivers/vfio/vfio_iommu_type1.c|  4 
 fs/attr.c  |  4 +++-
 fs/binfmt_aout.c   |  4 +++-
 fs/binfmt_flat.c   |  1 +
 fs/coredump.c  |  4 +++-
 fs/exec.c  | 14 ++
 fs/file.c  | 26 +-
 fs/select.c|  4 +++-
 include/linux/mm.h |  7 ++-
 ipc/mqueue.c   | 10 --
 ipc/shm.c  |  1 +
 kernel/bpf/syscall.c   | 15 ---
 kernel/events/core.c   |  1 +
 kernel/fork.c  |  9 ++---
 kernel/sched/core.c| 17 +
 kernel/signal.c|  7 ---
 kernel/sys.c   |  9 ++---
 kernel/time/posix-cpu-timers.c |  8 
 mm/mlock.c | 14 +-
 mm/mmap.c  | 19 +++
 mm/mremap.c|  4 +++-
 net/unix/af_unix.c |  9 ++---
 32 files changed, 179 insertions(+), 50 deletions(-)

diff --git a/arch/ia64/kernel/perfmon.c b/arch/ia64/kernel/perfmon.c
index 2436ad5f92c1..c765e94a7089 100644
--- a/arch/ia64/kernel/perfmon.c
+++ b/arch/ia64/kernel/perfmon.c
@@ -2259,8 +2259,10 @@ pfm_smpl_buffer_alloc(struct task_struct *task, struct 
file *filp, pfm_context_t
 * if ((mm->total_vm << PAGE_SHIFT) + len> 
task->rlim[RLIMIT_AS].rlim_cur)
 *  return -ENOMEM;
 */
-   if (size > task_rlimit(task, RLIMIT_MEMLOCK))
+   if (size > task_rlimit(task, RLIMIT_MEMLOCK)) {
+   rlimit_exceeded_task(RLIMIT_MEMLOCK, size, task);
return -ENOMEM;
+   }
 
/*
 * We do the easy to undo allocations first.
diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c
index c379ff5a4438..a0477260d398 100644
--- a/arch/powerpc/kvm/book3s_64_vio.c
+++ b/arch/powerpc/kvm/book3s_64_vio.c
@@ -67,10 +67,12 @@ static long kvmppc_account_memlimit(unsigned long 
stt_pages, bool inc)
 
locked = current->mm->locked_vm + stt_pages;
lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
-   if (locked > lock_limit && !capable(CAP_IPC_LOCK))
+   if (locked > lock_limit && !capable(CAP_IPC_LOCK)) {
+   rlimit_exceeded(RLIMIT_MEMLOCK, locked << PAGE_SHIFT);
ret = -ENOMEM;
-   else
+   } else {
current->mm->locked_vm += stt_pages;
+   }
} else {
if (WARN_ON_ONCE(stt_pages > current->mm->locked_vm))
stt_pages = current->mm->locked_vm;
diff --git a/arch/powerpc/mm/mmu_context_iommu.c 
b/arch/powerpc/mm/mmu_context_iommu.c
index da6a2168ae9e..421890d325df 100644
--- a/arch/powerpc/mm/mmu_context_iommu.c
+++ b/arch/powerpc/mm/mmu_context_iommu.c
@@ -42,10 +42,12 @@ static long mm_iommu_adjust_locked_vm(struct mm_struct *mm,
if (incr) {
locked = mm->locked_vm + npages;
lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
-   if (locked > lock_limit && !capable(CAP_IPC_LOCK))
+   if (locked > lock_limit && !capable(CAP_IPC_LOCK)) {
+   rlimit_exceeded(RLIMIT_MEMLOCK, locked << PAGE_SHIFT);
ret = -ENOMEM;
-   else
+   } else {
mm->locked_vm += npages;
+   }
} else {
if (WARN_ON_ONCE(npages > mm->locked_vm))
npages = mm->locked_vm;
diff --git a/drivers/android/binder.c b/drivers/android/binder.c
index 16288e777ec3..a44021051a02 100644
--- a/drivers/android/binder.c
+++ b/drivers/android/binder.c
@@ -379,6 +379,7 @@ static int task_get_unused_fd_flags(struct binder_proc 
*proc, int f

[PATCH RFC 0/2] rlimit exceed notification events

2016-09-07 Thread Yauheni Kaliuta
http://marc.info/?l=linux-kernel&m=147211966914100&w=2 follow up.

More complete rlimits violations reporting RFC using tracing
infrastructure.

Yauheni Kaliuta (2):
  rlimits: add infra to report violations
  rlimits: report resource limits violations

 arch/ia64/kernel/perfmon.c |   4 +-
 arch/powerpc/kvm/book3s_64_vio.c   |   6 +-
 arch/powerpc/mm/mmu_context_iommu.c|   6 +-
 drivers/android/binder.c   |   7 +-
 drivers/infiniband/core/umem.c |   1 +
 drivers/infiniband/hw/hfi1/user_pages.c|   5 +-
 drivers/infiniband/hw/qib/qib_user_pages.c |   1 +
 drivers/infiniband/hw/usnic/usnic_uiom.c   |   1 +
 drivers/misc/mic/scif/scif_rma.c   |   1 +
 drivers/vfio/vfio_iommu_spapr_tce.c|   6 +-
 drivers/vfio/vfio_iommu_type1.c|   4 ++
 fs/attr.c  |   4 +-
 fs/binfmt_aout.c   |   4 +-
 fs/binfmt_flat.c   |   1 +
 fs/coredump.c  |   4 +-
 fs/exec.c  |  14 ++--
 fs/file.c  |  26 +--
 fs/select.c|   4 +-
 include/linux/mm.h |   7 +-
 include/linux/resource.h   |   5 ++
 ipc/mqueue.c   |  10 ++-
 ipc/shm.c  |   1 +
 kernel/Makefile|   4 +-
 kernel/bpf/syscall.c   |  15 +++-
 kernel/events/core.c   |   1 +
 kernel/fork.c  |   9 ++-
 kernel/rlimit.c|  26 +++
 kernel/sched/core.c|  17 +++--
 kernel/signal.c|   7 +-
 kernel/sys.c   |   9 ++-
 kernel/time/posix-cpu-timers.c |   8 +++
 kernel/trace-rlimit.h  | 112 +
 mm/mlock.c |  14 +++-
 mm/mmap.c  |  19 +++--
 mm/mremap.c|   4 +-
 net/unix/af_unix.c |   9 ++-
 36 files changed, 325 insertions(+), 51 deletions(-)
 create mode 100644 kernel/rlimit.c
 create mode 100644 kernel/trace-rlimit.h

-- 
2.7.4



[PATCH RFC 1/2] rlimits: add infra to report violations

2016-09-07 Thread Yauheni Kaliuta
The patch defines tracepoints for resource limits (rlimits) violations
reporting and adds a thin layer to be called from rlimits aware code
without direct dependency of the tracepoints.

Signed-off-by: Yauheni Kaliuta 
---
 include/linux/resource.h |   5 +++
 kernel/Makefile  |   4 +-
 kernel/rlimit.c  |  26 +++
 kernel/trace-rlimit.h| 112 +++
 4 files changed, 146 insertions(+), 1 deletion(-)
 create mode 100644 kernel/rlimit.c
 create mode 100644 kernel/trace-rlimit.h

diff --git a/include/linux/resource.h b/include/linux/resource.h
index 5bc3116e649c..f4de2827e647 100644
--- a/include/linux/resource.h
+++ b/include/linux/resource.h
@@ -9,5 +9,10 @@ struct task_struct;
 int getrusage(struct task_struct *p, int who, struct rusage __user *ru);
 int do_prlimit(struct task_struct *tsk, unsigned int resource,
struct rlimit *new_rlim, struct rlimit *old_rlim);
+void rlimit_exceeded_task(int rlimit_id, u64 req, struct task_struct *task);
+void rlimit_exceeded(int rlimit_id, u64 req);
+void rlimit_hard_exceeded_task(int rlimit_id, u64 req,
+  struct task_struct *task);
+void rlimit_hard_exceeded(int rlimit_id, u64 req);
 
 #endif
diff --git a/kernel/Makefile b/kernel/Makefile
index e2ec54e2b952..30999d83a261 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -9,7 +9,7 @@ obj-y = fork.o exec_domain.o panic.o \
extable.o params.o \
kthread.o sys_ni.o nsproxy.o \
notifier.o ksysfs.o cred.o reboot.o \
-   async.o range.o smpboot.o
+   async.o range.o smpboot.o rlimit.o
 
 obj-$(CONFIG_MULTIUSER) += groups.o
 
@@ -18,6 +18,8 @@ ifdef CONFIG_FUNCTION_TRACER
 CFLAGS_REMOVE_irq_work.o = $(CC_FLAGS_FTRACE)
 endif
 
+CFLAGS_rlimit.o := -I$(src)
+
 # Prevents flicker of uninteresting __do_softirq()/__local_bh_disable_ip()
 # in coverage traces.
 KCOV_INSTRUMENT_softirq.o := n
diff --git a/kernel/rlimit.c b/kernel/rlimit.c
new file mode 100644
index ..0b42ebc3a9d5
--- /dev/null
+++ b/kernel/rlimit.c
@@ -0,0 +1,26 @@
+
+#include 
+
+#define CREATE_TRACE_POINTS
+#include "trace-rlimit.h"
+
+void rlimit_exceeded_task(int rlimit_id, u64 req, struct task_struct *task)
+{
+   trace_rlimit_exceeded(rlimit_id, task_rlimit(task, rlimit_id), req,
+ task_pid_nr(task), task->comm);
+}
+
+void rlimit_exceeded(int rlimit_id, u64 req)
+{
+   rlimit_exceeded_task(rlimit_id, req, current);
+}
+
+void rlimit_hard_exceeded_task(int rlimit_id, u64 req, struct task_struct 
*task)
+{
+   trace_rlimit_hard_exceeded(rlimit_id, task_rlimit_max(task, rlimit_id),
+  req, task_pid_nr(task), task->comm);
+}
+void rlimit_hard_exceeded(int rlimit_id, u64 req)
+{
+   rlimit_hard_exceeded_task(rlimit_id, req, current);
+}
diff --git a/kernel/trace-rlimit.h b/kernel/trace-rlimit.h
new file mode 100644
index ..e7433ae8a09e
--- /dev/null
+++ b/kernel/trace-rlimit.h
@@ -0,0 +1,112 @@
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM rlimit
+
+#if !defined(_TRACE_RLIMIT_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_RLIMIT_H
+#include 
+
+TRACE_DEFINE_ENUM(RLIMIT_CPU);
+TRACE_DEFINE_ENUM(RLIMIT_FSIZE);
+TRACE_DEFINE_ENUM(RLIMIT_DATA);
+TRACE_DEFINE_ENUM(RLIMIT_STACK);
+TRACE_DEFINE_ENUM(RLIMIT_CORE);
+TRACE_DEFINE_ENUM(RLIMIT_RSS);
+TRACE_DEFINE_ENUM(RLIMIT_NPROC);
+TRACE_DEFINE_ENUM(RLIMIT_NOFILE);
+TRACE_DEFINE_ENUM(RLIMIT_MEMLOCK);
+TRACE_DEFINE_ENUM(RLIMIT_AS);
+TRACE_DEFINE_ENUM(RLIMIT_LOCKS);
+TRACE_DEFINE_ENUM(RLIMIT_SIGPENDING);
+TRACE_DEFINE_ENUM(RLIMIT_MSGQUEUE);
+TRACE_DEFINE_ENUM(RLIMIT_NICE);
+TRACE_DEFINE_ENUM(RLIMIT_RTPRIO);
+TRACE_DEFINE_ENUM(RLIMIT_RTTIME);
+
+
+#define __print_rlimit_name(id_var)\
+   __print_symbolic(id_var,\
+{ RLIMIT_CPU, "CPU" }, \
+{ RLIMIT_FSIZE, "FSIZE" }, \
+{ RLIMIT_DATA, "DATA" },   \
+{ RLIMIT_STACK, "STACK" }, \
+{ RLIMIT_CORE, "CORE" },   \
+{ RLIMIT_RSS, "RSS" }, \
+{ RLIMIT_NPROC, "NPROC" }, \
+{ RLIMIT_NOFILE, "NOFILE" },   \
+{ RLIMIT_MEMLOCK, "MEMLOCK" }, \
+{ RLIMIT_AS, "AS" },   \
+{ RLIMIT_LOCKS, "LOCKS" }, \
+{ RLIMIT_SIGPENDING, "SIGPENDING" },   \
+{ RLIMIT_MSGQUEUE, "MSGQUEUE" },   \
+{ RLIMIT_NICE, "NICE" },   \
+  

Re: [BUG] perf tool: uprobe displays wrong argument value

2016-08-25 Thread Yauheni Kaliuta
Hi, Jiri!

>>>>> On Thu, 25 Aug 2016 14:02:50 +0200, Jiri Olsa  wrote:

[...]

 > Now here's where I got confused..  please continue reading only on your own 
 > risk ;-)

 > The uprobe_events shows following record:
 > # cat uprobe_events 
 > p:probe_ex/func 
 > /home/jolsa/linux-perf/tools/perf/ex:0x04f6 par=-12(%sp):s32

 > I can't see how ($rsp - 12) address could hold the 'par' value,
 > when we stop at the 'func' addreess, so where did it come from?

 > I figured it's the debug info, namely the par argument's CU:

 > $ readelf --debug-dump ./ex | less
 > ...

 >  <2><94>: Abbrev Number: 5 (DW_TAG_formal_parameter)
 > <95>   DW_AT_name: par
 > <99>   DW_AT_decl_file   : 1
 > <9a>   DW_AT_decl_line   : 3
 > <9b>   DW_AT_type: <0x57>
 > <9f>   DW_AT_location: 2 byte block: 91 6c  
 > (DW_OP_fbreg: -20)

 > which says the value is frame buffer reg -20.. I can't see
 > this will get the proper value for any of $rbp or $rsp even
 > after new func's stack frame is set..

I see how it can do that:

Breakpoint 1, main () at 1.c:10
10int a = 1;
(gdb) p &a
$1 = (int *) 0x7fffe9cc
(gdb) br *0x004004d6
Breakpoint 2 at 0x4004d6: file 1.c, line 4.

(before the prologue, as you did)

Breakpoint 2, func (par=0) at 1.c:4
4   {
(gdb) p $rsp + 20
$2 = (void *) 0x7fffe9cc (same as &a).


So DW_OP_call_frame_cfa restores %rsp before the function call.



 > Also if I set gdb to stop directly on the function address,
 > it shows wrong value:

 > # gdb ./ex
 > (gdb) b *0x4004f6
 > Breakpoint 1 at 0x4004f6: file ex.c, line 4.
 > (gdb) r
 > Starting program: /home/jolsa/linux-perf/tools/perf/ex 
 > Missing separate debuginfos, use: dnf debuginfo-install 
 > glibc-2.21-13.fc22.x86_64

 > Breakpoint 1, func (par=0) at ex.c:4
 > 4   {
 > (gdb) 

 > Apart from when I set the breakpoint after the new stack frame is set:

 > (gdb) b func
 > Breakpoint 1 at 0x4004fd: file ex.c, line 5.
 > (gdb) r
 > Starting program: /home/jolsa/linux-perf/tools/perf/ex 
 > Missing separate debuginfos, use: dnf debuginfo-install 
 > glibc-2.21-13.fc22.x86_64

 >     Breakpoint 1, func (par=1) at ex.c:5
 > 5   return par;


 > I'm clearly missing something..

 > thanks for help,
 > jirka


 > ---
 > kernel version: 4.8.0-rc2
 > perf version: latest Arnaldo's perf/core


-- 
WBR,
Yauheni Kaliuta


Re: [RFC] rlimit exceed notification events

2016-08-25 Thread Yauheni Kaliuta
Hi, Jiri!

>>>>> On Wed, 24 Aug 2016 13:24:28 +0200, Jiri Olsa  wrote:

 > On Fri, Aug 19, 2016 at 05:41:20PM +0300, Yauheni Kaliuta wrote:
 >> 
 >> At the moment there is no clear indication if a process exceeds resource
 >> limit. In some cases the problematic syscall can return a error, in some 
 >> cases
 >> the process can be just killed.

[...]

 >> 2) Using tracepoints. I've used a simple program, which dup()s until gets 
 >> the
 >> error 3 times:

 > just to start up the discussion.. ;-)

 > I'd think this one (2) is the proper way,

>From the options I checked, I like it most as well. Probably I should
prepare an RFC PATCH with it.

 > but generaly you need to
 > come with good justification/usecase to add new tracepoint

 > also rlimit seems to be difficult to add tracepoints to,
 > because the checks are spread all over the code.. 

 > can't think of a good solution ATM

Yes, every place should be instrumented. I just introduce some indirection
to have some flexibility for the final output.

Still it's good to know if there are objections for such a
instrumentation in any of the resource check places, like file operations
for example.

 >> $ sudo ./perf record -e rlimit:rlimit_exceeded ./a.out

[...]

 >> index 6b1acdfe59da..a358de041ac4 100644
 >> --- a/fs/file.c
 >> +++ b/fs/file.c
 >> @@ -947,6 +947,9 @@ SYSCALL_DEFINE1(dup, unsigned int, fildes)
 >> else
 >> fput(file);
 >> }
 >> +   if (ret == -EMFILE)
 >> +   rlimit_exceeded(RLIMIT_NOFILE,
 >> +   rlimit(RLIMIT_NOFILE), (u64)-1);
 >> return ret;

 > how about other places? alloc_fd/get_unused_fd_flags/replace_fd..

This is very good question. Initially I just wanted something for demo, but
I run into a dilemma even here. Ideally it must be a place, which is

a) aware of RLIMIT and
b) responsible for the decision making:

1) It would be good to place it into __alloc_fd() since it is a final point
and performs the check to against the limit, but it's not aware of the
RLIMIT, the limit is passed to it from upper levels.

2) get_unused_fd_flags() is aware of RLIMIT and entry point for many other
fd allocations, but doesn't do any decision.

3) the dup() syscall is not aware of RLIMIT, but makes the final decision.

That was the reason, why I put it here for the prototype code, but it
doesn't look as a good place for final solution.

In many other cases both a) and b) are in one place, so there is no such
problem.


-- 
WBR,
Yauheni Kaliuta


Re: [PATCH 2/5] Drivers: hv: balloon: account for gaps in hot add regions

2016-08-24 Thread Yauheni Kaliuta
Hi, kys!

>>>>> On Wed, 24 Aug 2016 16:23:10 -0700, kys   wrote:

[...]

 > -static bool pfn_covered(unsigned long start_pfn, unsigned long pfn_cnt)
 > +static int pfn_covered(unsigned long start_pfn, unsigned long pfn_cnt)
 >  {
 >  struct list_head *cur;
 >  struct hv_hotadd_state *has;
 > +struct hv_hotadd_gap *gap;
 >  unsigned long residual, new_inc;
 
 >  if (list_empty(&dm_device.ha_region_list))

One "return false;" left here.


[...]

 > -if (has->covered_end_pfn != start_pfn)
 > -has->covered_end_pfn = start_pfn;
 > -
 > -return true;
 > -
 > +return 1;
 >  }
 
 > -return false;
 > +return 0;
 >  }

[...]

-- 
WBR,
Yauheni Kaliuta


[RFC] rlimit exceed notification events

2016-08-19 Thread Yauheni Kaliuta
ader, &sample, event);
+
+   ret = perf_output_begin(&handle, event, rlimit_event->header.size);
+   if (ret)
+   return;
+
+   perf_output_put(&handle, *rlimit_event);
+   perf_event__output_id_sample(event, &handle, &sample);
+   perf_output_end(&handle);
+}
+
+void perf_event_rlimit(int id, u64 cur, u64 req)
+{
+   struct perf_rlimit_event rlimit_event;
+
+   rlimit_event = (struct perf_rlimit_event) {
+   .header = {
+   .type = PERF_RECORD_RLIMIT,
+   .size = sizeof(rlimit_event),
+   },
+   .rlimit_id = id,
+   .r_current = cur,
+   .r_requested = req,
+   };
+
+   perf_iterate_sb(perf_event_rlimit_output,
+   &rlimit_event,
+   NULL);
+}
+
+/*
  * comm tracking
  */
 
diff --git a/kernel/rlimit.c b/kernel/rlimit.c
new file mode 100644
index ..3ac043ea5383
--- /dev/null
+++ b/kernel/rlimit.c
@@ -0,0 +1,8 @@
+
+#include 
+#include 
+
+void rlimit_exceeded(int rlimit_id, u64 cur, u64 req)
+{
+   perf_event_rlimit(rlimit_id, cur, req);
+}


3.3) Create a new pmu for type PERF_TYPE_RLIMIT with own events and own record.

All perf changes will require some special userspace and/or perf utility
changes.

4) Something else.


So, any input about upstreamable way of the feature would be appreciated.


-- 
WBR,
Yauheni Kaliuta