Re: [llvm-dev] [LLVMdev] Cc llvmdev: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

2015-08-05 Thread Alexei Starovoitov
On Thu, Aug 06, 2015 at 12:35:30PM +0800, Wangnan (F) wrote:
> 
> 
> On 2015/8/6 11:22, Alexei Starovoitov wrote:
> >On Wed, Aug 05, 2015 at 04:28:13PM +0800, Wangnan (F) wrote:
> >>It doesn't work for me at first since in my llvm there's only
> >>llvm.bpf.load.*.
> >>
> >>I think llvm.bpf.store.* belone to some patches you haven't posted yet?
> >nope. only loads have special instructions ld_abs/ld_ind
> >which are represented by these intrinsics.
> >stores, so far, are done via single bpf_store_bytes() helper function.
> >
> >>>the typeid changing ids with order is surprising.
> >>>I think the assertion in ExtractTypeInfo() is not hard.
> >>>Just there were no such use cases. May be we can do something
> >>>similar to what LowerIntrinsicCall() does and lower it differently
> >>>in the backend.
> >>>
> >>But in backend can we still get type information? I thought type is
> >>meaningful in frontend only, and backend behaviors is unable to affect
> >>DWARF generation, right?
> >why do we need to affect type generation? we just need to know dwarf
> >type id in the backend, so we can emit it as a constant.
> >I still think lowering eh_typeid_for differently may work.
> >Like instead of doing
> >GV = ExtractTypeInfo(I.getArgOperand(0)) followed by
> >getMachineFunction().getMMI().getTypeIDFor(GV)
> >we can get dwarf type id from I.getArgOperand(0) if it's
> >any pointer to struct type.
> 
> I have a bad news to tell:
> 
> #include 
> struct my_str {
> int x;
> int y;
> } __gv_my_str;
> struct my_str __gv_my_str_;
> 
> struct my_str2 {
> int x;
> int y;
> } __gv_my_str2;
> 
> int typeid(void *p) asm("llvm.eh.typeid.for");
> 
> int main()
> {
> printf("%d\n", typeid(&__gv_my_str));
> printf("%d\n", typeid(&__gv_my_str_));
> printf("%d\n", typeid(&__gv_my_str2));
> return 0;
> }
> 
> Compiled with clang into x86 executable, then:
> 
> $ ./a.out
> 3
> 2
> 1
> 
> See? I have two types but reported 3 IDs.

that's expected. We don't have to use default lowering
of typeid_for with getTypeIDFor. bpf backend specific
lowering can be different, though in this case it's odd
that id for __gv_my_str and __gv_my_str_ are different.
__gv_my_str and __gv_my_str2 should be different.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [llvm-dev] [LLVMdev] Cc llvmdev: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

2015-08-05 Thread Alexei Starovoitov
On Thu, Aug 06, 2015 at 12:31:26PM +0800, Wangnan (F) wrote:
> 
> 
> What about hacking ELF binary in memory?
> 
> 1. load the object into memory;
> 2. twist the machine code to EM_X86_64;
> 3. load it using elf_begin;
> 4. return the twested elf memory image using libdwfl's find_elf callback.
> 
> Then libdw will recognise BPF's object file as a X86_64 object file. If
> required,
> relocation sections can also be twisted in this way. Should not very hard
> since
> we can only consider one relocation type.
> 
> Then let's start thinking how to introduce EM_BPF. We can rely on the
> hacking
> until EM_BPF symbol reaches elfutils in perf.
> 
> What do you think?

sounds crazy, but may work. let's try it :)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [llvm-dev] [LLVMdev] Cc llvmdev: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

2015-08-05 Thread Wangnan (F)



On 2015/8/6 11:22, Alexei Starovoitov wrote:

On Wed, Aug 05, 2015 at 04:28:13PM +0800, Wangnan (F) wrote:

It doesn't work for me at first since in my llvm there's only
llvm.bpf.load.*.

I think llvm.bpf.store.* belone to some patches you haven't posted yet?

nope. only loads have special instructions ld_abs/ld_ind
which are represented by these intrinsics.
stores, so far, are done via single bpf_store_bytes() helper function.


the typeid changing ids with order is surprising.
I think the assertion in ExtractTypeInfo() is not hard.
Just there were no such use cases. May be we can do something
similar to what LowerIntrinsicCall() does and lower it differently
in the backend.


But in backend can we still get type information? I thought type is
meaningful in frontend only, and backend behaviors is unable to affect
DWARF generation, right?

why do we need to affect type generation? we just need to know dwarf
type id in the backend, so we can emit it as a constant.
I still think lowering eh_typeid_for differently may work.
Like instead of doing
GV = ExtractTypeInfo(I.getArgOperand(0)) followed by
getMachineFunction().getMMI().getTypeIDFor(GV)
we can get dwarf type id from I.getArgOperand(0) if it's
any pointer to struct type.


I have a bad news to tell:

#include 
struct my_str {
int x;
int y;
} __gv_my_str;
struct my_str __gv_my_str_;

struct my_str2 {
int x;
int y;
} __gv_my_str2;

int typeid(void *p) asm("llvm.eh.typeid.for");

int main()
{
printf("%d\n", typeid(&__gv_my_str));
printf("%d\n", typeid(&__gv_my_str_));
printf("%d\n", typeid(&__gv_my_str2));
return 0;
}

Compiled with clang into x86 executable, then:

$ ./a.out
3
2
1

See? I have two types but reported 3 IDs.

And here is the implementation of getTypeIDFor, in 
lib/CodeGen/MachineModuleInfo.cpp:


unsigned MachineModuleInfo::getTypeIDFor(const GlobalValue *TI) {
  for (unsigned i = 0, N = TypeInfos.size(); i != N; ++i)
if (TypeInfos[i] == TI) return i + 1;

  TypeInfos.push_back(TI);
  return TypeInfos.size();
}

It only checks value in a stupid way.

Now the dwarf side becomes clear (see my other response), but the 
frontend may require

totally reconsidering.

Do you know someone in LLVM-dev who can help us?

Thank you.


I'm not familiar with dwarf handling part of llvm, but feels possible.




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [llvm-dev] [LLVMdev] Cc llvmdev: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

2015-08-05 Thread Wangnan (F)



On 2015/8/6 11:41, Alexei Starovoitov wrote:

On Wed, Aug 05, 2015 at 04:59:01PM +0800, He Kuang wrote:

Hi, Alexei

On 2015/7/30 1:13, Alexei Starovoitov wrote:

On 7/29/15 2:38 AM, He Kuang wrote:

Hi, Alexei

On 2015/7/28 10:18, Alexei Starovoitov wrote:

On 7/25/15 3:04 AM, He Kuang wrote:

I noticed that for 64-bit elf format, the reloc sections have
'Addend' in the entry, but there's no 'Addend' info in bpf elf
file(64bit). I think there must be something wrong in the process
of .s -> .o, which related to 64bit/32bit. Anyway, we can parse out the
AT_name now, DW_AT_LOCATION still missed and need your help.

Another thing about DW_AT_name, we've already found that the name
string is stored indirectly and needs relocation which is
architecture specific, while the e_machine info in bpf obj file
is "unknown", both objdump and libdw cannot parse DW_AT_name
correctly.

Should we just use a known architeture for bpf object file
instead of "unknown"? If so, we can use the existing relocation
codes in libdw and get DIE name by simply invoking
dwarf_diename(). The drawback of this method is that, e.g. we
use "x86-64" instead, is hard to distinguish bpf obj file with
x86-64 elf file. Do you think this is ok?

The only clean way would be to register bpf as an architecture
with elf standards committee. I have no idea who is doing that and
how much such new e_machine registration may cost.
So far using EM_NONE is a hack to avoid bureaucracy.
Are dwarf relocation processor specific?
Then simple hack to elfutils/libdw to treat EM_NONE as X64
should do the trick, right?
If that indeed works, we can tweak bpf backend to use EM_X86_64,
but then the danger that such .o file will be wrongly
recognized by elf utils. imo it's safer to keep it as EM_NONE
until real number is assigned, but even after it's assigned it
will take time to propagate that value. So for now I would try
to find a solution keeping EM_NONE hack.



What about hacking ELF binary in memory?

1. load the object into memory;
2. twist the machine code to EM_X86_64;
3. load it using elf_begin;
4. return the twested elf memory image using libdwfl's find_elf callback.

Then libdw will recognise BPF's object file as a X86_64 object file. If 
required,
relocation sections can also be twisted in this way. Should not very 
hard since

we can only consider one relocation type.

Then let's start thinking how to introduce EM_BPF. We can rely on the 
hacking

until EM_BPF symbol reaches elfutils in perf.

What do you think?

Thank you.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [llvm-dev] [LLVMdev] Cc llvmdev: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

2015-08-05 Thread Alexei Starovoitov
On Wed, Aug 05, 2015 at 04:59:01PM +0800, He Kuang wrote:
> Hi, Alexei
> 
> On 2015/7/30 1:13, Alexei Starovoitov wrote:
> >On 7/29/15 2:38 AM, He Kuang wrote:
> >>Hi, Alexei
> >>
> >>On 2015/7/28 10:18, Alexei Starovoitov wrote:
> >>>On 7/25/15 3:04 AM, He Kuang wrote:
> I noticed that for 64-bit elf format, the reloc sections have
> 'Addend' in the entry, but there's no 'Addend' info in bpf elf
> file(64bit). I think there must be something wrong in the process
> of .s -> .o, which related to 64bit/32bit. Anyway, we can parse out the
> AT_name now, DW_AT_LOCATION still missed and need your help.
> >>>
> 
> Another thing about DW_AT_name, we've already found that the name
> string is stored indirectly and needs relocation which is
> architecture specific, while the e_machine info in bpf obj file
> is "unknown", both objdump and libdw cannot parse DW_AT_name
> correctly.
> 
> Should we just use a known architeture for bpf object file
> instead of "unknown"? If so, we can use the existing relocation
> codes in libdw and get DIE name by simply invoking
> dwarf_diename(). The drawback of this method is that, e.g. we
> use "x86-64" instead, is hard to distinguish bpf obj file with
> x86-64 elf file. Do you think this is ok?

The only clean way would be to register bpf as an architecture
with elf standards committee. I have no idea who is doing that and
how much such new e_machine registration may cost.
So far using EM_NONE is a hack to avoid bureaucracy.
Are dwarf relocation processor specific?
Then simple hack to elfutils/libdw to treat EM_NONE as X64
should do the trick, right?
If that indeed works, we can tweak bpf backend to use EM_X86_64,
but then the danger that such .o file will be wrongly
recognized by elf utils. imo it's safer to keep it as EM_NONE
until real number is assigned, but even after it's assigned it
will take time to propagate that value. So for now I would try
to find a solution keeping EM_NONE hack.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [llvm-dev] [LLVMdev] Cc llvmdev: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

2015-08-05 Thread Alexei Starovoitov
On Wed, Aug 05, 2015 at 04:28:13PM +0800, Wangnan (F) wrote:
> 
> It doesn't work for me at first since in my llvm there's only
> llvm.bpf.load.*.
> 
> I think llvm.bpf.store.* belone to some patches you haven't posted yet?

nope. only loads have special instructions ld_abs/ld_ind
which are represented by these intrinsics.
stores, so far, are done via single bpf_store_bytes() helper function.

> >the typeid changing ids with order is surprising.
> >I think the assertion in ExtractTypeInfo() is not hard.
> >Just there were no such use cases. May be we can do something
> >similar to what LowerIntrinsicCall() does and lower it differently
> >in the backend.
> >
> But in backend can we still get type information? I thought type is
> meaningful in frontend only, and backend behaviors is unable to affect
> DWARF generation, right?

why do we need to affect type generation? we just need to know dwarf
type id in the backend, so we can emit it as a constant.
I still think lowering eh_typeid_for differently may work.
Like instead of doing
GV = ExtractTypeInfo(I.getArgOperand(0)) followed by
getMachineFunction().getMMI().getTypeIDFor(GV)
we can get dwarf type id from I.getArgOperand(0) if it's
any pointer to struct type.
I'm not familiar with dwarf handling part of llvm, but feels possible.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/