Jesse,

Many thanks for your suggestions.
For the openvswitch.ko module built from the Open vSwitch git repository, the 
line of code causing the kernel oops appears to be the following in method 
ovs_flow_to_nlattrs():
        if (nla_put_u32(skb, OVS_KEY_ATTR_PRIORITY, output->phy.priority))
                goto nla_put_failure;
This is totally repeatable - happens every time.

For the openvswitch.ko module built from the kernel 3.3 sources, the problem is 
different, but again totally repeatable.
The Kernel oops is as follows:

Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = de3e8000
[00000000] *pgd=9e3c7831, *pte=00000000, *ppte=00000000
Internal error: Oops: 817 [#1] PREEMPT
Modules linked in:
CPU: 0    Not tainted  (3.3.0 #1)
PC is at ovs_flow_tbl_alloc+0x4e/0x94
LR is at ovs_flow_tbl_alloc+0x4b/0x94
pc : [<c027f5d6>]    lr : [<c027f5d3>]    psr: 80000033
sp : de277c50  ip : 6c6c6c6c  fp : 00000000
r10: 00000004  r9 : de33bf10  r8 : de32f280
r7 : 00000000  r6 : de342000  r5 : 00000400  r4 : 00000002
r3 : de342000  r2 : 00000000  r1 : 00000002  r0 : 00000000
Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA Thumb  Segment user
Control: 50c5387d  Table: 9e3e8019  DAC: 00000015
Process ovs-vswitchd (pid: 454, stack limit = 0xde2762e8)
Stack: (0xde277c50 to 0xde278000)
…
[<c027f5d6>] (ovs_flow_tbl_alloc+0x4e/0x94) from [<c027eda5>] 
(ovs_dp_cmd_new+0x51/0x130)
[<c027eda5>] (ovs_dp_cmd_new+0x51/0x130) from [<c01c9347>] 
(genl_rcv_msg+0x15f/0x17c)
[<c01c9347>] (genl_rcv_msg+0x15f/0x17c) from [<c01c8c39>] 
(netlink_rcv_skb+0x65/0x70)
[<c01c8c39>] (netlink_rcv_skb+0x65/0x70) from [<c01c91df>] (genl_rcv+0x17/0x20)
[<c01c91df>] (genl_rcv+0x17/0x20) from [<c01c888f>] 
(netlink_unicast+0x117/0x150)
[<c01c888f>] (netlink_unicast+0x117/0x150) from [<c01c8ab1>] 
(netlink_sendmsg+0x185/0x1cc)
[<c01c8ab1>] (netlink_sendmsg+0x185/0x1cc) from [<c018f08b>] 
(sock_sendmsg+0x5f/0x74)

In this case, the line of code causing the problem in ovs_flow_tbl_alloc() is
    table->buckets = alloc_buckets(new_size);

When I tried to put a printk to dump the new_size property in this second 
scenario then the problem moved again.
What else can I try?
Regards
Michele Bozier


-----Original Message-----
From: Jesse Gross [mailto:je...@nicira.com] 
Sent: 01 October 2013 20:35
To: Michele Bozier
Cc: discuss@openvswitch.org
Subject: Re: [ovs-discuss] Kernel oops running Open vSwitch on 3.3 Kernel (ARM)

On Tue, Oct 1, 2013 at 2:25 AM, Michele Bozier <mboz...@airspan.com> wrote:
> I am having trouble running Open vSwitch on the ARM platform after 
> cross-compiling on an i686 platform.  I am using the latest code from 
> master from the Open vSwitch git repository - commit Sept 26th
> (6a8a8528acb05d6d0a520e09ad1ec67e62b99e5e) and the Arago Kernel 3.3.
>
>
>
> The problem I am seeing when running on the target and trying to 
> create a switch is as follows:
>
>
>
> insmod ./openvswitch.ko
>
> The module seems to install fine -on the console I get
>
> openvswitch: Open vSwitch switching datapath 2.0.90, built Sep 30 2013
> 11:33:05
>
>
>
> ./ovsdb-tool create /usr/local/etc/openvswitch/conf.db 
> ./vswitch.ovsschema ./ovsdb-server --remote=ptcp:6634 
> --remote=db:Open_vSwitch,Open_vSwitch,manager_options
> --pidfile=/home/opf/server.pid --detach ./ovs-vsctl 
> --db=tcp:127.0.0.1:6634 --no-wait init ./ovs-vswitchd 
> tcp:127.0.0.1:6634 --pidfile=/home/opf/switch.pid 
> --log-file=/home/opf/switch.log --detach
>
>
>
> On the console I see the following:
>
> 1970-01-01T00:01:15Z|00001|vlog|INFO|opened log file 
> /home/opf/switch.log
>
> 1970-01-01T00:01:15Z|00002|reconnect|INFO|tcp:127.0.0.1:6634: connecting...
>
> 1970-01-01T00:01:15Z|00003|reconnect|INFO|tcp:127.0.0.1:6634: 
> connected
>
>
>
> I then enter the command to create a switch ./ovs-vsctl
> --db=tcp:127.0.0.1:6634 add-br opfbr
>
>
>
> I get the following output to the console
>
> device: 'ovs-system': device_add
>
> device ovs-system entered promiscuous mode
>
> device: 'opfbr0': device_add
>
> device opfbr0 entered promiscuous mode
>
>
>
> Followed shortly afterwards by a kernel oops.
>
>
>
> [root@synergy opf]# Unable to handle kernel paging request at virtual 
> address 8d10051d pgd = dd840000 [8d10051d] *pgd=00000000 Internal error:
> Oops: 5 [#1] PREEMPT Modules linked in: openvswitch(O)
>
> CPU: 0    Tainted: G           O  (3.3.0 #7)
>
> PC is at ovs_flow_to_nlattrs+0x5/0x430 [openvswitch] LR is at
> ovs_flow_cmd_fill_info+0x114/0x208 [openvswitch]
>
> pc : [<bf80524e>]    lr : [<bf801669>]    psr: 80000033
>
> sp : de273c30  ip : 00000058  fp : 00000018
>
> r10: de36e540  r9 : 0001fffb  r8 : dd8b8000
>
> r7 : 00000013  r6 : 000001cd  r5 : dd8b8088  r4 : 00000070
>
> r3 : 00000000  r2 : de36e540  r1 : 8d100505  r0 : 0002001b
>
> Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA Thumb  Segment user
>
> Control: 50c5387d  Table: 9d840019  DAC: 00000015 Process ovs-vswitchd (pid:
> 461, stack limit = 0xde2722e8)
>
> Stack: (0xde273c30 to 0xde274000)
>
> ...
>
> [<bf80524e>] (ovs_flow_to_nlattrs+0x5/0x430 [openvswitch]) from 
> [<bf801669>]
> (ovs_flow_cmd_fill_info+0x114/0x208 [openvswitch]) [<bf801669>]
> (ovs_flow_cmd_fill_info+0x114/0x208 [openvswitch]) from [<bf80179f>] 
> (ovs_flow_cmd_dump+0x42/0x7c [openvswitch]) [<bf80179f>] 
> (ovs_flow_cmd_dump+0x42/0x7c [openvswitch]) from [<c01c90fb>]
> (netlink_dump+0x3b/0x130) [<c01c90fb>] (netlink_dump+0x3b/0x130) from 
> [<c01c9983>] (netlink_dump_start+0xc7/0x108) [<c01c9983>]
> (netlink_dump_start+0xc7/0x108) from [<c01cb069>] 
> (genl_rcv_msg+0xc1/0x17c) [<c01cb069>] (genl_rcv_msg+0xc1/0x17c) from 
> [<c01ca9f9>]
> (netlink_rcv_skb+0x65/0x70) [<c01ca9f9>] (netlink_rcv_skb+0x65/0x70) 
> from [<c01caf9f>] (genl_rcv+0x17/0x20) [<c01caf9f>] 
> (genl_rcv+0x17/0x20) from [<c01ca64f>] (netlink_unicast+0x117/0x150) 
> [<c01ca64f>]
> (netlink_unicast+0x117/0x150) from [<c01ca871>]
> (netlink_sendmsg+0x185/0x1cc) [<c01ca871>] 
> (netlink_sendmsg+0x185/0x1cc) from [<c0190e4b>] 
> (sock_sendmsg+0x5f/0x74) [<c0190e4b>]
> (sock_sendmsg+0x5f/0x74) from [<c01921c1>] (sys_sendto+0x6d/0x80) 
> [<c01921c1>] (sys_sendto+0x6d/0x80) from [<c01921e3>] 
> (sys_send+0xf/0x14) [<c01921e3>] (sys_send+0xf/0x14) from [<c000c521>]
> (ret_fast_syscall+0x1/0x46)
>
> Code: bf00 e92d 47f0 b086 (698f) ab06
>
> ---[ end trace c6309ab77c3d706d ]---
>
>
>
> The process I followed to cross-compile the code base is as follows:
>
>
>
> ./boot.sh
>
>
>
> ./configure CC=arm-none-linux-gnueabi-gcc 
> --host=arm-none-linux-gnueabi --target=arm-none-linux-gnueabi 
> --build=i686-linux --with-linux=/home/mbozier/synergy/kernel/ti 
> KARCH=arm --disable-ssl 
> CPPFLAGS=-I/home/mbozier/tirootfs/usr/inc-L/home/mbozier/tirootfs/usr/
> lib
>
>
>
> make CROSS_COMPILE="arm-none-linux-gnueabi-" ARCH="arm"
> KCC="arm-none-linux-gnueabi-gcc" GCC="arm-none-linux-gnueabi-gcc"
>
>
>
> The kernel used on the target is built without Open vSwitch support 
> and the 802.1d bridging support is configured to be loaded as a module.
>
>
>
> I also tried running the OpenvSwitch kernel module built from the 
> sources distributed with the 3.3 kernel but with no success either.

Is it the exact same problem on this kernel or is a different one?

Probably the place to start is to use GDB to find exactly where it is faulting, 
based on the address in the stack trace. Is the problem reproducible?
_______________________________________________
discuss mailing list
discuss@openvswitch.org
http://openvswitch.org/mailman/listinfo/discuss

Reply via email to