Both of these functions are pretty innocuous, don't work with shared data, and shouldn't be architecture-specific. Furthermore, given that the problem remains essentially the same but moves around between versions indicates to me that the issue isn't with the code itself.
It sounds to me that there is a larger issue with corruption - either by something else in memory, running off the end of the stack, etc. That is obviously difficult to track down but it might explain why the problem appears to be specific to your environment (I've never heard a report of this before). On Thu, Oct 3, 2013 at 1:38 AM, Michele Bozier <mboz...@airspan.com> wrote: > Jesse, > > Many thanks for your suggestions. > For the openvswitch.ko module built from the Open vSwitch git repository, the > line of code causing the kernel oops appears to be the following in method > ovs_flow_to_nlattrs(): > if (nla_put_u32(skb, OVS_KEY_ATTR_PRIORITY, output->phy.priority)) > goto nla_put_failure; > This is totally repeatable - happens every time. > > For the openvswitch.ko module built from the kernel 3.3 sources, the problem > is different, but again totally repeatable. > The Kernel oops is as follows: > > Unable to handle kernel NULL pointer dereference at virtual address 00000000 > pgd = de3e8000 > [00000000] *pgd=9e3c7831, *pte=00000000, *ppte=00000000 > Internal error: Oops: 817 [#1] PREEMPT > Modules linked in: > CPU: 0 Not tainted (3.3.0 #1) > PC is at ovs_flow_tbl_alloc+0x4e/0x94 > LR is at ovs_flow_tbl_alloc+0x4b/0x94 > pc : [<c027f5d6>] lr : [<c027f5d3>] psr: 80000033 > sp : de277c50 ip : 6c6c6c6c fp : 00000000 > r10: 00000004 r9 : de33bf10 r8 : de32f280 > r7 : 00000000 r6 : de342000 r5 : 00000400 r4 : 00000002 > r3 : de342000 r2 : 00000000 r1 : 00000002 r0 : 00000000 > Flags: Nzcv IRQs on FIQs on Mode SVC_32 ISA Thumb Segment user > Control: 50c5387d Table: 9e3e8019 DAC: 00000015 > Process ovs-vswitchd (pid: 454, stack limit = 0xde2762e8) > Stack: (0xde277c50 to 0xde278000) > … > [<c027f5d6>] (ovs_flow_tbl_alloc+0x4e/0x94) from [<c027eda5>] > (ovs_dp_cmd_new+0x51/0x130) > [<c027eda5>] (ovs_dp_cmd_new+0x51/0x130) from [<c01c9347>] > (genl_rcv_msg+0x15f/0x17c) > [<c01c9347>] (genl_rcv_msg+0x15f/0x17c) from [<c01c8c39>] > (netlink_rcv_skb+0x65/0x70) > [<c01c8c39>] (netlink_rcv_skb+0x65/0x70) from [<c01c91df>] > (genl_rcv+0x17/0x20) > [<c01c91df>] (genl_rcv+0x17/0x20) from [<c01c888f>] > (netlink_unicast+0x117/0x150) > [<c01c888f>] (netlink_unicast+0x117/0x150) from [<c01c8ab1>] > (netlink_sendmsg+0x185/0x1cc) > [<c01c8ab1>] (netlink_sendmsg+0x185/0x1cc) from [<c018f08b>] > (sock_sendmsg+0x5f/0x74) > > In this case, the line of code causing the problem in ovs_flow_tbl_alloc() is > table->buckets = alloc_buckets(new_size); > > When I tried to put a printk to dump the new_size property in this second > scenario then the problem moved again. > What else can I try? > Regards > Michele Bozier > > > -----Original Message----- > From: Jesse Gross [mailto:je...@nicira.com] > Sent: 01 October 2013 20:35 > To: Michele Bozier > Cc: discuss@openvswitch.org > Subject: Re: [ovs-discuss] Kernel oops running Open vSwitch on 3.3 Kernel > (ARM) > > On Tue, Oct 1, 2013 at 2:25 AM, Michele Bozier <mboz...@airspan.com> wrote: >> I am having trouble running Open vSwitch on the ARM platform after >> cross-compiling on an i686 platform. I am using the latest code from >> master from the Open vSwitch git repository - commit Sept 26th >> (6a8a8528acb05d6d0a520e09ad1ec67e62b99e5e) and the Arago Kernel 3.3. >> >> >> >> The problem I am seeing when running on the target and trying to >> create a switch is as follows: >> >> >> >> insmod ./openvswitch.ko >> >> The module seems to install fine -on the console I get >> >> openvswitch: Open vSwitch switching datapath 2.0.90, built Sep 30 2013 >> 11:33:05 >> >> >> >> ./ovsdb-tool create /usr/local/etc/openvswitch/conf.db >> ./vswitch.ovsschema ./ovsdb-server --remote=ptcp:6634 >> --remote=db:Open_vSwitch,Open_vSwitch,manager_options >> --pidfile=/home/opf/server.pid --detach ./ovs-vsctl >> --db=tcp:127.0.0.1:6634 --no-wait init ./ovs-vswitchd >> tcp:127.0.0.1:6634 --pidfile=/home/opf/switch.pid >> --log-file=/home/opf/switch.log --detach >> >> >> >> On the console I see the following: >> >> 1970-01-01T00:01:15Z|00001|vlog|INFO|opened log file >> /home/opf/switch.log >> >> 1970-01-01T00:01:15Z|00002|reconnect|INFO|tcp:127.0.0.1:6634: connecting... >> >> 1970-01-01T00:01:15Z|00003|reconnect|INFO|tcp:127.0.0.1:6634: >> connected >> >> >> >> I then enter the command to create a switch ./ovs-vsctl >> --db=tcp:127.0.0.1:6634 add-br opfbr >> >> >> >> I get the following output to the console >> >> device: 'ovs-system': device_add >> >> device ovs-system entered promiscuous mode >> >> device: 'opfbr0': device_add >> >> device opfbr0 entered promiscuous mode >> >> >> >> Followed shortly afterwards by a kernel oops. >> >> >> >> [root@synergy opf]# Unable to handle kernel paging request at virtual >> address 8d10051d pgd = dd840000 [8d10051d] *pgd=00000000 Internal error: >> Oops: 5 [#1] PREEMPT Modules linked in: openvswitch(O) >> >> CPU: 0 Tainted: G O (3.3.0 #7) >> >> PC is at ovs_flow_to_nlattrs+0x5/0x430 [openvswitch] LR is at >> ovs_flow_cmd_fill_info+0x114/0x208 [openvswitch] >> >> pc : [<bf80524e>] lr : [<bf801669>] psr: 80000033 >> >> sp : de273c30 ip : 00000058 fp : 00000018 >> >> r10: de36e540 r9 : 0001fffb r8 : dd8b8000 >> >> r7 : 00000013 r6 : 000001cd r5 : dd8b8088 r4 : 00000070 >> >> r3 : 00000000 r2 : de36e540 r1 : 8d100505 r0 : 0002001b >> >> Flags: Nzcv IRQs on FIQs on Mode SVC_32 ISA Thumb Segment user >> >> Control: 50c5387d Table: 9d840019 DAC: 00000015 Process ovs-vswitchd (pid: >> 461, stack limit = 0xde2722e8) >> >> Stack: (0xde273c30 to 0xde274000) >> >> ... >> >> [<bf80524e>] (ovs_flow_to_nlattrs+0x5/0x430 [openvswitch]) from >> [<bf801669>] >> (ovs_flow_cmd_fill_info+0x114/0x208 [openvswitch]) [<bf801669>] >> (ovs_flow_cmd_fill_info+0x114/0x208 [openvswitch]) from [<bf80179f>] >> (ovs_flow_cmd_dump+0x42/0x7c [openvswitch]) [<bf80179f>] >> (ovs_flow_cmd_dump+0x42/0x7c [openvswitch]) from [<c01c90fb>] >> (netlink_dump+0x3b/0x130) [<c01c90fb>] (netlink_dump+0x3b/0x130) from >> [<c01c9983>] (netlink_dump_start+0xc7/0x108) [<c01c9983>] >> (netlink_dump_start+0xc7/0x108) from [<c01cb069>] >> (genl_rcv_msg+0xc1/0x17c) [<c01cb069>] (genl_rcv_msg+0xc1/0x17c) from >> [<c01ca9f9>] >> (netlink_rcv_skb+0x65/0x70) [<c01ca9f9>] (netlink_rcv_skb+0x65/0x70) >> from [<c01caf9f>] (genl_rcv+0x17/0x20) [<c01caf9f>] >> (genl_rcv+0x17/0x20) from [<c01ca64f>] (netlink_unicast+0x117/0x150) >> [<c01ca64f>] >> (netlink_unicast+0x117/0x150) from [<c01ca871>] >> (netlink_sendmsg+0x185/0x1cc) [<c01ca871>] >> (netlink_sendmsg+0x185/0x1cc) from [<c0190e4b>] >> (sock_sendmsg+0x5f/0x74) [<c0190e4b>] >> (sock_sendmsg+0x5f/0x74) from [<c01921c1>] (sys_sendto+0x6d/0x80) >> [<c01921c1>] (sys_sendto+0x6d/0x80) from [<c01921e3>] >> (sys_send+0xf/0x14) [<c01921e3>] (sys_send+0xf/0x14) from [<c000c521>] >> (ret_fast_syscall+0x1/0x46) >> >> Code: bf00 e92d 47f0 b086 (698f) ab06 >> >> ---[ end trace c6309ab77c3d706d ]--- >> >> >> >> The process I followed to cross-compile the code base is as follows: >> >> >> >> ./boot.sh >> >> >> >> ./configure CC=arm-none-linux-gnueabi-gcc >> --host=arm-none-linux-gnueabi --target=arm-none-linux-gnueabi >> --build=i686-linux --with-linux=/home/mbozier/synergy/kernel/ti >> KARCH=arm --disable-ssl >> CPPFLAGS=-I/home/mbozier/tirootfs/usr/inc-L/home/mbozier/tirootfs/usr/ >> lib >> >> >> >> make CROSS_COMPILE="arm-none-linux-gnueabi-" ARCH="arm" >> KCC="arm-none-linux-gnueabi-gcc" GCC="arm-none-linux-gnueabi-gcc" >> >> >> >> The kernel used on the target is built without Open vSwitch support >> and the 802.1d bridging support is configured to be loaded as a module. >> >> >> >> I also tried running the OpenvSwitch kernel module built from the >> sources distributed with the 3.3 kernel but with no success either. > > Is it the exact same problem on this kernel or is a different one? > > Probably the place to start is to use GDB to find exactly where it is > faulting, based on the address in the stack trace. Is the problem > reproducible? > _______________________________________________ > discuss mailing list > discuss@openvswitch.org > http://openvswitch.org/mailman/listinfo/discuss _______________________________________________ discuss mailing list discuss@openvswitch.org http://openvswitch.org/mailman/listinfo/discuss