Hi, You need this fix:
https://sourceforge.net/p/tipc/mailman/message/34768934/ But it wont apply cleanly, so you need this entire series to fix all issues related to topology server. https://sourceforge.net/p/tipc/mailman/message/34768927/ They were too intrusive to be pushed to net, hence were pushed to net-next and were merged in 4.7. /Partha On 09/01/2016 08:43 PM, Jon Maloy wrote: > Hi Jonas, > I don’t think there is any such thing as a “long-term” kernel from the > community viewpoint. But distros such as SLES or Ubuntu use this term, so I > suspect that is what you mean. I believe the latest version of both of those > are based on 4.4. > I honestly don’t know how often and on which criteria those distros pick > upgrades from the upstream kernel, but if this is a serious problem we > certainly have to push them to adopt a fix for this. > > I believe Partha will recognize this bug, and can tell whether there is a fix > to it or not. If so he can also tell what has happened to it. If this is a > distro specific problem we need to know which one you are using. > > Regards > ///jon > > From: Arndt, Jonas [mailto:jonas.ar...@hpe.com] > Sent: Thursday, 01 September, 2016 14:11 > To: Jon Maloy <jon.ma...@ericsson.com> > Subject: Fwd: [tipc-discussion] [Kernel oops in 4.4.18] > > Jon, > > Sorry for reaching out to you directly. I have posted to the mailing list > multiple time and I don't understand why it is getting stuck. I am a > subscriber and got and email indicating that I can post. > > Cheers, > > // Jonas > > -------- Forwarded Message -------- > > Subject: > [tipc-discussion] [Kernel oops in 4.4.18] > > Date: > Wed, 31 Aug 2016 09:11:42 -0600 > > From: > Jonas Arndt <mailto:jonas.ar...@hpe.com> <jonas.ar...@hpe.com> > > To: > tipc-discussion@lists.sourceforge.net > <mailto:tipc-discussion@lists.sourceforge.net> > > > Resending as it appears it didn't show up on the mailing list. Sorry for any > duplicates.... > > Hi Guys, > > My apologies if this has been covered before. > > I am getting this kernel null pointer when trying TIPC with 4.4.18 kernel > (running OpenSAF). It works fine with 4.5.x. There seems to have been a > number of patches applied to net/tipc between the versions. Why is it not > back-ported to 4.4.x? Isn't that a longterm kernel? > > Thanks, > > // Jonas > > ================================================================================ > 2016-08-17T09:19:49.656792-06:00 rack13-ctrl2 kernel: [ 302.348407] BUG: > unable to handle kernel NULL pointer dereference at 0000000000000018 > 2016-08-17T09:19:49.656808-06:00 rack13-ctrl2 kernel: [ 302.348474] IP: > [<ffffffffa0702749>] tipc_nametbl_subscribe+0x19/0x180 [tipc] > 2016-08-17T09:19:49.656810-06:00 rack13-ctrl2 kernel: [ 302.348540] PGD 0 > 2016-08-17T09:19:49.656812-06:00 rack13-ctrl2 kernel: [ 302.348559] Oops: > 0000 1 SMP > 2016-08-17T09:19:49.656814-06:00 rack13-ctrl2 kernel: [ 302.348585] Modules > linked in: tipc rpcsec_gss_krb5 nfsv4 dns_resolver ebtable_filter ebtables > ip6table_filter ip6_tables iptable_filter ip_tables x_tables openvswitch > nf_defrag_ipv6 nf_conntrack libcrc32c crc32c_generic nfsd auth_rpcgss nfs_acl > nfs lockd grace fscache sunrpc x86_pkg_temp_thermal intel_powerclamp coretemp > kvm_intel kvm irqbypass mgag200 ttm crc32_pclmul drm_kms_helper drm hmac drbg > fb_sys_fops ansi_cprng syscopyarea aesni_intel sysfillrect aes_x86_64 > sysimgblt lrw gf128mul glue_helper ablk_helper cryptd ipmi_si iTCO_wdt hpilo > evdev pcspkr wmi ipmi_msghandler iTCO_vendor_support hpwdt acpi_power_meter > button sb_edac ioatdma lpc_ich edac_core pcc_cpufreq mfd_core acpi_cpufreq > processor autofs4 ext4 crc16 mbcache jbd2 dm_mod sg sd_mod ata_generic > pata_acpi crc32c_intel psmouse ata_piix libata uhci_hcd ehci_pci ehci_hcd igb > scsi_mod i2c_algo_bit i2c_core usbcore usb_common ixgbe dca mdio ptp pps_core > thermal > 2016-08-17T09:19:49.656817-06:00 rack13-ctrl2 kernel: [ 302.349237] CPU: 16 > PID: 98 Comm: kworker/u130:0 Not tainted 4.4.18-tipc #1 > 2016-08-17T09:19:49.656843-06:00 rack13-ctrl2 kernel: [ 302.349278] Hardware > name: HP ProLiant SL210t Gen8/, BIOS P83 11/01/2014 > 2016-08-17T09:19:49.656846-06:00 rack13-ctrl2 kernel: [ 302.349321] > Workqueue: tipc_rcv tipc_recv_work [tipc] > 2016-08-17T09:19:49.656848-06:00 rack13-ctrl2 kernel: [ 302.349354] task: > ffff881ff93a5640 ti: ffff881ff93b0000 task.ti: ffff881ff93b0000 > 2016-08-17T09:19:49.656850-06:00 rack13-ctrl2 kernel: [ 302.349395] RIP: > 0010:[<ffffffffa0702749>] [<ffffffffa0702749>] > tipc_nametbl_subscribe+0x19/0x180 [tipc] > 2016-08-17T09:19:49.656852-06:00 rack13-ctrl2 kernel: [ 302.349464] RSP: > 0018:ffff881ff93b3cc0 EFLAGS: 00010286 > 2016-08-17T09:19:49.656853-06:00 rack13-ctrl2 kernel: [ 302.349494] RAX: > 0000000000000000 RBX: 0000000000000000 RCX: 0000000180200017 > 2016-08-17T09:19:49.656855-06:00 rack13-ctrl2 kernel: [ 302.349534] RDX: > 0000000180200018 RSI: 0000000000000200 RDI: 0000000000000000 > 2016-08-17T09:19:49.656857-06:00 rack13-ctrl2 kernel: [ 302.349573] RBP: > ffff881ff93b3d00 R08: 00000000f7970601 R09: 0000000180200017 > 2016-08-17T09:19:49.656858-06:00 rack13-ctrl2 kernel: [ 302.349613] R10: > ffffea003fde5c00 R11: ffff880ff7970600 R12: 0000000000000000 > 2016-08-17T09:19:49.656859-06:00 rack13-ctrl2 kernel: [ 302.349652] R13: > ffff881ff54ac0a0 R14: ffff880fee6edd00 R15: ffff880ff7970200 > 2016-08-17T09:19:49.656860-06:00 rack13-ctrl2 kernel: [ 302.349692] FS: > 0000000000000000(0000) GS:ffff88203f880000(0000) knlGS:0000000000000000 > 2016-08-17T09:19:49.656860-06:00 rack13-ctrl2 kernel: [ 302.349736] CS: 0010 > DS: 0000 ES: 0000 CR0: 0000000080050033 > 2016-08-17T09:19:49.656861-06:00 rack13-ctrl2 kernel: [ 302.349785] CR2: > 0000000000000018 CR3: 0000000001a09000 CR4: 00000000001406e0 > 2016-08-17T09:19:49.656863-06:00 rack13-ctrl2 kernel: [ 302.349833] Stack: > 2016-08-17T09:19:49.656865-06:00 rack13-ctrl2 kernel: [ 302.349853] > ffffffff811a5d1b ffff880ff7970200 ffff880ff80f6000 0000000000000000 > 2016-08-17T09:19:49.656865-06:00 rack13-ctrl2 kernel: [ 302.349915] > ffff880ff87898c0 ffff881ff54ac0a0 ffff880fee6edd00 ffff880ff7970200 > 2016-08-17T09:19:49.656866-06:00 rack13-ctrl2 kernel: [ 302.349976] > ffff881ff93b3d48 ffffffffa070143a ffff881ff93b3d48 ffff880ff87898c8 > 2016-08-17T09:19:49.656867-06:00 rack13-ctrl2 kernel: [ 302.350037] Call > Trace: > 2016-08-17T09:19:49.656868-06:00 rack13-ctrl2 kernel: [ 302.350069] > [<ffffffff811a5d1b>] ? kfree+0x13b/0x150 > 2016-08-17T09:19:49.656870-06:00 rack13-ctrl2 kernel: [ 302.350114] > [<ffffffffa070143a>] tipc_subscrb_rcv_cb+0xfa/0x370 [tipc] > 2016-08-17T09:19:49.656872-06:00 rack13-ctrl2 kernel: [ 302.350165] > [<ffffffffa070d43f>] tipc_receive_from_sock+0xaf/0x100 [tipc] > 2016-08-17T09:19:49.656874-06:00 rack13-ctrl2 kernel: [ 302.350219] > [<ffffffffa070d61b>] tipc_recv_work+0x2b/0x60 [tipc] > 2016-08-17T09:19:49.656874-06:00 rack13-ctrl2 kernel: [ 302.350266] > [<ffffffff8107bad8>] process_one_work+0x158/0x420 > 2016-08-17T09:19:49.656875-06:00 rack13-ctrl2 kernel: [ 302.350310] > [<ffffffff8107c529>] worker_thread+0x69/0x480 > 2016-08-17T09:19:49.656876-06:00 rack13-ctrl2 kernel: [ 302.350351] > [<ffffffff8107c4c0>] ? rescuer_thread+0x310/0x310 > 2016-08-17T09:19:49.656877-06:00 rack13-ctrl2 kernel: [ 302.350395] > [<ffffffff810818cb>] kthread+0xdb/0x100 > 2016-08-17T09:19:49.656879-06:00 rack13-ctrl2 kernel: [ 302.350434] > [<ffffffff810817f0>] ? kthread_park+0x60/0x60 > 2016-08-17T09:19:49.656880-06:00 rack13-ctrl2 kernel: [ 302.350487] > [<ffffffff815575cf>] ret_from_fork+0x3f/0x70 > 2016-08-17T09:19:49.656881-06:00 rack13-ctrl2 kernel: [ 302.350528] > [<ffffffff810817f0>] ? kthread_park+0x60/0x60 > 2016-08-17T09:19:49.656882-06:00 rack13-ctrl2 kernel: [ 302.350567] Code: 41 > 5c 41 5d 41 5e 41 5f 5d c3 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 > 41 57 41 56 41 55 41 54 49 89 fc 53 48 83 ec 18 <48> 8b 47 18 8b 5f 08 48 8b > 90 e0 12 00 00 8b 05 27 ff 00 00 83 > 2016-08-17T09:19:49.656883-06:00 rack13-ctrl2 kernel: [ 302.350870] RIP > [<ffffffffa0702749>] tipc_nametbl_subscribe+0x19/0x180 [tipc] > 2016-08-17T09:19:49.656884-06:00 rack13-ctrl2 kernel: [ 302.352594] RSP > <ffff881ff93b3cc0> > 2016-08-17T09:19:49.656886-06:00 rack13-ctrl2 kernel: [ 302.354220] CR2: > 0000000000000018 > 2016-08-17T09:19:49.656888-06:00 rack13-ctrl2 kernel: [ 302.355816] --[ end > trace 3bc92e0fb0a9c178 ]-- > 2016-08-17T09:19:49.656894-06:00 rack13-ctrl2 kernel: [ 302.362309] BUG: > unable to handle kernel paging request at ffffffffffffffd8 > 2016-08-17T09:19:49.670952-06:00 rack13-ctrl2 osafntfd[1776]: Started > 2016-08-17T09:19:57.670994-06:00 rack13-ctrl2 osafntfd[1776]: MDTM:TIPC > Failed to connect to topology server in mdtm_check_for_endianness err > :Connection timed out > 2016-08-17T09:19:57.671340-06:00 rack13-ctrl2 osafntfd[1776]: ER > ncs_core_agents_startup FAILED > 2016-08-17T09:19:57.671695-06:00 rack13-ctrl2 osafntfd[1776]: > ncs_sel_obj_rmv_ind: recv failed - Socket operation on non-socket, raise_obj: > 0 rmv_obj: 0 > 2016-08-17T09:19:57.671935-06:00 rack13-ctrl2 osafntfd[1776]: osaf_abort(-1) > called from 0x7f3fca8d8938 with errno=88 > 2016-08-17T09:19:57.693637-06:00 rack13-ctrl2 osafclmd[1783]: Started > 2016-08-17T09:20:05.695009-06:00 rack13-ctrl2 osafclmd[1783]: MDTM:TIPC > Failed to connect to topology server in mdtm_check_for_endianness err > :Connection timed out > 2016-08-17T09:20:05.695408-06:00 rack13-ctrl2 osafclmd[1783]: ER clms_init > failed > 2016-08-17T09:20:05.695678-06:00 rack13-ctrl2 osafclmd[1783]: ER Failed, > exiting... > 2016-08-17T09:20:05.695932-06:00 rack13-ctrl2 opensafd[1699]: ER Failed #012 > DESC:CLMD > 2016-08-17T09:20:05.696303-06:00 rack13-ctrl2 opensafd[1699]: ER Going for > recovery > ==================================================================================== > ------------------------------------------------------------------------------ _______________________________________________ tipc-discussion mailing list tipc-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/tipc-discussion