Is the assert actually being hit?  I thought the contents were messed 
up?  If the assert is not being hit, then removing it won't help.  If it 
is, then removing the assert may be the right thing to do.

Daniel

On 07/07/2016 03:09 AM, Swen Schillig wrote:
> Any news anyone ?
>
> As I suggested during our last call, how about removing the assert() as
> a quick "fix" ?
> At least that's how the same situation is handled in get_state_owner().
>
> Cheers Swen
>
> On Di, 2016-07-05 at 16:59 +0200, Swen Schillig wrote:
>> On Di, 2016-07-05 at 09:58 -0400, Daniel Gryniewicz wrote:
>>>
>>> This sounds like an extra ref is being released.  You can check
>>> this
>>> by
>>> looking at the refcount in dec_state_owner_ref() and seeing if
>>> it's
>>> negative.  I did a quick once-over of the callpath involved, and I
>>> didn't see any obvious refcount issues, but this is probably a
>>> use-after-free due to refcounting.
>>>
>>> Daniel
>> I think as well it's a use after free but seemingly the refcount is
>> not
>> protecting us from it as it can't be negative because the preceeding
>> check should have caught that.
>>
>>
>> .....
>>
>>      refcount = atomic_dec_int32_t(&owner->so_refcount);
>>
>>      if (refcount != 0) {
>>              if (str_valid)
>>                      LogFullDebug(COMPONENT_STATE,
>>                                   "Decrement refcount now=%" PRId32
>> " {%s}",
>>                                   refcount, str);
>>
>>              assert(refcount > 0);
>>
>>              return;
>>      }
>>
>>      ht_owner = get_state_owner_hash_table(owner);
>>
>>      if (ht_owner == NULL) {
>>              if (!str_valid)
>>                      display_printf(&dspbuf, "Invalid owner %p",
>> owner);
>>
>>              LogCrit(COMPONENT_STATE, "Unexpected owner {%s}", str);
>>
>>              assert(ht_owner);
>>
>>              return;
>>      }
>>
>>
>>
>>
>>>
>>>
>>> On 07/04/2016 04:15 AM, Swen Schillig wrote:
>>>>
>>>>
>>>> I'm struggling with an abort triggered by assert(ht_owner) in
>>>> function dec_state_owner_ref().
>>>>
>>>> The call path is pretty short but I'm still not getting to the
>>>> root cause of the issue.
>>>>
>>>> The call path is
>>>>
>>>>     nfs4_op_release_lockowner()
>>>>         -> create_nfs4_owner()
>>>>             -> get_state_owner()
>>>>                 -> get_state_owner_hash_table()
>>>>                     -> compute()
>>>>                     # we're here so our "assembled" owner record
>>>> must be "OK"
>>>>
>>>>         -> release_lock_owner(..the one which was created one
>>>> line
>>>> above..)
>>>>         -> dec_state_owner_ref(...still the owner form above..)
>>>>             -> get_state_owner_hash_table() =>>>>>> BANG !!!
>>>> because the owner->so_type is totally off
>>>>
>>>> It seems the owner record fetched from the hash-table is
>>>> corrupted,
>>>> but I'm not really sure on how this hash-table / owner stuff is
>>>> supposed to work.
>>>>
>>>> Could someone please sched some light into "this" !
>>>>
>>>> Any support in this direction would be greatly appreciated.
>>>>
>>>> Cheers Swen
>>>>
>>>> P.S.: Here's the BT
>>>>
>>>>
>>>> Program terminated with signal 6, Aborted.
>>>> #0  0x00007fac3456b5f7 in __GI_raise (sig=sig@entry=6) at
>>>> ../nptl/sysdeps/unix/sysv/linux/raise.c:56
>>>> 56        return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
>>>> Missing separate debuginfos, use: debuginfo-install gpfs.smb-
>>>> 4.3.0_gpfs_9-6.el7.x86_64
>>>> (gdb) bt full
>>>> #0  0x00007fac3456b5f7 in __GI_raise (sig=sig@entry=6) at
>>>> ../nptl/sysdeps/unix/sysv/linux/raise.c:56
>>>>         resultvar = 0
>>>>         pid = 4895
>>>>         selftid = 5042
>>>> #1  0x00007fac3456cce8 in __GI_abort () at abort.c:90
>>>>         save_stage = 2
>>>>         act = {__sigaction_handler = {sa_handler =
>>>> 0x7fff4b1aee87,
>>>> sa_sigaction = 0x7fff4b1aee87}, sa_mask = {
>>>>             __val = {140377590535088, 5805400, 1033, 8617040,
>>>> 140377589178051, 4, 140376427389568, 4011254720,
>>>>               5466682, 140376427389688, 0, 0, 0, 21474836480,
>>>> 140377629331456, 140377590547080}},
>>>>           sa_flags = 5807401, sa_restorer = 0x589fa0
>>>> <__PRETTY_FUNCTION__.21651>}
>>>>         sigs = {__val = {32, 0 <repeats 15 times>}}
>>>> #2  0x00007fac34564566 in __assert_fail_base (fmt=0x7fac346b4288
>>>> "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n",
>>>>     assertion=assertion@entry=0x589d29 "ht_owner",
>>>>     file=file@entry=0x589558
>>>> "/home/ppsbld/bmd1.160602.151918/bmd1.ganesha-rpmdir/BUILD/nfs-
>>>> ganesha-2.3.2-ibm17-0.1.1-Source/SAL/state_misc.c", line=line@ent
>>>> ry
>>>> =1033,
>>>>     function=function@entry=0x589fa0 <__PRETTY_FUNCTION__.21651>
>>>> "dec_state_owner_ref") at assert.c:92
>>>>         str = 0x7fa9e4066fe0 "x"
>>>>         total = 4096
>>>> #3  0x00007fac34564612 in __GI___assert_fail (assertion=0x589d29
>>>> "ht_owner",
>>>>     file=0x589558 "/home/ppsbld/bmd1.160602.151918/bmd1.ganesha-
>>>> rpmdir/BUILD/nfs-ganesha-2.3.2-ibm17-0.1.1-
>>>> Source/SAL/state_misc.c", line=1033, function=0x589fa0
>>>> <__PRETTY_FUNCTION__.21651> "dec_state_owner_ref") at
>>>> assert.c:101
>>>> No locals.
>>>> #4  0x00000000004be5b4 in dec_state_owner_ref
>>>> (owner=0x7faa7c03d6d0)
>>>>     at /usr/src/debug/nfs-ganesha-2.3.2-ibm17-0.1.1-
>>>> Source/SAL/state_misc.c:1033
>>>>         str = "INVALID STATE OWNER TYPE
>>>> powner=0x7faa7c03d6d0\000\000\320\350\026\357\253\177\000\000v\03
>>>> 3Q
>>>> \000\000\000\000\000\340\350\026\357\253\177\000\000\220}\t\344\2
>>>> 51
>>>> \177\000\000\200`\000\024\252\177\000\000\240\355\026\357\253\177
>>>> \0
>>>> 00\000\065\321C\000\000\000\000\000\247\r\264\064@\000\000\000\25
>>>> 0\
>>>> 300\a\344\251\177\000\000\000\253\005\344\251\177\000\000
>>>> \355\026\357\253\177\000\000\371\003M\000\000\000\000\000\060\345
>>>> \0
>>>> 26\357\253\177\000\000p\365\006\254\251\177\000\000\065\321C\000\
>>>> 00
>>>> 0\000\000\000\000\253\005\344@\000\000\000\250\300\a\344\251\177\
>>>> 00
>>>> 0\000\000\253\005\344\251\177\000\000\210\300\a\344\251\177\000\0
>>>> 00
>>>> \000"...
>>>>         dspbuf = {b_size = 2048, b_current = 0x7fabef16e48e "",
>>>>           b_start = 0x7fabef16e460 "INVALID STATE OWNER TYPE
>>>> powner=0x7faa7c03d6d0"}
>>>>        str_valid = false
>>>>         latch = {locator = 0x7fabef16e880, rbt_hash =
>>>> 140367651405856, index = 4011256960}
>>>>         rc = HASHTABLE_SUCCESS
>>>>         buffkey = {addr = 0x800, len = 140366712337776}
>>>>         old_value = {addr = 0x7fabef16ec80, len = 5022057}
>>>>         old_key = {addr = 0x7fabef16ec50, len = 140370201794264}
>>>>         refcount = 0
>>>>         ht_owner = 0x0
>>>>         __func__ = "dec_state_owner_ref"
>>>>         __PRETTY_FUNCTION__ = "dec_state_owner_ref"
>>>> #5  0x000000000047cd07 in nfs4_op_release_lockowner
>>>> (op=0x7fab905f8fb0, data=0x7fabef16ed90, resp=0x7fa9e403e270)
>>>>     at /usr/src/debug/nfs-ganesha-2.3.2-ibm17-0.1.1-
>>>> Source/Protocols/NFS/nfs4_op_release_lockowner.c:129
>>>>         arg_RELEASE_LOCKOWNER4 = 0x7fab905f8fb8
>>>>         res_RELEASE_LOCKOWNER4 = 0x7fa9e403e278
>>>>         nfs_client_id = 0x7fa9ac06f570
>>>>         lock_owner = 0x7faa7c03d6d0
>>>>         owner_name = {son_owner_len = 20, son_owner_val =
>>>> 0x7fab903cfba0 "lock id:"}
>>>>         rc = 0
>>>>         __func__ = "nfs4_op_release_lockowner"
>>>> #6  0x0000000000460467 in nfs4_Compound (arg=0x7fab9056cf50,
>>>> req=0x7fab9056cd90, res=0x7fa9e4058400)
>>>>     at /usr/src/debug/nfs-ganesha-2.3.2-ibm17-0.1.1-
>>>> Source/Protocols/NFS/nfs4_Compound.c:710
>>>>         i = 0
>>>>         status = 0
>>>>         data = {currentFH = {nfs_fh4_len = 0, nfs_fh4_val = 0x0},
>>>> savedFH = {nfs_fh4_len = 0, nfs_fh4_val = 0x0},
>>>>           current_stateid = {seqid = 0, other = '\000' <repeats
>>>> 11
>>>> times>}, current_stateid_valid = false,
>>>>           saved_stateid = {seqid = 0, other = '\000' <repeats 11
>>>> times>}, saved_stateid_valid = false,
>>>>           minorversion = 0, current_entry = 0x0, saved_entry =
>>>> 0x0,
>>>> current_ds = 0x0, saved_ds = 0x0,
>>>>           current_filetype = NO_FILE_TYPE, saved_filetype =
>>>> NO_FILE_TYPE, saved_export = 0x0, saved_export_perms = {
>>>>             anonymous_uid = 0, anonymous_gid = 0, options = 0,
>>>> set
>>>> = 0}, req = 0x7fab9056cd90, credential = {
>>>>             flavor = 1, length = 36, auth_union = {auth_unix =
>>>> {aup_time = 26637285, aup_machname = 0x0,
>>>>                 aup_uid = 0, aup_gid = 0, aup_len = 0, aup_gids =
>>>> 0x0}, auth_gss = {svc = 26637285, qop = 0,
>>>>                 gss_context_id = 0x0}}}, preserved_clientid =
>>>> 0x0,
>>>> cached_res = 0x0, use_drc = false, oppos = 0,
>>>>          session = 0x0, sequence = 0, slot = 0}
>>>>         opcode = NFS4_OP_RELEASE_LOCKOWNER
>>>>         compound4_minor = 0
>>>>         argarray_len = 1
>>>>         argarray = 0x7fab905f8fb0
>>>>         resarray = 0x7fa9e403e270
>>>>         op_start_time = 50162982679195
>>>>         ts = {tv_sec = 1466665256, tv_nsec = 618484482}
>>>>         perm_flags = 0
>>>>         tagname = 0x0
>>>>         __func__ = "nfs4_Compound"
>>>> #7  0x0000000000444dbf in nfs_rpc_execute
>>>> (reqdata=0x7fab9056cd60)
>>>>     at /usr/src/debug/nfs-ganesha-2.3.2-ibm17-0.1.1-
>>>> Source/MainNFSD/nfs_worker_thread.c:1288
>>>>         client_ip = 0x7fab6c0009a0 "::ffff:10.31.39.10"
>>>>         progname = 0x556e03 "NFS"
>>>>         reqdesc = 0x555f70 <nfs4_func_desc+48>
>>>>         arg_nfs = 0x7fab9056cf50
>>>>         xprt = 0x7fab90244ca0
>>>>         res_nfs = 0x7fa9e4058400
>>>>         export_perms = {anonymous_uid = 4294967294, anonymous_gid
>>>> =
>>>> 4294967294, options = 0, set = 0}
>>>>         user_credentials = {caller_uid = 4294967294, caller_gid =
>>>> 4294967294, caller_glen = 0, caller_garray = 0x0}
>>>>         req_ctx = {creds = 0x7fabef16f3f0, original_creds =
>>>> {caller_uid = 0, caller_gid = 0, caller_glen = 0,
>>>>             caller_garray = 0x0}, caller_gdata = 0x0,
>>>> caller_garray_copy = 0x0, managed_garray_copy = 0x0,
>>>>           cred_flags = 0, caller_addr = 0x7fab90244d98, clientid
>>>> =
>>>> 0x0, nfs_vers = 4, nfs_minorvers = 0,
>>>>           req_type = 2, client = 0x7fab6c000908, export = 0x0,
>>>> fsal_export = 0x0, export_perms = 0x7fabef16f410,
>>>>           start_time = 50162982670723, queue_wait = 46214,
>>>> fsal_private = 0x0, fsal_module = 0x0,
>>>>           fsal_pnfs_ds = 0x0}
>>>>         dpq_status = DUPREQ_SUCCESS
>>>>         timer_start = {tv_sec = 1466665256, tv_nsec = 618476010}
>>>>         auth_rc = AUTH_OK
>>>>         port = 968
>>>>         protocol_options = 2097152
>>>>         rc = 0
>>>>         exportid = -1
>>>>         slocked = false
>>>>         __func__ = "nfs_rpc_execute"
>>>> #8  0x0000000000445701 in worker_run (ctx=0x4f254e0)
>>>>     at /usr/src/debug/nfs-ganesha-2.3.2-ibm17-0.1.1-
>>>> Source/MainNFSD/nfs_worker_thread.c:1548
>>>>         worker_data = 0x4f254e0
>>>>         reqdata = 0x7fab9056cd60
>>>>         __func__ = "worker_run"
>>>> #9  0x000000000051f2d2 in fridgethr_start_routine (arg=0x4f254e0)
>>>>     at /usr/src/debug/nfs-ganesha-2.3.2-ibm17-0.1.1-
>>>> Source/support/fridgethr.c:561
>>>>         fe = 0x4f254e0
>>>>         fr = 0x4efdf30
>>>>         reschedule = false
>>>>         rc = 0
>>>>         old_type = 0
>>>>         old_state = 0
>>>>         __PRETTY_FUNCTION__ = "fridgethr_start_routine"
>>>>         __func__ = "fridgethr_start_routine"
>>>> #10 0x00007fac34f6cdc5 in start_thread (arg=0x7fabef170700) at
>>>> pthread_create.c:308
>>>>         __res = <optimized out>
>>>>         pd = 0x7fabef170700
>>>>         now = <optimized out>
>>>>         unwind_buf = {cancel_jmp_buf = {{jmp_buf =
>>>> {140376427398912, 3136990464326405555, 0, 140376427399616,
>>>>                 140376427398912, 0, -3107754114923721293,
>>>> -3111778601013572173}, mask_was_saved = 0}}, priv = {
>>>>             pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0,
>>>> cleanup
>>>> = 0x0, canceltype = 0}}}
>>>>         not_first_call = <optimized out>
>>>>         pagesize_m1 = <optimized out>
>>>>         sp = <optimized out>
>>>>         freesize = <optimized out>
>>>> #11 0x00007fac3462c28d in clone () at
>>>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
>>>>
>>>>
>>>> -----------------------------------------------------------------
>>>> -------------
>>>> Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park
>>>> in
>>>> San
>>>> Francisco, CA to explore cutting-edge tech and listen to tech
>>>> luminaries
>>>> present their vision of the future. This family event has
>>>> something
>>>> for
>>>> everyone, including kids. Get more information and register
>>>> today.
>>>> http://sdm.link/attshape
>>>> _______________________________________________
>>>> Nfs-ganesha-devel mailing list
>>>> Nfs-ganesha-devel@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>>>>
>>> -------------------------------------------------------------------
>>> -----------
>>> Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in
>>> San
>>> Francisco, CA to explore cutting-edge tech and listen to tech
>>> luminaries
>>> present their vision of the future. This family event has something
>>> for
>>> everyone, including kids. Get more information and register today.
>>> http://sdm.link/attshape
>>> _______________________________________________
>>> Nfs-ganesha-devel mailing list
>>> Nfs-ganesha-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>>>
>> -------------------------------------------------------------------
>> -----------
>> Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in
>> San
>> Francisco, CA to explore cutting-edge tech and listen to tech
>> luminaries
>> present their vision of the future. This family event has something
>> for
>> everyone, including kids. Get more information and register today.
>> http://sdm.link/attshape
>> _______________________________________________
>> Nfs-ganesha-devel mailing list
>> Nfs-ganesha-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>
> ------------------------------------------------------------------------------
> Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
> Francisco, CA to explore cutting-edge tech and listen to tech luminaries
> present their vision of the future. This family event has something for
> everyone, including kids. Get more information and register today.
> http://sdm.link/attshape
> _______________________________________________
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>


------------------------------------------------------------------------------
Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape
_______________________________________________
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Reply via email to