On Fri, Aug 5, 2016 at 6:12 PM, Chris Mason <c...@fb.com> wrote: > > On 08/05/2016 07:08 AM, Nikolay Borisov wrote: >> Hello, >> >> Any ideas how come btrfs_path can be all zero, the one in >> the first slot comes from the increment in btrfs_next_old_item. > > Thanks for all the extra details. It really must be this: > > if (ret > 0) { > btrfs_release_path(path); > ret = btrfs_uuid_iter_rem(root, uuid, > key.type, > subid_cpu); > if (ret == 0) { > /* > * this might look inefficient, but > the > * justification is that it is an > * exception that check_func returns > 1, > * and that in the regular case only > one > * entry per UUID exists. > */ > goto again_search_slot; > } > if (ret < 0 && ret != -ENOENT) > goto out; > } > item_size -= sizeof(subid_le); > offset += sizeof(subid_le); > > > We've released the path, which would explain why its full of NULL. ret > was ENOENT, so it kept on going, and we fell through to > btrfs_next_item() > > Once the path is released, we should either be searching again or > exiting. A goto again_search_slot would probably fix it, but I'd want > to also bump the key so we don't just process the same item over and > over again. > > Can you reproduce this reliably? I'd hate to patch it now and make more > problems later just because we didn't fully understand the items we were > tripping over.
Well there are 2 things I can do: a) Dig more in the crash dump to see whether ret has been saved to the stack and extract the return value. If your theory is correct I should see the value of ENOENT. b) Patch the code to print a warn when btrfs_uuid_iter_rem returns an ENOENT, that way at least we will know that this is happening. In either cases this would take me until at least next week, at which time I should be able to give more information. > > -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html