hi,
      I think this is the RCA for the issue:
Basically with distributed ec + disctributed replicate as cold, hot tiers. tier sends a lookup which fails on ec. (By this time dict already contains ec xattrs) After this lookup_everywhere code path is hit in tier which triggers lookup on each of distribute's hash lookup but fails which leads to the cold, hot dht's lookup_everywhere in two parallel epoll threads where in ec's thread it
    tries to set trusted.ec.version/dirty/size in the dictionary, the older
values against the same key get erased. While this erasing is going on if the thread that is doing lookup on afr's subvolume accesses these members either in dict_copy_with_ref or client xlator trying to serialize, that can either lead to crash or hang based on when the spin/mutex lock is called on invalid memory.

At the moment I sent http://review.gluster.org/13680 (I am pressed for time because I need to provide a build for our customer with a fix), which avoids parallel accesses of elements which step on each other.

Raghavendra G and I discussed about this problem and the right way to fix it is to take a copy(without dict_foreach) of the dictionary in dict_foreach inside a lock and then loop over the local dictionary. I am worried about the performance implication of this, so wondering if anyone has a better idea.

Also included Xavi, who earlier said we need to change dict.c but it is a bigger change. May be the time has come? I would love to gather all your inputs and implement a better version of dict if we need one.

Pranith
_______________________________________________
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Reply via email to