Ilya,

I think there is a bigger problem here that your vdev_mirror() change would cover up. The problems is that we should never have a bp with a dva that is out of range. So we need to understand how that is occurring.

As for you question about metaslab_allocate_dva(), I'm assuming you're referring to this code path:

        } else if (d != 0) {
                vd = vdev_lookup_top(spa, DVA_GET_VDEV(&dva[d - 1]));
                mg = vd->vdev_mg->mg_next;

This vdev can't be NULL since the configuration should not have changed while we're in the middle of doing allocations.

Even for gang blocks we should never have a NULL vdev. I'm assuming you're referring to this logic:

                if (vd != NULL) {
                        mg = vd->vdev_mg;

                        if (flags & METASLAB_HINTBP_AVOID &&
                            mg->mg_next != NULL)
                                mg = mg->mg_next;
                } else {
                        mg = mc->mc_rotor;
                }

This is meant to handle device logs which can be removed since the ZIL may have a reference to an old log block on a device that no longer exists.

I think we need to find the root cause of where this out of range dva is coming from. If you have old core files that you could share please point me at them.

Thanks,
George

On 12/18/13 1:33 PM, Ilya Usvyatsky wrote:
I am looking at a rare but nasty case of corruption in which a block pointer has one of ditto dva's out of range.
This causes vdev_lookup_top() to return NULL for a vdev pointer.
Going through the callers of vdev_lookup_top(), I noticed that in some cases NULL vdev are handled properly, while in others it is not. In particular, in case of an import, ditto blocks are handled by a mirror vdev code path that does not have proper handling for NULL and would panic the system.

My question, though, is not about the mirror (for which I have a fix that I will upstream eventually),but about metaslab allocator. In particular, I am looking at metaslab_allocate() and metaslab_allocate_dva(). In the latter function NULL vdev is properly handled in the case of a gang block (since a hint may be stale and the device may be gone). However, that very same function does not handle NULL vdev in case of a regular block pointer with ditto blocks. The dilemma I am facing here is that I can either just use rotor (i.e.. mimic gang block behavior), or return an error immediately. In the latter case, the caller, metaslab_allocate() would handle it properly. I am inclined to go with the second option, but I would greatly appreciate an insight on this from someone who is familiar with the internal logic and the theory behind metaslab allocator.

Thank you very much,
Ilya.



_______________________________________________
developer mailing list
developer@open-zfs.org
http://lists.open-zfs.org/mailman/listinfo/developer

_______________________________________________
developer mailing list
developer@open-zfs.org
http://lists.open-zfs.org/mailman/listinfo/developer

Reply via email to