On 06/12/2014 11:13 PM, Anand Avati wrote:
On Thu, Jun 12, 2014 at 10:33 AM, Vijay Bellur <vbel...@redhat.com <mailto:vbel...@redhat.com>> wrote:

    On 06/12/2014 06:52 PM, Ravishankar N wrote:

        Hi Vijay,

        Since glusterfs 3.5, posix_lookup() sends ESTALE instead of
        ENOENT [1]
        when when a parent gfid (entry) is not present on the brick . In a
        replicate set up, this causes a problem because AFR gives more
        priority
        to ESTALE than ENOENT, causing IO to fail [2]. The fix is in
        progress at
        [3] and is client-side specific , and would make it to 3.5.2

        But we will still hit the problem when rolling upgrade is
        performed from
        3.4 to 3.5,  unless the clients are also upgraded to 3.5: To
        elaborate
        an example:

        0) Create a 1x2 volume using 2 nodes and mount it from client. All
        machines are glusterfs 3.4
        1) Perform for i in {1..30}; do mkdir $i; tar xf
        glusterfs-3.5git.tar.gz
        -C $i& done
        2) While this is going on, kill one of the node in the replica
        pair and
        upgrade it to glusterfs 3.5 (simulating rolling upgrade)
        3) After a while, kill all tar processes
        4) Create a backup directory and move all 1..30 dirs inside
        'backup'
        5) Start the untar processes in 1) again
        6) Bring up the upgraded node. Tar fails with estale errors.

        Essentially the errors occur because [3] is a client side fix. But
        rolling upgrades are targeted at servers while the older
        clients still
        need to access them without issues.

        A solution is to have a fix in the posix translator wherein
        the newer
        client passes it's version (3.5) to posix_lookup() which then
        sends
        ESTALE if version is 3.5 or newer but sends ENOENT instead if
        it is an
        older client. Does this seem okay?


    Cannot think of a better solution to this. Seamless rolling
    upgrades are necessary for us and the proposed fix does seem okay
    for that reason.

    Thanks,
    Vijay


I also like Justin's proposal, of having fixes in 3.4.X and requiring clients to be at least 3.4.X in order to have rolling upgrade to 3.5.Y. This way we can add the "special fix" in 3.4.X client (just like the 3.5.2 client). Ravi's proposal "works", but all LOOKUPs will have an extra xattr, and we will be carrying forward the compat code burden for a very long time. Whereas a 3.4.X client fix will remain in 3.4 branch.

Thanks


I have sent a fix for review (http://review.gluster.org/#/c/8080/) . The change is in the server side only. I reckon if we are asking users to upgrade clients to a 3.4.x which anyway involves app downtime, we might as well ask them to upgrade to 3.5.

The fix is only sent on 3.5 - it does not need to go to master as I understand from Pranith that we only support compatibility between the current two releases. (meaning 3.6 servers require clients to be at at least 3.5 and not lower).

Regards,
Ravi

_______________________________________________
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Reply via email to