Re: [Gluster-devel] Moratorium on new patch acceptance

Vijaikumar M Tue, 19 May 2015 08:24:43 -0700


On Tuesday 19 May 2015 08:36 PM, Shyam wrote:

On 05/19/2015 08:10 AM, Raghavendra G wrote:
After discussion with Vijaykumar mallikarjuna and other inputs in this
thread, we are proposing all quota tests to comply to followingcriteria:
* use dd always with oflag=append (to make sure there are no parallel
writes) and conv=fdatasync (to make sure errors, if any are delivered to
application. Turning off flush-behind is optional since fdatasync acts
as a barrier)

OR

* turn off write-behind in nfs client and glusterfs server.

What do you people think is a better test scenario?

Also, we don't have confirmation on the RCA that parallel writes are
indeed the culprits. We are trying to reproduce the issue locally.
@Shyam, it would be helpful if you can confirm the hypothesis :).
Ummm... I thought we acknowledge that quota checks are done during theWIND and updated during UNWIND, and we have io threads doing in flightIOs (as well as possible IOs in io threads queue) and we have 256Kwrites in the case mentioned. Put together, in my head this forms agood RCA that we write more than needed due to the in flight IOs onthe brick. We need to control the in flight IOs as a resolution forthis from the application.
In terms of actual proof, we would need to instrument the code andcheck. When you say it does not fail for you, does the file stop oncequota is reached or is a random size greater than quota? Which itselfmay explain or point to the RCA.
The basic thing needed from an application is,
- Sync IOs, so that there aren't too many in flight IOs and theapplication waits for each IO to complete- Based on tests below if we keep block size in dd lower and useoflag=sync we can achieve the same, if we use higher block sizes wecannot
Test results:
1) noac:
- NFS sends a COMMIT (internally translates to a flush) post each IOrequest (NFS WRITES are still with the UNSTABLE flag)- Ensures prior IO is complete before next IO request is sent (dueto waiting on the COMMIT)- Fails if IO size is large, i.e in the test case being discussed Ichanged the dd line that was failing as "TEST ! dd if=/dev/zeroof=$N0/$mydir/newfile_2 *bs=10M* count=1 conv=fdatasync" and thisfails at times, as the writes here are sent as 256k chunks to theserver and we still see the same behavior- noac + performance.nfs.flush-behind: off +performance.flush-behind: off + performance.nfs.strict-write-ordering:on + performance.strict-write-ordering: on +performance.nfs.write-behind: off + performance.write-behind: off- Still see similar failures, i.e at times 10MB file is createdsuccessfully in the modified dd command above
Overall, the switch works, but not always. If we are to use thisvariant then we need to announce that all quota tests using dd not tryto go beyond the quota limit set in a single IO from dd.
2) oflag=sync:
  - Exactly the same behavior as above.
3) Added all (and possibly the kitches sink) to the test case, asattached, and still see failures,- Yes, I have made the test fail intentionally (of sorts) by using3M per dd IO and 2 IOs to go beyond the quota limit.- The intention is to demonstrate that we still get parallel IOsfrom NFS client- The test would work if we reduce the block size per IO (reliablyis a border condition here, and we need specific rules like block sizeand how many blocks before we state quota is exceeded etc.)- The test would work if we just go beyond the quota, and then checka separate dd instance as being able to *not* exceed the quota. Whichis why I put up that patch.
What next?

Hi Shyam,

I tried running the test with dd option 'oflag=append' and didn't seethe issue.Can you please try this option and see if it works?


Thanks,
Vijay


regards,
Raghavendra.

On Tue, May 19, 2015 at 5:27 PM, Raghavendra G <raghaven...@gluster.com
<mailto:raghaven...@gluster.com>> wrote:



    On Tue, May 19, 2015 at 4:26 PM, Jeff Darcy <jda...@redhat.com
    <mailto:jda...@redhat.com>> wrote:

> No, my suggestion was aimed at not having parallel writes.In this case quota> won't even fail the writes with EDQUOT because of reasonsexplained above.> Yes, we need to disable flush-behind along with this sothat errors are

        > delivered to application.

        Would conv=sync help here?  That should prevent any kind of
        write parallelism.


    An strace of dd shows that

    * fdatasync is issued only once at the end of all writes when
    conv=fdatasync
    * for some strange reason no fsync or fdatasync is issued at all
    when conv=sync

    So, using conv=fdatasync in the test cannot prevent
    write-parallelism induced by write-behind. Parallelism would've been
    prevented only if dd had issued fdatasync after each write or opened
    the file with O_SYNC.

        If it doesn't, I'd say that's a true test failure somewhere in
        our stack.  A
        similar possibility would be to invoke dd multiple times with
        oflag=append.


    Yes, appending writes curb parallelism (at least in glusterfs, but
    not sure how nfs client behaves) and hence can be used  as an
    alternative solution.

    On a slightly unrelated note flush-behind is immaterial in this test
    since fdatasync is anyways acting as a barrier.

        _______________________________________________
        Gluster-devel mailing list
        Gluster-devel@gluster.org <mailto:Gluster-devel@gluster.org>
        http://www.gluster.org/mailman/listinfo/gluster-devel




    --
    Raghavendra G




--
Raghavendra G


_______________________________________________
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel



_______________________________________________
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

_______________________________________________
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Moratorium on new patch acceptance

Reply via email to