Hi, Gluster expert,

When we setup replicate volume with info like the below:

Volume Name: test
Type: Replicate
Volume ID: 9373eba9-eb84-4618-a54c-f2837345daec
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: rcp:/trunk/brick/test1/sn0
Brick2: rcp:/trunk/brick/test1/sn1
Brick3: rcp:/trunk/brick/test1/sn2 (arbiter)

If we run a performance test which could write a same file with multi-pthread 
in same time.(write different offset). The write performance drop a lots (about 
60%-70% off  to the volume which no arbiter)
And when we study the source code, there is a function 
“afr_set_transaction_flock” in” afr-transaction.c”,
It will flock the entire file when arbiter_count is not zero, I suppose it is 
the root cause lead to performance drop.
Now my question is:

1)     Why flock the entire file when arbiter is set on? Could you please share 
the detail why it will lead to split brain only to arbiter?

2)     If it is the root cause, and it really will lead to split-brain if not 
lock entire file, is there any solution to avoid performance drop for this 
mulit-write case?

The following is attached source code for this function FYI:
--------------------------------------------------------------------------------------
int afr_set_transaction_flock (xlator_t *this, afr_local_t *local)
{
        afr_internal_lock_t *int_lock = NULL;
        afr_private_t       *priv     = NULL;

        int_lock = &local->internal_lock;
        priv = this->private;

        if ((priv->arbiter_count || local->transaction.eager_lock_on ||
             priv->full_lock) &&
            local->transaction.type == AFR_DATA_TRANSACTION) {
                /*Lock entire file to avoid network split brains.*/
                int_lock->flock.l_len   = 0;
                int_lock->flock.l_start = 0;
        } else {
                int_lock->flock.l_len   = local->transaction.len;
                int_lock->flock.l_start = local->transaction.start;
        }
        int_lock->flock.l_type  = F_WRLCK;

        return 0;
}
------------------------------------------------------------------------------------
Thanks & Best Regards,
George
_______________________________________________
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Reply via email to