Re: [Gluster-users] question about info and info.tmp

2016-11-14 Thread songxin


Hi Atin,


I think the root cause is in the function glusterd_import_friend_volume as 
below. 

int32_t 
glusterd_import_friend_volume (dict_t *peer_data, size_t count) 
{ 
... 
ret = glusterd_volinfo_find (new_volinfo->volname, _volinfo); 
if (0 == ret) { 
(void) gd_check_and_update_rebalance_info (old_volinfo, 
   new_volinfo); 
(void) glusterd_delete_stale_volume (old_volinfo, new_volinfo); 
} 
... 
ret = glusterd_store_volinfo (new_volinfo, 
GLUSTERD_VOLINFO_VER_AC_NONE); 
if (ret) { 
gf_msg (this->name, GF_LOG_ERROR, 0, 
GD_MSG_VOLINFO_STORE_FAIL, "Failed to store " 
"volinfo for volume %s", new_volinfo->volname); 
goto out; 
} 
... 
} 

glusterd_delete_stale_volume will remove the info and bricks/* and the 
glusterd_store_volinfo will create the new one. 
But if glusterd is killed before rename the info will is empty. 


And glusterd will start failed because the infois empty in the next time you 
start the glusterd.


Any idea, Atin?


Thanks,
Xin



在 2016-11-15 12:07:05,"Atin Mukherjee"  写道:





On Tue, Nov 15, 2016 at 8:58 AM, songxin  wrote:

Hi Atin,
I have some clues about this issue.
I could reproduce this issue use the scrip that mentioned in 
https://bugzilla.redhat.com/show_bug.cgi?id=1308487 .


I really appreciate your help in trying to nail down this issue. While I am at 
your email and going through the code to figure out the possible cause for it, 
unfortunately I don't see any script in the attachment of the bug.  Could you 
please cross check?
 



After I added some debug print,which like below, in glusterd-store.c and I 
found that the /var/lib/glusterd/vols/xxx/info and 
/var/lib/glusterd/vols/xxx/bricks/* are removed. 
But other files in /var/lib/glusterd/vols/xxx/ will not be remove.


int32_t
glusterd_store_volinfo (glusterd_volinfo_t *volinfo, glusterd_volinfo_ver_ac_t 
ac)
{
int32_t ret = -1;


GF_ASSERT (volinfo)


ret = access("/var/lib/glusterd/vols/gv0/info", F_OK);
if(ret < 0)
{
gf_msg (THIS->name, GF_LOG_ERROR, 0, 0, "info is not exit(%d)", 
errno);
}
else
{
ret = stat("/var/lib/glusterd/vols/gv0/info", );
if(ret < 0)
{
gf_msg (THIS->name, GF_LOG_ERROR, 0, 0, "stat info 
error");
}
else
{
gf_msg (THIS->name, GF_LOG_ERROR, 0, 0, "info size is 
%lu, inode num is %lu", buf.st_size, buf.st_ino);
}
}


glusterd_perform_volinfo_version_action (volinfo, ac);
ret = glusterd_store_create_volume_dir (volinfo);
if (ret)
goto out;


...
}


So it is easy to understand why  the info or 10.32.1.144.-opt-lvmdir-c2-brick 
sometimes is empty.
It is becaue the info file is not exist, and it will be create by “fd = open 
(path, O_RDWR | O_CREAT | O_APPEND, 0600);” in function gf_store_handle_new.
And the info file is empty before rename.
So the info file is empty if glusterd shutdown before rename.
 



My question is following.
1.I did not find the point the info is removed.Could you tell me the point 
where the info and /bricks/* are removed?
2.why the file info and bricks/* is removed?But other files in 
var/lib/glusterd/vols/xxx/ are not be removed?

AFAIK, we never delete the info file and hence this file is opened with 
O_APPEND flag. As I said I will go back and cross check the code once again.






Thanks,
Xin



在 2016-11-11 20:34:05,"Atin Mukherjee"  写道:





On Fri, Nov 11, 2016 at 4:00 PM, songxin  wrote:

Hi Atin,



Thank you for your support.
Sincerely wait for your reply.


By the way, could you make sure that the issue, file info is empty, cause by 
rename is interrupted in kernel?


As per my RCA on that bug, it looked to be.
 



Thanks,
Xin

在 2016-11-11 15:49:02,"Atin Mukherjee"  写道:





On Fri, Nov 11, 2016 at 1:15 PM, songxin  wrote:

Hi Atin,
Thank you for your reply.
Actually it is very difficult to reproduce because I don't know when there was 
an ongoing commit happening.It is just a coincidence.
But I want to make sure the root cause.


I'll give it a another try and see if this situation can be 
simulated/reproduced and will keep you posted.
 



So I would be grateful if you could answer my questions below.


You said that "This issue is hit at part of the negative testing where while 
gluster volume set was executed at the same point of time glusterd in another 
instance was brought down. In the faulty node we could see 
/var/lib/glusterd/vols/info file been empty whereas the info.tmp file 
has the correct 

Re: [Gluster-users] question about info and info.tmp

2016-11-14 Thread songxin
Hi Atin,
Now I have known that the info and bricks/* are removed by the function 
glusterd_delete_stale_volume().
But I have not known how to solve this issue.


Thanks,
Xin






在 2016-11-15 12:07:05,"Atin Mukherjee"  写道:





On Tue, Nov 15, 2016 at 8:58 AM, songxin  wrote:

Hi Atin,
I have some clues about this issue.
I could reproduce this issue use the scrip that mentioned in 
https://bugzilla.redhat.com/show_bug.cgi?id=1308487 .


I really appreciate your help in trying to nail down this issue. While I am at 
your email and going through the code to figure out the possible cause for it, 
unfortunately I don't see any script in the attachment of the bug.  Could you 
please cross check?
 



After I added some debug print,which like below, in glusterd-store.c and I 
found that the /var/lib/glusterd/vols/xxx/info and 
/var/lib/glusterd/vols/xxx/bricks/* are removed. 
But other files in /var/lib/glusterd/vols/xxx/ will not be remove.


int32_t
glusterd_store_volinfo (glusterd_volinfo_t *volinfo, glusterd_volinfo_ver_ac_t 
ac)
{
int32_t ret = -1;


GF_ASSERT (volinfo)


ret = access("/var/lib/glusterd/vols/gv0/info", F_OK);
if(ret < 0)
{
gf_msg (THIS->name, GF_LOG_ERROR, 0, 0, "info is not exit(%d)", 
errno);
}
else
{
ret = stat("/var/lib/glusterd/vols/gv0/info", );
if(ret < 0)
{
gf_msg (THIS->name, GF_LOG_ERROR, 0, 0, "stat info 
error");
}
else
{
gf_msg (THIS->name, GF_LOG_ERROR, 0, 0, "info size is 
%lu, inode num is %lu", buf.st_size, buf.st_ino);
}
}


glusterd_perform_volinfo_version_action (volinfo, ac);
ret = glusterd_store_create_volume_dir (volinfo);
if (ret)
goto out;


...
}


So it is easy to understand why  the info or 10.32.1.144.-opt-lvmdir-c2-brick 
sometimes is empty.
It is becaue the info file is not exist, and it will be create by “fd = open 
(path, O_RDWR | O_CREAT | O_APPEND, 0600);” in function gf_store_handle_new.
And the info file is empty before rename.
So the info file is empty if glusterd shutdown before rename.
 



My question is following.
1.I did not find the point the info is removed.Could you tell me the point 
where the info and /bricks/* are removed?
2.why the file info and bricks/* is removed?But other files in 
var/lib/glusterd/vols/xxx/ are not be removed?

AFAIK, we never delete the info file and hence this file is opened with 
O_APPEND flag. As I said I will go back and cross check the code once again.






Thanks,
Xin



在 2016-11-11 20:34:05,"Atin Mukherjee"  写道:





On Fri, Nov 11, 2016 at 4:00 PM, songxin  wrote:

Hi Atin,



Thank you for your support.
Sincerely wait for your reply.


By the way, could you make sure that the issue, file info is empty, cause by 
rename is interrupted in kernel?


As per my RCA on that bug, it looked to be.
 



Thanks,
Xin

在 2016-11-11 15:49:02,"Atin Mukherjee"  写道:





On Fri, Nov 11, 2016 at 1:15 PM, songxin  wrote:

Hi Atin,
Thank you for your reply.
Actually it is very difficult to reproduce because I don't know when there was 
an ongoing commit happening.It is just a coincidence.
But I want to make sure the root cause.


I'll give it a another try and see if this situation can be 
simulated/reproduced and will keep you posted.
 



So I would be grateful if you could answer my questions below.


You said that "This issue is hit at part of the negative testing where while 
gluster volume set was executed at the same point of time glusterd in another 
instance was brought down. In the faulty node we could see 
/var/lib/glusterd/vols/info file been empty whereas the info.tmp file 
has the correct contents." in comment.
I have two questions for you.

1.Could you reproduce this issue by gluster volume set glusterd which was 
brought down?
2.Could you be certain that this issue is cause by rename is interrupted in 
kernel?
In my case there are two files, info and 10.32.1.144.-opt-lvmdir-c2-brick, are 
both empty.
But in my view only one rename can be running at the same time because of the 
big lock.
Why there are two files are empty?


Could rename("info.tmp", "info") and rename("xxx-brick.tmp", "xxx-brick") be 
running in two thread?
Thanks,
Xin




在 2016-11-11 15:27:03,"Atin Mukherjee"  写道:





On Fri, Nov 11, 2016 at 12:38 PM, songxin  wrote:



Hi Atin,
Thank you for your reply.


As you said that the info file can only be changed in the 
glusterd_store_volinfo() sequentially because of the big lock.


I have found the similar issue as below that you mentioned. 
https://bugzilla.redhat.com/show_bug.cgi?id=1308487


Great, so this is what I was 

Re: [Gluster-users] Unable to stop volume because geo-replication

2016-11-14 Thread Kotresh Hiremath Ravishankar
Hi,

Could you please restart glusterd in DEBUG mode and share the glusterd logs?

*Starting glusterd in DEBUG mode as follows.

#glusterd -LDEBUG

*Stop the volume
   #gluster vol stop 

Share the glusterd logs.

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Chao-Ping Chien" 
> To: gluster-users@gluster.org
> Sent: Monday, November 14, 2016 10:18:16 PM
> Subject: [Gluster-users] Unable to stop volume because geo-replication
> 
> 
> 
> Hi,
> 
> 
> 
> Hope someone can point me how to do this.
> 
> 
> 
> I want to delete a volume but not able to do so because glusterfs is keep
> reporting there is geo-replication setup which seems to be not exist at the
> moment when I issue stop command.
> 
> 
> 
> On a Redhat 7.2 kernel: 3.10.0-327.36.3.el7.x86_64
> 
> [root@eqappsrvp01 mule1]# rpm -qa |grep gluster
> 
> glusterfs-3.7.14-1.el7.x86_64
> 
> glusterfs-fuse-3.7.14-1.el7.x86_64
> 
> glusterfs-server-3.7.14-1.el7.x86_64
> 
> glusterfs-libs-3.7.14-1.el7.x86_64
> 
> glusterfs-api-3.7.14-1.el7.x86_64
> 
> glusterfs-geo-replication-3.7.14-1.el7.x86_64
> 
> glusterfs-cli-3.7.14-1.el7.x86_64
> 
> glusterfs-client-xlators-3.7.14-1.el7.x86_64
> 
> 
> 
> 
> 
> [root@eqappsrvp01 mule1]# gluster volume stop mule1 Stopping volume will make
> its data inaccessible. Do you want to continue? (y/n) y volume stop: mule1:
> failed: geo-replication sessions are active for the volume mule1.
> 
> Stop geo-replication sessions involved in this volume. Use 'volume
> geo-replication status' command for more info.
> 
> [root@eqappsrvp01 mule1]# gluster volume geo-replication status
> 
> 
> 
> MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE SLAVE NODE STATUS CRAWL
> STATUS LAST_SYNCED
> 
> --
> 
> eqappsrvp01 gitlab_data /data/gitlab_data root ssh://eqappsrvd02::gitlab_data
> N/A Stopped N/A N/A
> 
> eqappsrvp02 gitlab_data /data/gitlab_data root ssh://eqappsrvd02::gitlab_data
> N/A Stopped N/A N/A
> 
> [root@eqappsrvp01 mule1]# uname -a
> 
> Linux eqappsrvp01 3.10.0-327.36.3.el7.x86_64 #1 SMP Thu Oct 20 04:56:07 EDT
> 2016 x86_64 x86_64 x86_64 GNU/Linux
> 
> [root@eqappsrvp01 mule1]# cat /etc/redhat-release Red Hat Enterprise Linux
> Server release 7.2 (Maipo)
> =
> 
> 
> 
> I search the internet found in Redhat Bugzilla bug 1342431 seems to address
> this problem but according to its status should be fixed in 3.7.12 but in my
> version 3.7.14 it still exist.
> 
> 
> 
> Thanks
> 
> 
> 
> Ping.
> 
> 
> 
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] question about info and info.tmp

2016-11-14 Thread songxin


Hi Atin,
I have two nodes, a node and b node, in which creating a replicate volume and 
then start the volume.


I will run the script as below on b node.
#!/bin/bash
i=1   
while(($i<100))
do 
   
gluster volume set gv0 nfs.disable on


sleep 2s


gluster volume set gv0 nfs.disable off  
 
i=$(($i+1))
done


And I  run the script as below on a node at the same time.


#!/bin/bash


i=1
while(($i<100))
do


systemctl stop glusterd


systemctl start glusterd


gluster volume info


i=$(($i+1))
done


The issue is very easy reproduced on a board.


Could you please tell me where is the point that info file is unlink?


Thanks,
Xin








在 2016-11-15 12:07:05,"Atin Mukherjee"  写道:





On Tue, Nov 15, 2016 at 8:58 AM, songxin  wrote:

Hi Atin,
I have some clues about this issue.
I could reproduce this issue use the scrip that mentioned in 
https://bugzilla.redhat.com/show_bug.cgi?id=1308487 .


I really appreciate your help in trying to nail down this issue. While I am at 
your email and going through the code to figure out the possible cause for it, 
unfortunately I don't see any script in the attachment of the bug.  Could you 
please cross check?
 



After I added some debug print,which like below, in glusterd-store.c and I 
found that the /var/lib/glusterd/vols/xxx/info and 
/var/lib/glusterd/vols/xxx/bricks/* are removed. 
But other files in /var/lib/glusterd/vols/xxx/ will not be remove.


int32_t
glusterd_store_volinfo (glusterd_volinfo_t *volinfo, glusterd_volinfo_ver_ac_t 
ac)
{
int32_t ret = -1;


GF_ASSERT (volinfo)


ret = access("/var/lib/glusterd/vols/gv0/info", F_OK);
if(ret < 0)
{
gf_msg (THIS->name, GF_LOG_ERROR, 0, 0, "info is not exit(%d)", 
errno);
}
else
{
ret = stat("/var/lib/glusterd/vols/gv0/info", );
if(ret < 0)
{
gf_msg (THIS->name, GF_LOG_ERROR, 0, 0, "stat info 
error");
}
else
{
gf_msg (THIS->name, GF_LOG_ERROR, 0, 0, "info size is 
%lu, inode num is %lu", buf.st_size, buf.st_ino);
}
}


glusterd_perform_volinfo_version_action (volinfo, ac);
ret = glusterd_store_create_volume_dir (volinfo);
if (ret)
goto out;


...
}


So it is easy to understand why  the info or 10.32.1.144.-opt-lvmdir-c2-brick 
sometimes is empty.
It is becaue the info file is not exist, and it will be create by “fd = open 
(path, O_RDWR | O_CREAT | O_APPEND, 0600);” in function gf_store_handle_new.
And the info file is empty before rename.
So the info file is empty if glusterd shutdown before rename.
 



My question is following.
1.I did not find the point the info is removed.Could you tell me the point 
where the info and /bricks/* are removed?
2.why the file info and bricks/* is removed?But other files in 
var/lib/glusterd/vols/xxx/ are not be removed?

AFAIK, we never delete the info file and hence this file is opened with 
O_APPEND flag. As I said I will go back and cross check the code once again.






Thanks,
Xin



在 2016-11-11 20:34:05,"Atin Mukherjee"  写道:





On Fri, Nov 11, 2016 at 4:00 PM, songxin  wrote:

Hi Atin,



Thank you for your support.
Sincerely wait for your reply.


By the way, could you make sure that the issue, file info is empty, cause by 
rename is interrupted in kernel?


As per my RCA on that bug, it looked to be.
 



Thanks,
Xin

在 2016-11-11 15:49:02,"Atin Mukherjee"  写道:





On Fri, Nov 11, 2016 at 1:15 PM, songxin  wrote:

Hi Atin,
Thank you for your reply.
Actually it is very difficult to reproduce because I don't know when there was 
an ongoing commit happening.It is just a coincidence.
But I want to make sure the root cause.


I'll give it a another try and see if this situation can be 
simulated/reproduced and will keep you posted.
 



So I would be grateful if you could answer my questions below.


You said that "This issue is hit at part of the negative testing where while 
gluster volume set was executed at the same point of time glusterd in another 
instance was brought down. In the faulty node we could see 
/var/lib/glusterd/vols/info file been empty whereas the info.tmp file 
has the correct contents." in comment.
I have two questions for you.

1.Could you reproduce this issue by gluster volume set glusterd which was 
brought down?
2.Could you be certain that this issue is cause by rename is interrupted in 
kernel?
In my case there are two files, info and 10.32.1.144.-opt-lvmdir-c2-brick, are 
both empty.
But in my view only one rename can 

Re: [Gluster-users] question about info and info.tmp

2016-11-14 Thread Atin Mukherjee
On Tue, Nov 15, 2016 at 8:58 AM, songxin  wrote:

> Hi Atin,
> I have some clues about this issue.
> I could reproduce this issue use the scrip that mentioned in
> https://bugzilla.redhat.com/show_bug.cgi?id=1308487 .
>

I really appreciate your help in trying to nail down this issue. While I am
at your email and going through the code to figure out the possible cause
for it, unfortunately I don't see any script in the attachment of the bug.
Could you please cross check?


>
> After I added some debug print,which like below, in glusterd-store.c and I
> found that the /var/lib/glusterd/vols/xxx/info and 
> /var/lib/glusterd/vols/xxx/bricks/*
> are removed.
> But other files in /var/lib/glusterd/vols/xxx/ will not be remove.
>
> int32_t
> glusterd_store_volinfo (glusterd_volinfo_t *volinfo,
> glusterd_volinfo_ver_ac_t ac)
> {
> int32_t ret = -1;
>
> GF_ASSERT (volinfo)
>
> ret = access("/var/lib/glusterd/vols/gv0/info", F_OK);
> if(ret < 0)
> {
> gf_msg (THIS->name, GF_LOG_ERROR, 0, 0, "info is not
> exit(%d)", errno);
> }
> else
> {
> ret = stat("/var/lib/glusterd/vols/gv0/info", );
> if(ret < 0)
> {
> gf_msg (THIS->name, GF_LOG_ERROR, 0, 0, "stat info
> error");
> }
> else
> {
> gf_msg (THIS->name, GF_LOG_ERROR, 0, 0, "info size
> is %lu, inode num is %lu", buf.st_size, buf.st_ino);
> }
> }
>
> glusterd_perform_volinfo_version_action (volinfo, ac);
> ret = glusterd_store_create_volume_dir (volinfo);
> if (ret)
> goto out;
>
> ...
> }
>
> So it is easy to understand why  the info or 10.32.1.144.-opt-lvmdir-c2-
> brick sometimes is empty.
> It is becaue the info file is not exist, and it will be create by “fd =
> open (path, O_RDWR | O_CREAT | O_APPEND, 0600);” in function
> gf_store_handle_new.
> And the info file is empty before rename.
> So the info file is empty if glusterd shutdown before rename.
>
>

> My question is following.
> 1.I did not find the point the info is removed.Could you tell me the point
> where the info and /bricks/* are removed?
> 2.why the file info and bricks/* is removed?But other files in 
> var/lib/glusterd/vols/xxx/
> are not be removed?
>

AFAIK, we never delete the info file and hence this file is opened with
O_APPEND flag. As I said I will go back and cross check the code once again.




> Thanks,
> Xin
>
>
> 在 2016-11-11 20:34:05,"Atin Mukherjee"  写道:
>
>
>
> On Fri, Nov 11, 2016 at 4:00 PM, songxin  wrote:
>
>> Hi Atin,
>>
>> Thank you for your support.
>> Sincerely wait for your reply.
>>
>> By the way, could you make sure that the issue, file info is empty, cause
>> by rename is interrupted in kernel?
>>
>
> As per my RCA on that bug, it looked to be.
>
>
>>
>> Thanks,
>> Xin
>>
>> 在 2016-11-11 15:49:02,"Atin Mukherjee"  写道:
>>
>>
>>
>> On Fri, Nov 11, 2016 at 1:15 PM, songxin  wrote:
>>
>>> Hi Atin,
>>> Thank you for your reply.
>>> Actually it is very difficult to reproduce because I don't know when there
>>> was an ongoing commit happening.It is just a coincidence.
>>> But I want to make sure the root cause.
>>>
>>
>> I'll give it a another try and see if this situation can be
>> simulated/reproduced and will keep you posted.
>>
>>
>>>
>>> So I would be grateful if you could answer my questions below.
>>>
>>> You said that "This issue is hit at part of the negative testing where
>>> while gluster volume set was executed at the same point of time glusterd in
>>> another instance was brought down. In the faulty node we could see
>>> /var/lib/glusterd/vols/info file been empty whereas the
>>> info.tmp file has the correct contents." in comment.
>>>
>>> I have two questions for you.
>>>
>>> 1.Could you reproduce this issue by gluster volume set glusterd which was 
>>> brought down?
>>> 2.Could you be certain that this issue is cause by rename is interrupted in 
>>> kernel?
>>>
>>> In my case there are two files, info and 10.32.1.144.-opt-lvmdir-c2-brick, 
>>> are both empty.
>>> But in my view only one rename can be running at the same time because of 
>>> the big lock.
>>> Why there are two files are empty?
>>>
>>>
>>> Could rename("info.tmp", "info") and rename("xxx-brick.tmp", "xxx-brick") 
>>> be running in two thread?
>>>
>>> Thanks,
>>> Xin
>>>
>>>
>>> 在 2016-11-11 15:27:03,"Atin Mukherjee"  写道:
>>>
>>>
>>>
>>> On Fri, Nov 11, 2016 at 12:38 PM, songxin  wrote:
>>>

 Hi Atin,
 Thank you for your reply.

 As you said that the info file can only be changed in the 
 glusterd_store_volinfo()
 sequentially because of the big lock.

 I have found the similar issue as below that you 

Re: [Gluster-users] question about info and info.tmp

2016-11-14 Thread songxin
Hi Atin,
I have some clues about this issue.
I could reproduce this issue use the scrip that mentioned in 
https://bugzilla.redhat.com/show_bug.cgi?id=1308487 .


After I added some debug print,which like below, in glusterd-store.c and I 
found that the /var/lib/glusterd/vols/xxx/info and 
/var/lib/glusterd/vols/xxx/bricks/* are removed. 
But other files in /var/lib/glusterd/vols/xxx/ will not be remove.


int32_t
glusterd_store_volinfo (glusterd_volinfo_t *volinfo, glusterd_volinfo_ver_ac_t 
ac)
{
int32_t ret = -1;


GF_ASSERT (volinfo)


ret = access("/var/lib/glusterd/vols/gv0/info", F_OK);
if(ret < 0)
{
gf_msg (THIS->name, GF_LOG_ERROR, 0, 0, "info is not exit(%d)", 
errno);
}
else
{
ret = stat("/var/lib/glusterd/vols/gv0/info", );
if(ret < 0)
{
gf_msg (THIS->name, GF_LOG_ERROR, 0, 0, "stat info 
error");
}
else
{
gf_msg (THIS->name, GF_LOG_ERROR, 0, 0, "info size is 
%lu, inode num is %lu", buf.st_size, buf.st_ino);
}
}


glusterd_perform_volinfo_version_action (volinfo, ac);
ret = glusterd_store_create_volume_dir (volinfo);
if (ret)
goto out;


...
}


So it is easy to understand why  the info or 10.32.1.144.-opt-lvmdir-c2-brick 
sometimes is empty.
It is becaue the info file is not exist, and it will be create by “fd = open 
(path, O_RDWR | O_CREAT | O_APPEND, 0600);” in function gf_store_handle_new.
And the info file is empty before rename.
So the info file is empty if glusterd shutdown before rename.


My question is following.
1.I did not find the point the info is removed.Could you tell me the point 
where the info and /bricks/* are removed?
2.why the file info and bricks/* is removed?But other files in 
var/lib/glusterd/vols/xxx/ are not be removed?


Thanks,
Xin



在 2016-11-11 20:34:05,"Atin Mukherjee"  写道:





On Fri, Nov 11, 2016 at 4:00 PM, songxin  wrote:

Hi Atin,



Thank you for your support.
Sincerely wait for your reply.


By the way, could you make sure that the issue, file info is empty, cause by 
rename is interrupted in kernel?


As per my RCA on that bug, it looked to be.
 



Thanks,
Xin

在 2016-11-11 15:49:02,"Atin Mukherjee"  写道:





On Fri, Nov 11, 2016 at 1:15 PM, songxin  wrote:

Hi Atin,
Thank you for your reply.
Actually it is very difficult to reproduce because I don't know when there was 
an ongoing commit happening.It is just a coincidence.
But I want to make sure the root cause.


I'll give it a another try and see if this situation can be 
simulated/reproduced and will keep you posted.
 



So I would be grateful if you could answer my questions below.


You said that "This issue is hit at part of the negative testing where while 
gluster volume set was executed at the same point of time glusterd in another 
instance was brought down. In the faulty node we could see 
/var/lib/glusterd/vols/info file been empty whereas the info.tmp file 
has the correct contents." in comment.
I have two questions for you.

1.Could you reproduce this issue by gluster volume set glusterd which was 
brought down?
2.Could you be certain that this issue is cause by rename is interrupted in 
kernel?
In my case there are two files, info and 10.32.1.144.-opt-lvmdir-c2-brick, are 
both empty.
But in my view only one rename can be running at the same time because of the 
big lock.
Why there are two files are empty?


Could rename("info.tmp", "info") and rename("xxx-brick.tmp", "xxx-brick") be 
running in two thread?
Thanks,
Xin




在 2016-11-11 15:27:03,"Atin Mukherjee"  写道:





On Fri, Nov 11, 2016 at 12:38 PM, songxin  wrote:



Hi Atin,
Thank you for your reply.


As you said that the info file can only be changed in the 
glusterd_store_volinfo() sequentially because of the big lock.


I have found the similar issue as below that you mentioned. 
https://bugzilla.redhat.com/show_bug.cgi?id=1308487


Great, so this is what I was actually trying to refer in my first email that I 
saw a similar issue. Have you got a chance to look at 
https://bugzilla.redhat.com/show_bug.cgi?id=1308487#c4 ? But in your case, did 
you try to bring down glusterd when there was an ongoing commit happening?
 



You said that "This issue is hit at part of the negative testing where while 
gluster volume set was executed at the same point of time glusterd in another 
instance was brought down. In the faulty node we could see 
/var/lib/glusterd/vols/info file been empty whereas the info.tmp file 
has the correct contents." in comment.
I have two questions for you.

1.Could you reproduce this issue by gluster volume set glusterd which was 
brought down?
2.Could you be certain that this 

[Gluster-users] Annual Community Survey 2016 - Open until December 9th

2016-11-14 Thread Amye Scavarda
Hi all!
It's that time again, it's our annual community survey.

Please send this link out so that we can get better feedback from our users
+ overall community.

https://www.surveymonkey.com/r/gluster2016

Thanks!
- amye
-- 
Amye Scavarda | a...@redhat.com | Gluster Community Lead
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] GlusterFS geo-replication brick KeyError

2016-11-14 Thread Shirwa Hersi
Hi,

I'm using glusterfs geo-replication on version 3.7.11, one of the bricks
becomes faulty and does not replicated to slave bricks after i start
geo-replication session.
Following are the logs related to the faulty brick, can someone please
advice me on how to resolve this issue.

[2016-06-11 09:41:17.359086] E
[syncdutils(/var/glusterfs/gluster_b2/brick):276:log_raise_exception]
: FAIL:
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 166, in main
main_i()
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 663, in main_i
local.service_loop(*[r for r in [remote] if r])
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line
1497, in service_loop
g3.crawlwrap(oneshot=True)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 571,
in crawlwrap
self.crawl()
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1201, in crawl
self.changelogs_batch_process(changes)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line
1107, in changelogs_batch_process
self.process(batch)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 984,
in process
self.datas_in_batch.remove(unlinked_gfid)
KeyError: '.gfid/757b0ad8-b6f5-44da-b71a-1b1c25a72988'



Thanks
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Feature Request: Lock Volume Settings

2016-11-14 Thread Gandalf Corvotempesta
Il 14 nov 2016 7:28 PM, "Joe Julian"  ha scritto:
>
> IMHO, if a command will result in data loss, fall it. Period.
>
> It should never be ok for a filesystem to lose data. If someone wanted to
do that with ext or xfs they would have to format.
>

Exactly. I've wrote something similiar in some mail.
Gluster should preserve data consistency at any cost.
If you are trying to do something bad, this should be blocked or, AT
MINIMUM,  a confirm must be asked

Like doing fsck on a mounted FS
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] 3.7.16 with sharding corrupts VMDK files when adding and removing bricks

2016-11-14 Thread ML Wong
Though remove-brick is not an usual act we would do for Gluster volume,
this has consistently failed ending in corrupted gluster volume after
Sharding has been turned on. For bug1387878, it's very similar to what i
had encountered in ESXi world.  Add-brick, would run successful, but
virtual-machine files would crash after rebalance in one of my
environments. That did not happen in my another environment under same
version (3.7.16).  Difference between 2 was one is changing from Replicate
to Distributed-Replicate, but they are still configured with only
2-replicas. i will have to test 3.8.* with Ganesha to see how it goes.

On Mon, Nov 14, 2016 at 8:29 AM, Gandalf Corvotempesta <
gandalf.corvotempe...@gmail.com> wrote:

> 1016-11-14 17:01 GMT+01:00 Vijay Bellur :
> > Accessing sharded data after disabling sharding is something that we
> > did not visualize as a valid use case at any point in time. Also, you
> > could access the contents by enabling sharding again. Given these
> > factors I think this particular problem has not been prioritized by
> > us.
>
> That's not true.
> If you have VMs running on a sharded volume and you disable sharding,
> with the VM still running, everything crash and could lead to data loss,
> as VM
> will be unable to find their filesystem and so on, qemu currupts the
> image and so on.
>
> If I write to a file that was shareded, (in example a log file), now
> when you disable the shard,
> the application would write the existing file (the one that was the
> first shard).
> If you reenable sharding, you lost some data
>
> Example:
>
> 128MB file. shard set to 64MB. You have 2 chunks: shard1+shard2
>
> Now you are writing to the file:
>
> 
> 
> 
> 
>
> + are placed on shard1, + are placed on shard2
>
> If you disable the shard and write some extra data, , then 
> would be placed after  in shard1 (growing more than 64MB)
> and not on shard3
>
> If you re-enable shard,  is lost, as gluster would expect it as
> shard3. and I think gluster will read only the first 64MB from shard1.
> If gluster read the whole file, you'll get something like this:
>
> 
> 
> 
> 
> 
>
> in a text file this is bad, in a VM image, this mean data
> loss/corruption almost impossible to fix.
>
>
> > As with many other projects, we are in a stage today where the number
> > of users and testers far outweigh the number of developers
> > contributing code. With this state it becomes hard to prioritize
> > problems from a long todo list for developers.  If valuable community
> > members like you feel strongly about a bug or feature that need
> > attention of developers, please call such issues out on the mailing
> > list. We will be more than happy to help.
>
> That's why i've asked for less feature and more stability.
> If you have to prioritize, please choose all bugs that could lead to
> data corruption or similiar.
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] 3.7.16 with sharding corrupts VMDK files when adding and removing bricks

2016-11-14 Thread Joe Julian
Features and stability are not mutually exclusive. 

Sometimes instability is cured by adding a feature. 

Fixing a bug is not something that's solved better by having more developers 
work on it.

Sometimes fixing one bug exposed a problem elsewhere. 

Using free open source community projects with your own hardware and system 
design weights the responsibility to test more heavily on yourself. If that's 
not a risk you can afford, you might consider contracting with a 3rd party 
which has "certified" installation parameters. IMHO.

On November 14, 2016 8:29:00 AM PST, Gandalf Corvotempesta 
 wrote:
>1016-11-14 17:01 GMT+01:00 Vijay Bellur :
>> Accessing sharded data after disabling sharding is something that we
>> did not visualize as a valid use case at any point in time. Also, you
>> could access the contents by enabling sharding again. Given these
>> factors I think this particular problem has not been prioritized by
>> us.
>
>That's not true.
>If you have VMs running on a sharded volume and you disable sharding,
>with the VM still running, everything crash and could lead to data
>loss, as VM
>will be unable to find their filesystem and so on, qemu currupts the
>image and so on.
>
>If I write to a file that was shareded, (in example a log file), now
>when you disable the shard,
>the application would write the existing file (the one that was the
>first shard).
>If you reenable sharding, you lost some data
>
>Example:
>
>128MB file. shard set to 64MB. You have 2 chunks: shard1+shard2
>
>Now you are writing to the file:
>
>
>
>
>
>
>+ are placed on shard1, + are placed on shard2
>
>If you disable the shard and write some extra data, , then 
>would be placed after  in shard1 (growing more than 64MB)
>and not on shard3
>
>If you re-enable shard,  is lost, as gluster would expect it as
>shard3. and I think gluster will read only the first 64MB from shard1.
>If gluster read the whole file, you'll get something like this:
>
>
>
>
>
>
>
>in a text file this is bad, in a VM image, this mean data
>loss/corruption almost impossible to fix.
>
>
>> As with many other projects, we are in a stage today where the number
>> of users and testers far outweigh the number of developers
>> contributing code. With this state it becomes hard to prioritize
>> problems from a long todo list for developers.  If valuable community
>> members like you feel strongly about a bug or feature that need
>> attention of developers, please call such issues out on the mailing
>> list. We will be more than happy to help.
>
>That's why i've asked for less feature and more stability.
>If you have to prioritize, please choose all bugs that could lead to
>data corruption or similiar.
>___
>Gluster-users mailing list
>Gluster-users@gluster.org
>http://www.gluster.org/mailman/listinfo/gluster-users

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Feature Request: Lock Volume Settings

2016-11-14 Thread Joe Julian
IMHO, if a command will result in data loss, fall it. Period.

It should never be ok for a filesystem to lose data. If someone wanted to do 
that with ext or xfs they would have to format. 

On November 14, 2016 8:15:16 AM PST, Ravishankar N  
wrote:
>On 11/14/2016 05:57 PM, Atin Mukherjee wrote:
>> This would be a straight forward thing to implement at glusterd, 
>> anyone up for it? If not, we will take this into consideration for 
>> GlusterD 2.0.
>>
>> On Mon, Nov 14, 2016 at 10:28 AM, Mohammed Rafi K C 
>> > wrote:
>>
>> I think it is worth to implement a lock option.
>>
>> +1
>>
>>
>> Rafi KC
>>
>>
>> On 11/14/2016 06:12 AM, David Gossage wrote:
>>> On Sun, Nov 13, 2016 at 6:35 PM, Lindsay Mathieson
>>> >> > wrote:
>>>
>>> As discussed recently, it is way to easy to make destructive
>>> changes
>>> to a volume,e.g change shard size. This can corrupt the data
>>> with no
>>> warnings and its all to easy to make a typo or access the
>>> wrong volume
>>> when doing 3am maintenance ...
>>>
>>> So I'd like to suggest something like the following:
>>>
>>>   gluster volume lock 
>>>
>
>
>I don't think this is a good idea. It would make more sense to give out
>
>verbose warnings in the individual commands themselves. A volume lock 
>doesn't prevent users from unlocking and still inadvertently running 
>those commands without knowing the implications. The remove brick set
>of 
>commands provides verbose messages nicely:
>
>$gluster v remove-brick testvol 127.0.0.2:/home/ravi/bricks/brick{4..6}
>
>commit
>Removing brick(s) can result in data loss. Do you want to Continue?
>(y/n) y
>volume remove-brick commit: success
>Check the removed bricks to ensure all files are migrated.
>If files with data are found on the brick path, copy them via a gluster
>
>mount point before re-purposing the removed brick
>
>My 2 cents,
>Ravi
>
>
>>>
>>> Setting this would fail all:
>>> - setting changes
>>> - add bricks
>>> - remove bricks
>>> - delete volume
>>>
>>>   gluster volume unlock 
>>>
>>> would allow all changes to be made.
>>>
>>> Just a thought, open to alternate suggestions.
>>>
>>> Thanks
>>>
>>> +
>>> sounds handy
>>>
>>> --
>>> Lindsay
>>> ___
>>> Gluster-users mailing list
>>> Gluster-users@gluster.org 
>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>> 
>>>
>>>
>>>
>>>
>>> ___
>>> Gluster-users mailing list
>>> Gluster-users@gluster.org 
>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>> 
>> ___ Gluster-devel
>> mailing list gluster-de...@gluster.org
>> 
>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>  
>>
>> -- 
>> ~ Atin (atinm)
>>
>> ___
>> Gluster-devel mailing list
>> gluster-de...@gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-devel
>
>
>
>
>
>___
>Gluster-devel mailing list
>gluster-de...@gluster.org
>http://www.gluster.org/mailman/listinfo/gluster-devel

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Unable to stop volume because geo-replication

2016-11-14 Thread Chao-Ping Chien
Hi,



Hope someone can point me how to do this.



I want to delete a volume but not able to do so because glusterfs is keep 
reporting there is geo-replication setup which seems to be not exist at the 
moment when I issue stop command.



On a Redhat 7.2 kernel: 3.10.0-327.36.3.el7.x86_64

[root@eqappsrvp01 mule1]# rpm -qa |grep gluster

glusterfs-3.7.14-1.el7.x86_64

glusterfs-fuse-3.7.14-1.el7.x86_64

glusterfs-server-3.7.14-1.el7.x86_64

glusterfs-libs-3.7.14-1.el7.x86_64

glusterfs-api-3.7.14-1.el7.x86_64

glusterfs-geo-replication-3.7.14-1.el7.x86_64

glusterfs-cli-3.7.14-1.el7.x86_64

glusterfs-client-xlators-3.7.14-1.el7.x86_64





[root@eqappsrvp01 mule1]# gluster volume stop mule1 Stopping volume will make 
its data inaccessible. Do you want to continue? (y/n) y volume stop: mule1: 
failed: geo-replication sessions are active for the volume mule1.

Stop geo-replication sessions involved in this volume. Use 'volume 
geo-replication status' command for more info.

[root@eqappsrvp01 mule1]# gluster volume geo-replication status



MASTER NODEMASTER VOL MASTER BRICK SLAVE USERSLAVE  
   SLAVE NODESTATUS CRAWL STATUSLAST_SYNCED

--

eqappsrvp01gitlab_data/data/gitlab_dataroot  
ssh://eqappsrvd02::gitlab_dataN/A   StoppedN/A N/A

eqappsrvp02gitlab_data/data/gitlab_dataroot  
ssh://eqappsrvd02::gitlab_dataN/A   StoppedN/A N/A

[root@eqappsrvp01 mule1]# uname -a

Linux eqappsrvp01 3.10.0-327.36.3.el7.x86_64 #1 SMP Thu Oct 20 04:56:07 EDT 
2016 x86_64 x86_64 x86_64 GNU/Linux

[root@eqappsrvp01 mule1]# cat /etc/redhat-release Red Hat Enterprise Linux 
Server release 7.2 (Maipo) 
=



I search the internet found in Redhat Bugzilla bug 1342431 seems to address 
this problem but according to its status should be fixed in 3.7.12 but in my 
version 3.7.14 it still exist.



Thanks



Ping.

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] 3.7.16 with sharding corrupts VMDK files when adding and removing bricks

2016-11-14 Thread Gandalf Corvotempesta
1016-11-14 17:01 GMT+01:00 Vijay Bellur :
> Accessing sharded data after disabling sharding is something that we
> did not visualize as a valid use case at any point in time. Also, you
> could access the contents by enabling sharding again. Given these
> factors I think this particular problem has not been prioritized by
> us.

That's not true.
If you have VMs running on a sharded volume and you disable sharding,
with the VM still running, everything crash and could lead to data loss, as VM
will be unable to find their filesystem and so on, qemu currupts the
image and so on.

If I write to a file that was shareded, (in example a log file), now
when you disable the shard,
the application would write the existing file (the one that was the
first shard).
If you reenable sharding, you lost some data

Example:

128MB file. shard set to 64MB. You have 2 chunks: shard1+shard2

Now you are writing to the file:






+ are placed on shard1, + are placed on shard2

If you disable the shard and write some extra data, , then 
would be placed after  in shard1 (growing more than 64MB)
and not on shard3

If you re-enable shard,  is lost, as gluster would expect it as
shard3. and I think gluster will read only the first 64MB from shard1.
If gluster read the whole file, you'll get something like this:







in a text file this is bad, in a VM image, this mean data
loss/corruption almost impossible to fix.


> As with many other projects, we are in a stage today where the number
> of users and testers far outweigh the number of developers
> contributing code. With this state it becomes hard to prioritize
> problems from a long todo list for developers.  If valuable community
> members like you feel strongly about a bug or feature that need
> attention of developers, please call such issues out on the mailing
> list. We will be more than happy to help.

That's why i've asked for less feature and more stability.
If you have to prioritize, please choose all bugs that could lead to
data corruption or similiar.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] 3.7.16 with sharding corrupts VMDK files when adding and removing bricks

2016-11-14 Thread Gandalf Corvotempesta
2016-11-14 16:55 GMT+01:00 Krutika Dhananjay :
> The only way to fix it is to have sharding be part of the graph *even* if
> disabled,
> except that in this case, its job should be confined to aggregating the
> already
> sharded files during reads but NOT shard new files that are created, since
> it is
> supposed to "act" disabled. This is a slightly bigger change and this is why
> I had
> suggested the workaround at
> https://bugzilla.redhat.com/show_bug.cgi?id=1355846#c1
> back then.

Why not keeping the shard xlator always on but set on a very high value so that
shard is never happening? Something at 100GB (just as proof of concept)

> FWIW, the documentation [1] does explain how to disable sharding the right
> way and has been in existence ever since sharding was first released in
> 3.7.0.
>
> [1] -
> http://staged-gluster-docs.readthedocs.io/en/release3.7.0beta1/Features/shard/

Ok but:
1) that's for 3.7 *beta1*. I'm using 3.8
2) "advisable" doesn't mean "you have to". It's an advice, not the
only way to disable a feature
3) i'm talking about a confirm to add in the cli, nothing strange. all
software ask for a confirm when bad things could happens.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] [Gluster-devel] Feature Request: Lock Volume Settings

2016-11-14 Thread Ravishankar N

On 11/14/2016 05:57 PM, Atin Mukherjee wrote:
This would be a straight forward thing to implement at glusterd, 
anyone up for it? If not, we will take this into consideration for 
GlusterD 2.0.


On Mon, Nov 14, 2016 at 10:28 AM, Mohammed Rafi K C 
> wrote:


I think it is worth to implement a lock option.

+1


Rafi KC


On 11/14/2016 06:12 AM, David Gossage wrote:

On Sun, Nov 13, 2016 at 6:35 PM, Lindsay Mathieson
> wrote:

As discussed recently, it is way to easy to make destructive
changes
to a volume,e.g change shard size. This can corrupt the data
with no
warnings and its all to easy to make a typo or access the
wrong volume
when doing 3am maintenance ...

So I'd like to suggest something like the following:

  gluster volume lock 




I don't think this is a good idea. It would make more sense to give out 
verbose warnings in the individual commands themselves. A volume lock 
doesn't prevent users from unlocking and still inadvertently running 
those commands without knowing the implications. The remove brick set of 
commands provides verbose messages nicely:


$gluster v remove-brick testvol 127.0.0.2:/home/ravi/bricks/brick{4..6} 
commit

Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y
volume remove-brick commit: success
Check the removed bricks to ensure all files are migrated.
If files with data are found on the brick path, copy them via a gluster 
mount point before re-purposing the removed brick


My 2 cents,
Ravi




Setting this would fail all:
- setting changes
- add bricks
- remove bricks
- delete volume

  gluster volume unlock 

would allow all changes to be made.

Just a thought, open to alternate suggestions.

Thanks

+
sounds handy

--
Lindsay
___
Gluster-users mailing list
Gluster-users@gluster.org 
http://www.gluster.org/mailman/listinfo/gluster-users





___
Gluster-users mailing list
Gluster-users@gluster.org 
http://www.gluster.org/mailman/listinfo/gluster-users


___ Gluster-devel
mailing list gluster-de...@gluster.org

http://www.gluster.org/mailman/listinfo/gluster-devel
 


--
~ Atin (atinm)

___
Gluster-devel mailing list
gluster-de...@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] 3.7.16 with sharding corrupts VMDK files when adding and removing bricks

2016-11-14 Thread David Gossage
On Mon, Nov 14, 2016 at 8:54 AM, Niels de Vos  wrote:

> On Mon, Nov 14, 2016 at 04:50:44PM +0530, Pranith Kumar Karampuri wrote:
> > On Mon, Nov 14, 2016 at 4:38 PM, Gandalf Corvotempesta <
> > gandalf.corvotempe...@gmail.com> wrote:
> >
> > > 2016-11-14 11:50 GMT+01:00 Pranith Kumar Karampuri <
> pkara...@redhat.com>:
> > > > To make gluster stable for VM images we had to add all these new
> features
> > > > and then fix all the bugs Lindsay/Kevin reported. We just fixed a
> > > corruption
> > > > issue that can happen with replace-brick which will be available in
> 3.9.0
> > > > and 3.8.6. The only 2 other known issues that can lead to
> corruptions are
> > > > add-brick and the bug you filed Gandalf. Krutika just 5 minutes back
> saw
> > > > something that could possibly lead to the corruption for the
> add-brick
> > > bug.
> > > > Is that really the Root cause? We are not sure yet, we need more
> time.
> > > > Without Lindsay/Kevin/David Gossage's support this workload would
> have
> > > been
> > > > in much worse condition. These bugs are not easy to re-create thus
> not
> > > easy
> > > > to fix. At least that has been Krutika's experience.
> > >
> > > Ok, but this changes should be placed in a "test" version and not
> > > marked as stable.
> > > I don't see any development release, only stable releases here.
> > > Do you want all features ? Try the "beta/rc/unstable/alpha/dev"
> version.
> > > Do you want the stable version without known bugs but slow on VMs
> > > workload? Use the "-stable" version.
> > >
> > > If you relase as stable, users tend to upgrade their cluster and use
> > > the newer feature (that you are marking as stable).
> > > What If I upgrade a production cluster to a stable version and try to
> > > add-brick that lead to data corruption ?
> > > I have to restore terabytes worth of data? Gluster is made for
> > > scale-out, what I my cluster was made with 500TB of VMs ?
> > > Try to restore 500TB from a backup
> > >
> > > This is unacceptable. add-brick/replace-brick should be common "daily"
> > > operations. You should heavy check these for regression or bug.
> > >
> >
> > This is a very good point. Adding other maintainers.
>
> Obviously this is unacceptible for versions that have sharding as a
> functional (not experimental) feature. All supported features are
> expected to function without major problems (like corruption) for all
> standard Gluster operations. Add-brick/replace-brick are surely such
> Gluster operations.
>
> Of course it is possible that this does not always happen, and our tests
> did not catch the problem. In that case, we really need to have a bug
> report with all the details, and preferably a script that can be used to
> reproduce and detect the failure.
>

I believe this bug relates to this particular issue raised in this email
chain.

https://bugzilla.redhat.com/show_bug.cgi?id=1387878

Kevin found bug, and Lindsay filed report after she was able to recreate it.


>
> FWIW sharding has several open bugs (like any other component), but it
> is not immediately clear to me if the problem reported in this email is
> in Bugzilla yet. These are the bugs that are expected to get fixed in
> upcoming minor releases:
>   https://bugzilla.redhat.com/buglist.cgi?component=
> sharding=bug_status=version=notequals=
> notequals=GlusterFS_format=advanced=CLOSED=mainline
>
> HTH,
> Niels
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] 3.7.16 with sharding corrupts VMDK files when adding and removing bricks

2016-11-14 Thread Vijay Bellur
On Mon, Nov 14, 2016 at 10:38 AM, Gandalf Corvotempesta
 wrote:
> 2016-11-14 15:54 GMT+01:00 Niels de Vos :
>> Obviously this is unacceptible for versions that have sharding as a
>> functional (not experimental) feature. All supported features are
>> expected to function without major problems (like corruption) for all
>> standard Gluster operations. Add-brick/replace-brick are surely such
>> Gluster operations.
>
> Is sharding an experimental feature even in 3.8 ?
> Because in 3.8 announcement, it's declared stable:
> http://blog.gluster.org/2016/06/glusterfs-3-8-released/
> "Sharding is now stable for VM image storage. "
>

sharding was an experimental feature in 3.7. Based on the feedback
that we received in testing, we called it out as stable in 3.8. The
add-brick related issue is something that none of us encountered in
testing and we will determine how we can avoid missing such problems
in the future.

>> FWIW sharding has several open bugs (like any other component), but it
>> is not immediately clear to me if the problem reported in this email is
>> in Bugzilla yet. These are the bugs that are expected to get fixed in
>> upcoming minor releases:
>>   
>> https://bugzilla.redhat.com/buglist.cgi?component=sharding=bug_status=version=notequals=notequals=GlusterFS_format=advanced=CLOSED=mainline
>
> My issue with sharding was reported in bugzilla on 2016-07-12
> 4 months for a IMHO, critical bug.
>
> If you disable sharding on a sharded volume with existing shared data,
> you corrupt every existing file.

Accessing sharded data after disabling sharding is something that we
did not visualize as a valid use case at any point in time. Also, you
could access the contents by enabling sharding again. Given these
factors I think this particular problem has not been prioritized by
us.

As with many other projects, we are in a stage today where the number
of users and testers far outweigh the number of developers
contributing code. With this state it becomes hard to prioritize
problems from a long todo list for developers.  If valuable community
members like you feel strongly about a bug or feature that need
attention of developers, please call such issues out on the mailing
list. We will be more than happy to help.

Having explained the developer perspective, I do apologize for any
inconvenience you might have encountered from this particular bug.

Thanks!
Vijay
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] 3.7.16 with sharding corrupts VMDK files when adding and removing bricks

2016-11-14 Thread Krutika Dhananjay
Yes. I apologise for the delay.

Disabling sharding would knock the translator itself off the client stack,
and
being that sharding is the actual (and the only) translator that has the
knowledge of how to interpret sharded files, and how to aggregate them,
removing the translator from the stack will make all shards start to appear
like
isolated files with no way to interpret the correlation between the
individual pieces.

The only way to fix it is to have sharding be part of the graph *even* if
disabled,
except that in this case, its job should be confined to aggregating the
already
sharded files during reads but NOT shard new files that are created, since
it is
supposed to "act" disabled. This is a slightly bigger change and this is
why I had
suggested the workaround at
https://bugzilla.redhat.com/show_bug.cgi?id=1355846#c1
back then.

FWIW, the documentation [1] does explain how to disable sharding the right
way and has been in existence ever since sharding was first released in
3.7.0.

[1] - http://staged-gluster-docs.readthedocs.io/en/release3.7.
0beta1/Features/shard/

-Krutika



On Mon, Nov 14, 2016 at 9:08 PM, Gandalf Corvotempesta <
gandalf.corvotempe...@gmail.com> wrote:

> 2016-11-14 15:54 GMT+01:00 Niels de Vos :
> > Obviously this is unacceptible for versions that have sharding as a
> > functional (not experimental) feature. All supported features are
> > expected to function without major problems (like corruption) for all
> > standard Gluster operations. Add-brick/replace-brick are surely such
> > Gluster operations.
>
> Is sharding an experimental feature even in 3.8 ?
> Because in 3.8 announcement, it's declared stable:
> http://blog.gluster.org/2016/06/glusterfs-3-8-released/
> "Sharding is now stable for VM image storage. "
>
> > FWIW sharding has several open bugs (like any other component), but it
> > is not immediately clear to me if the problem reported in this email is
> > in Bugzilla yet. These are the bugs that are expected to get fixed in
> > upcoming minor releases:
> >   https://bugzilla.redhat.com/buglist.cgi?component=
> sharding=bug_status=version=notequals=
> notequals=GlusterFS_format=advanced=CLOSED=mainline
>
> My issue with sharding was reported in bugzilla on 2016-07-12
> 4 months for a IMHO, critical bug.
>
> If you disable sharding on a sharded volume with existing shared data,
> you corrupt every existing file.
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Volume ping-timeout parameter and client side mount timeouts

2016-11-14 Thread Martin Schlegel
Hello Gluster Community

We have 2x brick nodes running with replication for a volume gv0 for which set a
"gluster volume set gv0 ping-timeout 20".

In our tests it seemed there is unknown delay with this ping-timeout - we see it
timing out much later after about 35 seconds and not at around 20 seconds (see
test below).

Our distributed database cluster is using Gluster as a secondary file system for
backups etc. - it's Pacemaker cluster manager needs to know how long to wait
before giving up on the glusterfs mounted file system to become available again
or when to failover to another node.

1. When do we know when to give up waiting on the glusterfs mount point to
become accessible again following an outage on the brick server this client was
connected to ?
2. Is there a timeout / interval setting on the client side that we could
reduce, so that it more quickly tries to switch the mount point to a different,
available brick server ?


Regards,
Martin Schlegel

__

Here is how we tested this:

As a test we blocked the entire network on one of these brick nodes:
root@glusterfs-brick-node1 $ date;iptables -A INPUT -i bond0 -j DROP ; iptables
-A OUTPUT -o bond0 -j DROP
Mon Nov 14 08:26:55 UTC 2016

From the syslog on the glusterfs-client-node
Nov 14 08:27:30 glusterfs-client-node1 pgshared1[26783]: [2016-11-14
08:27:30.275694] C [rpc-clnt-ping.c:160:rpc_clnt_ping_timer_expired]
0-gv0-client-0: server glusterfs-brick-node1:49152 has not responded in the last
20 seconds, disconnecting.

<--- This last message "has not responded in the last 20 seconds" is confusing
to me, because the brick node was clearly blocked for 35 seconds already ! Is
there some client-side check interval that can be reduced ?
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] 3.7.16 with sharding corrupts VMDK files when adding and removing bricks

2016-11-14 Thread Krutika Dhananjay
On Mon, Nov 14, 2016 at 8:24 PM, Niels de Vos  wrote:

> On Mon, Nov 14, 2016 at 04:50:44PM +0530, Pranith Kumar Karampuri wrote:
> > On Mon, Nov 14, 2016 at 4:38 PM, Gandalf Corvotempesta <
> > gandalf.corvotempe...@gmail.com> wrote:
> >
> > > 2016-11-14 11:50 GMT+01:00 Pranith Kumar Karampuri <
> pkara...@redhat.com>:
> > > > To make gluster stable for VM images we had to add all these new
> features
> > > > and then fix all the bugs Lindsay/Kevin reported. We just fixed a
> > > corruption
> > > > issue that can happen with replace-brick which will be available in
> 3.9.0
> > > > and 3.8.6. The only 2 other known issues that can lead to
> corruptions are
> > > > add-brick and the bug you filed Gandalf. Krutika just 5 minutes back
> saw
> > > > something that could possibly lead to the corruption for the
> add-brick
> > > bug.
> > > > Is that really the Root cause? We are not sure yet, we need more
> time.
> > > > Without Lindsay/Kevin/David Gossage's support this workload would
> have
> > > been
> > > > in much worse condition. These bugs are not easy to re-create thus
> not
> > > easy
> > > > to fix. At least that has been Krutika's experience.
> > >
> > > Ok, but this changes should be placed in a "test" version and not
> > > marked as stable.
> > > I don't see any development release, only stable releases here.
> > > Do you want all features ? Try the "beta/rc/unstable/alpha/dev"
> version.
> > > Do you want the stable version without known bugs but slow on VMs
> > > workload? Use the "-stable" version.
> > >
> > > If you relase as stable, users tend to upgrade their cluster and use
> > > the newer feature (that you are marking as stable).
> > > What If I upgrade a production cluster to a stable version and try to
> > > add-brick that lead to data corruption ?
> > > I have to restore terabytes worth of data? Gluster is made for
> > > scale-out, what I my cluster was made with 500TB of VMs ?
> > > Try to restore 500TB from a backup
> > >
> > > This is unacceptable. add-brick/replace-brick should be common "daily"
> > > operations. You should heavy check these for regression or bug.
> > >
> >
> > This is a very good point. Adding other maintainers.
>

I think Pranith's intention here was to bring to other maintainers'
attention the point about
development releases vs stable releases although his inline comment may
have been a
bit out-of-place (I was part of the discussion that took place before this
reply of his, in office
today, hence taking the liberty to clarify).

-Krutika


> Obviously this is unacceptible for versions that have sharding as a
> functional (not experimental) feature. All supported features are
> expected to function without major problems (like corruption) for all
> standard Gluster operations. Add-brick/replace-brick are surely such
> Gluster operations.
>
> Of course it is possible that this does not always happen, and our tests
> did not catch the problem. In that case, we really need to have a bug
> report with all the details, and preferably a script that can be used to
> reproduce and detect the failure.
>
> FWIW sharding has several open bugs (like any other component), but it
> is not immediately clear to me if the problem reported in this email is
> in Bugzilla yet. These are the bugs that are expected to get fixed in
> upcoming minor releases:
>   https://bugzilla.redhat.com/buglist.cgi?component=
> sharding=bug_status=version=notequals=
> notequals=GlusterFS_format=advanced=CLOSED=mainline
>
> HTH,
> Niels
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] 3.7.16 with sharding corrupts VMDK files when adding and removing bricks

2016-11-14 Thread Gandalf Corvotempesta
2016-11-14 15:54 GMT+01:00 Niels de Vos :
> Obviously this is unacceptible for versions that have sharding as a
> functional (not experimental) feature. All supported features are
> expected to function without major problems (like corruption) for all
> standard Gluster operations. Add-brick/replace-brick are surely such
> Gluster operations.

Is sharding an experimental feature even in 3.8 ?
Because in 3.8 announcement, it's declared stable:
http://blog.gluster.org/2016/06/glusterfs-3-8-released/
"Sharding is now stable for VM image storage. "

> FWIW sharding has several open bugs (like any other component), but it
> is not immediately clear to me if the problem reported in this email is
> in Bugzilla yet. These are the bugs that are expected to get fixed in
> upcoming minor releases:
>   
> https://bugzilla.redhat.com/buglist.cgi?component=sharding=bug_status=version=notequals=notequals=GlusterFS_format=advanced=CLOSED=mainline

My issue with sharding was reported in bugzilla on 2016-07-12
4 months for a IMHO, critical bug.

If you disable sharding on a sharded volume with existing shared data,
you corrupt every existing file.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] 3.7.16 with sharding corrupts VMDK files when adding and removing bricks

2016-11-14 Thread Niels de Vos
On Mon, Nov 14, 2016 at 04:50:44PM +0530, Pranith Kumar Karampuri wrote:
> On Mon, Nov 14, 2016 at 4:38 PM, Gandalf Corvotempesta <
> gandalf.corvotempe...@gmail.com> wrote:
> 
> > 2016-11-14 11:50 GMT+01:00 Pranith Kumar Karampuri :
> > > To make gluster stable for VM images we had to add all these new features
> > > and then fix all the bugs Lindsay/Kevin reported. We just fixed a
> > corruption
> > > issue that can happen with replace-brick which will be available in 3.9.0
> > > and 3.8.6. The only 2 other known issues that can lead to corruptions are
> > > add-brick and the bug you filed Gandalf. Krutika just 5 minutes back saw
> > > something that could possibly lead to the corruption for the add-brick
> > bug.
> > > Is that really the Root cause? We are not sure yet, we need more time.
> > > Without Lindsay/Kevin/David Gossage's support this workload would have
> > been
> > > in much worse condition. These bugs are not easy to re-create thus not
> > easy
> > > to fix. At least that has been Krutika's experience.
> >
> > Ok, but this changes should be placed in a "test" version and not
> > marked as stable.
> > I don't see any development release, only stable releases here.
> > Do you want all features ? Try the "beta/rc/unstable/alpha/dev" version.
> > Do you want the stable version without known bugs but slow on VMs
> > workload? Use the "-stable" version.
> >
> > If you relase as stable, users tend to upgrade their cluster and use
> > the newer feature (that you are marking as stable).
> > What If I upgrade a production cluster to a stable version and try to
> > add-brick that lead to data corruption ?
> > I have to restore terabytes worth of data? Gluster is made for
> > scale-out, what I my cluster was made with 500TB of VMs ?
> > Try to restore 500TB from a backup
> >
> > This is unacceptable. add-brick/replace-brick should be common "daily"
> > operations. You should heavy check these for regression or bug.
> >
> 
> This is a very good point. Adding other maintainers.

Obviously this is unacceptible for versions that have sharding as a
functional (not experimental) feature. All supported features are
expected to function without major problems (like corruption) for all
standard Gluster operations. Add-brick/replace-brick are surely such
Gluster operations.

Of course it is possible that this does not always happen, and our tests
did not catch the problem. In that case, we really need to have a bug
report with all the details, and preferably a script that can be used to
reproduce and detect the failure.

FWIW sharding has several open bugs (like any other component), but it
is not immediately clear to me if the problem reported in this email is
in Bugzilla yet. These are the bugs that are expected to get fixed in
upcoming minor releases:
  
https://bugzilla.redhat.com/buglist.cgi?component=sharding=bug_status=version=notequals=notequals=GlusterFS_format=advanced=CLOSED=mainline

HTH,
Niels


signature.asc
Description: PGP signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Glusterfs readonly Issue

2016-11-14 Thread Atul Yadav
Dear Team,

In the event of the failure of master1, master 2 glusterfs home directory
will become read only fs.

If we manually shutdown the master 2, then there is no impact on the file
system and all io operation will complete with out any problem.

can you please provide some guidance to isolate the problem.



# gluster peer status
Number of Peers: 2

Hostname: master1-ib.dbt.au
Uuid: a5608d66-a3c6-450e-a239-108668083ff2
State: Peer in Cluster (Connected)

Hostname: compute01-ib.dbt.au
Uuid: d2c47fc2-f673-4790-b368-d214a58c59f4
State: Peer in Cluster (Connected)



# gluster vol info home

Volume Name: home
Type: Replicate
Volume ID: 2403ddf9-c2e0-4930-bc94-734772ef099f
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp,rdma
Bricks:
Brick1: master1-ib.dbt.au:/glusterfs/home/brick1
Brick2: master2-ib.dbt.au:/glusterfs/home/brick2
Options Reconfigured:
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
network.remote-dio: enable
cluster.quorum-type: auto
nfs.disable: on
performance.readdir-ahead: on
cluster.server-quorum-type: server
config.transport: tcp,rdma
network.ping-timeout: 10
cluster.server-quorum-ratio: 51%
cluster.enable-shared-storage: disable



# gluster vol heal home info
Brick master1-ib.dbt.au:/glusterfs/home/brick1
Status: Connected
Number of entries: 0

Brick master2-ib.dbt.au:/glusterfs/home/brick2
Status: Connected
Number of entries: 0


# gluster vol heal home info heal-failed
Gathering list of heal failed entries on volume home has been unsuccessful
on bricks that are down. Please check if all brick processes are
running[root@master2


Thank You
Atul Yadav
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Feature Request: Lock Volume Settings

2016-11-14 Thread Atin Mukherjee
On Mon, Nov 14, 2016 at 6:16 PM, Gandalf Corvotempesta <
gandalf.corvotempe...@gmail.com> wrote:

> Il 14 nov 2016 13:27, "Atin Mukherjee"  ha scritto:
> >
> > This would be a straight forward thing to implement at glusterd, anyone
> up for it? If not, we will take this into consideration for GlusterD 2.0.
> >
>
> I would prefer an additional parameter to the cli or a confirm, something
> like:
>
> --do-what-i-say or --force
>

for locking the volume? If so, what additional benefit you are going to get
from it?


> Or a question to answer yes/no (like SSH when accepting the remote key)
>



-- 

~ Atin (atinm)
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Feature Request: Lock Volume Settings

2016-11-14 Thread Gandalf Corvotempesta
Il 14 nov 2016 13:27, "Atin Mukherjee"  ha scritto:
>
> This would be a straight forward thing to implement at glusterd, anyone
up for it? If not, we will take this into consideration for GlusterD 2.0.
>

I would prefer an additional parameter to the cli or a confirm, something
like:

--do-what-i-say or --force

Or a question to answer yes/no (like SSH when accepting the remote key)
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Feature Request: Lock Volume Settings

2016-11-14 Thread Atin Mukherjee
This would be a straight forward thing to implement at glusterd, anyone up
for it? If not, we will take this into consideration for GlusterD 2.0.

On Mon, Nov 14, 2016 at 10:28 AM, Mohammed Rafi K C 
wrote:

> I think it is worth to implement a lock option.
>
> +1
>
>
> Rafi KC
>
> On 11/14/2016 06:12 AM, David Gossage wrote:
>
> On Sun, Nov 13, 2016 at 6:35 PM, Lindsay Mathieson <
> lindsay.mathie...@gmail.com> wrote:
>
>> As discussed recently, it is way to easy to make destructive changes
>> to a volume,e.g change shard size. This can corrupt the data with no
>> warnings and its all to easy to make a typo or access the wrong volume
>> when doing 3am maintenance ...
>>
>> So I'd like to suggest something like the following:
>>
>>   gluster volume lock 
>>
>> Setting this would fail all:
>> - setting changes
>> - add bricks
>> - remove bricks
>> - delete volume
>>
>>   gluster volume unlock 
>>
>> would allow all changes to be made.
>>
>> Just a thought, open to alternate suggestions.
>>
>> Thanks
>>
>> +
> sounds handy
>
>> --
>> Lindsay
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>>
>
>
>
> ___
> Gluster-users mailing 
> listGluster-users@gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users
>
>
>
> ___
> Gluster-devel mailing list
> gluster-de...@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>



-- 

~ Atin (atinm)
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] problems(?) using rdiff-backup

2016-11-14 Thread lejeczek

hi guys,

should rdiff-backup struggle to backup a glusterfs mount?
I'm trying glusterfs and was hoping, expecting I could keep 
on rdiff-backing up data. I backup directly to 
local(non-gluster) storage(xfs) and get this:


$ rdiff-backup --exclude-other-filesystems 
--exclude-symbolic-links /0-ALL.DATA/USER-HOME/ 
/__.aLocalStorages/4/0-DATA-BACKUPs/USER-HOME.RDIFF-backup 
>> /var/log/WHALE-backups.log
Exception '[Errno 61] No data available' raised of class 
'':
  File 
"/usr/lib64/python2.7/site-packages/rdiff_backup/robust.py", 
line 32, in check_common_error

try: return function(*args)
  File 
"/usr/lib64/python2.7/site-packages/rdiff_backup/rpath.py", 
line 1149, in append
return self.__class__(self.conn, self.base, self.index 
+ (ext,))
  File 
"/usr/lib64/python2.7/site-packages/rdiff_backup/rpath.py", 
line 884, in __init__

else: self.setdata()
  File 
"/usr/lib64/python2.7/site-packages/rdiff_backup/rpath.py", 
line 909, in setdata

if self.lstat(): self.conn.rpath.setdata_local(self)
  File 
"/usr/lib64/python2.7/site-packages/rdiff_backup/rpath.py", 
line 1496, in setdata_local

if Globals.eas_conn: rpath.data['ea'] = ea_get(rpath)
  File 
"/usr/lib64/python2.7/site-packages/rdiff_backup/eas_acls.py", 
line 597, in rpath_ea_get

ea.read_from_rp(rp)
  File 
"/usr/lib64/python2.7/site-packages/rdiff_backup/eas_acls.py", 
line 60, in read_from_rp

attr_list = rp.conn.xattr.listxattr(rp.path, rp.issym())
...

and more, and rdiff-backup crashes.

Would there be some info about "glusterfs backup best 
practices" that are must-read?

many thanks.
L.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Comparison with other SDS

2016-11-14 Thread Gandalf Corvotempesta
2016-11-14 12:51 GMT+01:00 Lindsay Mathieson :
> Of course if you're running a replica  volume, non-dispersed you should
> only need to do lookups locally. It would be interesting to know if thats a
> optimization gluster does.

I have a replica 2 with only 2 bricks, there is nothing to "disperse" :)
Probably, Lizard is faster because replication is made by each
chunkserver and not by the client
http://moosefs.org/tl_files/mfs_folder/write862.png
(image for MooseFS, but Lizard is a fork, it replicate in the same way)
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Comparison with other SDS

2016-11-14 Thread Lindsay Mathieson

On 14/11/2016 9:00 PM, Gandalf Corvotempesta wrote:

Can someone explain me why Lizard is 10 times faster than gluster?
This is not a flame, I would only like to know the technical
differences between these two software


Its my understanding that with many/small file operations involving 
directory lookup etc on disperse volumes gluster has to check each 
directory on each brick set, which leads to very high latencies. This is 
where a metadata server is an advantage I guess.



Of course if you're running a replica  volume, non-dispersed you 
should only need to do lookups locally. It would be interesting to know 
if thats a optimization gluster does.


--
Lindsay Mathieson

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Comparison with other SDS

2016-11-14 Thread Jean R. Franco
Hi Gandalf,

Can you provide more information about your setup?
How many nodes? What disk sizes? Are they VMs or physical machines? What is the 
speed of the network?
What OS are you running Lizard on , and finally how are the disks setup?

We use MooseFS, Nexenta, Gluster and Ceph here, and in our tests we see very 
little difference in speeds.
Some of these setups have advantages with many clients writing at once.

Thanks,

- Mensagem original -
De: "Gandalf Corvotempesta" 
Para: "gluster-users" 
Enviadas: Segunda-feira, 14 de novembro de 2016 9:00:00
Assunto: [Gluster-users] Comparison with other SDS

I did a very simple and stupid LizardFS installation this weekend.
Same configuration as gluster, same nodes, same disks. Both set with
replica 2, same ZFS filesystem on each disks/bricks

LizardFS installation took 10 minutes on all servers (1 client that
i've also used as master and 2 chunkservers), Gluster took less than 5
minutes from 0 to a working cluster. (just apt-get, gluster peer probe
and volume create)

Performances:
extracting this:
https://cdn.kernel.org/pub/linux/kernel/v4.x/testing/linux-4.9-rc5.tar.xz
took 45 minutes (forty-five minutes) on Gluster, 4 minutes (four
minutes) on LizardFS. It's not a typo. 45 minutes vs 4.

removing the whole directory tree: in Lizard less than 4 minutes, in
gluster i've stopped the process after about 20 minutes.

Both were configured with sharding (64M). LizardFS/MooseFS has this hardcoded.

Can this be related to the metadata server? I don't think so. Gluster
is able to know where a file is without asking to the brick servers.
In fact, gluster should be faster, as there isn't any query to make to
a metadata server when reading/writing.

Failures: LizardFS detect properly a missing/corrupted (like bitrot)
chunk but I was unable to understand it's recovery process. I've not
tried the bit-rot feature in gluster.

Can someone explain me why Lizard is 10 times faster than gluster?
This is not a flame, I would only like to know the technical
differences between these two software
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] 3.7.16 with sharding corrupts VMDK files when adding and removing bricks

2016-11-14 Thread Pranith Kumar Karampuri
On Mon, Nov 14, 2016 at 4:38 PM, Gandalf Corvotempesta <
gandalf.corvotempe...@gmail.com> wrote:

> 2016-11-14 11:50 GMT+01:00 Pranith Kumar Karampuri :
> > To make gluster stable for VM images we had to add all these new features
> > and then fix all the bugs Lindsay/Kevin reported. We just fixed a
> corruption
> > issue that can happen with replace-brick which will be available in 3.9.0
> > and 3.8.6. The only 2 other known issues that can lead to corruptions are
> > add-brick and the bug you filed Gandalf. Krutika just 5 minutes back saw
> > something that could possibly lead to the corruption for the add-brick
> bug.
> > Is that really the Root cause? We are not sure yet, we need more time.
> > Without Lindsay/Kevin/David Gossage's support this workload would have
> been
> > in much worse condition. These bugs are not easy to re-create thus not
> easy
> > to fix. At least that has been Krutika's experience.
>
> Ok, but this changes should be placed in a "test" version and not
> marked as stable.
> I don't see any development release, only stable releases here.
> Do you want all features ? Try the "beta/rc/unstable/alpha/dev" version.
> Do you want the stable version without known bugs but slow on VMs
> workload? Use the "-stable" version.
>
> If you relase as stable, users tend to upgrade their cluster and use
> the newer feature (that you are marking as stable).
> What If I upgrade a production cluster to a stable version and try to
> add-brick that lead to data corruption ?
> I have to restore terabytes worth of data? Gluster is made for
> scale-out, what I my cluster was made with 500TB of VMs ?
> Try to restore 500TB from a backup
>
> This is unacceptable. add-brick/replace-brick should be common "daily"
> operations. You should heavy check these for regression or bug.
>

This is a very good point. Adding other maintainers.


>
> > One more take away is to get the
> > documentation right. Lack of documentation led Alex to try the worst
> > possible combo for storing VMs on gluster. So we as community failed in
> some
> > way there as well.
> >
> >   Krutika will be sending out VM usecase related documentation after
> > 28th of this month. If you have any other feedback, do let us know.
>
> Yes, lack of updated docs or a reference architecture is a big issue.
>



-- 
Pranith
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] 3.7.16 with sharding corrupts VMDK files when adding and removing bricks

2016-11-14 Thread Gandalf Corvotempesta
2016-11-14 11:50 GMT+01:00 Pranith Kumar Karampuri :
> To make gluster stable for VM images we had to add all these new features
> and then fix all the bugs Lindsay/Kevin reported. We just fixed a corruption
> issue that can happen with replace-brick which will be available in 3.9.0
> and 3.8.6. The only 2 other known issues that can lead to corruptions are
> add-brick and the bug you filed Gandalf. Krutika just 5 minutes back saw
> something that could possibly lead to the corruption for the add-brick bug.
> Is that really the Root cause? We are not sure yet, we need more time.
> Without Lindsay/Kevin/David Gossage's support this workload would have been
> in much worse condition. These bugs are not easy to re-create thus not easy
> to fix. At least that has been Krutika's experience.

Ok, but this changes should be placed in a "test" version and not
marked as stable.
I don't see any development release, only stable releases here.
Do you want all features ? Try the "beta/rc/unstable/alpha/dev" version.
Do you want the stable version without known bugs but slow on VMs
workload? Use the "-stable" version.

If you relase as stable, users tend to upgrade their cluster and use
the newer feature (that you are marking as stable).
What If I upgrade a production cluster to a stable version and try to
add-brick that lead to data corruption ?
I have to restore terabytes worth of data? Gluster is made for
scale-out, what I my cluster was made with 500TB of VMs ?
Try to restore 500TB from a backup

This is unacceptable. add-brick/replace-brick should be common "daily"
operations. You should heavy check these for regression or bug.

> One more take away is to get the
> documentation right. Lack of documentation led Alex to try the worst
> possible combo for storing VMs on gluster. So we as community failed in some
> way there as well.
>
>   Krutika will be sending out VM usecase related documentation after
> 28th of this month. If you have any other feedback, do let us know.

Yes, lack of updated docs or a reference architecture is a big issue.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] 3.7.16 with sharding corrupts VMDK files when adding and removing bricks

2016-11-14 Thread Krutika Dhananjay
Which data corruption issue is this? Could you point me to the bug report
on bugzilla?

-Krutika

On Sat, Nov 12, 2016 at 4:28 PM, Gandalf Corvotempesta <
gandalf.corvotempe...@gmail.com> wrote:

> Il 12 nov 2016 10:21, "Kevin Lemonnier"  ha scritto:
> > We've had a lot of problems in the past, but at least for us 3.7.12 (and
> 3.7.15)
> > seems to be working pretty well as long as you don't add bricks. We
> started doing
> > multiple little clusters and abandonned the idea of one big cluster, had
> no
> > issues since :)
> >
>
> Well, adding bricks could be usefull...  :)
>
> Having to create multiple cluster is not a solution and is much more
> expansive.
> And if you corrupt data from a single cluster you still have issues
>
> I think would be better to add less features and focus more to stability.
> In a software defined storage, stability and consistency are the most
> important things
>
> I'm also subscribed to moosefs and lizardfs mailing list and I don't
> recall any single data corruption/data loss event
>
> In gluster,  after some days of testing I've found a huge data corruption
> issue that is still unfixed on bugzilla.
> If you change the shard size on a populated cluster,  you break all
> existing data.
> Try to do this on a cluster with working VMs and see what happens
> a single cli command break everything and is still unfixed.
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] 3.7.16 with sharding corrupts VMDK files when adding and removing bricks

2016-11-14 Thread Pranith Kumar Karampuri
On Sat, Nov 12, 2016 at 4:28 PM, Gandalf Corvotempesta <
gandalf.corvotempe...@gmail.com> wrote:

> Il 12 nov 2016 10:21, "Kevin Lemonnier"  ha scritto:
> > We've had a lot of problems in the past, but at least for us 3.7.12 (and
> 3.7.15)
> > seems to be working pretty well as long as you don't add bricks. We
> started doing
> > multiple little clusters and abandonned the idea of one big cluster, had
> no
> > issues since :)
> >
>
> Well, adding bricks could be usefull...  :)
>
> Having to create multiple cluster is not a solution and is much more
> expansive.
> And if you corrupt data from a single cluster you still have issues
>
> I think would be better to add less features and focus more to stability.
>
First of all, thanks to all the folks who contributed to this thread. We
value your feedback.

In gluster-users and ovirt-community  we saw people trying gluster and
complain about heal times and split-brains. So we had to fix bugs in quorum
in 3-way replication; then we started working on features like sharding for
better heal times and arbiter volumes for cost benefits.

To make gluster stable for VM images we had to add all these new features
and then fix all the bugs Lindsay/Kevin reported. We just fixed a
corruption issue that can happen with replace-brick which will be available
in 3.9.0 and 3.8.6. The only 2 other known issues that can lead to
corruptions are add-brick and the bug you filed Gandalf. Krutika just 5
minutes back saw something that could possibly lead to the corruption for
the add-brick bug. Is that really the Root cause? We are not sure yet, we
need more time. Without Lindsay/Kevin/David Gossage's support this workload
would have been in much worse condition. These bugs are not easy to
re-create thus not easy to fix. At least that has been Krutika's experience.

   Take away from this mail thread for me is: I think it is important
to educate users about why we are adding new features. People are coming to
the conclusion that only bug fixing corresponds to stabilization and not
features. It is a wrong perception. Without the work that went into adding
all those new features above in gluster, most probably you guys wouldn't
have given gluster another chance because it used to be unusable before
these features for VM workloads. One more take away is to get the
documentation right. Lack of documentation led Alex to try the worst
possible combo for storing VMs on gluster. So we as community failed in
some way there as well.

  Krutika will be sending out VM usecase related documentation after
28th of this month. If you have any other feedback, do let us know.

In a software defined storage, stability and consistency are the most
> important things
>
> I'm also subscribed to moosefs and lizardfs mailing list and I don't
> recall any single data corruption/data loss event
>
> In gluster,  after some days of testing I've found a huge data corruption
> issue that is still unfixed on bugzilla.
> If you change the shard size on a populated cluster,  you break all
> existing data.
> Try to do this on a cluster with working VMs and see what happens
> a single cli command break everything and is still unfixed.
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>



-- 
Pranith
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users