Re: [Gluster-users] Replica 3 - how to replace failed node (peer)

2019-04-11 Thread Karthik Subrahmanya
On Thu, Apr 11, 2019 at 6:38 PM Martin Toth  wrote:

> Hi Karthik,
>
> On Thu, Apr 11, 2019 at 12:43 PM Martin Toth  wrote:
>
>> Hi Karthik,
>>
>> more over, I would like to ask if there are some recommended
>> settings/parameters for SHD in order to achieve good or fair I/O while
>> volume will be healed when I will replace Brick (this should trigger
>> healing process).
>>
> If I understand you concern correctly, you need to get fair I/O
> performance for clients while healing takes place as part of  the replace
> brick operation. For this you can turn off the "data-self-heal" and
> "metadata-self-heal" options until the heal completes on the new brick.
>
>
> This is exactly what I mean. I am running VM disks on remaining 2 (out of
> 3 - one failed as mentioned) nodes and I need to ensure there will be fair
> I/O performance available on these two nodes while replace brick operation
> will heal volume.
> I will not run any VMs on node where replace brick operation will be
> running. So if I understand correctly, when I will set :
>
> # gluster volume set  cluster.data-self-heal off
> # gluster volume set  cluster.metadata-self-heal off
>
> this will tell Gluster clients (libgfapi and FUSE mount) not to read from
> node “where replace brick operation” is in place but from remaing two
> healthy nodes. Is this correct ? Thanks for clarification.
>
The reads will be served from one of the good bricks since the file will
either be not present on the replaced brick at the time of read or it will
be present but marked for heal if it is not already healed. If already
healed by SHD, then it could be served from the new brick as well, but
there won't be any problem in reading from there in that scenario.
By setting these two options whenever a read comes from client it will not
try to heal the file for data/metadata. Otherwise it would try to heal (if
not already healed by SHD) when the read comes on this, hence slowing down
the client.

>
> Turning off client side healing doesn't compromise data integrity and
> consistency. During the read request from client, pending xattr is
> evaluated for replica copies and read is only served from correct copy.
> During writes, IO will continue on both the replicas, SHD will take care of
> healing files.
> After replacing the brick, we strongly recommend you to consider upgrading
> your gluster to one of the maintained versions. We have many stability
> related fixes there, which can handle some critical issues and corner cases
> which you could hit during these kind of scenarios.
>
>
> This will be first priority in infrastructure after fixing this cluster
> back to fully functional replica3. I will upgrade to 3.12.x and then to
> version 5 or 6.
>
Sounds good.

If you are planning to have the same name for the new brick and if you get
the error like "Brick may be containing or be contained by an existing
brick" even after using the force option, try  using a different name. That
should work.

Regards,
Karthik

>
> BR,
> Martin
>
> Regards,
> Karthik
>
>> I had some problems in past when healing was triggered, VM disks became
>> unresponsive because healing took most of I/O. My volume containing only
>> big files with VM disks.
>>
>> Thanks for suggestions.
>> BR,
>> Martin
>>
>> On 10 Apr 2019, at 12:38, Martin Toth  wrote:
>>
>> Thanks, this looks ok to me, I will reset brick because I don't have any
>> data anymore on failed node so I can use same path / brick name.
>>
>> Is reseting brick dangerous command? Should I be worried about some
>> possible failure that will impact remaining two nodes? I am running really
>> old 3.7.6 but stable version.
>>
>> Thanks,
>> BR!
>>
>> Martin
>>
>>
>> On 10 Apr 2019, at 12:20, Karthik Subrahmanya 
>> wrote:
>>
>> Hi Martin,
>>
>> After you add the new disks and creating raid array, you can run the
>> following command to replace the old brick with new one:
>>
>> - If you are going to use a different name to the new brick you can run
>> gluster volume replace-brickcommit
>> force
>>
>> - If you are planning to use the same name for the new brick as well then
>> you can use
>> gluster volume reset-brickcommit force
>> Here old-brick & new-brick's hostname &  path should be same.
>>
>> After replacing the brick, make sure the brick comes online using volume
>> status.
>> Heal should automatically start, you can check the heal status to see all
>> the files gets replicated to the newly added brick. If it does not start
>> automatically, you can manually start that by running gluster volume heal
>> .
>>
>> HTH,
>> Karthik
>>
>> On Wed, Apr 10, 2019 at 3:13 PM Martin Toth  wrote:
>>
>>> Hi all,
>>>
>>> I am running replica 3 gluster with 3 bricks. One of my servers failed -
>>> all disks are showing errors and raid is in fault state.
>>>
>>> Type: Replicate
>>> Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a
>>> Status: Started
>>> Number of Bricks: 1 x 3 = 3
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: 

Re: [Gluster-users] Replica 3 - how to replace failed node (peer)

2019-04-11 Thread Martin Toth
Hi Karthik,

> On Thu, Apr 11, 2019 at 12:43 PM Martin Toth  > wrote:
> Hi Karthik,
> 
> more over, I would like to ask if there are some recommended 
> settings/parameters for SHD in order to achieve good or fair I/O while volume 
> will be healed when I will replace Brick (this should trigger healing 
> process). 
> If I understand you concern correctly, you need to get fair I/O performance 
> for clients while healing takes place as part of  the replace brick 
> operation. For this you can turn off the "data-self-heal" and 
> "metadata-self-heal" options until the heal completes on the new brick.

This is exactly what I mean. I am running VM disks on remaining 2 (out of 3 - 
one failed as mentioned) nodes and I need to ensure there will be fair I/O 
performance available on these two nodes while replace brick operation will 
heal volume.
I will not run any VMs on node where replace brick operation will be running. 
So if I understand correctly, when I will set :

# gluster volume set  cluster.data-self-heal off
# gluster volume set  cluster.metadata-self-heal off

this will tell Gluster clients (libgfapi and FUSE mount) not to read from node 
“where replace brick operation” is in place but from remaing two healthy nodes. 
Is this correct ? Thanks for clarification.

> Turning off client side healing doesn't compromise data integrity and 
> consistency. During the read request from client, pending xattr is evaluated 
> for replica copies and read is only served from correct copy. During writes, 
> IO will continue on both the replicas, SHD will take care of healing files.
> After replacing the brick, we strongly recommend you to consider upgrading 
> your gluster to one of the maintained versions. We have many stability 
> related fixes there, which can handle some critical issues and corner cases 
> which you could hit during these kind of scenarios.

This will be first priority in infrastructure after fixing this cluster back to 
fully functional replica3. I will upgrade to 3.12.x and then to version 5 or 6.

BR, 
Martin

> Regards,
> Karthik
> I had some problems in past when healing was triggered, VM disks became 
> unresponsive because healing took most of I/O. My volume containing only big 
> files with VM disks.
> 
> Thanks for suggestions.
> BR, 
> Martin
> 
>> On 10 Apr 2019, at 12:38, Martin Toth > > wrote:
>> 
>> Thanks, this looks ok to me, I will reset brick because I don't have any 
>> data anymore on failed node so I can use same path / brick name.
>> 
>> Is reseting brick dangerous command? Should I be worried about some possible 
>> failure that will impact remaining two nodes? I am running really old 3.7.6 
>> but stable version.
>> 
>> Thanks,
>> BR!
>> 
>> Martin
>>  
>> 
>>> On 10 Apr 2019, at 12:20, Karthik Subrahmanya >> > wrote:
>>> 
>>> Hi Martin,
>>> 
>>> After you add the new disks and creating raid array, you can run the 
>>> following command to replace the old brick with new one:
>>> 
>>> - If you are going to use a different name to the new brick you can run
>>> gluster volume replace-brickcommit force
>>> 
>>> - If you are planning to use the same name for the new brick as well then 
>>> you can use
>>> gluster volume reset-brickcommit force
>>> Here old-brick & new-brick's hostname &  path should be same.
>>> 
>>> After replacing the brick, make sure the brick comes online using volume 
>>> status.
>>> Heal should automatically start, you can check the heal status to see all 
>>> the files gets replicated to the newly added brick. If it does not start 
>>> automatically, you can manually start that by running gluster volume heal 
>>> .
>>> 
>>> HTH,
>>> Karthik
>>> 
>>> On Wed, Apr 10, 2019 at 3:13 PM Martin Toth >> > wrote:
>>> Hi all,
>>> 
>>> I am running replica 3 gluster with 3 bricks. One of my servers failed - 
>>> all disks are showing errors and raid is in fault state.
>>> 
>>> Type: Replicate
>>> Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a
>>> Status: Started
>>> Number of Bricks: 1 x 3 = 3
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: node1.san:/tank/gluster/gv0imagestore/brick1
>>> Brick2: node2.san:/tank/gluster/gv0imagestore/brick1 <— this brick is down
>>> Brick3: node3.san:/tank/gluster/gv0imagestore/brick1
>>> 
>>> So one of my bricks is totally failed (node2). It went down and all data 
>>> are lost (failed raid on node2). Now I am running only two bricks on 2 
>>> servers out from 3.
>>> This is really critical problem for us, we can lost all data. I want to add 
>>> new disks to node2, create new raid array on them and try to replace failed 
>>> brick on this node. 
>>> 
>>> What is the procedure of replacing Brick2 on node2, can someone advice? I 
>>> can’t find anything relevant in documentation.
>>> 
>>> Thanks in advance,
>>> Martin
>>> ___
>>> Gluster-users mailing list
>>> 

Re: [Gluster-users] Replica 3 - how to replace failed node (peer)

2019-04-11 Thread Karthik Subrahmanya
On Thu, Apr 11, 2019 at 1:40 PM Strahil Nikolov 
wrote:

> Hi Karthik,
>
> - the volume configuration you were using?
> I used oVirt 4.2.6 Gluster Wizard, so I guess - we need to involve the
> oVirt devs here.
> - why you wanted to replace your brick?
> I have deployed the arbiter on another location as I thought I can deploy
> the Thin Arbiter (still waiting the docs to be updated), but once I
> realized that GlusterD doesn't support Thin Arbiter, I had to build another
> machine for a local arbiter - thus a replacement was needed.
>
We are working on supporting Thin-arbiter with GlusterD. Once done, we will
update on the users list so that you can play with it and let us know your
experience.

> - which brick(s) you tried replacing?
> I was replacing the old arbiter with a new one
> - what problem(s) did you face?
> All oVirt VMs got paused due to I/O errors.
>
There could be many reasons for this. Without knowing the exact state of
the system at that time, I am afraid to make any comment on this.

>
> At the end, I have rebuild the whole setup and I never tried to replace
> the brick this way (used only reset-brick which didn't cause any issues).
>
> As I mentioned that was on v3.12, which is not the default for oVirt
> 4.3.x - so my guess is that it is OK now (current is v5.5).
>
I don't remember anyone complaining about this recently. This should work
in the latest releases.

>
> Just sharing my experience.
>
Highly appreciated.

Regards,
Karthik

>
> Best Regards,
> Strahil Nikolov
>
> В четвъртък, 11 април 2019 г., 0:53:52 ч. Гринуич-4, Karthik Subrahmanya <
> ksubr...@redhat.com> написа:
>
>
> Hi Strahil,
>
> Can you give us some more insights on
> - the volume configuration you were using?
> - why you wanted to replace your brick?
> - which brick(s) you tried replacing?
> - what problem(s) did you face?
>
> Regards,
> Karthik
>
> On Thu, Apr 11, 2019 at 10:14 AM Strahil  wrote:
>
> Hi Karthnik,
> I used only once the brick replace function when I wanted to change my
> Arbiter (v3.12.15 in oVirt 4.2.7)  and it was a complete disaster.
> Most probably I should have stopped the source arbiter before doing that,
> but the docs didn't mention it.
>
> Thus I always use reset-brick, as it never let me down.
>
> Best Regards,
> Strahil Nikolov
> On Apr 11, 2019 07:34, Karthik Subrahmanya  wrote:
>
> Hi Strahil,
>
> Thank you for sharing your experience with reset-brick option.
> Since he is using the gluster version 3.7.6, we do not have the
> reset-brick [1] option implemented there. It is introduced in 3.9.0. He has
> to go with replace-brick with the force option if he wants to use the same
> path & name for the new brick.
> Yes, it is recommended to have the new brick to be of the same size as
> that of the other bricks.
>
> [1]
> https://docs.gluster.org/en/latest/release-notes/3.9.0/#introducing-reset-brick-command
>
> Regards,
> Karthik
>
> On Wed, Apr 10, 2019 at 10:31 PM Strahil  wrote:
>
> I have used reset-brick - but I have just changed the brick layout.
> You may give it a try, but I guess you need your new brick to have same
> amount of space (or more).
>
> Maybe someone more experienced should share a more sound solution.
>
> Best Regards,
> Strahil NikolovOn Apr 10, 2019 12:42, Martin Toth 
> wrote:
> >
> > Hi all,
> >
> > I am running replica 3 gluster with 3 bricks. One of my servers failed -
> all disks are showing errors and raid is in fault state.
> >
> > Type: Replicate
> > Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a
> > Status: Started
> > Number of Bricks: 1 x 3 = 3
> > Transport-type: tcp
> > Bricks:
> > Brick1: node1.san:/tank/gluster/gv0imagestore/brick1
> > Brick2: node2.san:/tank/gluster/gv0imagestore/brick1 <— this brick is
> down
> > Brick3: node3.san:/tank/gluster/gv0imagestore/brick1
> >
> > So one of my bricks is totally failed (node2). It went down and all data
> are lost (failed raid on node2). Now I am running only two bricks on 2
> servers out from 3.
> > This is really critical problem for us, we can lost all data. I want to
> add new disks to node2, create new raid array on them and try to replace
> failed brick on this node.
> >
> > What is the procedure of replacing Brick2 on node2, can someone advice?
> I can’t find anything relevant in documentation.
> >
> > Thanks in advance,
> > Martin
> > ___
> > Gluster-users mailing list
> > Gluster-users@gluster.org
> > https://lists.gluster.org/mailman/listinfo/gluster-users
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Replica 3 - how to replace failed node (peer)

2019-04-11 Thread Karthik Subrahmanya
On Thu, Apr 11, 2019 at 12:43 PM Martin Toth  wrote:

> Hi Karthik,
>
> more over, I would like to ask if there are some recommended
> settings/parameters for SHD in order to achieve good or fair I/O while
> volume will be healed when I will replace Brick (this should trigger
> healing process).
>
If I understand you concern correctly, you need to get fair I/O performance
for clients while healing takes place as part of  the replace brick
operation. For this you can turn off the "data-self-heal" and
"metadata-self-heal" options until the heal completes on the new brick.
Turning off client side healing doesn't compromise data integrity and
consistency. During the read request from client, pending xattr is
evaluated for replica copies and read is only served from correct copy.
During writes, IO will continue on both the replicas, SHD will take care of
healing files.
After replacing the brick, we strongly recommend you to consider upgrading
your gluster to one of the maintained versions. We have many stability
related fixes there, which can handle some critical issues and corner cases
which you could hit during these kind of scenarios.

Regards,
Karthik

> I had some problems in past when healing was triggered, VM disks became
> unresponsive because healing took most of I/O. My volume containing only
> big files with VM disks.
>
> Thanks for suggestions.
> BR,
> Martin
>
> On 10 Apr 2019, at 12:38, Martin Toth  wrote:
>
> Thanks, this looks ok to me, I will reset brick because I don't have any
> data anymore on failed node so I can use same path / brick name.
>
> Is reseting brick dangerous command? Should I be worried about some
> possible failure that will impact remaining two nodes? I am running really
> old 3.7.6 but stable version.
>
> Thanks,
> BR!
>
> Martin
>
>
> On 10 Apr 2019, at 12:20, Karthik Subrahmanya  wrote:
>
> Hi Martin,
>
> After you add the new disks and creating raid array, you can run the
> following command to replace the old brick with new one:
>
> - If you are going to use a different name to the new brick you can run
> gluster volume replace-brickcommit force
>
> - If you are planning to use the same name for the new brick as well then
> you can use
> gluster volume reset-brickcommit force
> Here old-brick & new-brick's hostname &  path should be same.
>
> After replacing the brick, make sure the brick comes online using volume
> status.
> Heal should automatically start, you can check the heal status to see all
> the files gets replicated to the newly added brick. If it does not start
> automatically, you can manually start that by running gluster volume heal
> .
>
> HTH,
> Karthik
>
> On Wed, Apr 10, 2019 at 3:13 PM Martin Toth  wrote:
>
>> Hi all,
>>
>> I am running replica 3 gluster with 3 bricks. One of my servers failed -
>> all disks are showing errors and raid is in fault state.
>>
>> Type: Replicate
>> Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a
>> Status: Started
>> Number of Bricks: 1 x 3 = 3
>> Transport-type: tcp
>> Bricks:
>> Brick1: node1.san:/tank/gluster/gv0imagestore/brick1
>> Brick2: node2.san:/tank/gluster/gv0imagestore/brick1 <— this brick is down
>> Brick3: node3.san:/tank/gluster/gv0imagestore/brick1
>>
>> So one of my bricks is totally failed (node2). It went down and all data
>> are lost (failed raid on node2). Now I am running only two bricks on 2
>> servers out from 3.
>> This is really critical problem for us, we can lost all data. I want to
>> add new disks to node2, create new raid array on them and try to replace
>> failed brick on this node.
>>
>> What is the procedure of replacing Brick2 on node2, can someone advice? I
>> can’t find anything relevant in documentation.
>>
>> Thanks in advance,
>> Martin
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Invitation: Gluster Community Meeting (NA/EMEA friendly hours) @ Tue Apr 23, 2019 10:30pm - 11:30pm (IST) (gluster-users@gluster.org)

2019-04-11 Thread amarts
BEGIN:VCALENDAR
PRODID:-//Google Inc//Google Calendar 70.9054//EN
VERSION:2.0
CALSCALE:GREGORIAN
METHOD:REQUEST
BEGIN:VEVENT
DTSTART:20190423T17Z
DTEND:20190423T18Z
DTSTAMP:20190411T085751Z
ORGANIZER;CN=Gluster Community Calendar:mailto:vebj5bl0knsb9d0cm9eh9pbli4@g
 roup.calendar.google.com
UID:7v55fde915d3st1ptv8rg6n...@google.com
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=gluster-users@gluster.org;X-NUM-GUESTS=0:mailto:gluster-users@glust
 er.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=maintain...@gluster.org;X-NUM-GUESTS=0:mailto:maintainers@gluster.o
 rg
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=gluster-de...@gluster.org;X-NUM-GUESTS=0:mailto:gluster-devel@glust
 er.org
X-MICROSOFT-CDO-OWNERAPPTID:-288290639
CREATED:20190410T163536Z
DESCRIPTION:Bridge: https://bluejeans.com/486278655\n\n\nMeeting minutes: h
 ttps://hackmd.io/OqZbh7gfQe6uvVUXUVKJ5g?both\n\nPrevious Meeting notes: htt
 p://github.com/gluster/community\n\n-::~:~::~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~
 :~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~::~:~::-\nPlease do not edit this s
 ection of the description.\n\nView your event at https://www.google.com/cal
 endar/event?action=VIEW=N3Y1NWZkZTkxNWQzc3QxcHR2OHJnNm4zNzYgZ2x1c3Rlci1
 1c2Vyc0BnbHVzdGVyLm9yZw=NTIjdmViajVibDBrbnNiOWQwY205ZWg5cGJsaTRAZ3JvdXA
 uY2FsZW5kYXIuZ29vZ2xlLmNvbTc2MWYzNWEwZmFiMjk5YzFlYmM3NzkyNjNhOWY5MzExYTM4NG
 YwMWQ=Asia%2FKolkata=en=1.\n-::~:~::~:~:~:~:~:~:~:~:~:~:~:~:~:~:~
 :~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~::~:~::-
LAST-MODIFIED:20190411T085749Z
LOCATION:https://bluejeans.com/486278655
SEQUENCE:0
STATUS:CONFIRMED
SUMMARY:Gluster Community Meeting (NA/EMEA friendly hours)
TRANSP:OPAQUE
END:VEVENT
END:VCALENDAR


invite.ics
Description: application/ics
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Invitation: Gluster Community Meeting (APAC friendly hours) @ Tue Apr 16, 2019 11:30am - 12:30pm (IST) (gluster-users@gluster.org)

2019-04-11 Thread amarts
BEGIN:VCALENDAR
PRODID:-//Google Inc//Google Calendar 70.9054//EN
VERSION:2.0
CALSCALE:GREGORIAN
METHOD:REQUEST
BEGIN:VEVENT
DTSTART:20190416T06Z
DTEND:20190416T07Z
DTSTAMP:20190411T085648Z
ORGANIZER;CN=Gluster Community Calendar:mailto:vebj5bl0knsb9d0cm9eh9pbli4@g
 roup.calendar.google.com
UID:256uie4423kjhk4f8btivbg...@google.com
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=gluster-users@gluster.org;X-NUM-GUESTS=0:mailto:gluster-users@glust
 er.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=maintain...@gluster.org;X-NUM-GUESTS=0:mailto:maintainers@gluster.o
 rg
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=gluster-de...@gluster.org;X-NUM-GUESTS=0:mailto:gluster-devel@glust
 er.org
X-MICROSOFT-CDO-OWNERAPPTID:1601644375
CREATED:20190410T163315Z
DESCRIPTION:Bridge: https://bluejeans.com/836554017\n\nMeeting minutes: htt
 ps://hackmd.io/OqZbh7gfQe6uvVUXUVKJ5g?both\n\nPrevious Meeting notes: http:
 //github.com/gluster/community\n\n-::~:~::~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~
 :~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~::~:~::-\nPlease do not edit this sec
 tion of the description.\n\nView your event at https://www.google.com/calen
 dar/event?action=VIEW=MjU2dWllNDQyM2tqaGs0ZjhidGl2YmdtM2YgZ2x1c3Rlci11c
 2Vyc0BnbHVzdGVyLm9yZw=NTIjdmViajVibDBrbnNiOWQwY205ZWg5cGJsaTRAZ3JvdXAuY
 2FsZW5kYXIuZ29vZ2xlLmNvbTZlODU1NTU1Mzk4NjllOTQ4NzUxODAxYTQ4M2E4Y2ExMDRhODg3
 YjY=Asia%2FKolkata=en=1.\n-::~:~::~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~
 :~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~::~:~::-
LAST-MODIFIED:20190411T085646Z
LOCATION:https://bluejeans.com/836554017
SEQUENCE:0
STATUS:CONFIRMED
SUMMARY:Gluster Community Meeting (APAC friendly hours)
TRANSP:OPAQUE
END:VEVENT
END:VCALENDAR


invite.ics
Description: application/ics
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Proposal: Changes in Gluster Community meetings

2019-04-11 Thread Amar Tumballi Suryanarayan
Hi All,

Below is the final details of our community meeting, and I will be sending
invites to mailing list following this email. You can add Gluster Community
Calendar so you can get notifications on the meetings.

We are starting the meetings from next week. For the first meeting, we need
1 volunteer from users to discuss the use case / what went well, and what
went bad, etc. preferrably in APAC region.  NA/EMEA region, next week.

Draft Content: https://hackmd.io/OqZbh7gfQe6uvVUXUVKJ5g

Gluster Community Meeting
Previous
Meeting minutes:

   - http://github.com/gluster/community

Date/Time:
Check the community calendar

Bridge

   - APAC friendly hours
  - Bridge: https://bluejeans.com/836554017
   - NA/EMEA
  - Bridge: https://bluejeans.com/486278655

--
Attendance

   - Name, Company

Host

   - Who will host next meeting?
  - Host will need to send out the agenda 24hr - 12hrs in advance to
  mailing list, and also make sure to send the meeting minutes.
  - Host will need to reach out to one user at least who can talk about
  their usecase, their experience, and their needs.
  - Host needs to send meeting minutes as PR to
  http://github.com/gluster/community

User stories

   - Discuss 1 usecase from a user.
  - How was the architecture derived, what volume type used, options,
  etc?
  - What were the major issues faced ? How to improve them?
  - What worked good?
  - How can we all collaborate well, so it is win-win for the community
  and the user? How can we

Community

   -

   Any release updates?
   -

   Blocker issues across the project?
   -

   Metrics
   - Number of new bugs since previous meeting. How many are not triaged?
  - Number of emails, anything unanswered?

Conferences
/ Meetups

   - Any conference in next 1 month where gluster-developers are going?
   gluster-users are going? So we can meet and discuss.

Developer
focus

   -

   Any design specs to discuss?
   -

   Metrics of the week?
   - Coverity
  - Clang-Scan
  - Number of patches from new developers.
  - Did we increase test coverage?
  - [Atin] Also talk about most frequent test failures in the CI and
  carve out an AI to get them fixed.

RoundTable

   - 



Regards,
Amar

On Mon, Mar 25, 2019 at 8:53 PM Amar Tumballi Suryanarayan <
atumb...@redhat.com> wrote:

> Thanks for the feedback Darrell,
>
> The new proposal is to have one in North America 'morning' time. (10AM
> PST), And another in ASIA day time, which is evening 7pm/6pm in Australia,
> 9pm Newzealand, 5pm Tokyo, 4pm Beijing.
>
> For example, if we choose Every other Tuesday for meeting, and 1st of the
> month is Tuesday, we would have North America time for 1st, and on 15th it
> would be ASIA/Pacific time.
>
> Hopefully, this way, we can cover all the timezones, and meeting minutes
> would be committed to github repo, so that way, it will be easier for
> everyone to be aware of what is happening.
>
> Regards,
> Amar
>
> On Mon, Mar 25, 2019 at 8:40 PM Darrell Budic 
> wrote:
>
>> As a user, I’d like to visit more of these, but the time slot is my 3AM.
>> Any possibility for a rolling schedule (move meeting +6 hours each week
>> with rolling attendance from maintainers?) or an occasional regional
>> meeting 12 hours opposed to the one you’re proposing?
>>
>>   -Darrell
>>
>> On Mar 25, 2019, at 4:25 AM, Amar Tumballi Suryanarayan <
>> atumb...@redhat.com> wrote:
>>
>> All,
>>
>> We currently have 3 meetings which are public:
>>
>> 1. Maintainer's Meeting
>>
>> - Runs once in 2 weeks (on Mondays), and current attendance is around 3-5
>> on an avg, and not much is discussed.
>> - Without majority attendance, we can't take any decisions too.
>>
>> 2. Community meeting
>>
>> - Supposed to happen on #gluster-meeting, every 2 weeks, and is the only
>> meeting which is for 'Community/Users'. Others are for developers as of
>> now.
>> Sadly attendance is getting closer to 0 in recent times.
>>
>> 3. GCS meeting
>>
>> - We started it as an effort inside Red Hat gluster team, and opened it
>> up for community from Jan 2019, but the attendance was always from RHT
>> members, and haven't seen any traction from wider group.
>>

Re: [Gluster-users] Gluster snapshot fails

2019-04-11 Thread Strahil Nikolov
 Hi Rafi,
thanks for your update.
I have tested again with another gluster volume.[root@ovirt1 glusterfs]# 
gluster volume info isos

Volume Name: isos
Type: Replicate
Volume ID: 9b92b5bd-79f5-427b-bd8d-af28b038ed2a
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: ovirt1:/gluster_bricks/isos/isos
Brick2: ovirt2:/gluster_bricks/isos/isos
Brick3: ovirt3.localdomain:/gluster_bricks/isos/isos (arbiter)
Options Reconfigured:
cluster.granular-entry-heal: enable
performance.strict-o-direct: on
network.ping-timeout: 30
storage.owner-gid: 36
storage.owner-uid: 36
user.cifs: off
features.shard: on
cluster.shd-wait-qlength: 1
cluster.shd-max-threads: 8
cluster.locking-scheme: granular
cluster.data-self-heal-algorithm: full
cluster.server-quorum-type: server
cluster.quorum-type: auto
cluster.eager-lock: enable
network.remote-dio: off
performance.low-prio-threads: 32
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
cluster.enable-shared-storage: enable

Command run:
logrotate -f glusterfs ; logrotate -f glusterfs-georep;  gluster snapshot 
create isos-snap-2019-04-11 isos  description TEST

Logs:[root@ovirt1 glusterfs]# cat cli.log
[2019-04-11 07:51:02.367453] I [cli.c:769:main] 0-cli: Started running gluster 
with version 5.5
[2019-04-11 07:51:02.486863] I [MSGID: 101190] 
[event-epoll.c:621:event_dispatch_epoll_worker] 0-epoll: Started thread with 
index 1
[2019-04-11 07:51:02.556813] E [cli-rpc-ops.c:11293:gf_cli_snapshot] 0-cli: 
cli_to_glusterd for snapshot failed
[2019-04-11 07:51:02.556880] I [input.c:31:cli_batch] 0-: Exiting with: -1
[root@ovirt1 glusterfs]# cat glusterd.log
[2019-04-11 07:51:02.553357] E [MSGID: 106024] 
[glusterd-snapshot.c:2547:glusterd_snapshot_create_prevalidate] 0-management: 
Snapshot is supported only for thin provisioned LV. Ensure that all bricks of 
isos are thinly provisioned LV.
[2019-04-11 07:51:02.553365] W [MSGID: 106029] 
[glusterd-snapshot.c:8613:glusterd_snapshot_prevalidate] 0-management: Snapshot 
create pre-validation failed
[2019-04-11 07:51:02.553703] W [MSGID: 106121] 
[glusterd-mgmt.c:147:gd_mgmt_v3_pre_validate_fn] 0-management: Snapshot 
Prevalidate Failed
[2019-04-11 07:51:02.553719] E [MSGID: 106121] 
[glusterd-mgmt.c:1015:glusterd_mgmt_v3_pre_validate] 0-management: Pre 
Validation failed for operation Snapshot on local node

My LVs hosting the bricks are:[root@ovirt1 ~]# lvs gluster_vg_md0
  LV  VG Attr   LSize   Pool    Origin 
Data%  Meta%  Move Log Cpy%Sync Convert
  gluster_lv_data gluster_vg_md0 Vwi-aot--- 500.00g my_vdo_thinpool    35.97
  gluster_lv_isos gluster_vg_md0 Vwi-aot---  50.00g my_vdo_thinpool    52.11
  my_vdo_thinpool gluster_vg_md0 twi-aot---   9.86t    2.04 
  11.45

[root@ovirt1 ~]# ssh ovirt2 "lvs gluster_vg_md0"
  LV  VG Attr   LSize   Pool    Origin 
Data%  Meta%  Move Log Cpy%Sync Convert
  gluster_lv_data gluster_vg_md0 Vwi-aot--- 500.00g my_vdo_thinpool    35.98
  gluster_lv_isos gluster_vg_md0 Vwi-aot---  50.00g my_vdo_thinpool    25.94
  my_vdo_thinpool gluster_vg_md0 twi-aot---  <9.77t    1.93 
  11.39
[root@ovirt1 ~]# ssh ovirt3 "lvs gluster_vg_sda3"
  LV    VG  Attr   LSize  Pool  
Origin Data%  Meta%  Move Log Cpy%Sync Convert
  gluster_lv_data   gluster_vg_sda3 Vwi-aotz-- 15.00g gluster_thinpool_sda3 
   0.17
  gluster_lv_engine gluster_vg_sda3 Vwi-aotz-- 15.00g gluster_thinpool_sda3 
   0.16
  gluster_lv_isos   gluster_vg_sda3 Vwi-aotz-- 15.00g gluster_thinpool_sda3 
   0.12
  gluster_thinpool_sda3 gluster_vg_sda3 twi-aotz-- 41.00g   
   0.16   1.58

As you can see - all bricks are thin LV and space is not the issue.
Can someone hint me how to enable debug , so gluster logs can show the reason 
for that pre-check failure ?
Best Regards,Strahil Nikolov


В сряда, 10 април 2019 г., 9:05:15 ч. Гринуич-4, Rafi Kavungal Chundattu 
Parambil  написа:  
 
 Hi Strahil,

The name of device is not at all a problem here. Can you please check the log 
of glusterd, and see if there is any useful information about the failure. Also 
please provide the output of `lvscan` and `lvs --noheadings -o pool_lv` from 
all nodes

Regards
Rafi KC

- Original Message -
From: "Strahil Nikolov" 
To: gluster-users@gluster.org
Sent: Wednesday, April 10, 2019 2:36:39 AM
Subject: [Gluster-users] Gluster snapshot fails

Hello Community, 

I have a problem running a snapshot of a replica 3 arbiter 1 volume. 

Error: 
[root@ovirt2 ~]# gluster snapshot create before-423 engine description "Before 
upgrade of engine from 4.2.2 to 4.2.3" 
snapshot create: failed: Snapshot is supported only for thin provisioned LV. 
Ensure that all bricks of engine are thinly 

Re: [Gluster-users] Replica 3 - how to replace failed node (peer)

2019-04-11 Thread Martin Toth
Hi Karthik,

more over, I would like to ask if there are some recommended 
settings/parameters for SHD in order to achieve good or fair I/O while volume 
will be healed when I will replace Brick (this should trigger healing process). 
I had some problems in past when healing was triggered, VM disks became 
unresponsive because healing took most of I/O. My volume containing only big 
files with VM disks.

Thanks for suggestions.
BR, 
Martin

> On 10 Apr 2019, at 12:38, Martin Toth  wrote:
> 
> Thanks, this looks ok to me, I will reset brick because I don't have any data 
> anymore on failed node so I can use same path / brick name.
> 
> Is reseting brick dangerous command? Should I be worried about some possible 
> failure that will impact remaining two nodes? I am running really old 3.7.6 
> but stable version.
> 
> Thanks,
> BR!
> 
> Martin
>  
> 
>> On 10 Apr 2019, at 12:20, Karthik Subrahmanya > > wrote:
>> 
>> Hi Martin,
>> 
>> After you add the new disks and creating raid array, you can run the 
>> following command to replace the old brick with new one:
>> 
>> - If you are going to use a different name to the new brick you can run
>> gluster volume replace-brickcommit force
>> 
>> - If you are planning to use the same name for the new brick as well then 
>> you can use
>> gluster volume reset-brickcommit force
>> Here old-brick & new-brick's hostname &  path should be same.
>> 
>> After replacing the brick, make sure the brick comes online using volume 
>> status.
>> Heal should automatically start, you can check the heal status to see all 
>> the files gets replicated to the newly added brick. If it does not start 
>> automatically, you can manually start that by running gluster volume heal 
>> .
>> 
>> HTH,
>> Karthik
>> 
>> On Wed, Apr 10, 2019 at 3:13 PM Martin Toth > > wrote:
>> Hi all,
>> 
>> I am running replica 3 gluster with 3 bricks. One of my servers failed - all 
>> disks are showing errors and raid is in fault state.
>> 
>> Type: Replicate
>> Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a
>> Status: Started
>> Number of Bricks: 1 x 3 = 3
>> Transport-type: tcp
>> Bricks:
>> Brick1: node1.san:/tank/gluster/gv0imagestore/brick1
>> Brick2: node2.san:/tank/gluster/gv0imagestore/brick1 <— this brick is down
>> Brick3: node3.san:/tank/gluster/gv0imagestore/brick1
>> 
>> So one of my bricks is totally failed (node2). It went down and all data are 
>> lost (failed raid on node2). Now I am running only two bricks on 2 servers 
>> out from 3.
>> This is really critical problem for us, we can lost all data. I want to add 
>> new disks to node2, create new raid array on them and try to replace failed 
>> brick on this node. 
>> 
>> What is the procedure of replacing Brick2 on node2, can someone advice? I 
>> can’t find anything relevant in documentation.
>> 
>> Thanks in advance,
>> Martin
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org 
>> https://lists.gluster.org/mailman/listinfo/gluster-users 
>> 

___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] SEGFAULT in FUSE layer

2019-04-11 Thread Amar Tumballi Suryanarayan
Thanks for the report Florian. We will look into this.

On Wed, Apr 10, 2019 at 9:03 PM Florian Manschwetus <
manschwe...@cs-software-gmbh.de> wrote:

> Hi All,
>
> I’d like to bring this bug report, I just opened, to your attention.
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1697971
>
>
>
>
>
>
>
> --
>
> Mit freundlichen Grüßen / With kind regards
>
> Florian Manschwetus
>
>
>
> CS Software Concepts and Solutions GmbH
>
> Geschäftsführer / Managing director: Dr. Werner Alexi
>
> Amtsgericht Wiesbaden HRB 10004 (Commercial registry)
>
> Schiersteiner Straße 31
>
> D-65187 Wiesbaden
>
> Germany
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users



-- 
Amar Tumballi (amarts)
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users