Re: [Gluster-users] Replica 3 - how to replace failed node (peer)

2019-04-20 Thread Martin Toth
Just for other users.. they may find this usefull.

I finally started Gluster server process on failed node that lost brick and all 
went OK.
Server is again available as a peer and failed brick is not running, so I can 
continue with replace brick/ reset brick operation. 

> On 16 Apr 2019, at 17:44, Martin Toth  wrote:
> 
> Thanks for clarification, one more question.
> 
> When I will recover(boot) failed node back and this peer will be available 
> again to remaining two nodes. How do I tell gluster to mark this brick as 
> failed ?
> 
> I mean, I’ve booted failed node back without networking. Disk partition (ZFS 
> pool on another disks) where brick was before failure is lost.
> Now I can start gluster event when I don't have ZFS pool where failed brick 
> was before ?
> 
> This wont be a problem when I will connect this node back to cluster ? 
> (before brick replace/reset command will be issued)
> 
> Thanks. BR!
> Martin
> 
>> On 11 Apr 2019, at 15:40, Karthik Subrahmanya > > wrote:
>> 
>> 
>> 
>> On Thu, Apr 11, 2019 at 6:38 PM Martin Toth > > wrote:
>> Hi Karthik,
>> 
>>> On Thu, Apr 11, 2019 at 12:43 PM Martin Toth >> > wrote:
>>> Hi Karthik,
>>> 
>>> more over, I would like to ask if there are some recommended 
>>> settings/parameters for SHD in order to achieve good or fair I/O while 
>>> volume will be healed when I will replace Brick (this should trigger 
>>> healing process). 
>>> If I understand you concern correctly, you need to get fair I/O performance 
>>> for clients while healing takes place as part of  the replace brick 
>>> operation. For this you can turn off the "data-self-heal" and 
>>> "metadata-self-heal" options until the heal completes on the new brick.
>> 
>> This is exactly what I mean. I am running VM disks on remaining 2 (out of 3 
>> - one failed as mentioned) nodes and I need to ensure there will be fair I/O 
>> performance available on these two nodes while replace brick operation will 
>> heal volume.
>> I will not run any VMs on node where replace brick operation will be 
>> running. So if I understand correctly, when I will set :
>> 
>> # gluster volume set  cluster.data-self-heal off
>> # gluster volume set  cluster.metadata-self-heal off
>> 
>> this will tell Gluster clients (libgfapi and FUSE mount) not to read from 
>> node “where replace brick operation” is in place but from remaing two 
>> healthy nodes. Is this correct ? Thanks for clarification.
>> The reads will be served from one of the good bricks since the file will 
>> either be not present on the replaced brick at the time of read or it will 
>> be present but marked for heal if it is not already healed. If already 
>> healed by SHD, then it could be served from the new brick as well, but there 
>> won't be any problem in reading from there in that scenario.
>> By setting these two options whenever a read comes from client it will not 
>> try to heal the file for data/metadata. Otherwise it would try to heal (if 
>> not already healed by SHD) when the read comes on this, hence slowing down 
>> the client.
>> 
>>> Turning off client side healing doesn't compromise data integrity and 
>>> consistency. During the read request from client, pending xattr is 
>>> evaluated for replica copies and read is only served from correct copy. 
>>> During writes, IO will continue on both the replicas, SHD will take care of 
>>> healing files.
>>> After replacing the brick, we strongly recommend you to consider upgrading 
>>> your gluster to one of the maintained versions. We have many stability 
>>> related fixes there, which can handle some critical issues and corner cases 
>>> which you could hit during these kind of scenarios.
>> 
>> This will be first priority in infrastructure after fixing this cluster back 
>> to fully functional replica3. I will upgrade to 3.12.x and then to version 5 
>> or 6.
>> Sounds good.
>> 
>> If you are planning to have the same name for the new brick and if you get 
>> the error like "Brick may be containing or be contained by an existing 
>> brick" even after using the force option, try  using a different name. That 
>> should work.
>> 
>> Regards,
>> Karthik 
>> 
>> BR, 
>> Martin
>> 
>>> Regards,
>>> Karthik
>>> I had some problems in past when healing was triggered, VM disks became 
>>> unresponsive because healing took most of I/O. My volume containing only 
>>> big files with VM disks.
>>> 
>>> Thanks for suggestions.
>>> BR, 
>>> Martin
>>> 
 On 10 Apr 2019, at 12:38, Martin Toth >>> > wrote:
 
 Thanks, this looks ok to me, I will reset brick because I don't have any 
 data anymore on failed node so I can use same path / brick name.
 
 Is reseting brick dangerous command? Should I be worried about some 
 possible failure that will impact remaining two nodes? I am running really 
 old 3.7.6 but stable version.
 
 Thanks,
 

Re: [Gluster-users] Replica 3 - how to replace failed node (peer)

2019-04-16 Thread Martin Toth
Thanks for clarification, one more question.

When I will recover(boot) failed node back and this peer will be available 
again to remaining two nodes. How do I tell gluster to mark this brick as 
failed ?

I mean, I’ve booted failed node back without networking. Disk partition (ZFS 
pool on another disks) where brick was before failure is lost.
Now I can start gluster event when I don't have ZFS pool where failed brick was 
before ?

This wont be a problem when I will connect this node back to cluster ? (before 
brick replace/reset command will be issued)

Thanks. BR!
Martin

> On 11 Apr 2019, at 15:40, Karthik Subrahmanya  wrote:
> 
> 
> 
> On Thu, Apr 11, 2019 at 6:38 PM Martin Toth  > wrote:
> Hi Karthik,
> 
>> On Thu, Apr 11, 2019 at 12:43 PM Martin Toth > > wrote:
>> Hi Karthik,
>> 
>> more over, I would like to ask if there are some recommended 
>> settings/parameters for SHD in order to achieve good or fair I/O while 
>> volume will be healed when I will replace Brick (this should trigger healing 
>> process). 
>> If I understand you concern correctly, you need to get fair I/O performance 
>> for clients while healing takes place as part of  the replace brick 
>> operation. For this you can turn off the "data-self-heal" and 
>> "metadata-self-heal" options until the heal completes on the new brick.
> 
> This is exactly what I mean. I am running VM disks on remaining 2 (out of 3 - 
> one failed as mentioned) nodes and I need to ensure there will be fair I/O 
> performance available on these two nodes while replace brick operation will 
> heal volume.
> I will not run any VMs on node where replace brick operation will be running. 
> So if I understand correctly, when I will set :
> 
> # gluster volume set  cluster.data-self-heal off
> # gluster volume set  cluster.metadata-self-heal off
> 
> this will tell Gluster clients (libgfapi and FUSE mount) not to read from 
> node “where replace brick operation” is in place but from remaing two healthy 
> nodes. Is this correct ? Thanks for clarification.
> The reads will be served from one of the good bricks since the file will 
> either be not present on the replaced brick at the time of read or it will be 
> present but marked for heal if it is not already healed. If already healed by 
> SHD, then it could be served from the new brick as well, but there won't be 
> any problem in reading from there in that scenario.
> By setting these two options whenever a read comes from client it will not 
> try to heal the file for data/metadata. Otherwise it would try to heal (if 
> not already healed by SHD) when the read comes on this, hence slowing down 
> the client.
> 
>> Turning off client side healing doesn't compromise data integrity and 
>> consistency. During the read request from client, pending xattr is evaluated 
>> for replica copies and read is only served from correct copy. During writes, 
>> IO will continue on both the replicas, SHD will take care of healing files.
>> After replacing the brick, we strongly recommend you to consider upgrading 
>> your gluster to one of the maintained versions. We have many stability 
>> related fixes there, which can handle some critical issues and corner cases 
>> which you could hit during these kind of scenarios.
> 
> This will be first priority in infrastructure after fixing this cluster back 
> to fully functional replica3. I will upgrade to 3.12.x and then to version 5 
> or 6.
> Sounds good.
> 
> If you are planning to have the same name for the new brick and if you get 
> the error like "Brick may be containing or be contained by an existing brick" 
> even after using the force option, try  using a different name. That should 
> work.
> 
> Regards,
> Karthik 
> 
> BR, 
> Martin
> 
>> Regards,
>> Karthik
>> I had some problems in past when healing was triggered, VM disks became 
>> unresponsive because healing took most of I/O. My volume containing only big 
>> files with VM disks.
>> 
>> Thanks for suggestions.
>> BR, 
>> Martin
>> 
>>> On 10 Apr 2019, at 12:38, Martin Toth >> > wrote:
>>> 
>>> Thanks, this looks ok to me, I will reset brick because I don't have any 
>>> data anymore on failed node so I can use same path / brick name.
>>> 
>>> Is reseting brick dangerous command? Should I be worried about some 
>>> possible failure that will impact remaining two nodes? I am running really 
>>> old 3.7.6 but stable version.
>>> 
>>> Thanks,
>>> BR!
>>> 
>>> Martin
>>>  
>>> 
 On 10 Apr 2019, at 12:20, Karthik Subrahmanya >>> > wrote:
 
 Hi Martin,
 
 After you add the new disks and creating raid array, you can run the 
 following command to replace the old brick with new one:
 
 - If you are going to use a different name to the new brick you can run
 gluster volume replace-brickcommit force
 
 - If you are planning to use the same name for the new 

Re: [Gluster-users] Replica 3 - how to replace failed node (peer)

2019-04-11 Thread Karthik Subrahmanya
On Thu, Apr 11, 2019 at 6:38 PM Martin Toth  wrote:

> Hi Karthik,
>
> On Thu, Apr 11, 2019 at 12:43 PM Martin Toth  wrote:
>
>> Hi Karthik,
>>
>> more over, I would like to ask if there are some recommended
>> settings/parameters for SHD in order to achieve good or fair I/O while
>> volume will be healed when I will replace Brick (this should trigger
>> healing process).
>>
> If I understand you concern correctly, you need to get fair I/O
> performance for clients while healing takes place as part of  the replace
> brick operation. For this you can turn off the "data-self-heal" and
> "metadata-self-heal" options until the heal completes on the new brick.
>
>
> This is exactly what I mean. I am running VM disks on remaining 2 (out of
> 3 - one failed as mentioned) nodes and I need to ensure there will be fair
> I/O performance available on these two nodes while replace brick operation
> will heal volume.
> I will not run any VMs on node where replace brick operation will be
> running. So if I understand correctly, when I will set :
>
> # gluster volume set  cluster.data-self-heal off
> # gluster volume set  cluster.metadata-self-heal off
>
> this will tell Gluster clients (libgfapi and FUSE mount) not to read from
> node “where replace brick operation” is in place but from remaing two
> healthy nodes. Is this correct ? Thanks for clarification.
>
The reads will be served from one of the good bricks since the file will
either be not present on the replaced brick at the time of read or it will
be present but marked for heal if it is not already healed. If already
healed by SHD, then it could be served from the new brick as well, but
there won't be any problem in reading from there in that scenario.
By setting these two options whenever a read comes from client it will not
try to heal the file for data/metadata. Otherwise it would try to heal (if
not already healed by SHD) when the read comes on this, hence slowing down
the client.

>
> Turning off client side healing doesn't compromise data integrity and
> consistency. During the read request from client, pending xattr is
> evaluated for replica copies and read is only served from correct copy.
> During writes, IO will continue on both the replicas, SHD will take care of
> healing files.
> After replacing the brick, we strongly recommend you to consider upgrading
> your gluster to one of the maintained versions. We have many stability
> related fixes there, which can handle some critical issues and corner cases
> which you could hit during these kind of scenarios.
>
>
> This will be first priority in infrastructure after fixing this cluster
> back to fully functional replica3. I will upgrade to 3.12.x and then to
> version 5 or 6.
>
Sounds good.

If you are planning to have the same name for the new brick and if you get
the error like "Brick may be containing or be contained by an existing
brick" even after using the force option, try  using a different name. That
should work.

Regards,
Karthik

>
> BR,
> Martin
>
> Regards,
> Karthik
>
>> I had some problems in past when healing was triggered, VM disks became
>> unresponsive because healing took most of I/O. My volume containing only
>> big files with VM disks.
>>
>> Thanks for suggestions.
>> BR,
>> Martin
>>
>> On 10 Apr 2019, at 12:38, Martin Toth  wrote:
>>
>> Thanks, this looks ok to me, I will reset brick because I don't have any
>> data anymore on failed node so I can use same path / brick name.
>>
>> Is reseting brick dangerous command? Should I be worried about some
>> possible failure that will impact remaining two nodes? I am running really
>> old 3.7.6 but stable version.
>>
>> Thanks,
>> BR!
>>
>> Martin
>>
>>
>> On 10 Apr 2019, at 12:20, Karthik Subrahmanya 
>> wrote:
>>
>> Hi Martin,
>>
>> After you add the new disks and creating raid array, you can run the
>> following command to replace the old brick with new one:
>>
>> - If you are going to use a different name to the new brick you can run
>> gluster volume replace-brickcommit
>> force
>>
>> - If you are planning to use the same name for the new brick as well then
>> you can use
>> gluster volume reset-brickcommit force
>> Here old-brick & new-brick's hostname &  path should be same.
>>
>> After replacing the brick, make sure the brick comes online using volume
>> status.
>> Heal should automatically start, you can check the heal status to see all
>> the files gets replicated to the newly added brick. If it does not start
>> automatically, you can manually start that by running gluster volume heal
>> .
>>
>> HTH,
>> Karthik
>>
>> On Wed, Apr 10, 2019 at 3:13 PM Martin Toth  wrote:
>>
>>> Hi all,
>>>
>>> I am running replica 3 gluster with 3 bricks. One of my servers failed -
>>> all disks are showing errors and raid is in fault state.
>>>
>>> Type: Replicate
>>> Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a
>>> Status: Started
>>> Number of Bricks: 1 x 3 = 3
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: 

Re: [Gluster-users] Replica 3 - how to replace failed node (peer)

2019-04-11 Thread Martin Toth
Hi Karthik,

> On Thu, Apr 11, 2019 at 12:43 PM Martin Toth  > wrote:
> Hi Karthik,
> 
> more over, I would like to ask if there are some recommended 
> settings/parameters for SHD in order to achieve good or fair I/O while volume 
> will be healed when I will replace Brick (this should trigger healing 
> process). 
> If I understand you concern correctly, you need to get fair I/O performance 
> for clients while healing takes place as part of  the replace brick 
> operation. For this you can turn off the "data-self-heal" and 
> "metadata-self-heal" options until the heal completes on the new brick.

This is exactly what I mean. I am running VM disks on remaining 2 (out of 3 - 
one failed as mentioned) nodes and I need to ensure there will be fair I/O 
performance available on these two nodes while replace brick operation will 
heal volume.
I will not run any VMs on node where replace brick operation will be running. 
So if I understand correctly, when I will set :

# gluster volume set  cluster.data-self-heal off
# gluster volume set  cluster.metadata-self-heal off

this will tell Gluster clients (libgfapi and FUSE mount) not to read from node 
“where replace brick operation” is in place but from remaing two healthy nodes. 
Is this correct ? Thanks for clarification.

> Turning off client side healing doesn't compromise data integrity and 
> consistency. During the read request from client, pending xattr is evaluated 
> for replica copies and read is only served from correct copy. During writes, 
> IO will continue on both the replicas, SHD will take care of healing files.
> After replacing the brick, we strongly recommend you to consider upgrading 
> your gluster to one of the maintained versions. We have many stability 
> related fixes there, which can handle some critical issues and corner cases 
> which you could hit during these kind of scenarios.

This will be first priority in infrastructure after fixing this cluster back to 
fully functional replica3. I will upgrade to 3.12.x and then to version 5 or 6.

BR, 
Martin

> Regards,
> Karthik
> I had some problems in past when healing was triggered, VM disks became 
> unresponsive because healing took most of I/O. My volume containing only big 
> files with VM disks.
> 
> Thanks for suggestions.
> BR, 
> Martin
> 
>> On 10 Apr 2019, at 12:38, Martin Toth > > wrote:
>> 
>> Thanks, this looks ok to me, I will reset brick because I don't have any 
>> data anymore on failed node so I can use same path / brick name.
>> 
>> Is reseting brick dangerous command? Should I be worried about some possible 
>> failure that will impact remaining two nodes? I am running really old 3.7.6 
>> but stable version.
>> 
>> Thanks,
>> BR!
>> 
>> Martin
>>  
>> 
>>> On 10 Apr 2019, at 12:20, Karthik Subrahmanya >> > wrote:
>>> 
>>> Hi Martin,
>>> 
>>> After you add the new disks and creating raid array, you can run the 
>>> following command to replace the old brick with new one:
>>> 
>>> - If you are going to use a different name to the new brick you can run
>>> gluster volume replace-brickcommit force
>>> 
>>> - If you are planning to use the same name for the new brick as well then 
>>> you can use
>>> gluster volume reset-brickcommit force
>>> Here old-brick & new-brick's hostname &  path should be same.
>>> 
>>> After replacing the brick, make sure the brick comes online using volume 
>>> status.
>>> Heal should automatically start, you can check the heal status to see all 
>>> the files gets replicated to the newly added brick. If it does not start 
>>> automatically, you can manually start that by running gluster volume heal 
>>> .
>>> 
>>> HTH,
>>> Karthik
>>> 
>>> On Wed, Apr 10, 2019 at 3:13 PM Martin Toth >> > wrote:
>>> Hi all,
>>> 
>>> I am running replica 3 gluster with 3 bricks. One of my servers failed - 
>>> all disks are showing errors and raid is in fault state.
>>> 
>>> Type: Replicate
>>> Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a
>>> Status: Started
>>> Number of Bricks: 1 x 3 = 3
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: node1.san:/tank/gluster/gv0imagestore/brick1
>>> Brick2: node2.san:/tank/gluster/gv0imagestore/brick1 <— this brick is down
>>> Brick3: node3.san:/tank/gluster/gv0imagestore/brick1
>>> 
>>> So one of my bricks is totally failed (node2). It went down and all data 
>>> are lost (failed raid on node2). Now I am running only two bricks on 2 
>>> servers out from 3.
>>> This is really critical problem for us, we can lost all data. I want to add 
>>> new disks to node2, create new raid array on them and try to replace failed 
>>> brick on this node. 
>>> 
>>> What is the procedure of replacing Brick2 on node2, can someone advice? I 
>>> can’t find anything relevant in documentation.
>>> 
>>> Thanks in advance,
>>> Martin
>>> ___
>>> Gluster-users mailing list
>>> 

Re: [Gluster-users] Replica 3 - how to replace failed node (peer)

2019-04-11 Thread Karthik Subrahmanya
On Thu, Apr 11, 2019 at 1:40 PM Strahil Nikolov 
wrote:

> Hi Karthik,
>
> - the volume configuration you were using?
> I used oVirt 4.2.6 Gluster Wizard, so I guess - we need to involve the
> oVirt devs here.
> - why you wanted to replace your brick?
> I have deployed the arbiter on another location as I thought I can deploy
> the Thin Arbiter (still waiting the docs to be updated), but once I
> realized that GlusterD doesn't support Thin Arbiter, I had to build another
> machine for a local arbiter - thus a replacement was needed.
>
We are working on supporting Thin-arbiter with GlusterD. Once done, we will
update on the users list so that you can play with it and let us know your
experience.

> - which brick(s) you tried replacing?
> I was replacing the old arbiter with a new one
> - what problem(s) did you face?
> All oVirt VMs got paused due to I/O errors.
>
There could be many reasons for this. Without knowing the exact state of
the system at that time, I am afraid to make any comment on this.

>
> At the end, I have rebuild the whole setup and I never tried to replace
> the brick this way (used only reset-brick which didn't cause any issues).
>
> As I mentioned that was on v3.12, which is not the default for oVirt
> 4.3.x - so my guess is that it is OK now (current is v5.5).
>
I don't remember anyone complaining about this recently. This should work
in the latest releases.

>
> Just sharing my experience.
>
Highly appreciated.

Regards,
Karthik

>
> Best Regards,
> Strahil Nikolov
>
> В четвъртък, 11 април 2019 г., 0:53:52 ч. Гринуич-4, Karthik Subrahmanya <
> ksubr...@redhat.com> написа:
>
>
> Hi Strahil,
>
> Can you give us some more insights on
> - the volume configuration you were using?
> - why you wanted to replace your brick?
> - which brick(s) you tried replacing?
> - what problem(s) did you face?
>
> Regards,
> Karthik
>
> On Thu, Apr 11, 2019 at 10:14 AM Strahil  wrote:
>
> Hi Karthnik,
> I used only once the brick replace function when I wanted to change my
> Arbiter (v3.12.15 in oVirt 4.2.7)  and it was a complete disaster.
> Most probably I should have stopped the source arbiter before doing that,
> but the docs didn't mention it.
>
> Thus I always use reset-brick, as it never let me down.
>
> Best Regards,
> Strahil Nikolov
> On Apr 11, 2019 07:34, Karthik Subrahmanya  wrote:
>
> Hi Strahil,
>
> Thank you for sharing your experience with reset-brick option.
> Since he is using the gluster version 3.7.6, we do not have the
> reset-brick [1] option implemented there. It is introduced in 3.9.0. He has
> to go with replace-brick with the force option if he wants to use the same
> path & name for the new brick.
> Yes, it is recommended to have the new brick to be of the same size as
> that of the other bricks.
>
> [1]
> https://docs.gluster.org/en/latest/release-notes/3.9.0/#introducing-reset-brick-command
>
> Regards,
> Karthik
>
> On Wed, Apr 10, 2019 at 10:31 PM Strahil  wrote:
>
> I have used reset-brick - but I have just changed the brick layout.
> You may give it a try, but I guess you need your new brick to have same
> amount of space (or more).
>
> Maybe someone more experienced should share a more sound solution.
>
> Best Regards,
> Strahil NikolovOn Apr 10, 2019 12:42, Martin Toth 
> wrote:
> >
> > Hi all,
> >
> > I am running replica 3 gluster with 3 bricks. One of my servers failed -
> all disks are showing errors and raid is in fault state.
> >
> > Type: Replicate
> > Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a
> > Status: Started
> > Number of Bricks: 1 x 3 = 3
> > Transport-type: tcp
> > Bricks:
> > Brick1: node1.san:/tank/gluster/gv0imagestore/brick1
> > Brick2: node2.san:/tank/gluster/gv0imagestore/brick1 <— this brick is
> down
> > Brick3: node3.san:/tank/gluster/gv0imagestore/brick1
> >
> > So one of my bricks is totally failed (node2). It went down and all data
> are lost (failed raid on node2). Now I am running only two bricks on 2
> servers out from 3.
> > This is really critical problem for us, we can lost all data. I want to
> add new disks to node2, create new raid array on them and try to replace
> failed brick on this node.
> >
> > What is the procedure of replacing Brick2 on node2, can someone advice?
> I can’t find anything relevant in documentation.
> >
> > Thanks in advance,
> > Martin
> > ___
> > Gluster-users mailing list
> > Gluster-users@gluster.org
> > https://lists.gluster.org/mailman/listinfo/gluster-users
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Replica 3 - how to replace failed node (peer)

2019-04-11 Thread Karthik Subrahmanya
On Thu, Apr 11, 2019 at 12:43 PM Martin Toth  wrote:

> Hi Karthik,
>
> more over, I would like to ask if there are some recommended
> settings/parameters for SHD in order to achieve good or fair I/O while
> volume will be healed when I will replace Brick (this should trigger
> healing process).
>
If I understand you concern correctly, you need to get fair I/O performance
for clients while healing takes place as part of  the replace brick
operation. For this you can turn off the "data-self-heal" and
"metadata-self-heal" options until the heal completes on the new brick.
Turning off client side healing doesn't compromise data integrity and
consistency. During the read request from client, pending xattr is
evaluated for replica copies and read is only served from correct copy.
During writes, IO will continue on both the replicas, SHD will take care of
healing files.
After replacing the brick, we strongly recommend you to consider upgrading
your gluster to one of the maintained versions. We have many stability
related fixes there, which can handle some critical issues and corner cases
which you could hit during these kind of scenarios.

Regards,
Karthik

> I had some problems in past when healing was triggered, VM disks became
> unresponsive because healing took most of I/O. My volume containing only
> big files with VM disks.
>
> Thanks for suggestions.
> BR,
> Martin
>
> On 10 Apr 2019, at 12:38, Martin Toth  wrote:
>
> Thanks, this looks ok to me, I will reset brick because I don't have any
> data anymore on failed node so I can use same path / brick name.
>
> Is reseting brick dangerous command? Should I be worried about some
> possible failure that will impact remaining two nodes? I am running really
> old 3.7.6 but stable version.
>
> Thanks,
> BR!
>
> Martin
>
>
> On 10 Apr 2019, at 12:20, Karthik Subrahmanya  wrote:
>
> Hi Martin,
>
> After you add the new disks and creating raid array, you can run the
> following command to replace the old brick with new one:
>
> - If you are going to use a different name to the new brick you can run
> gluster volume replace-brickcommit force
>
> - If you are planning to use the same name for the new brick as well then
> you can use
> gluster volume reset-brickcommit force
> Here old-brick & new-brick's hostname &  path should be same.
>
> After replacing the brick, make sure the brick comes online using volume
> status.
> Heal should automatically start, you can check the heal status to see all
> the files gets replicated to the newly added brick. If it does not start
> automatically, you can manually start that by running gluster volume heal
> .
>
> HTH,
> Karthik
>
> On Wed, Apr 10, 2019 at 3:13 PM Martin Toth  wrote:
>
>> Hi all,
>>
>> I am running replica 3 gluster with 3 bricks. One of my servers failed -
>> all disks are showing errors and raid is in fault state.
>>
>> Type: Replicate
>> Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a
>> Status: Started
>> Number of Bricks: 1 x 3 = 3
>> Transport-type: tcp
>> Bricks:
>> Brick1: node1.san:/tank/gluster/gv0imagestore/brick1
>> Brick2: node2.san:/tank/gluster/gv0imagestore/brick1 <— this brick is down
>> Brick3: node3.san:/tank/gluster/gv0imagestore/brick1
>>
>> So one of my bricks is totally failed (node2). It went down and all data
>> are lost (failed raid on node2). Now I am running only two bricks on 2
>> servers out from 3.
>> This is really critical problem for us, we can lost all data. I want to
>> add new disks to node2, create new raid array on them and try to replace
>> failed brick on this node.
>>
>> What is the procedure of replacing Brick2 on node2, can someone advice? I
>> can’t find anything relevant in documentation.
>>
>> Thanks in advance,
>> Martin
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Replica 3 - how to replace failed node (peer)

2019-04-11 Thread Martin Toth
Hi Karthik,

more over, I would like to ask if there are some recommended 
settings/parameters for SHD in order to achieve good or fair I/O while volume 
will be healed when I will replace Brick (this should trigger healing process). 
I had some problems in past when healing was triggered, VM disks became 
unresponsive because healing took most of I/O. My volume containing only big 
files with VM disks.

Thanks for suggestions.
BR, 
Martin

> On 10 Apr 2019, at 12:38, Martin Toth  wrote:
> 
> Thanks, this looks ok to me, I will reset brick because I don't have any data 
> anymore on failed node so I can use same path / brick name.
> 
> Is reseting brick dangerous command? Should I be worried about some possible 
> failure that will impact remaining two nodes? I am running really old 3.7.6 
> but stable version.
> 
> Thanks,
> BR!
> 
> Martin
>  
> 
>> On 10 Apr 2019, at 12:20, Karthik Subrahmanya > > wrote:
>> 
>> Hi Martin,
>> 
>> After you add the new disks and creating raid array, you can run the 
>> following command to replace the old brick with new one:
>> 
>> - If you are going to use a different name to the new brick you can run
>> gluster volume replace-brickcommit force
>> 
>> - If you are planning to use the same name for the new brick as well then 
>> you can use
>> gluster volume reset-brickcommit force
>> Here old-brick & new-brick's hostname &  path should be same.
>> 
>> After replacing the brick, make sure the brick comes online using volume 
>> status.
>> Heal should automatically start, you can check the heal status to see all 
>> the files gets replicated to the newly added brick. If it does not start 
>> automatically, you can manually start that by running gluster volume heal 
>> .
>> 
>> HTH,
>> Karthik
>> 
>> On Wed, Apr 10, 2019 at 3:13 PM Martin Toth > > wrote:
>> Hi all,
>> 
>> I am running replica 3 gluster with 3 bricks. One of my servers failed - all 
>> disks are showing errors and raid is in fault state.
>> 
>> Type: Replicate
>> Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a
>> Status: Started
>> Number of Bricks: 1 x 3 = 3
>> Transport-type: tcp
>> Bricks:
>> Brick1: node1.san:/tank/gluster/gv0imagestore/brick1
>> Brick2: node2.san:/tank/gluster/gv0imagestore/brick1 <— this brick is down
>> Brick3: node3.san:/tank/gluster/gv0imagestore/brick1
>> 
>> So one of my bricks is totally failed (node2). It went down and all data are 
>> lost (failed raid on node2). Now I am running only two bricks on 2 servers 
>> out from 3.
>> This is really critical problem for us, we can lost all data. I want to add 
>> new disks to node2, create new raid array on them and try to replace failed 
>> brick on this node. 
>> 
>> What is the procedure of replacing Brick2 on node2, can someone advice? I 
>> can’t find anything relevant in documentation.
>> 
>> Thanks in advance,
>> Martin
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org 
>> https://lists.gluster.org/mailman/listinfo/gluster-users 
>> 

___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Replica 3 - how to replace failed node (peer)

2019-04-10 Thread Karthik Subrahmanya
On Thu, Apr 11, 2019 at 10:23 AM Karthik Subrahmanya 
wrote:

> Hi Strahil,
>
> Can you give us some more insights on
> - the volume configuration you were using?
> - why you wanted to replace your brick?
> - which brick(s) you tried replacing?
>
- if you remember the commands/steps that you followed, please give that as
well.

> - what problem(s) did you face?
>

> Regards,
> Karthik
>
> On Thu, Apr 11, 2019 at 10:14 AM Strahil  wrote:
>
>> Hi Karthnik,
>> I used only once the brick replace function when I wanted to change my
>> Arbiter (v3.12.15 in oVirt 4.2.7)  and it was a complete disaster.
>> Most probably I should have stopped the source arbiter before doing that,
>> but the docs didn't mention it.
>>
>> Thus I always use reset-brick, as it never let me down.
>>
>> Best Regards,
>> Strahil Nikolov
>> On Apr 11, 2019 07:34, Karthik Subrahmanya  wrote:
>>
>> Hi Strahil,
>>
>> Thank you for sharing your experience with reset-brick option.
>> Since he is using the gluster version 3.7.6, we do not have the
>> reset-brick [1] option implemented there. It is introduced in 3.9.0. He has
>> to go with replace-brick with the force option if he wants to use the same
>> path & name for the new brick.
>> Yes, it is recommended to have the new brick to be of the same size as
>> that of the other bricks.
>>
>> [1]
>> https://docs.gluster.org/en/latest/release-notes/3.9.0/#introducing-reset-brick-command
>>
>> Regards,
>> Karthik
>>
>> On Wed, Apr 10, 2019 at 10:31 PM Strahil  wrote:
>>
>> I have used reset-brick - but I have just changed the brick layout.
>> You may give it a try, but I guess you need your new brick to have same
>> amount of space (or more).
>>
>> Maybe someone more experienced should share a more sound solution.
>>
>> Best Regards,
>> Strahil NikolovOn Apr 10, 2019 12:42, Martin Toth 
>> wrote:
>> >
>> > Hi all,
>> >
>> > I am running replica 3 gluster with 3 bricks. One of my servers failed
>> - all disks are showing errors and raid is in fault state.
>> >
>> > Type: Replicate
>> > Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a
>> > Status: Started
>> > Number of Bricks: 1 x 3 = 3
>> > Transport-type: tcp
>> > Bricks:
>> > Brick1: node1.san:/tank/gluster/gv0imagestore/brick1
>> > Brick2: node2.san:/tank/gluster/gv0imagestore/brick1 <— this brick is
>> down
>> > Brick3: node3.san:/tank/gluster/gv0imagestore/brick1
>> >
>> > So one of my bricks is totally failed (node2). It went down and all
>> data are lost (failed raid on node2). Now I am running only two bricks on 2
>> servers out from 3.
>> > This is really critical problem for us, we can lost all data. I want to
>> add new disks to node2, create new raid array on them and try to replace
>> failed brick on this node.
>> >
>> > What is the procedure of replacing Brick2 on node2, can someone advice?
>> I can’t find anything relevant in documentation.
>> >
>> > Thanks in advance,
>> > Martin
>> > ___
>> > Gluster-users mailing list
>> > Gluster-users@gluster.org
>> > https://lists.gluster.org/mailman/listinfo/gluster-users
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>>
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Replica 3 - how to replace failed node (peer)

2019-04-10 Thread Karthik Subrahmanya
Hi Strahil,

Can you give us some more insights on
- the volume configuration you were using?
- why you wanted to replace your brick?
- which brick(s) you tried replacing?
- what problem(s) did you face?

Regards,
Karthik

On Thu, Apr 11, 2019 at 10:14 AM Strahil  wrote:

> Hi Karthnik,
> I used only once the brick replace function when I wanted to change my
> Arbiter (v3.12.15 in oVirt 4.2.7)  and it was a complete disaster.
> Most probably I should have stopped the source arbiter before doing that,
> but the docs didn't mention it.
>
> Thus I always use reset-brick, as it never let me down.
>
> Best Regards,
> Strahil Nikolov
> On Apr 11, 2019 07:34, Karthik Subrahmanya  wrote:
>
> Hi Strahil,
>
> Thank you for sharing your experience with reset-brick option.
> Since he is using the gluster version 3.7.6, we do not have the
> reset-brick [1] option implemented there. It is introduced in 3.9.0. He has
> to go with replace-brick with the force option if he wants to use the same
> path & name for the new brick.
> Yes, it is recommended to have the new brick to be of the same size as
> that of the other bricks.
>
> [1]
> https://docs.gluster.org/en/latest/release-notes/3.9.0/#introducing-reset-brick-command
>
> Regards,
> Karthik
>
> On Wed, Apr 10, 2019 at 10:31 PM Strahil  wrote:
>
> I have used reset-brick - but I have just changed the brick layout.
> You may give it a try, but I guess you need your new brick to have same
> amount of space (or more).
>
> Maybe someone more experienced should share a more sound solution.
>
> Best Regards,
> Strahil NikolovOn Apr 10, 2019 12:42, Martin Toth 
> wrote:
> >
> > Hi all,
> >
> > I am running replica 3 gluster with 3 bricks. One of my servers failed -
> all disks are showing errors and raid is in fault state.
> >
> > Type: Replicate
> > Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a
> > Status: Started
> > Number of Bricks: 1 x 3 = 3
> > Transport-type: tcp
> > Bricks:
> > Brick1: node1.san:/tank/gluster/gv0imagestore/brick1
> > Brick2: node2.san:/tank/gluster/gv0imagestore/brick1 <— this brick is
> down
> > Brick3: node3.san:/tank/gluster/gv0imagestore/brick1
> >
> > So one of my bricks is totally failed (node2). It went down and all data
> are lost (failed raid on node2). Now I am running only two bricks on 2
> servers out from 3.
> > This is really critical problem for us, we can lost all data. I want to
> add new disks to node2, create new raid array on them and try to replace
> failed brick on this node.
> >
> > What is the procedure of replacing Brick2 on node2, can someone advice?
> I can’t find anything relevant in documentation.
> >
> > Thanks in advance,
> > Martin
> > ___
> > Gluster-users mailing list
> > Gluster-users@gluster.org
> > https://lists.gluster.org/mailman/listinfo/gluster-users
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Replica 3 - how to replace failed node (peer)

2019-04-10 Thread Strahil
Hi Karthnik,
I used only once the brick replace function when I wanted to change my Arbiter 
(v3.12.15 in oVirt 4.2.7)  and it was a complete disaster.
Most probably I should have stopped the source arbiter before doing that, but 
the docs didn't mention it.

Thus I always use reset-brick, as it never let me down.

Best Regards,
Strahil NikolovOn Apr 11, 2019 07:34, Karthik Subrahmanya  
wrote:
>
> Hi Strahil,
>
> Thank you for sharing your experience with reset-brick option.
> Since he is using the gluster version 3.7.6, we do not have the reset-brick 
> [1] option implemented there. It is introduced in 3.9.0. He has to go with 
> replace-brick with the force option if he wants to use the same path & name 
> for the new brick. 
> Yes, it is recommended to have the new brick to be of the same size as that 
> of the other bricks.
>
> [1] 
> https://docs.gluster.org/en/latest/release-notes/3.9.0/#introducing-reset-brick-command
>
> Regards,
> Karthik
>
> On Wed, Apr 10, 2019 at 10:31 PM Strahil  wrote:
>>
>> I have used reset-brick - but I have just changed the brick layout.
>> You may give it a try, but I guess you need your new brick to have same 
>> amount of space (or more).
>>
>> Maybe someone more experienced should share a more sound solution.
>>
>> Best Regards,
>> Strahil NikolovOn Apr 10, 2019 12:42, Martin Toth  
>> wrote:
>> >
>> > Hi all,
>> >
>> > I am running replica 3 gluster with 3 bricks. One of my servers failed - 
>> > all disks are showing errors and raid is in fault state.
>> >
>> > Type: Replicate
>> > Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a
>> > Status: Started
>> > Number of Bricks: 1 x 3 = 3
>> > Transport-type: tcp
>> > Bricks:
>> > Brick1: node1.san:/tank/gluster/gv0imagestore/brick1
>> > Brick2: node2.san:/tank/gluster/gv0imagestore/brick1 <— this brick is down
>> > Brick3: node3.san:/tank/gluster/gv0imagestore/brick1
>> >
>> > So one of my bricks is totally failed (node2). It went down and all data 
>> > are lost (failed raid on node2). Now I am running only two bricks on 2 
>> > servers out from 3.
>> > This is really critical problem for us, we can lost all data. I want to 
>> > add new disks to node2, create new raid array on them and try to replace 
>> > failed brick on this node.
>> >
>> > What is the procedure of replacing Brick2 on node2, can someone advice? I 
>> > can’t find anything relevant in documentation.
>> >
>> > Thanks in advance,
>> > Martin
>> > ___
>> > Gluster-users mailing list
>> > Gluster-users@gluster.org
>> > https://lists.gluster.org/mailman/listinfo/gluster-users
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Replica 3 - how to replace failed node (peer)

2019-04-10 Thread Karthik Subrahmanya
Hi Strahil,

Thank you for sharing your experience with reset-brick option.
Since he is using the gluster version 3.7.6, we do not have the reset-brick
[1] option implemented there. It is introduced in 3.9.0. He has to go with
replace-brick with the force option if he wants to use the same path & name
for the new brick.
Yes, it is recommended to have the new brick to be of the same size as that
of the other bricks.

[1]
https://docs.gluster.org/en/latest/release-notes/3.9.0/#introducing-reset-brick-command

Regards,
Karthik

On Wed, Apr 10, 2019 at 10:31 PM Strahil  wrote:

> I have used reset-brick - but I have just changed the brick layout.
> You may give it a try, but I guess you need your new brick to have same
> amount of space (or more).
>
> Maybe someone more experienced should share a more sound solution.
>
> Best Regards,
> Strahil NikolovOn Apr 10, 2019 12:42, Martin Toth 
> wrote:
> >
> > Hi all,
> >
> > I am running replica 3 gluster with 3 bricks. One of my servers failed -
> all disks are showing errors and raid is in fault state.
> >
> > Type: Replicate
> > Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a
> > Status: Started
> > Number of Bricks: 1 x 3 = 3
> > Transport-type: tcp
> > Bricks:
> > Brick1: node1.san:/tank/gluster/gv0imagestore/brick1
> > Brick2: node2.san:/tank/gluster/gv0imagestore/brick1 <— this brick is
> down
> > Brick3: node3.san:/tank/gluster/gv0imagestore/brick1
> >
> > So one of my bricks is totally failed (node2). It went down and all data
> are lost (failed raid on node2). Now I am running only two bricks on 2
> servers out from 3.
> > This is really critical problem for us, we can lost all data. I want to
> add new disks to node2, create new raid array on them and try to replace
> failed brick on this node.
> >
> > What is the procedure of replacing Brick2 on node2, can someone advice?
> I can’t find anything relevant in documentation.
> >
> > Thanks in advance,
> > Martin
> > ___
> > Gluster-users mailing list
> > Gluster-users@gluster.org
> > https://lists.gluster.org/mailman/listinfo/gluster-users
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Replica 3 - how to replace failed node (peer)

2019-04-10 Thread Strahil
I have used reset-brick - but I have just changed the brick layout.
You may give it a try, but I guess you need your new brick to have same amount 
of space (or more).

Maybe someone more experienced should share a more sound solution.

Best Regards,
Strahil NikolovOn Apr 10, 2019 12:42, Martin Toth  wrote:
>
> Hi all,
>
> I am running replica 3 gluster with 3 bricks. One of my servers failed - all 
> disks are showing errors and raid is in fault state.
>
> Type: Replicate
> Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a
> Status: Started
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> Brick1: node1.san:/tank/gluster/gv0imagestore/brick1
> Brick2: node2.san:/tank/gluster/gv0imagestore/brick1 <— this brick is down
> Brick3: node3.san:/tank/gluster/gv0imagestore/brick1
>
> So one of my bricks is totally failed (node2). It went down and all data are 
> lost (failed raid on node2). Now I am running only two bricks on 2 servers 
> out from 3.
> This is really critical problem for us, we can lost all data. I want to add 
> new disks to node2, create new raid array on them and try to replace failed 
> brick on this node.
>
> What is the procedure of replacing Brick2 on node2, can someone advice? I 
> can’t find anything relevant in documentation.
>
> Thanks in advance,
> Martin
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Replica 3 - how to replace failed node (peer)

2019-04-10 Thread Martin Toth
Thanks, this looks ok to me, I will reset brick because I don't have any data 
anymore on failed node so I can use same path / brick name.

Is reseting brick dangerous command? Should I be worried about some possible 
failure that will impact remaining two nodes? I am running really old 3.7.6 but 
stable version.

Thanks,
BR!

Martin
 

> On 10 Apr 2019, at 12:20, Karthik Subrahmanya  wrote:
> 
> Hi Martin,
> 
> After you add the new disks and creating raid array, you can run the 
> following command to replace the old brick with new one:
> 
> - If you are going to use a different name to the new brick you can run
> gluster volume replace-brickcommit force
> 
> - If you are planning to use the same name for the new brick as well then you 
> can use
> gluster volume reset-brickcommit force
> Here old-brick & new-brick's hostname &  path should be same.
> 
> After replacing the brick, make sure the brick comes online using volume 
> status.
> Heal should automatically start, you can check the heal status to see all the 
> files gets replicated to the newly added brick. If it does not start 
> automatically, you can manually start that by running gluster volume heal 
> .
> 
> HTH,
> Karthik
> 
> On Wed, Apr 10, 2019 at 3:13 PM Martin Toth  > wrote:
> Hi all,
> 
> I am running replica 3 gluster with 3 bricks. One of my servers failed - all 
> disks are showing errors and raid is in fault state.
> 
> Type: Replicate
> Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a
> Status: Started
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> Brick1: node1.san:/tank/gluster/gv0imagestore/brick1
> Brick2: node2.san:/tank/gluster/gv0imagestore/brick1 <— this brick is down
> Brick3: node3.san:/tank/gluster/gv0imagestore/brick1
> 
> So one of my bricks is totally failed (node2). It went down and all data are 
> lost (failed raid on node2). Now I am running only two bricks on 2 servers 
> out from 3.
> This is really critical problem for us, we can lost all data. I want to add 
> new disks to node2, create new raid array on them and try to replace failed 
> brick on this node. 
> 
> What is the procedure of replacing Brick2 on node2, can someone advice? I 
> can’t find anything relevant in documentation.
> 
> Thanks in advance,
> Martin
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org 
> https://lists.gluster.org/mailman/listinfo/gluster-users 
> 
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Replica 3 - how to replace failed node (peer)

2019-04-10 Thread Karthik Subrahmanya
Hi Martin,

After you add the new disks and creating raid array, you can run the
following command to replace the old brick with new one:

- If you are going to use a different name to the new brick you can run
gluster volume replace-brickcommit force

- If you are planning to use the same name for the new brick as well then
you can use
gluster volume reset-brickcommit force
Here old-brick & new-brick's hostname &  path should be same.

After replacing the brick, make sure the brick comes online using volume
status.
Heal should automatically start, you can check the heal status to see all
the files gets replicated to the newly added brick. If it does not start
automatically, you can manually start that by running gluster volume heal
.

HTH,
Karthik

On Wed, Apr 10, 2019 at 3:13 PM Martin Toth  wrote:

> Hi all,
>
> I am running replica 3 gluster with 3 bricks. One of my servers failed -
> all disks are showing errors and raid is in fault state.
>
> Type: Replicate
> Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a
> Status: Started
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> Brick1: node1.san:/tank/gluster/gv0imagestore/brick1
> Brick2: node2.san:/tank/gluster/gv0imagestore/brick1 <— this brick is down
> Brick3: node3.san:/tank/gluster/gv0imagestore/brick1
>
> So one of my bricks is totally failed (node2). It went down and all data
> are lost (failed raid on node2). Now I am running only two bricks on 2
> servers out from 3.
> This is really critical problem for us, we can lost all data. I want to
> add new disks to node2, create new raid array on them and try to replace
> failed brick on this node.
>
> What is the procedure of replacing Brick2 on node2, can someone advice? I
> can’t find anything relevant in documentation.
>
> Thanks in advance,
> Martin
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Replica 3 - how to replace failed node (peer)

2019-04-10 Thread David Spisla
Hello Martin,

look here:
https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.4/pdf/administration_guide/Red_Hat_Gluster_Storage-3.4-Administration_Guide-en-US.pdf
on page 324. There is a manual how to replace a brick in case of a hardware
failure

Regards
David Spisla

Am Mi., 10. Apr. 2019 um 11:42 Uhr schrieb Martin Toth :

> Hi all,
>
> I am running replica 3 gluster with 3 bricks. One of my servers failed -
> all disks are showing errors and raid is in fault state.
>
> Type: Replicate
> Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a
> Status: Started
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> Brick1: node1.san:/tank/gluster/gv0imagestore/brick1
> Brick2: node2.san:/tank/gluster/gv0imagestore/brick1 <— this brick is down
> Brick3: node3.san:/tank/gluster/gv0imagestore/brick1
>
> So one of my bricks is totally failed (node2). It went down and all data
> are lost (failed raid on node2). Now I am running only two bricks on 2
> servers out from 3.
> This is really critical problem for us, we can lost all data. I want to
> add new disks to node2, create new raid array on them and try to replace
> failed brick on this node.
>
> What is the procedure of replacing Brick2 on node2, can someone advice? I
> can’t find anything relevant in documentation.
>
> Thanks in advance,
> Martin
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Replica 3 - how to replace failed node (peer)

2019-04-10 Thread Martin Toth
Hi all,

I am running replica 3 gluster with 3 bricks. One of my servers failed - all 
disks are showing errors and raid is in fault state.

Type: Replicate
Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: node1.san:/tank/gluster/gv0imagestore/brick1
Brick2: node2.san:/tank/gluster/gv0imagestore/brick1 <— this brick is down
Brick3: node3.san:/tank/gluster/gv0imagestore/brick1

So one of my bricks is totally failed (node2). It went down and all data are 
lost (failed raid on node2). Now I am running only two bricks on 2 servers out 
from 3.
This is really critical problem for us, we can lost all data. I want to add new 
disks to node2, create new raid array on them and try to replace failed brick 
on this node. 

What is the procedure of replacing Brick2 on node2, can someone advice? I can’t 
find anything relevant in documentation.

Thanks in advance,
Martin
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users