Hey Tamal,
Sorry for the delay. See my comments inline.
On Tue, Feb 28, 2017 at 10:29 AM, Tamal Saha wrote:
> Hi,
> I am running a GlusterFS cluster in Kubernetes. This has a single 1x2
> volume. I am dealing with a split-brain sitution. During debugging I
> noticed that,
Hi,
Comments inline.
On Tue, Apr 18, 2017 at 1:11 AM, Mahdi Adnan
wrote:
> Hi,
>
>
> We have a replica 2 volume and we have issue with setting proper quorum.
>
> The volumes used as datastore for vmware/ovirt, the current settings for
> the quorum are:
>
>
>
Hi,
If I did not misunderstood, you are saying that WORM is not allowing to
create hard links for the files.
Answering it based on that assumption.
If the volume level or file level WORM feature is enabled and the file is
in WORM/WORM-Retained state,
then those files should be immutable and hence
On Mon, Jul 10, 2017 at 5:55 AM, 최두일 wrote:
> hard linksA read-only file system does not produce a hard link in GlusterFS
> WORM mode. Is it impossible?
>
It is not possible to create a hard link in WORM mode.
If a file is in WORM mode then even from hard links you can not
Hay Niklaus,
Sorry for the delay. The *reset-brick* should do the trick for you.
You can have a look at [1] for more details.
[1] https://gluster.readthedocs.io/en/latest/release-notes/3.9.0/
HTH,
Karthik
On Thu, Jun 1, 2017 at 12:28 PM, Niklaus Hofer <
niklaus.ho...@stepping-stone.ch> wrote:
Hi Ludwig,
There is no way to resolve gfid split-brains with type mismatch. You have
to do it manually by following the steps in [1].
In case of type mismatch it is recommended to resolve it manually. But for
only gfid mismatch in 3.11 we have a way to
resolve it by using the
Hi Matt,
The files might be in split brain. Could you please send the outputs of
these?
gluster volume info
gluster volume heal info
And also the getfattr output of the files which are in the heal info output
from all the bricks of that replica pair.
getfattr -d -e hex -m .
Thanks & Regards
; Brick tpc-arbiter1-100617:/exp/b3/gv0
>
> Status: Connected
>
> Number of entries: 0
>
>
>
> Brick tpc-cent-glus1-081017:/exp/b4/gv0
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
&
Hey Ludwig,
Yes this configuration is fine. You can add them and do the rebalance after
that.
FYI: Replica 2 volumes are prone to split-brains. Replica 3 or arbiter will
highly reduce the possibility of ending up in split-brains.
If possible consider using one of those configurations. For more
ide it
> to your, Thanks!
>
Ok.
>
>
> Best regards,
> *Cynthia **(周琳)*
>
> MBB SM HETRAN SW3 MATRIX
>
> Storage
> Mobile: +86 (0)18657188311
>
>
>
> *From:* Karthik Subrahmanya [mailto:ksubr...@redhat.com]
> *Sent:* Thursday, September 28, 2017 2:0
Hi,
To resolve the gfid split-brain you can follow the steps at [1].
Since we don't have the pending markers set on the files, it is not showing
in the heal info.
To debug this issue, need some more data from you. Could you provide these
things?
1. volume info
2. mount log
3. brick logs
4. shd
at.
They have many new features, bug fixes & performance improvements. If you
can try to reproduce the issue on that would be
very helpful.
Regards,
Karthik
> Best regards,
> *Cynthia **(周琳)*
>
> MBB SM HETRAN SW3 MATRIX
>
> Storage
> Mobile: +86 (0)18657188311
>
&
Hi,
There is no way to isolate the healing peer. Healing happens from the good
brick to the bad brick.
I guess your replica bricks are on a different peers. If you try to isolate
the healing peer, it will stop the healing process itself.
What is the error you are getting while writing? It would
617:/exp/b3/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick tpc-cent-glus1-081017:/exp/b4/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick tpc-cent-glus2-081017:/exp/b4/gv0
> Status: Connected
> Number of entries in split-b
filename or /filename and if so relative to
> where? /, , ?
>
> On Mon, 2017-10-23 at 18:54 +, Matt Waymack wrote:
>
> In my case I was able to delete the hard links in the .glusterfs folders
> of the bricks and it seems to have done the trick, thanks!
>
>
>
> *F
Hi,
Can you provide the
- volume info
- shd log
- mount log
of the volumes which are showing pending entries, to debug the issue.
Thanks & Regards,
Karthik
On Wed, Dec 20, 2017 at 3:11 AM, Matt Waymack wrote:
> Mine also has a list of files that seemingly never heal. They
Hey Richard,
Could you share the following informations please?
1. gluster volume info
2. getfattr output of that file from all the bricks
getfattr -d -e hex -m .
3. glustershd & glfsheal logs
Regards,
Karthik
On Thu, Oct 26, 2017 at 10:21 AM, Amar Tumballi wrote:
>
me-client-4=0x00010001
> trusted.bit-rot.version=0x020059df11cd000548ec
> trusted.gfid=0xea8ecfd195fd4e48b994fd0a2da226f9
> trusted.glusterfs.quota.48e9eea6-cda6-4e53-bb4a-72059debf4c2.contri.1=
> 0x9a01
> trusted.pgfid.48e9eea6-cda6-4e5
t;
> I've only tested with one GFID but the file it referenced _IS_ on the down
> machine even though it has no GFID in the .glusterfs structure.
>
> On Tue, 2017-10-24 at 12:35 +0530, Karthik Subrahmanya wrote:
>
> Hi Jim,
>
> Can you check whether the same hardlinks are present
Hey,
Can you give us the volume info output for this volume?
Why are you not able to get the xattrs from arbiter brick? It is the same
way as you do it on data bricks.
The changelog xattrs are named trusted.afr.virt_images-client-{1,2,3} in
the getxattr outputs you have provided.
Did you do a
Hi,
With replica 2 volumes one can easily end up in split-brains if there are
frequent disconnects and high IOs going on.
If you use replica 3 or arbiter volumes, it will guard you by using the
quorum mechanism giving you both consistency and availability.
But in replica 2 volumes, quorum does
Hi,
Which version of gluster you are using?
You can find which file is that using the following command
find -samefile //
Please provide the getfatr output of the file which is in split brain.
The steps to recover from split-brain can be found here,
: 0
>
> Brick tpc-cent-glus2-081017:/exp/b3/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick tpc-arbiter1-100617:/exp/b3/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick tpc-cent-glus1-081017:/exp/b4/gv0
> Status:
Hi Henrik,
Thanks for providing the required outputs. See my replies inline.
On Thu, Dec 21, 2017 at 10:42 PM, Henrik Juul Pedersen <h...@liab.dk> wrote:
> Hi Karthik and Ben,
>
> I'll try and reply to you inline.
>
> On 21 December 2017 at 07:18, Karthik Subrahmanya
Hi,
I am not aware of any command which clears the historic heal statistics.
You can use the command "gluster volume start force" which will
restart the SHD and clears the statistics.
Regards,
Karthik
On Mon, Jan 8, 2018 at 3:23 AM, Gino Lisignoli wrote:
> Is there any
Hi,
I am wondering why the other brick is not showing any entry in split brain
in the heal info split-brain output.
Can you give the output of stat & getfattr -d -m . -e hex
from both the bricks.
Regards,
Karthik
On Mon, Feb 5, 2018 at 5:03 PM, Alex K wrote:
> After
start.
Now I am setting up the engine from scratch...
In case I see this kind of split brain again I will get back before I start
deleting :)
Sure. Thanks for the update.
Regards,
Karthik
Alex
On Mon, Feb 5, 2018 at 2:34 PM, Karthik Subrahmanya <ksubr...@redhat.com>
wrote:
> H
f a file and a directory which
belongs to the second replica sub volume from all the 3 bricks
Brick4: gv2:/data/glusterfs
Brick5: gv3:/data/glusterfs
Brick6: gv1:/data/gv23-arbiter (arbiter)
to see the direction of pending markers being set.
Regards,
Karthik
>
> --
> Best Regards,
Hey,
Did the heal completed and you still have some entries pending heal?
If yes then can you provide the following informations to debug the issue.
1. Which version of gluster you are running
2. gluster volume heal info summary or gluster volume heal
info
3. getfattr -d -e hex -m . output of
.ru
February 9, 2018 2:01 PM, "Karthik Subrahmanya" <ksubr...@redhat.com
<%22karthik%20subrahmanya%22%20%3cksubr...@redhat.com%3E>> wrote:
On Fri, Feb 9, 2018 at 3:23 PM, Seva Gluschenko <g...@webkontrol.ru> wrote:
Hi Karthik,
Thank you for your reply. The heal is
On Fri, Feb 9, 2018 at 11:46 AM, Karthik Subrahmanya <ksubr...@redhat.com>
wrote:
> Hey,
>
> Did the heal completed and you still have some entries pending heal?
> If yes then can you provide the following informations to debug the issue.
> 1. Which version of gluster you ar
Hi,
>From the information you provided, I am guessing that you have a replica 3
volume configured.
In that case you can run "gluster volume heal " which should do
the trick for you.
Regards,
Karthik
On Thu, Feb 8, 2018 at 6:16 AM, Frizz wrote:
> I have a setup with
need to be able to guarantee data
> integrity and availability to the users.
> 7) Is glusterfs "production ready"? Because I find it hard to monitor
> and thus trust in these setups. Also performance with small / many
> files seems horrible at best - but that's for another discus
On Mon, Feb 26, 2018 at 6:14 PM, Dave Sherohman <d...@sherohman.org> wrote:
> On Mon, Feb 26, 2018 at 05:45:27PM +0530, Karthik Subrahmanya wrote:
> > > "In a replica 2 volume... If we set the client-quorum option to
> > > auto, then the first brick
Hi,
Yes you can do that.
- Make sure "gluster volume heal info" is zero.
- Remove the arbiter brick using the command "gluster volume remove-brick
replica 2 "
- Add a new brick of same size as of the other 2 data brick using the
command "gluster volume add-brick replica 3 "
- heal info should
Hi,
>From the logs you have pasted it looks like those files are in GFID
split-brain.
They should have the GFIDs assigned on both the data bricks but they will
be different.
Can you please paste the getfattr output of those files and their parent
from all the bricks again?
Which version of
Hey,
>From the getfattr output you have provided, the directory is clearly not in
split brain.
If all the bricks are being blamed by others then it is called split brain.
In your case only client-13 that is Brick-14 in the volume info output had
a pending entry heal on the directory.
That is the
Hi Anatoliy,
The heal command is basically used to heal any mismatching contents between
replica copies of the files.
For the command "gluster volume heal " to succeed, you should have
the self-heal-daemon running,
which is true only if your volume is of type replicate/disperse.
In your case you
On Wed, Mar 14, 2018 at 4:33 AM, Laura Bailey <lbai...@redhat.com> wrote:
> Can we add a smarter error message for this situation by checking volume
> type first?
Yes we can. I will do that.
Thanks,
Karthik
>
> Cheers,
> Laura B
>
>
> On Wednesday, March 14, 20
be reduced to, for
> example, 5%? Could you point to any best practice document(s)?
>
Yes you can decrease it to any value. There won't be any side effect.
Regards,
Karthik
>
> Regards,
>
> Anatoliy
>
>
>
>
>
> On 2018-03-13 16:46, Karthik Subrahmanya wrote:
>
&
On Wed, Mar 14, 2018 at 5:42 PM, Karthik Subrahmanya <ksubr...@redhat.com>
wrote:
>
>
> On Wed, Mar 14, 2018 at 3:36 PM, Anatoliy Dmytriyev <to...@tolid.eu.org>
> wrote:
>
>> Hi Karthik,
>>
>>
>> Thanks a lot for the explanation.
>>
>&g
Hi Jose,
By switching into pure distribute volume you will lose availability if
something goes bad.
I am guessing you have a nX2 volume.
If you want to preserve one copy of the data in all the distributes, you
can do that by decreasing the replica count in the remove-brick operation.
If you have
me scratch
>
> --
> There are no active volume tasks
>
> -
> Jose Sanchez
> Systems/Network Analyst 1
> Center of Advanced Research Computing
> 1601 Central A
>
> Thanks
>
> Jose
>
>
>
>
> -
> Jose Sanchez
> Systems/Network Analyst 1
> Center of Advanced Research Computing
> 1601 Central Ave
> <https://maps.google.com/?q=1601+Central+Ave=gmail=g>.
> MSC 01 1190
> Albuquerqu
On Tue, Feb 27, 2018 at 1:40 PM, Dave Sherohman <d...@sherohman.org> wrote:
> On Tue, Feb 27, 2018 at 12:00:29PM +0530, Karthik Subrahmanya wrote:
> > I will try to explain how you can end up in split-brain even with cluster
> > wide quorum:
>
> Yep, the explan
Hi David,
Yes it is a good to have feature, but AFAIK it is currently not in the
priority/focus list.
If anyone from community is interested in implementing this, is most
welcome.
Otherwise you need to wait for some more time until it comes to focus.
Thanks & Regards,
Karthik
On Tue, Feb 27,
On Tue, Feb 27, 2018 at 5:35 PM, Dave Sherohman <d...@sherohman.org> wrote:
> On Tue, Feb 27, 2018 at 04:59:36PM +0530, Karthik Subrahmanya wrote:
> > > > Since arbiter bricks need not be of same size as the data bricks, if
> you
> > > > can configure thr
On Tue, Feb 27, 2018 at 4:18 PM, Dave Sherohman <d...@sherohman.org> wrote:
> On Tue, Feb 27, 2018 at 03:20:25PM +0530, Karthik Subrahmanya wrote:
> > If you want to use the first two bricks as arbiter, then you need to be
> > aware of the following things:
> >
Hi Dave,
On Mon, Feb 26, 2018 at 4:45 PM, Dave Sherohman wrote:
> I've configured 6 bricks as distributed-replicated with replica 2,
> expecting that all active bricks would be usable so long as a quorum of
> at least 4 live bricks is maintained.
>
The client quorum is
to at this time.
>
> It does change which host it does this on.
>
>
>
> Thanks.
>
>
>
> *From: *Atin Mukherjee
> *Date: *Friday, August 31, 2018 at 1:03 PM
> *To: *"Johnson, Tim"
> *Cc: *Karthik Subrahmanya , Ravishankar N <
> ravishan...@redh
On Mon, Sep 3, 2018 at 11:17 AM Karthik Subrahmanya
wrote:
> Hey,
>
> We need some more information to debug this.
> I think you missed to send the output of 'gluster volume info '.
> Can you also provide the bricks, shd and glfsheal logs as well?
> In the setup how many peer
Hey,
Please provide the glustershd log from all the nodes and client logs on the
node from where you did the lookup on the file to resolve this issue.
Regards,
Karthik
On Fri, Sep 28, 2018 at 5:27 PM Ravishankar N
wrote:
> + gluster-users.
>
> Adding Karthik to see if he has some cycles to
+Sanju Rakonde & +Atin Mukherjee
adding
glusterd folks who can help here.
On Wed, Mar 27, 2019 at 3:24 PM Riccardo Murri
wrote:
> I managed to put the reinstalled server back into connected state with
> this procedure:
>
> 1. Run `for other_server in ...; do gluster peer probe $other_server;
mdpi.com
> Skype: milos.cuculovic.mdpi
>
> Disclaimer: The information and files contained in this message
> are confidential and intended solely for the use of the individual or
> entity to whom they are addressed. If you have received this message in
> error, please notify me and
received this message in
> error, please notify me and delete this message from your system. You may
> not copy this message in its entirety or in part, or disclose its contents
> to anyone.
>
> On 21 Mar 2019, at 12:36, Karthik Subrahmanya wrote:
>
> Hi Milos,
>
> Thanks
Hi,
Note: I guess the volume you are talking about is of type replica-2 (1x2).
Usually replica 2 volumes are prone to split-brain. If you can consider
converting them to arbiter or replica-3, they will handle most of the cases
which can lead to slit-brains. For more information see [1].
0100
> Modify: 2019-03-20 11:28:10.834584374 +0100
> Change: 2019-03-20 14:06:07.940849268 +0100
> Birth: -
> ————————
>
>
> The file is from brick 2 that I upgraded and st
ssage in
> error, please notify me and delete this message from your system. You may
> not copy this message in its entirety or in part, or disclose its contents
> to anyone.
>
> On 21 Mar 2019, at 10:27, Karthik Subrahmanya wrote:
>
> Hi,
>
> Note: I guess the volume you
Hi Strahil,
Thank you for sharing your experience with reset-brick option.
Since he is using the gluster version 3.7.6, we do not have the reset-brick
[1] option implemented there. It is introduced in 3.9.0. He has to go with
replace-brick with the force option if he wants to use the same path &
Hi,
I guess you missed Ravishankar's reply [1] for this query, on your previous
thread.
[1] https://lists.gluster.org/pipermail/gluster-users/2019-April/036247.html
Regards,
Karthik
On Wed, Apr 10, 2019 at 8:59 PM Ingo Fischer wrote:
> Hi All,
>
> I had a replica 2 cluster to host my VM
On Thu, Apr 11, 2019 at 10:23 AM Karthik Subrahmanya
wrote:
> Hi Strahil,
>
> Can you give us some more insights on
> - the volume configuration you were using?
> - why you wanted to replace your brick?
> - which brick(s) you tried replacing?
>
- if you remember the co
it never let me down.
>
> Best Regards,
> Strahil Nikolov
> On Apr 11, 2019 07:34, Karthik Subrahmanya wrote:
>
> Hi Strahil,
>
> Thank you for sharing your experience with reset-brick option.
> Since he is using the gluster version 3.7.6, we do not have the
> reset-b
ng my experience.
>
Highly appreciated.
Regards,
Karthik
>
> Best Regards,
> Strahil Nikolov
>
> В четвъртък, 11 април 2019 г., 0:53:52 ч. Гринуич-4, Karthik Subrahmanya <
> ksubr...@redhat.com> написа:
>
>
> Hi Strahil,
>
> Can you give us some more insights o
odes? I am running really
> old 3.7.6 but stable version.
>
> Thanks,
> BR!
>
> Martin
>
>
> On 10 Apr 2019, at 12:20, Karthik Subrahmanya wrote:
>
> Hi Martin,
>
> After you add the new disks and creating raid array, you can run the
> following command to
Hi Martin,
After you add the new disks and creating raid array, you can run the
following command to replace the old brick with new one:
- If you are going to use a different name to the new brick you can run
gluster volume replace-brickcommit force
- If you are planning to use the same
> Thanks, this looks ok to me, I will reset brick because I don't have any
>> data anymore on failed node so I can use same path / brick name.
>>
>> Is reseting brick dangerous command? Should I be worried about some
>> possible failure that will impact remaining two nod
brick
- After it succeeds set back the auth.allow option to the previous value.
Regards,
Karthik
On Tue, Apr 16, 2019 at 5:20 PM Boris Goldowsky wrote:
> OK, log files attached.
>
>
>
> Boris
>
>
>
>
>
> *From: *Karthik Subrahmanya
> *Date: *Tuesday, April
You're welcome!
On Tue 16 Apr, 2019, 7:12 PM Boris Goldowsky, wrote:
> That worked! Thank you SO much!
>
>
>
> Boris
>
>
>
>
>
> *From: *Karthik Subrahmanya
> *Date: *Tuesday, April 16, 2019 at 8:20 AM
> *To: *Boris Goldowsky
> *Cc: *Atin M
Hi,
Currently we do not have support for converting an existing volume to a
thin-arbiter volume. It is also not supported to replace the thin-arbiter
brick with a new one.
You can create a fresh thin arbiter volume using GD2 framework and play
around that. Feel free to share your experience with
Hi Martin,
The reset-brick command is introduced in 3.9.0 and not present in 3.7.6.
You can try using the same replace-brick command with the force option even
if you want to use the same name for the brick being replaced.
3.7.6 is EOLed long back and glusterfs-6 is the latest version with lots
On Mon, Apr 15, 2019 at 9:43 PM Atin Mukherjee
wrote:
> +Karthik Subrahmanya
>
> Didn't we we fix this problem recently? Failed to set extended attribute
> indicates that temp mount is failing and we don't have quorum number of
> bricks up.
>
We had two fixes which handl
Hi Dmitry,
Answers inline.
On Fri, Nov 29, 2019 at 6:26 PM Dmitry Antipov wrote:
> I'm trying to manually garbage data on bricks
First of all changing data directly on the backend is not recommended and
is not supported. All the operations needs to be done from the client
mount point.
Only
/active/-client-*/private | egrep -i
> 'connected') on the clients revealed that a few were not connected to all
> bricks.
> After restarting them, everything went back to normal.
>
> Regards,
> Ulrich
> Am 06.02.20 um 12:51 schrieb Karthik Subrahmanya:
>
> Hi Ulrich,
Hi Ulrich,
>From the problem statement, seems like the client(s) have lost connection
with brick. Can you give the following information?
- How many clients are there for this volume and which version they are in?
- gluster volume info & gluster volume status outputs
- Check whether all the
Hi Chris,
By looking at the data provided (hope the other entry is also a file and
not the parent of the file for which the stat & getfattrs are provided) it
seems like the parent(s) of these entries are missing the entry pending
markers on the good bricks, which is necessary to create these
wrote:
> > No Luck. Same problem.
> >
> > I stopped the volume.
> >
> > I ran the remove-brick command. It warned about not being able to
> > migrate files from removed bricks and asked if I want to continue.
> >
> > when I say 'yes'
> >
>
Hi,
Since your two nodes are scrapped and there is no chance that they
will come back in later time, you can try reducing the replica count
to 1 by removing the down bricks from the volume and then mounting the
volume back to access the data which is available on the only up
brick.
The remove
Hi,
I am assuming that you are using one of the maintained versions of gluster.
GFID split-brains can be resolved using one of the methods in the
split-brain resolution CLI as explained in the section "3. Resolution of
split-brain using gluster CLI" of
Hi,
Please provide the following information to understand the setup and debug
this further:
- Which version of gluster you are using?
- 'gluster volume status atlassian' to confirm both bricks and shds are up
or not
- Complete output of 'gluster volume profile atlassian info' before running
'du'
Hi Ahemad,
Sorry for a lot of back and forth on this. But we might need a few more
details to find the actual cause here.
What version of gluster you are running on server and client nodes?
Also provide the statedump [1] of the bricks and the client process when
the hang is seen.
[1]
ely
run the index heal command "gluster volume heal " to trigger the
heal manually and you can see the entries needing heal and the progress of
heal by running "gluster volume heal info".
HTH,
Karthik
>
> Any documentation on that end will be helpful.
>
> Tha
so.0(rpc_clnt_connection_cleanup+0x97)[0x7f4a6a1c3987] (-->
> /lib64/libgfrpc.so.0(+0xf518)[0x7f4a6a1c4518] ) 0-glustervol-client-2:
> forced unwinding frame type(GlusterFS 4.x v1) op(LOOKUP(27)) called at
> 2020-06-16 05:16:52.258940 (xid=0xb5)
> [2020-06-16 05:16:59.732060] E
gt;
> kindly suggest on how to make the volume high available.
>
> Thanks,
> Ahemad
>
>
>
> On Tuesday, 16 June, 2020, 12:09:10 pm IST, Karthik Subrahmanya <
> ksubr...@redhat.com> wrote:
>
>
> Hi,
>
> Thanks for the clarification.
> In that case can
Hi Ahemad,
Please provide the following info:
1. gluster peer status
2. gluster volume info glustervol
3. gluster volume status glustervol
4. client log from node4 when you saw unavailability
Regards,
Karthik
On Mon, Jun 15, 2020 at 11:07 PM ahemad shaik
wrote:
> Hi There,
>
> I have created
Hey,
I think [1] should help you.
If you can't find anything matching your situation or can't resolve it with
any of the methods listed there, please open an issue for this at [2], with
the following information.
- volume info, volume status, heal info and shd logs from node-1 & arbiter.
- Output
Hi Andre,
Striped volumes are deprecated long back, see [1] & [2]. Seems like you are
using a very old version. May I know which version of gluster you are
running and the gluster volume info please?
Release schedule and the maintained branches can be found at [3].
[1]
es itself, or a
new node is required.
[1]
https://docs.gluster.org/en/latest/Administrator-Guide/arbiter-volumes-and-quorum/
Regards,
Karthik
> ---
> Gilberto Nunes Ferreira
> (47) 99676-7530 - Whatsapp / Telegram
>
>
>
>
>
>
> Em ter., 8 de fev. de 2022 às 07:17, Kart
87 matches
Mail list logo