Hi,
On Sat, Sep 9, 2017 at 2:35 AM, WK wrote:
> Pavel.
>
> Is there a difference between native client (fuse) and libgfapi in regards
> to the crashing/read-only behaviour?
I switched to FUSE now and the VM crashed (read-only remount)
immediately after one node started rebooting.
I tried to mou
I've always wondered what the scenario for these situations are (aside
from the doc description of nodes coming up and down).
Aren't Gluster writes atomic for all nodes? I seem to recall Jeff Darcy
stating that years ago.
So a clean shutdown for maintenance shouldn't be a problem at all. If
Pavel.
Is there a difference between native client (fuse) and libgfapi in
regards to the crashing/read-only behaviour?
We use Rep2 + Arb and can shutdown a node cleanly, without issue on our
VMs. We do it all the time for upgrades and maintenance.
However we are still on native client as we
Following changes resolved the perf issue:
Added the option
/etc/glusterfs/glusterd.vol :
option rpc-auth-allow-insecure on
restarted glusterd
Then set the volume option:
gluster volume set vms server.allow-insecure on
I am reaching now the max network bandwidth and performance of VMs is quite
On 09/08/2017 01:32 PM, Serkan Çoban wrote:
Any suggestions?
On Thu, Sep 7, 2017 at 4:35 PM, Serkan Çoban wrote:
Hi,
Is it safe to use 3.10.5 client with 3.7.11 server with read-only data
move operation?
The normal upgrade, and hence tested, procedure is older clients and
newer server, IOW
Any suggestions?
On Thu, Sep 7, 2017 at 4:35 PM, Serkan Çoban wrote:
> Hi,
>
> Is it safe to use 3.10.5 client with 3.7.11 server with read-only data
> move operation?
> Client will have 3.10.5 glusterfs-client packages. It will mount one
> volume from 3.7.11 cluster and one from 3.10.5 cluster.
You were right John. After you mentioned about the file names, i checked
the listing again and yes, the uid 1000 does belongs to 'git' user present
on the GitLab container. Actually the long listing i mentioned in my first
mail had all contents mapped from GitLab, Redis and PostgreSQL in one
single
Getting this answer back on the list in case anyone else is trying to share
storage.
Thanks for the docs pointer, Tanner.
-John
On Thu, Sep 7, 2017 at 6:50 PM, Tanner Bruce
wrote:
> You can set a security context on your pod to set the guid as needed:
> https://kubernetes.io/docs/tasks/configu
On Wed, Sep 06, 2017 at 05:45:05PM -0400, Shyam Ranganathan wrote:
> On 09/05/2017 02:07 PM, Serkan Çoban wrote:
> > For rpm packages you can use [1], just installed without any problems.
> > It is taking time packages to land in Centos storage SIG repo...
>
> Thank you for reporting this. The SIG
Well I really do not like the non-deterministic characteristic of it.
However the server crash did never occur in my production environment
- only upgrades and reboots ;-)
-ps
On Fri, Sep 8, 2017 at 2:13 PM, Gandalf Corvotempesta
wrote:
> 2017-09-08 14:11 GMT+02:00 Pavel Szalbot :
>> Gandalf, SI
Btw after few more seconds in SIGTERM scenario, VM kind of revived and
seems to be fine... And after few more restarts of fio job, I got I/O
error.
-ps
On Fri, Sep 8, 2017 at 2:11 PM, Pavel Szalbot wrote:
> Gandalf, SIGKILL (killall -9 glusterfsd) did not stop I/O after few
> minutes. SIGTERM on
2017-09-08 14:11 GMT+02:00 Pavel Szalbot :
> Gandalf, SIGKILL (killall -9 glusterfsd) did not stop I/O after few
> minutes. SIGTERM on the other hand causes crash, but this time it is
> not read-only remount, but around 10 IOPS tops and 2 IOPS on average.
> -ps
So, seems to be reliable to server c
Gandalf, SIGKILL (killall -9 glusterfsd) did not stop I/O after few
minutes. SIGTERM on the other hand causes crash, but this time it is
not read-only remount, but around 10 IOPS tops and 2 IOPS on average.
-ps
On Fri, Sep 8, 2017 at 1:56 PM, Diego Remolina wrote:
> I currently only have a Windo
I currently only have a Windows 2012 R2 server VM in testing on top of
the gluster storage, so I will have to take some time to provision a
couple Linux VMs with both ext4 and XFS to see what happens on those.
The Windows server VM is OK with killall glusterfsd, but when the 42
second timeout goes
I added firewall rule to block all traffic from Gluster VLAN on one of
the nodes.
Approximately 3 minutes in and no crash so far. Errors about missing
node in qemu instance log are present, but this is normal.
-ps
On Fri, Sep 8, 2017 at 1:53 PM, Gandalf Corvotempesta
wrote:
> 2017-09-08 13:44 G
2017-09-08 13:44 GMT+02:00 Pavel Szalbot :
> I did not test SIGKILL because I suppose if graceful exit is bad, SIGKILL
> will be as well. This assumption might be wrong. So I will test it. It would
> be interesting to see client to work in case of crash (SIGKILL) and not in
> case of graceful exit
On Sep 8, 2017 13:36, "Gandalf Corvotempesta" <
gandalf.corvotempe...@gmail.com> wrote:
2017-09-08 13:21 GMT+02:00 Pavel Szalbot :
> Gandalf, isn't possible server hard-crash too much? I mean if reboot
> reliably kills the VM, there is no doubt network crash or poweroff
> will as well.
IIUP, the
2017-09-08 13:21 GMT+02:00 Pavel Szalbot :
> Gandalf, isn't possible server hard-crash too much? I mean if reboot
> reliably kills the VM, there is no doubt network crash or poweroff
> will as well.
IIUP, the only way to keep I/O running is to gracefully exiting glusterfsd.
killall should send sig
So even killall situation eventually kills VM (I/O errors).
Gandalf, isn't possible server hard-crash too much? I mean if reboot
reliably kills the VM, there is no doubt network crash or poweroff
will as well.
I am tempted to test this setup on DigitalOcean to eliminate
possibility of my hardware
2017-09-08 13:07 GMT+02:00 Pavel Szalbot :
> OK, so killall seems to be ok after several attempts i.e. iops do not stop
> on VM. Reboot caused I/O errors after maybe 20 seconds since issuing the
> command. I will check the servers console during reboot to see if the VM
> errors appear just after t
OK, so killall seems to be ok after several attempts i.e. iops do not stop
on VM. Reboot caused I/O errors after maybe 20 seconds since issuing the
command. I will check the servers console during reboot to see if the VM
errors appear just after the power cycle and will try to crash the VM after
ki
I would prefer the behavior was different to what it is of I/O stopping.
The argument I heard for the long 42 second time out was that MTBF on a
server was high, and that the client reconnection operation was *costly*.
Those were arguments to *not* change the ping timeout value down from 42
seconds
Btw now I am experiencing "Transport endpoint disconnects" because of
1s ping-timeout even though nodes are up. This sucks. The network is
not overloaded at all, switches are used only by gluster network and
network consists only of three gluster nodes and one VM hypervisor and
Cinder controller (n
On Fri, Sep 8, 2017 at 12:48 PM, Gandalf Corvotempesta
wrote:
> I think this should be considered a bug
> If you have a server crash, glusterfsd process obviously doesn't exit
> properly and thus this could least to IO stop ?
I agree with you completely in this.
__
On Fri, Sep 8, 2017 at 12:43 PM, Diego Remolina wrote:
> This is exactly the problem,
>
> Systemctl stop glusterd does *not* kill the brick processes.
Yes, I now.
> On CentOS with gluster 3.10.x there is also a service, meant to only stop
> glusterfsd (brick processes). I think the reboot proces
On Fri, Sep 8, 2017 at 12:38 PM, Diego Remolina wrote:
> If your VMs use ext4 also check this:
>
> https://joejulian.name/blog/keeping-your-vms-from-going-read-only-when-encountering-a-ping-timeout-in-glusterfs/
I know about this post, but as I pointed out - ping-timeout does not
seem to prevent
Hi,
I am using glusterfs 3.10.1 with 30 nodes each with 36 bricks and 10 nodes
each with 16 bricks in a single cluster.
By default I have paused scrub process to have it run manually. for the
first time, i was trying to run scrub-on-demand and it was running fine,
but after some time, i decided t
Hi Diego,
indeed glusterfsd processes are runnin and it is the reason I do
server reboot instead of systemctl glusterd stop. Is killall different
from reboot in a way glusterfsd processes are terminated in CentOS
(init 1?)?
However I will try this and let you know.
-ps
On Fri, Sep 8, 2017 at
This is the qemu log of instance:
[2017-09-08 09:31:48.381077] C
[rpc-clnt-ping.c:160:rpc_clnt_ping_timer_expired]
0-gv_openstack_1-client-1: server 10.0.1.202:49152 has not responded
in the last 1 seconds, disconnecting.
[2017-09-08 09:31:48.382411] E [rpc-clnt.c:365:saved_frames_unwind]
(--> /li
On Fri, Sep 8, 2017 at 11:42 AM, wrote:
> Oh, you really don't want to go below 30s, I was told.
> I'm using 30 seconds for the timeout, and indeed when a node goes down
> the VM freez for 30 seconds, but I've never seen them go read only for
> that.
>
> I _only_ use virtio though, maybe it's tha
Oh, you really don't want to go below 30s, I was told.
I'm using 30 seconds for the timeout, and indeed when a node goes down
the VM freez for 30 seconds, but I've never seen them go read only for
that.
I _only_ use virtio though, maybe it's that. What are you using ?
On Fri, Sep 08, 2017 at 11:
Back to replica 3 w/o arbiter. Two fio jobs running (direct=1 and
direct=0), rebooting one node... and VM dmesg looks like:
[ 483.862664] blk_update_request: I/O error, dev vda, sector 23125016
[ 483.898034] blk_update_request: I/O error, dev vda, sector 2161832
[ 483.901103] blk_update_request
2017-07-19 11:22 GMT+02:00 yayo (j) :
> running the "gluster volume heal engine" don't solve the problem...
>
> Some extra info:
>
> We have recently changed the gluster from: 2 (full repliacated) + 1
> arbiter to 3 full replicated cluster but i don't know this is the problem...
>
>
Hi,
I'm sorr
FYI I set up replica 3 (no arbiter this time), did the same thing -
rebooted one node during lots of file IO on VM and IO stopped.
As I mentioned either here or in another thread, this behavior is
caused by high default of network.ping-timeout. My main problem used
to be that setting it to low val
34 matches
Mail list logo