Re: [Gluster-users] gluster brick daemon segfaulted in pairs

2016-10-24 Thread Atin Mukherjee
On Monday 24 October 2016, Jackie Tung  wrote:

> Hi,
>
> We are running a distributed replicated volume: 16 pairs of bricks (rep
> count 2), 2 nodes.
>
> On Friday, 2 pairs of brick daemons seg-faulted within minutes of each
> other, leading to 2 subvolumes down (no replicas left).  We tried to bring
> them up again by doing a "volume start force”, which worked, but roughly 4
> hours later this happened again, but to two other pairs of bricks.
>
> There is nothing of note in brick logs for the downed bricks, except that
> it just suddenly stops logging.  In the other logs (nfs, glusterhd, etc),
> we simply start seeing errors saying “All sub volumes down” for those
> replicates.
>
> We are running GluserFS 3.8.2 on Ubuntu 16.04.
>
> I do have a couple of core dumps preserved by apport.  Any ideas?  Should
> I file this straight into bugzilla?


Filing a bug with coredumps attached would be ideal, have you got a chance
to look at the backtraces of these coredumps? If not please provide the
backtrace too, sometimes developers can straight away identify the issues
looking at the backtraces.


> Thanks,
> Jackie
> --
>
>
> The information in this email is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this email
> by anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it, is prohibited and may be unlawful.
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org 
> http://www.gluster.org/mailman/listinfo/gluster-users



-- 
--Atin
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] gluster brick daemon segfaulted in pairs

2016-10-24 Thread Jackie Tung
Hi,

We are running a distributed replicated volume: 16 pairs of bricks (rep count 
2), 2 nodes.

On Friday, 2 pairs of brick daemons seg-faulted within minutes of each other, 
leading to 2 subvolumes down (no replicas left).  We tried to bring them up 
again by doing a "volume start force”, which worked, but roughly 4 hours later 
this happened again, but to two other pairs of bricks.

There is nothing of note in brick logs for the downed bricks, except that it 
just suddenly stops logging.  In the other logs (nfs, glusterhd, etc), we 
simply start seeing errors saying “All sub volumes down” for those replicates.

We are running GluserFS 3.8.2 on Ubuntu 16.04.

I do have a couple of core dumps preserved by apport.  Any ideas?  Should I 
file this straight into bugzilla?

Thanks,
Jackie
-- 
 

The information in this email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this email 
by anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be 
taken in reliance on it, is prohibited and may be unlawful.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Gluster Developer Summit 2016 Notes

2016-10-24 Thread Amye Scavarda
Notes from our Gluster Developer Summit 2016 in Berlin!

Videos
Slides
Flickr Group
Public Etherpad
Bootstrapping Challenge

All of the videos from Gluster Developer Summit are now live on our YouTube
channel, and slides are available in our Slideshare accounts. We've also
created a Flickr group, please add your photos of the event!

https://www.youtube.com/user/GlusterCommunity
http://www.slideshare.net/GlusterCommunity
https://www.flickr.com/groups/glusterdevelopersummit2016/

We've also got a public etherpad for our comments from the event:
https://public.pad.fsfe.org/p/gluster-developer-summit-2016

Please feel free to add to this and help keep our momentum from this event!
I'm looking for the community maintainers to take a strong hand in here to
be able to tell us what they're focusing on this from this event over the
next three months.

One thing that we didn't get to that I wanted to was a Community Bootstrap
Challenge, so let's do this as a hangout after the Community Meeting on
November 2nd. I'll send out a separate email on this describing the event,
and we'll all join in at 1pm UTC.

Anything I missed?

Happy to take suggestions and comments about what else we'd want to see in
a Gluster Developer Summit!

-- amye

-- 
Amye Scavarda | a...@redhat.com | Gluster Community Lead
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Bad sectors on brick

2016-10-24 Thread Gandalf Corvotempesta
2016-10-24 16:13 GMT+02:00 Niels de Vos :
> Yes, correct. But note that different filesystems can handle bad sectors
> differently, read-only filesystems is the most common default though.

Yes, I know.
Anyway, even the first bit-rot scrub should detect the failed sector
and trigger the error, right ?

> In 'man 8 mount' the option "errors=" describes the different values for
> ext2/3/4. Configuring it to "continue" will most likely cause data
> corruption or other bad problems and is definitely not advised ;-)

That's for sure :)
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Bad sectors on brick

2016-10-24 Thread Niels de Vos
On Mon, Oct 24, 2016 at 03:35:06PM +0200, Gandalf Corvotempesta wrote:
> 2016-10-24 11:29 GMT+02:00 Niels de Vos :
> > The filesystem on the brick will detect the problem, and most likely
> > aborts itself. Depending on the configuration (mount options, tune2fs)
> > the kernel will panic or mount the filesystem in read-only mode.
> >
> > When the filesystem becomes read-only, the brick process will log a
> > warning and exit.
> 
> So, gluster is able to handle a single bad sector. Kernel puts the FS
> in readonly and the brick process will exit.
> When the brick process exist, gluster detect the brick as missing and
> so on

Yes, correct. But note that different filesystems can handle bad sectors
differently, read-only filesystems is the most common default though.

In 'man 8 mount' the option "errors=" describes the different values for
ext2/3/4. Configuring it to "continue" will most likely cause data
corruption or other bad problems and is definitely not advised ;-)

Niels


signature.asc
Description: PGP signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Bad sectors on brick

2016-10-24 Thread Gandalf Corvotempesta
2016-10-24 11:29 GMT+02:00 Niels de Vos :
> The filesystem on the brick will detect the problem, and most likely
> aborts itself. Depending on the configuration (mount options, tune2fs)
> the kernel will panic or mount the filesystem in read-only mode.
>
> When the filesystem becomes read-only, the brick process will log a
> warning and exit.

So, gluster is able to handle a single bad sector. Kernel puts the FS
in readonly and the brick process will exit.
When the brick process exist, gluster detect the brick as missing and
so on
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Reliably mounting a gluster volume

2016-10-24 Thread Kevin Lemonnier
> 
> I rather like the x-systemd.automount solution, because it works equally 

Well I don't know about that, we take care of removing systemd as soon as
possible. AutoFS works everywhere, it doesn't depend on the init, so I'd
use that but I guess if you are using systemd everywhere it makes sense to
use it, one less thing to maintain.

-- 
Kevin Lemonnier
PGP Fingerprint : 89A5 2283 04A0 E6E9 0111


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Reliably mounting a gluster volume

2016-10-24 Thread Paul Boven

Hi Kevin, everyone,

On 10/21/2016 03:19 PM, Kevin Lemonnier wrote:

As we were discussing in the "gluster volume not mounted on boot" you
should probably just go with AutoFS, not ideal but I don't see any
other reliable solutions.


Apologies, I had checked about half a year of Gluster mailing list 
postings before I started working on this, but hadn't re-checked the 
last few days.


I rather like the x-systemd.automount solution, because it works equally 
well on a gluster server as on a gluster client. I can confirm that it 
works perfectly in our case. The virtual machines also get properly 
started on boot once the /gluster filesystem is there.


Regards, Paul Boven.
--
Paul Boven  +31 (0)521-596547
Unix/Linux/Networking specialist
Joint Institute for VLBI in Europe - www.jive.eu
VLBI - It's a fringe science
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Bad sectors on brick

2016-10-24 Thread Ravishankar N

On 10/24/2016 02:59 PM, Niels de Vos wrote:

On Mon, Oct 24, 2016 at 02:24:48PM +0530, Ravishankar N wrote:

On 10/24/2016 02:02 PM, Gandalf Corvotempesta wrote:

Hi,
how gluster manage a single bad sector on a disk/brick? It will kick
out the whole brick?

Gluster does not manage sectors, it will just  propagate the error
returned by the on-disk file system for that syscall to the application.

What if the single bad sector makes a single file partially corrupted?

Ditto.

The filesystem on the brick will detect the problem, and most likely
aborts itself. Depending on the configuration (mount options, tune2fs)
the kernel will panic or mount the filesystem in read-only mode.

When the filesystem becomes read-only, the brick process will log a
warning and exit.

Ah, I see that the health-checker thread kills the brick process if I/O
on the health_check file fails. Thanks for correcting me.


Niels



___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Bad sectors on brick

2016-10-24 Thread Niels de Vos
On Mon, Oct 24, 2016 at 02:24:48PM +0530, Ravishankar N wrote:
> On 10/24/2016 02:02 PM, Gandalf Corvotempesta wrote:
> > Hi,
> > how gluster manage a single bad sector on a disk/brick? It will kick
> > out the whole brick?
> Gluster does not manage sectors, it will just  propagate the error
> returned by the on-disk file system for that syscall to the application.
> > What if the single bad sector makes a single file partially corrupted?
> Ditto.

The filesystem on the brick will detect the problem, and most likely
aborts itself. Depending on the configuration (mount options, tune2fs)
the kernel will panic or mount the filesystem in read-only mode.

When the filesystem becomes read-only, the brick process will log a
warning and exit.

Niels


signature.asc
Description: PGP signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Bad sectors on brick

2016-10-24 Thread Ravishankar N

On 10/24/2016 02:02 PM, Gandalf Corvotempesta wrote:

Hi,
how gluster manage a single bad sector on a disk/brick? It will kick
out the whole brick?

Gluster does not manage sectors, it will just  propagate the error
returned by the on-disk file system for that syscall to the application.

What if the single bad sector makes a single file partially corrupted?

Ditto.


___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users



___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Bad sectors on brick

2016-10-24 Thread Gandalf Corvotempesta
Hi,
how gluster manage a single bad sector on a disk/brick? It will kick
out the whole brick?
What if the single bad sector makes a single file partially corrupted?
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Query for Gluster 3.7.x and Proxmox Users

2016-10-24 Thread Lindsay Mathieson
Hi all, got a technical query from the proxmox list.

I upgraded to the latest proxmox (no-subscription repo) last night and
ran into an issue with online migration failing for VM's hosted on
gluster (gfapi). Fuse mounts are ok.

This is with Gluster 3.8.4, the proxmox devs were wanting to know if
it was an issue with 3.7.x

nb;qemu was updated to vs 2.7 which has support for seek_hole, a
gluster 3.8 feature.

-- 
Lindsay
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users