Re: [Gluster-users] Reboot Issue with 6.5 on Ubuntu 18.04

2019-09-08 Thread Amar Tumballi
On Mon, Sep 9, 2019 at 12:23 AM Alexander Iliev 
wrote:

> Hi all,
>
> I am running a GlusterFS server 6.3 on three Ubuntu 18.04 nodes
> installed from the https://launchpad.net/~gluster PPA.
>
> I tried upgrading to 6.5 today and ran into an issue with the first (and
> only) node that has been upgraded so far. When I rebooted the node the
> underlying brick filesystems failed to mount because of a `pvscan`
> process timing out on boot.
>
> I did some experimenting and the issue seems to be that on reboot the
> glusterfsd processes (that expose the bricks as far as I understand) are
> not being shut down which leads to the underlying filesystems show up as
> busy and not getting properly unmounted.
>
> Then I found out that `systemctl stop glusterd.service` doesn't stop the
> brick processes by design and it also seems that for Fedora/RHEL this
> has been worked around by having a separate `glusterfsd.service` unit
> that only acts on shutdown.
>
> This however does not seem to be the case on Ubuntu and I can't figure
> out what is the expected flow there.
>
> So I guess my question is - is this normal/expected behaviour on Ubuntu?
> How is one supposed to set things up so that bricks get properly
> unmounted on reboot and properly mounted at startup?
>
> I am also considering migrating from Ubuntu to CentOS now as the
> upstream support seems much better there. If I decide to switch can I
> re-use the existing bricks or do I need to spin up a clean node, join
> the cluster and get the data synced to it?
>
> I can only answer this part for now. If your bricks can be accessed
directly on CentOS (xfs/ext4 or anything) after installing new OS, it
should just work fine with GlusterFS too after migration.. The challenge
will be with content of /var/lib/glusterd (and IP addresses etc), which you
need to handle properly.

Regards,
Amar



> Thanks!
>
> Best regards,
> --
> alexander iliev
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Rebalancing newly added bricks

2019-09-08 Thread Nithya Balachandran
On Sat, 7 Sep 2019 at 00:03, Strahil Nikolov  wrote:

> As it was mentioned, you might have to run rebalance on the other node -
> but it is better to wait this node is over.
>
>
Hi Strahil,

Rebalance does not need to be run on the other node - the operation is a
volume wide one . Only a single node per replica set would migrate files in
the version used in this case .

Regards,
Nithya

Best Regards,
> Strahil Nikolov
>
> В петък, 6 септември 2019 г., 15:29:20 ч. Гринуич+3, Herb Burnswell <
> herbert.burnsw...@gmail.com> написа:
>
>
>
>
> On Thu, Sep 5, 2019 at 9:56 PM Nithya Balachandran 
> wrote:
>
>
>
> On Thu, 5 Sep 2019 at 02:41, Herb Burnswell 
> wrote:
>
> Thanks for the replies.  The rebalance is running and the brick
> percentages are not adjusting as expected:
>
> # df -hP |grep data
> /dev/mapper/gluster_vg-gluster_lv1_data   60T   49T   11T  83%
> /gluster_bricks/data1
> /dev/mapper/gluster_vg-gluster_lv2_data   60T   49T   11T  83%
> /gluster_bricks/data2
> /dev/mapper/gluster_vg-gluster_lv3_data   60T  4.6T   55T   8%
> /gluster_bricks/data3
> /dev/mapper/gluster_vg-gluster_lv4_data   60T  4.6T   55T   8%
> /gluster_bricks/data4
> /dev/mapper/gluster_vg-gluster_lv5_data   60T  4.6T   55T   8%
> /gluster_bricks/data5
> /dev/mapper/gluster_vg-gluster_lv6_data   60T  4.6T   55T   8%
> /gluster_bricks/data6
>
> At the current pace it looks like this will continue to run for another
> 5-6 days.
>
> I appreciate the guidance..
>
>
> What is the output of the rebalance status command?
> Can you check if there are any errors in the rebalance logs on the node
> on which you see rebalance activity?
> If there are a lot of small files on the volume, the rebalance is expected
> to take time.
>
> Regards,
> Nithya
>
>
> My apologies, that was a typo.  I meant to say:
>
> "The rebalance is running and the brick percentages are NOW adjusting as
> expected"
>
> I did expect the rebalance to take several days.  The rebalance log is not
> showing any errors.  Status output:
>
> # gluster vol rebalance tank status
> Node Rebalanced-files  size
> scanned  failures   skipped   status  run time in
> h:m:s
>-  ---   ---
> ---   ---   --- 
> --
>localhost  125132035.5TB
> 2079527 0 0  in progress  139:9:46
>serverB 0
>  0Bytes 7 0 0completed
>   63:47:55
> volume rebalance: tank: success
>
> Thanks again for the guidance.
>
> HB
>
>
>
>
>
> On Mon, Sep 2, 2019 at 9:08 PM Nithya Balachandran 
> wrote:
>
>
>
> On Sat, 31 Aug 2019 at 22:59, Herb Burnswell 
> wrote:
>
> Thank you for the reply.
>
> I started a rebalance with force on serverA as suggested.  Now I see
> 'activity' on that node:
>
> # gluster vol rebalance tank status
> Node Rebalanced-files  size
> scanned  failures   skipped   status  run time in
> h:m:s
>-  ---   ---
> ---   ---   --- 
> --
>localhost 6143 6.1GB
>9542 0 0  in progress0:4:5
>serverB  00Bytes
>   7 0 0  in progress0:4:5
> volume rebalance: tank: success
>
> But I am not seeing any activity on serverB.  Is this expected?  Does the
> rebalance need to run on each node even though it says both nodes are 'in
> progress'?
>
>
> It looks like this is a replicate volume. If that is the case then yes,
> you are running an old version of Gluster for which this was the default
> behaviour.
>
> Regards,
> Nithya
>
> Thanks,
>
> HB
>
> On Sat, Aug 31, 2019 at 4:18 AM Strahil  wrote:
>
> The rebalance status show 0 Bytes.
>
> Maybe you should try with the 'gluster volume rebalance  start
> force' ?
>
> Best Regards,
> Strahil Nikolov
>
> Source:
> https://docs.gluster.org/en/latest/Administrator%20Guide/Managing%20Volumes/#rebalancing-volumes
> On Aug 30, 2019 20:04, Herb Burnswell  wrote:
>
> All,
>
> RHEL 7.5
> Gluster 3.8.15
> 2 Nodes: serverA & serverB
>
> I am not deeply knowledgeable about Gluster and it's administration but we
> have a 2 node cluster that's been running for about a year and a half.  All
> has worked fine to date.  Our main volume has consisted of two 60TB bricks
> on each of the cluster nodes.  As we reached capacity on the volume we
> needed to expand.  So, we've added four new 60TB bricks to each of the
> cluster nodes.  The bricks are now seen, and the total size of the volume
> is as expected:
>
> # gluster vol status tank

[Gluster-users] Reboot Issue with 6.5 on Ubuntu 18.04

2019-09-08 Thread Alexander Iliev

Hi all,

I am running a GlusterFS server 6.3 on three Ubuntu 18.04 nodes 
installed from the https://launchpad.net/~gluster PPA.


I tried upgrading to 6.5 today and ran into an issue with the first (and 
only) node that has been upgraded so far. When I rebooted the node the 
underlying brick filesystems failed to mount because of a `pvscan` 
process timing out on boot.


I did some experimenting and the issue seems to be that on reboot the 
glusterfsd processes (that expose the bricks as far as I understand) are 
not being shut down which leads to the underlying filesystems show up as 
busy and not getting properly unmounted.


Then I found out that `systemctl stop glusterd.service` doesn't stop the 
brick processes by design and it also seems that for Fedora/RHEL this 
has been worked around by having a separate `glusterfsd.service` unit 
that only acts on shutdown.


This however does not seem to be the case on Ubuntu and I can't figure 
out what is the expected flow there.


So I guess my question is - is this normal/expected behaviour on Ubuntu? 
How is one supposed to set things up so that bricks get properly 
unmounted on reboot and properly mounted at startup?


I am also considering migrating from Ubuntu to CentOS now as the 
upstream support seems much better there. If I decide to switch can I 
re-use the existing bricks or do I need to spin up a clean node, join 
the cluster and get the data synced to it?


Thanks!

Best regards,
--
alexander iliev
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Issues with Geo-replication (GlusterFS 6.3 on Ubuntu 18.04)

2019-09-08 Thread Alexander Iliev

Hi all,

Sunny, thank you for the update.

I have applied the patch locally on my slave system and now the 
mountbroker setup is successful.


I am facing another issue though - when I try to create a replication 
session between the two sites I am getting:


# gluster volume geo-replication store1 
glustergeorep@::store1 create push-pem

Error : Request timed out
geo-replication command failed

It is still unclear to me if my setup is expected to work at all.

Reading the geo-replication documentation at [1] I see this paragraph:

> A password-less SSH connection is also required for gsyncd between 
every node in the master to every node in the slave. The gluster 
system:: execute gsec_create command creates secret-pem files on all the 
nodes in the master, and is used to implement the password-less SSH 
connection. The push-pem option in the geo-replication create command 
pushes these keys to all the nodes in the slave.


It is not clear to me whether connectivity from each master node to each 
slave node is a requirement in terms of networking. In my setup the 
slave nodes form the Gluster pool over a private network which is not 
reachable from the master site.


Any ideas how to proceed from here will be greatly appreciated.

Thanks!

Links:
[1] 
https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3/html/administration_guide/sect-preparing_to_deploy_geo-replication


Best regards,
--
alexander iliev

On 9/3/19 2:50 PM, Sunny Kumar wrote:

Thank you for the explanation Kaleb.

Alexander,

This fix will be available with next release for all supported versions.

/sunny

On Mon, Sep 2, 2019 at 6:47 PM Kaleb Keithley  wrote:


Fixes on master (before or after the release-7 branch was taken) almost 
certainly warrant a backport IMO to at least release-6, and probably release-5 
as well.

We used to have a "tracker" BZ for each minor release (e.g. 6.6) to keep track 
of backports by cloning the original BZ and changing the Version, and adding that BZ to 
the tracker. I'm not sure what happened to that practice. The last ones I can find are 
for 6.3 and 5.7;  https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-6.3 and 
https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-5.7

It isn't enough to just backport recent fixes on master to release-7. We are 
supposedly continuing to maintain release-6 and release-5 after release-7 GAs. 
If that has changed, I haven't seen an announcement to that effect. I don't 
know why our developers don't automatically backport to all the actively 
maintained releases.

Even if there isn't a tracker BZ, you can always create a backport BZ by 
cloning the original BZ and change the release to 6. That'd be a good place to 
start.

On Sun, Sep 1, 2019 at 8:45 AM Alexander Iliev  wrote:


Hi Strahil,

Yes, this might be right, but I would still expect fixes like this to be
released for all supported major versions (which should include 6.) At
least that's how I understand https://www.gluster.org/release-schedule/.

Anyway, let's wait for Sunny to clarify.

Best regards,
alexander iliev

On 9/1/19 2:07 PM, Strahil Nikolov wrote:

Hi Alex,

I'm not very deep into bugzilla stuff, but for me NEXTRELEASE means v7.

Sunny,
Am I understanding it correctly ?

Best Regards,
Strahil Nikolov

В неделя, 1 септември 2019 г., 14:27:32 ч. Гринуич+3, Alexander Iliev
 написа:


Hi Sunny,

Thank you for the quick response.

It's not clear to me however if the fix has been already released or not.

The bug status is CLOSED NEXTRELEASE and according to [1] the
NEXTRELEASE resolution means that the fix will be included in the next
supported release. The bug is logged against the mainline version
though, so I'm not sure what this means exactly.

  From the 6.4[2] and 6.5[3] release notes it seems it hasn't been
released yet.

Ideally I would not like to patch my systems locally, so if you have an
ETA on when this will be out officially I would really appreciate it.

Links:
[1] https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_status
[2] https://docs.gluster.org/en/latest/release-notes/6.4/
[3] https://docs.gluster.org/en/latest/release-notes/6.5/

Thank you!

Best regards,

alexander iliev

On 8/30/19 9:22 AM, Sunny Kumar wrote:
  > Hi Alexander,
  >
  > Thanks for pointing that out!
  >
  > But this issue is fixed now you can see below link for bz-link and patch.
  >
  > BZ - https://bugzilla.redhat.com/show_bug.cgi?id=1709248
  >
  > Patch - https://review.gluster.org/#/c/glusterfs/+/22716/
  >
  > Hope this helps.
  >
  > /sunny
  >
  > On Fri, Aug 30, 2019 at 2:30 AM Alexander Iliev
  > mailto:glus...@mamul.org>> wrote:
  >>
  >> Hello dear GlusterFS users list,
  >>
  >> I have been trying to set up geo-replication between two clusters for
  >> some time now. The desired state is (Cluster #1) being replicated to
  >> (Cluster #2).
  >>
  >> Here are some details about the setup:
  >>
  >> Cluster #1: three nodes connected via a local network