Re: [Gluster-users] Exception in Geo-Replication

2019-07-02 Thread Kotresh Hiremath Ravishankar
You should be looking into the other log file (changes-.log)
for actual failure.
In your case "changes-home-sas-gluster-data-code-misc.log"

On Tue, Jul 2, 2019 at 12:33 PM deepu srinivasan  wrote:

> Any Update on this issue ?
>
> On Mon, Jul 1, 2019 at 4:19 PM deepu srinivasan 
> wrote:
>
>> Hi
>> I am getting this exception while starting geo-replication. Please help.
>>
>>  [2019-07-01 10:48:02.445475] E [repce(agent
>> /home/sas/gluster/data/code-misc):122:worker] : call failed:
>>
>>  Traceback (most recent call last):
>>
>>File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 118,
>> in worker
>>
>>  res = getattr(self.obj, rmeth)(*in_data[2:])
>>
>>File "/usr/libexec/glusterfs/python/syncdaemon/changelogagent.py",
>> line 41, in register
>>
>>  return Changes.cl_register(cl_brick, cl_dir, cl_log, cl_level,
>> retries)
>>
>>File "/usr/libexec/glusterfs/python/syncdaemon/libgfchangelog.py",
>> line 45, in cl_register
>>
>>  cls.raise_changelog_err()
>>
>>File "/usr/libexec/glusterfs/python/syncdaemon/libgfchangelog.py",
>> line 29, in raise_changelog_err
>>
>>  raise ChangelogException(errn, os.strerror(errn))
>>
>>  ChangelogException: [Errno 21] Is a directory
>>
>>  [2019-07-01 10:48:02.446341] E [repce(worker
>> /home/sas/gluster/data/code-misc):214:__call__] RepceClient: call failed
>> call=31023:140523296659264:1561978082.44   method=register
>> error=ChangelogException
>>
>>  [2019-07-01 10:48:02.446654] E [resource(worker
>> /home/sas/gluster/data/code-misc):1268:service_loop] GLUSTER: Changelog
>> register failed error=[Errno 21] Is a  directory
>>
>

-- 
Thanks and Regards,
Kotresh H R
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Removing subvolume from dist/rep volume

2019-07-02 Thread Nithya Balachandran
Hi Dave,

Yes, files in split brain are not migrated as we cannot figure out which is
the good copy. Adding Ravi to look at this and see what can be done.
Also adding Krutika as this is a sharded volume.

The files with the "-T" permissions are internal files and can be
ignored. Ravi and Krutika, please take a look at the other files.

Regards,
Nithya


On Fri, 28 Jun 2019 at 19:56, Dave Sherohman  wrote:

> On Thu, Jun 27, 2019 at 12:17:10PM +0530, Nithya Balachandran wrote:
> > There are some edge cases that may prevent a file from being migrated
> > during a remove-brick. Please do the following after this:
> >
> >1. Check the remove-brick status for any failures.  If there are any,
> >check the rebalance log file for errors.
> >2. Even if there are no failures, check the removed bricks to see if
> any
> >files have not been migrated. If there are any, please check that
> they are
> >valid files on the brick and copy them to the volume from the brick
> to the
> >mount point.
>
> Well, looks like I hit one of those edge cases.  Probably because of
> some issues around a reboot last September which left a handful of files
> in a state where self-heal identified them as needing to be healed, but
> incapable of actually healing them.  (Check the list archives for
> "Kicking a stuck heal", posted on Sept 4, if you want more details.)
>
> So I'm getting 9 failures on the arbiter (merlin), 8 on one data brick
> (gandalf), and 3 on the other (saruman).  Looking in
> /var/log/gluster/palantir-rebalance.log, I see those numbers of
>
> migrate file failed: /.shard/291e9749-2d1b-47af-ad53-3a09ad4e64c6.229:
> failed to lock file on palantir-replicate-1 [Stale file handle]
>
> errors.
>
> Also, merlin has four errors, and gandalf has one, of the form:
>
> Gfid mismatch detected for
> /0f500288-ff62-4f0b-9574-53f510b4159f.2898>,
> 9f00c0fe-58c3-457e-a2e6-f6a006d1cfc6 on palantir-client-7 and
> 08bb7cdc-172b-4c21-916a-2a244c095a3e on palantir-client-1.
>
> There are no gfid mismatches recorded on saruman.  All of the gfid
> mismatches are for  and (on
> saruman) appear to correspond to 0-byte files (e.g.,
> .shard/0f500288-ff62-4f0b-9574-53f510b4159f.2898, in the case of the
> gfid mismatch quoted above).
>
> For both types of errors, all affected files are in .shard/ and have
> UUID-style names, so I have no idea which actual files they belong to.
> File sizes are generally either 0 bytes or 4M (exactly), although one of
> them has a size slightly larger than 3M.  So I'm assuming they're chunks
> of larger files (which would be almost all the files on the volume -
> it's primarily holding disk image files for kvm servers).
>
> Web searches generally seem to consider gfid mismatches to be a form of
> split-brain, but `gluster volume heal palantir info split-brain` shows
> "Number of entries in split-brain: 0" for all bricks, including those
> bricks which are reporting gfid mismatches.
>
>
> Given all that, how do I proceed with cleaning up the stale handle
> issues?  I would guess that this will involve somehow converting the
> shard filename to a "real" filename, then shutting down the
> corresponding VM and maybe doing some additional cleanup.
>
> And then there's the gfid mismatches.  Since they're for 0-byte files,
> is it safe to just ignore them on the assumption that they only hold
> metadata?  Or do I need to do some kind of split-brain resolution on
> them (even though gluster says no files are in split-brain)?
>
>
> Finally, a listing of /var/local/brick0/data/.shard on saruman, in case
> any of the information it contains (like file sizes/permissions) might
> provide clues to resolving the errors:
>
> --- cut here ---
> root@saruman:/var/local/brick0/data/.shard# ls -l
> total 63996
> -rw-rw 2 root libvirt-qemu   0 Sep 17  2018
> 0f500288-ff62-4f0b-9574-53f510b4159f.2864
> -rw-rw 2 root libvirt-qemu   0 Sep 17  2018
> 0f500288-ff62-4f0b-9574-53f510b4159f.2868
> -rw-rw 2 root libvirt-qemu   0 Sep 17  2018
> 0f500288-ff62-4f0b-9574-53f510b4159f.2879
> -rw-rw 2 root libvirt-qemu   0 Sep 17  2018
> 0f500288-ff62-4f0b-9574-53f510b4159f.2898
> -rw--- 2 root libvirt-qemu 4194304 May 17 14:42
> 291e9749-2d1b-47af-ad53-3a09ad4e64c6.229
> -rw--- 2 root libvirt-qemu 4194304 Jun 24 09:10
> 291e9749-2d1b-47af-ad53-3a09ad4e64c6.925
> -rw-rw-r-- 2 root libvirt-qemu 4194304 Jun 26 12:54
> 2df12cb0-6cf4-44ae-8b0a-4a554791187e.266
> -rw-rw-r-- 2 root libvirt-qemu 4194304 Jun 26 16:30
> 2df12cb0-6cf4-44ae-8b0a-4a554791187e.820
> -rw-r--r-- 2 root libvirt-qemu 4194304 Jun 17 20:22
> 323186b1-6296-4cbe-8275-b940cc9d65cf.27466
> -rw-r--r-- 2 root libvirt-qemu 4194304 Jun 27 05:01
> 323186b1-6296-4cbe-8275-b940cc9d65cf.32575
> -rw-r--r-- 2 root libvirt-qemu 3145728 Jun 11 13:23
> 323186b1-6296-4cbe-8275-b940cc9d65cf.3448
> -T 2 root libvirt-qemu   0 Jun 28 14:26
> 4cd094f4-0344-4660-98b0-83249d5bd659.22998
> -rw--- 2 root libvirt-qemu 4194304