Re: [Gluster-users] Fwd: Troubleshooting glusterfs

2018-02-04 Thread Nithya Balachandran
Hi,

Please provide the log for the mount process from the node on which you
have mounted the volume. This should be in /var/log/glusterfs and the name
of the file will the the hyphenated path of the mount point. For e.g., If
the volume in mounted at /mnt/glustervol, the log file will be
/var/log/glusterfs/mnt-glusterfs.log

Regards,
Nithya

On 4 February 2018 at 21:09, Nikita Yeryomin  wrote:

> Please help troubleshooting glusterfs with the following setup:
> Distributed volume without replication. Sharding enabled.
>
> # cat /etc/centos-release
>
> CentOS release 6.9 (Final)
>
> # glusterfs --version
>
> glusterfs 3.12.3
>
> [root@master-5f81bad0054a11e8bf7d0671029ed6b8 uploads]# gluster volume
> info
>
>
>
> Volume Name: gv0
>
> Type: Distribute
>
> Volume ID: 1a7e05f6-4aa8-48d3-b8e3-300637031925
>
> Status: Started
>
> Snapshot Count: 0
>
> Number of Bricks: 27
>
> Transport-type: tcp
>
> Bricks:
>
> Brick1: gluster3.qencode.com:/var/storage/brick/gv0
>
> Brick2: encoder-376cac0405f311e884700671029ed6b8.qencode.com:/var/
> storage/brick/gv0
>
> Brick3: encoder-ee6761c0091c11e891ba0671029ed6b8.qencode.com:/var/
> storage/brick/gv0
>
> Brick4: encoder-ee68b8ea091c11e89c2d0671029ed6b8.qencode.com:/var/
> storage/brick/gv0
>
> Brick5: encoder-ee663700091c11e8b48f0671029ed6b8.qencode.com:/var/
> storage/brick/gv0
>
> Brick6: encoder-efcf113e091c11e899520671029ed6b8.qencode.com:/var/
> storage/brick/gv0
>
> Brick7: encoder-efcd5a24091c11e8963a0671029ed6b8.qencode.com:/var/
> storage/brick/gv0
>
> Brick8: encoder-099f557e091d11e882f70671029ed6b8.qencode.com:/var/
> storage/brick/gv0
>
> Brick9: encoder-099bdda4091d11e881090671029ed6b8.qencode.com:/var/
> storage/brick/gv0
>
> Brick10: encoder-099dca56091d11e8b3410671029ed6b8.qencode.com:/var/
> storage/brick/gv0
>
> Brick11: encoder-09a1ba4e091d11e8a3c20671029ed6b8.qencode.com:/var/
> storage/brick/gv0
>
> Brick12: encoder-099a826a091d11e895940671029ed6b8.qencode.com:/var/
> storage/brick/gv0
>
> Brick13: encoder-0998aa8a091d11e8a8160671029ed6b8.qencode.com:/var/
> storage/brick/gv0
>
> Brick14: encoder-0b582724091d11e8b3b40671029ed6b8.qencode.com:/var/
> storage/brick/gv0
>
> Brick15: encoder-0dff527c091d11e896f20671029ed6b8.qencode.com:/var/
> storage/brick/gv0
>
> Brick16: encoder-0e0d5c14091d11e886cf0671029ed6b8.qencode.com:/var/
> storage/brick/gv0
>
> Brick17: encoder-7f1bf3d4093b11e8a3580671029ed6b8.qencode.com:/var/
> storage/brick/gv0
>
> Brick18: encoder-7f70378c093b11e885260671029ed6b8.qencode.com:/var/
> storage/brick/gv0
>
> Brick19: encoder-7f19528c093b11e88f100671029ed6b8.qencode.com:/var/
> storage/brick/gv0
>
> Brick20: encoder-7f76c048093b11e8a7470671029ed6b8.qencode.com:/var/
> storage/brick/gv0
>
> Brick21: encoder-7f7fc90e093b11e8a74e0671029ed6b8.qencode.com:/var/
> storage/brick/gv0
>
> Brick22: encoder-7f6bc382093b11e8b8a30671029ed6b8.qencode.com:/var/
> storage/brick/gv0
>
> Brick23: encoder-7f7b44d8093b11e8906f0671029ed6b8.qencode.com:/var/
> storage/brick/gv0
>
> Brick24: encoder-7f72aa30093b11e89a8e0671029ed6b8.qencode.com:/var/
> storage/brick/gv0
>
> Brick25: encoder-7f7d735c093b11e8b4650671029ed6b8.qencode.com:/var/
> storage/brick/gv0
>
> Brick26: encoder-7f1a5006093b11e89bcb0671029ed6b8.qencode.com:/var/
> storage/brick/gv0
>
> Brick27: encoder-95791076093b11e8af170671029ed6b8.qencode.com:/var/
> storage/brick/gv0
>
> Options Reconfigured:
>
> cluster.min-free-disk: 10%
>
> performance.cache-max-file-size: 1048576
>
> nfs.disable: on
>
> transport.address-family: inet
>
> features.shard: on
>
> performance.client-io-threads: on
>
> Each brick is 15Gb size.
>
> After using volume for several hours with intensive read/write operations
> (~300GB written and then deleted) an attempt to write to volume results in
> an Input/Output error:
>
> # wget https://speed.hetzner.de/1GB.bin
>
> --2018-02-04 12:02:34--  https://speed.hetzner.de/1GB.bin
>
> Resolving speed.hetzner.de... 88.198.248.254, 2a01:4f8:0:59ed::2
>
> Connecting to speed.hetzner.de|88.198.248.254|:443... connected.
>
> HTTP request sent, awaiting response... 200 OK
>
> Length: 1048576000 (1000M) [application/octet-stream]
>
> Saving to: `1GB.bin'
>
>
> 38% [=>
>
>   ] 403,619,518 27.8M/s   in 15s
>
>
>
> Cannot write to `1GB.bin' (Input/output error).
>
> I don't see anything written to glusterd.log, or any other logs in
> /var/log/glusterfs/* when this error occurs.
>
> Deleting partially downloaded file works without error.
>
> Thanks,
> Nikita Yeryomin
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [ovirt-users] VM paused due unknown storage error

2018-02-04 Thread Sahina Bose
Adding gluster-users.


On Wed, Jan 31, 2018 at 3:55 PM, Misak Khachatryan  wrote:

> Hi,
>
> here is the output from virt3 - problematic host:
>
> [root@virt3 ~]# gluster volume status
> Status of volume: data
> Gluster process TCP Port  RDMA Port  Online
> Pid
> 
> --
> Brick virt1:/gluster/brick2/data49152 0  Y
>  3536
> Brick virt2:/gluster/brick2/data49152 0  Y
>  3557
> Brick virt3:/gluster/brick2/data49152 0  Y
>  3523
> Self-heal Daemon on localhost   N/A   N/AY
>  32056
> Self-heal Daemon on virt2   N/A   N/AY
>  29977
> Self-heal Daemon on virt1   N/A   N/AY
>  1788
>
> Task Status of Volume data
> 
> --
> There are no active volume tasks
>
> Status of volume: engine
> Gluster process TCP Port  RDMA Port  Online
> Pid
> 
> --
> Brick virt1:/gluster/brick1/engine  49153 0  Y
>  3561
> Brick virt2:/gluster/brick1/engine  49153 0  Y
>  3570
> Brick virt3:/gluster/brick1/engine  49153 0  Y
>  3534
> Self-heal Daemon on localhost   N/A   N/AY
>  32056
> Self-heal Daemon on virt2   N/A   N/AY
>  29977
> Self-heal Daemon on virt1   N/A   N/AY
>  1788
>
> Task Status of Volume engine
> 
> --
> There are no active volume tasks
>
> Status of volume: iso
> Gluster process TCP Port  RDMA Port  Online
> Pid
> 
> --
> Brick virt1:/gluster/brick4/iso 49154 0  Y
>  3585
> Brick virt2:/gluster/brick4/iso 49154 0  Y
>  3592
> Brick virt3:/gluster/brick4/iso 49154 0  Y
>  3543
> Self-heal Daemon on localhost   N/A   N/AY
>  32056
> Self-heal Daemon on virt1   N/A   N/AY
>  1788
> Self-heal Daemon on virt2   N/A   N/AY
>  29977
>
> Task Status of Volume iso
> 
> --
> There are no active volume tasks
>
> and one of the logs.
>
> Thanks in advance
>
> Best regards,
> Misak Khachatryan
>
>
> On Wed, Jan 31, 2018 at 9:17 AM, Sahina Bose  wrote:
> > Could you provide the output of "gluster volume status" and the gluster
> > mount logs to check further?
> > Are all the host shown as active in the engine (that is, is the
> monitoring
> > working?)
> >
> > On Wed, Jan 31, 2018 at 1:07 AM, Misak Khachatryan 
> wrote:
> >>
> >> Hi,
> >>
> >> After upgrade to 4.2 i'm getting "VM paused due unknown storage
> >> error". When i was upgrading i had some gluster problem with one of
> >> the hosts, which i was fixed readding it to gluster peers. Now i see
> >> something weir in bricks configuration, see attachment - one of the
> >> bricks uses 0% of space.
> >>
> >> How I can diagnose this? Nothing wrong in logs as I can see.
> >>
> >>
> >>
> >>
> >> Best regards,
> >> Misak Khachatryan
> >>
> >> ___
> >> Users mailing list
> >> us...@ovirt.org
> >> http://lists.ovirt.org/mailman/listinfo/users
> >>
> >
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Error - Disk Full - No Space Left

2018-02-04 Thread Nithya Balachandran
Hi,

This might be because of:
https://github.com/gluster/glusterfs/blob/release-3.13/doc/release-notes/3.13.0.md#ability-to-reserve-back-end-storage-space

Please try running the following and see if it solves the problem:

gluster volume set  storage.reserve 0

Regards,
Nithya

On 4 February 2018 at 03:35, Taste-Of-IT  wrote:

> Hello Community and Devs,
>
> i have the following problem. I have a 3 Brick Distributed GFS. After
> Upgrading to the latest 3.13.x i cant create any directory on the volume
> which i access due NFS. I have on the smallest brick over 170GB free and
> also enough free inodes on all 3 Bricks. If i want to create a directory,
> the system says that there is not enough disk space left, but i have. I run
> a fix-layout, but without success.
>
> Any idea?
>
> thx
> Taste
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Very slow rsync to gluster volume UNLESS `ls` or `find` scan dir on gluster volume first

2018-02-04 Thread Amar Tumballi
Thanks for the report Artem,

Looks like the issue is about cache warming up. Specially, I suspect rsync
doing a 'readdir(), stat(), file operations' loop, where as when a find or
ls is issued, we get 'readdirp()' request, which contains the stat
information along with entries, which also makes sure cache is up-to-date
(at md-cache layer).

Note that this is just a off-the memory hypothesis, We surely need to
analyse and debug more thoroughly for a proper explanation.  Some one in my
team would look at it soon.

Regards,
Amar

On Mon, Feb 5, 2018 at 7:25 AM, Vlad Kopylov  wrote:

> You mounting it to the local bricks?
>
> struggling with same performance issues
> try using this volume setting
> http://lists.gluster.org/pipermail/gluster-users/2018-January/033397.html
> performance.stat-prefetch: on might be it
>
> seems like when it gets to cache it is fast - those stat fetch which
> seem to come from .gluster are slow
>
> On Sun, Feb 4, 2018 at 3:45 AM, Artem Russakovskii 
> wrote:
> > An update, and a very interesting one!
> >
> > After I started stracing rsync, all I could see was lstat calls, quite
> slow
> > ones, over and over, which is expected.
> >
> > For example: lstat("uploads/2016/10/nexus2cee_DSC05339_thumb-
> 161x107.jpg",
> > {st_mode=S_IFREG|0664, st_size=4043, ...}) = 0
> >
> > I googled around and found
> > https://gist.github.com/nh2/1836415489e2132cf85ed3832105fcc1, which is
> > seeing this exact issue with gluster, rsync and xfs.
> >
> > Here's the craziest finding so far. If while rsync is running (or right
> > before), I run /bin/ls or find on the same gluster dirs, it immediately
> > speeds up rsync by a factor of 100 or maybe even 1000. It's absolutely
> > insane.
> >
> > I'm stracing the rsync run, and the slow lstat calls flood in at an
> > incredible speed as soon as ls or find run. Several hundred of files per
> > minute (excruciatingly slow) becomes thousands or even tens of thousands
> of
> > files a second.
> >
> > What do you make of this?
> >
> >
> > ___
> > Gluster-users mailing list
> > Gluster-users@gluster.org
> > http://lists.gluster.org/mailman/listinfo/gluster-users
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>



-- 
Amar Tumballi (amarts)
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Very slow rsync to gluster volume UNLESS `ls` or `find` scan dir on gluster volume first

2018-02-04 Thread Vlad Kopylov
You mounting it to the local bricks?

struggling with same performance issues
try using this volume setting
http://lists.gluster.org/pipermail/gluster-users/2018-January/033397.html
performance.stat-prefetch: on might be it

seems like when it gets to cache it is fast - those stat fetch which
seem to come from .gluster are slow

On Sun, Feb 4, 2018 at 3:45 AM, Artem Russakovskii  wrote:
> An update, and a very interesting one!
>
> After I started stracing rsync, all I could see was lstat calls, quite slow
> ones, over and over, which is expected.
>
> For example: lstat("uploads/2016/10/nexus2cee_DSC05339_thumb-161x107.jpg",
> {st_mode=S_IFREG|0664, st_size=4043, ...}) = 0
>
> I googled around and found
> https://gist.github.com/nh2/1836415489e2132cf85ed3832105fcc1, which is
> seeing this exact issue with gluster, rsync and xfs.
>
> Here's the craziest finding so far. If while rsync is running (or right
> before), I run /bin/ls or find on the same gluster dirs, it immediately
> speeds up rsync by a factor of 100 or maybe even 1000. It's absolutely
> insane.
>
> I'm stracing the rsync run, and the slow lstat calls flood in at an
> incredible speed as soon as ls or find run. Several hundred of files per
> minute (excruciatingly slow) becomes thousands or even tens of thousands of
> files a second.
>
> What do you make of this?
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Fwd: Troubleshooting glusterfs

2018-02-04 Thread Nikita Yeryomin
Please help troubleshooting glusterfs with the following setup:
Distributed volume without replication. Sharding enabled.

# cat /etc/centos-release

CentOS release 6.9 (Final)

# glusterfs --version

glusterfs 3.12.3

[root@master-5f81bad0054a11e8bf7d0671029ed6b8 uploads]# gluster volume info



Volume Name: gv0

Type: Distribute

Volume ID: 1a7e05f6-4aa8-48d3-b8e3-300637031925

Status: Started

Snapshot Count: 0

Number of Bricks: 27

Transport-type: tcp

Bricks:

Brick1: gluster3.qencode.com:/var/storage/brick/gv0

Brick2: encoder-376cac0405f311e884700671029ed6b8.qencode.com:/var/storage/
brick/gv0

Brick3: encoder-ee6761c0091c11e891ba0671029ed6b8.qencode.com:/var/storage/
brick/gv0

Brick4: encoder-ee68b8ea091c11e89c2d0671029ed6b8.qencode.com:/var/storage/
brick/gv0

Brick5: encoder-ee663700091c11e8b48f0671029ed6b8.qencode.com:/var/storage/
brick/gv0

Brick6: encoder-efcf113e091c11e899520671029ed6b8.qencode.com:/var/storage/
brick/gv0

Brick7: encoder-efcd5a24091c11e8963a0671029ed6b8.qencode.com:/var/storage/
brick/gv0

Brick8: encoder-099f557e091d11e882f70671029ed6b8.qencode.com:/var/storage/
brick/gv0

Brick9: encoder-099bdda4091d11e881090671029ed6b8.qencode.com:/var/storage/
brick/gv0

Brick10: encoder-099dca56091d11e8b3410671029ed6b8.qencode.com:/var/storage/
brick/gv0

Brick11: encoder-09a1ba4e091d11e8a3c20671029ed6b8.qencode.com:/var/storage/
brick/gv0

Brick12: encoder-099a826a091d11e895940671029ed6b8.qencode.com:/var/storage/
brick/gv0

Brick13: encoder-0998aa8a091d11e8a8160671029ed6b8.qencode.com:/var/storage/
brick/gv0

Brick14: encoder-0b582724091d11e8b3b40671029ed6b8.qencode.com:/var/storage/
brick/gv0

Brick15: encoder-0dff527c091d11e896f20671029ed6b8.qencode.com:/var/storage/
brick/gv0

Brick16: encoder-0e0d5c14091d11e886cf0671029ed6b8.qencode.com:/var/storage/
brick/gv0

Brick17: encoder-7f1bf3d4093b11e8a3580671029ed6b8.qencode.com:/var/storage/
brick/gv0

Brick18: encoder-7f70378c093b11e885260671029ed6b8.qencode.com:/var/storage/
brick/gv0

Brick19: encoder-7f19528c093b11e88f100671029ed6b8.qencode.com:/var/storage/
brick/gv0

Brick20: encoder-7f76c048093b11e8a7470671029ed6b8.qencode.com:/var/storage/
brick/gv0

Brick21: encoder-7f7fc90e093b11e8a74e0671029ed6b8.qencode.com:/var/storage/
brick/gv0

Brick22: encoder-7f6bc382093b11e8b8a30671029ed6b8.qencode.com:/var/storage/
brick/gv0

Brick23: encoder-7f7b44d8093b11e8906f0671029ed6b8.qencode.com:/var/storage/
brick/gv0

Brick24: encoder-7f72aa30093b11e89a8e0671029ed6b8.qencode.com:/var/storage/
brick/gv0

Brick25: encoder-7f7d735c093b11e8b4650671029ed6b8.qencode.com:/var/storage/
brick/gv0

Brick26: encoder-7f1a5006093b11e89bcb0671029ed6b8.qencode.com:/var/storage/
brick/gv0

Brick27: encoder-95791076093b11e8af170671029ed6b8.qencode.com:/var/storage/
brick/gv0

Options Reconfigured:

cluster.min-free-disk: 10%

performance.cache-max-file-size: 1048576

nfs.disable: on

transport.address-family: inet

features.shard: on

performance.client-io-threads: on

Each brick is 15Gb size.

After using volume for several hours with intensive read/write operations
(~300GB written and then deleted) an attempt to write to volume results in
an Input/Output error:

# wget https://speed.hetzner.de/1GB.bin

--2018-02-04 12:02:34--  https://speed.hetzner.de/1GB.bin

Resolving speed.hetzner.de... 88.198.248.254, 2a01:4f8:0:59ed::2

Connecting to speed.hetzner.de|88.198.248.254|:443... connected.

HTTP request sent, awaiting response... 200 OK

Length: 1048576000 (1000M) [application/octet-stream]

Saving to: `1GB.bin'


38% [=>

] 403,619,518 27.8M/s   in 15s



Cannot write to `1GB.bin' (Input/output error).

I don't see anything written to glusterd.log, or any other logs in
/var/log/glusterfs/* when this error occurs.

Deleting partially downloaded file works without error.

Thanks,
Nikita Yeryomin
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Troubleshooting glusterfs

2018-02-04 Thread Nikita Yeryomin
Please help troubleshooting glusterfs with the following setup:
Distributed volume without replication. Sharding enabled.

[root@master-5f81bad0054a11e8bf7d0671029ed6b8 uploads]# gluster volume info



Volume Name: gv0

Type: Distribute

Volume ID: 1a7e05f6-4aa8-48d3-b8e3-300637031925

Status: Started

Snapshot Count: 0

Number of Bricks: 27

Transport-type: tcp

Bricks:

Brick1: gluster3.qencode.com:/var/storage/brick/gv0

Brick2: encoder-376cac0405f311e884700671029ed6b8.qencode.com:
/var/storage/brick/gv0

Brick3: encoder-ee6761c0091c11e891ba0671029ed6b8.qencode.com:
/var/storage/brick/gv0

Brick4: encoder-ee68b8ea091c11e89c2d0671029ed6b8.qencode.com:
/var/storage/brick/gv0

Brick5: encoder-ee663700091c11e8b48f0671029ed6b8.qencode.com:
/var/storage/brick/gv0

Brick6: encoder-efcf113e091c11e899520671029ed6b8.qencode.com:
/var/storage/brick/gv0

Brick7: encoder-efcd5a24091c11e8963a0671029ed6b8.qencode.com:
/var/storage/brick/gv0

Brick8: encoder-099f557e091d11e882f70671029ed6b8.qencode.com:
/var/storage/brick/gv0

Brick9: encoder-099bdda4091d11e881090671029ed6b8.qencode.com:
/var/storage/brick/gv0

Brick10: encoder-099dca56091d11e8b3410671029ed6b8.qencode.com:
/var/storage/brick/gv0

Brick11: encoder-09a1ba4e091d11e8a3c20671029ed6b8.qencode.com:
/var/storage/brick/gv0

Brick12: encoder-099a826a091d11e895940671029ed6b8.qencode.com:
/var/storage/brick/gv0

Brick13: encoder-0998aa8a091d11e8a8160671029ed6b8.qencode.com:
/var/storage/brick/gv0

Brick14: encoder-0b582724091d11e8b3b40671029ed6b8.qencode.com:
/var/storage/brick/gv0

Brick15: encoder-0dff527c091d11e896f20671029ed6b8.qencode.com:
/var/storage/brick/gv0

Brick16: encoder-0e0d5c14091d11e886cf0671029ed6b8.qencode.com:
/var/storage/brick/gv0

Brick17: encoder-7f1bf3d4093b11e8a3580671029ed6b8.qencode.com:
/var/storage/brick/gv0

Brick18: encoder-7f70378c093b11e885260671029ed6b8.qencode.com:
/var/storage/brick/gv0

Brick19: encoder-7f19528c093b11e88f100671029ed6b8.qencode.com:
/var/storage/brick/gv0

Brick20: encoder-7f76c048093b11e8a7470671029ed6b8.qencode.com:
/var/storage/brick/gv0

Brick21: encoder-7f7fc90e093b11e8a74e0671029ed6b8.qencode.com:
/var/storage/brick/gv0

Brick22: encoder-7f6bc382093b11e8b8a30671029ed6b8.qencode.com:
/var/storage/brick/gv0

Brick23: encoder-7f7b44d8093b11e8906f0671029ed6b8.qencode.com:
/var/storage/brick/gv0

Brick24: encoder-7f72aa30093b11e89a8e0671029ed6b8.qencode.com:
/var/storage/brick/gv0

Brick25: encoder-7f7d735c093b11e8b4650671029ed6b8.qencode.com:
/var/storage/brick/gv0

Brick26: encoder-7f1a5006093b11e89bcb0671029ed6b8.qencode.com:
/var/storage/brick/gv0

Brick27: encoder-95791076093b11e8af170671029ed6b8.qencode.com:
/var/storage/brick/gv0

Options Reconfigured:

cluster.min-free-disk: 10%

performance.cache-max-file-size: 1048576

nfs.disable: on

transport.address-family: inet

features.shard: on

performance.client-io-threads: on

Each brick is 15Gb size.

After using volume for several hours with intensive read/write operations
(~300GB written and then deleted) an attempt to write to volume results in
an Input/Output error:

# wget https://speed.hetzner.de/1GB.bin

--2018-02-04 12:02:34--  https://speed.hetzner.de/1GB.bin

Resolving speed.hetzner.de... 88.198.248.254, 2a01:4f8:0:59ed::2

Connecting to speed.hetzner.de|88.198.248.254|:443... connected.

HTTP request sent, awaiting response... 200 OK

Length: 1048576000 (1000M) [application/octet-stream]

Saving to: `1GB.bin'


38% [=>

] 403,619,518 27.8M/s   in 15s



Cannot write to `1GB.bin' (Input/output error).

I don't see anything written to glusterd.log, or any other logs in
/var/log/glusterfs/* when this error occurs.

Deleting partially downloaded file works without error.

Thanks,
Nikita Yeryomin
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] halo not work as desired!!!

2018-02-04 Thread atris adam
I have 2 data centers in two different region, each DC have 3 severs, I
have created glusterfs volume with 4 replica, this is glusterfs volume info
output:


Volume Name: test-halo
Type: Replicate
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 4 = 4
Transport-type: tcp
Bricks:
Brick1: 10.0.0.1:/mnt/test1
Brick2: 10.0.0.3:/mnt/test2
Brick3: 10.0.0.5:/mnt/test3
Brick4: 10.0.0.6:/mnt/test4
Options Reconfigured:
cluster.halo-shd-max-latency: 5
cluster.halo-max-latency: 10
cluster.quorum-count: 2
cluster.quorum-type: fixed
cluster.halo-enabled: yes
transport.address-family: inet
nfs.disable: on

bricks with ip 10.0.0.1 & 10.0.0.3 are in region A and bricks with ip
10.0.0.5 & 10.0.0.6 are in region B


when I mount the volume in region A, I except the data first store in
brick1 & brick2, then asynchronously the data copies in region B, on brick3
& brick4.

Am I write? this is what halo claims?

If yes, unfortunately, this not happen to me, no differ I mount the volume
in region A or mount the volume in region B, all the data are copied in
brick3 & brick4 and no data copies in brick1 & brick2.

ping bricks ip from region A is as follows:
ping 10.0.0.1 & 10.0.0.3 are bellow  time=0.500 ms
ping 10.0.0.5 & 10.0.0.6 are more than  time=20 ms

What is the logic that the halo select the bricks to write to?if it is the
access time, so when I mount the volume in region A, the ping time to
brick1 & brick2 is bellow 0.5 ms, but the halo select the brick3 &
brick4

glusterfs version is:
glusterfs 3.12.4

I really need to work with halo feature, But I am not successful to run
this case, Can anyone help me soon??


Thx alot
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Very slow rsync to gluster volume UNLESS `ls` or `find` scan dir on gluster volume first

2018-02-04 Thread Artem Russakovskii
An update, and a very interesting one!

After I started stracing rsync, all I could see was lstat calls, quite slow
ones, over and over, which is expected.

For example: lstat("uploads/2016/10/nexus2cee_DSC05339_thumb-161x107.jpg",
{st_mode=S_IFREG|0664, st_size=4043, ...}) = 0

I googled around and found
https://gist.github.com/nh2/1836415489e2132cf85ed3832105fcc1, which is seeing
this exact issue with gluster, rsync and xfs.

Here's the craziest finding so far. If while rsync is running (or right
before), I run /bin/ls or find on the same gluster dirs, it immediately
speeds up rsync by a factor of 100 or maybe even 1000. It's absolutely
insane.

I'm stracing the rsync run, and the slow lstat calls flood in at an
incredible speed as soon as ls or find run. Several hundred of files per
minute (excruciatingly slow) becomes thousands or even tens of thousands of
files a second.

What do you make of this?
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Very slow rsync to gluster volume UNLESS `ls` or `find` scan dir on gluster volume first

2018-02-04 Thread Artem Russakovskii
Hi,

I have been working on setting up a 4 replica gluster with over a million
files (~250GB total), and I've seen some really weird stuff happen, even
after trying to optimize for small files. I've set up a 4-brick replicate
volume (gluster 3.13.2).

It took almost 2 days to rsync the data from the local drive to the gluster
volume, and now I'm running a 2nd rsync that just looks for changes in case
more files have been written. I'd like to concentrate this email on a very
specific and odd issue.


The dir structure is
/
  MM/
 10k+files in each month folder


rsyncing each month folder cold can take 2+ minutes.

However, if I ls the destination folder first, or use find (both of which
finish within 5 seconds), the rsync is almost instant.


Here's a log with time calls that shows you what happens.:

box:/mnt/gluster/uploads/2017 # time rsync -aPr
/srv/www/htdocs/uploads/2017/08/ 08/
sending incremental file list
^Crsync error: received SIGINT, SIGTERM, or SIGHUP (code 20) at
rsync.c(637) [sender=3.1.0]

real1m39.848s
user0m0.010s
sys 0m0.030s
box:/mnt/gluster/uploads/2017 # time find 08 | wc -l
14254

real0m0.726s
user0m0.013s
sys 0m0.033s
box:/mnt/gluster/uploads/2017 # time rsync -aPr
/srv/www/htdocs/uploads/2017/08/ 08/
sending incremental file list

real0m0.562s
user0m0.057s
sys 0m0.137s
box:/mnt/gluster/uploads/2017 # time find 07 | wc -l
10103

real0m4.550s
user0m0.010s
sys 0m0.033s
box:/mnt/gluster/uploads/2017 # time rsync -aPr
/srv/www/htdocs/uploads/2017/07/ 07/
sending incremental file list

real0m0.428s
user0m0.030s
sys 0m0.083s
box:/mnt/gluster/uploads/2017 # time ls 06 | wc -l
11890

real0m1.850s
user0m0.077s
sys 0m0.040s
box:/mnt/gluster/uploads/2017 # time rsync -aPr
/srv/www/htdocs/uploads/2017/06/ 06/
sending incremental file list

real0m0.627s
user0m0.073s
sys 0m0.107s
box:/mnt/gluster/uploads/2017 # time rsync -aPr
/srv/www/htdocs/uploads/2017/05/ 05/
sending incremental file list

real2m24.382s
user0m0.127s
sys 0m0.357s


Note how if I precede the rsync call with ls or find, the rsync completes
in less than a second (finding no files to sync because they've already
been synced). Otherwise, it takes over 2 minutes (I interrupted the first
call before the 2 minutes because it was already taking too long).

What could be causing rsync to work so slowly unless the dir is primed?

Volume config:

Volume Name: gluster
Type: Replicate
Volume ID: X
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 4 = 4
Transport-type: tcp
Bricks:
Brick1: server1 :/mnt/server1_block4/gluster
Brick2: server2 :/mnt/server2_block4/gluster
Brick3: server3 :/mnt/server3_block4/gluster
Brick4: server4 :/mnt/server4_block4/gluster
Options Reconfigured:
performance.parallel-readdir: off
transport.address-family: inet
nfs.disable: on
cluster.self-heal-daemon: enable
performance.cache-size: 1GB
network.ping-timeout: 5
cluster.quorum-type: fixed
cluster.quorum-count: 1
features.cache-invalidation: on
features.cache-invalidation-timeout: 600
performance.cache-invalidation: on
performance.md-cache-timeout: 600
network.inode-lru-limit: 50
performance.rda-cache-limit: 256MB
performance.read-ahead: off
client.event-threads: 4
server.event-threads: 4



Thank you for any insight.

Sincerely,
Artem
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users