Re: [Gluster-users] iobuf/iobref error

2016-08-22 Thread Poornima Gurusiddaiah
Hi, 

The error that you see in the log file, is fixed as a part of patch 
http://review.gluster.org/#/c/10206/ (release 3.8.0) 
But these errors are not responsible for the "Transport endpoint not connected 
issues." Can you check if there are any other errors reported in the log? 

Regards, 
Poornima 

- Original Message -

> From: "ngsflow" 
> To: "gluster-users" 
> Sent: Sunday, August 21, 2016 7:53:38 PM
> Subject: [Gluster-users] iobuf/iobref error

> Hi:

> I'v been experiencing an intermittent issue with GlusterFS in 30 nodes
> cluster which makes the mounted file system unavailable through the
> GlusterFS client.

> The symptom is:

> $ ls /gluster
> ls: cannot access /gluster: Transport endpoint is not connected

> the client log reports the following error:

> [2016-08-09 23:25:36.012877] E [iobuf.c:759:iobuf_unref] (-->
> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1e0)[0x371a220580] (-->
> /usr/li
> b64/glusterfs/3.6.7/xlator/performance/quick-read.so(qr_readv_cached+0xb7)[0x7ff57f318ea7]
> (--> /usr/lib64/glusterfs/3.6.7/xlator/performance/
> quick-read.so(qr_readv+0x62)[0x7ff57f3194c2] (-->
> /usr/lib64/libglusterfs.so.0(default_readv_resume+0x14d)[0x371a22a75d] (-->
> /usr/lib64/libgl
> usterfs.so.0(call_resume+0x3d6)[0x371a2424b6] ) 0-iobuf: invalid
> argument: iobuf
> [2016-08-09 23:25:36.013192] E [iobuf.c:865:iobref_unref] (-->
> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1e0)[0x371a220580] (-->
> /usr/l
> ib64/glusterfs/3.6.7/xlator/performance/quick-read.so(qr_readv_cached+0xc1)[0x7ff57f318eb1]
> (--> /usr/lib64/glusterfs/3.6.7/xlator/performance
> /quick-read.so(qr_readv+0x62)[0x7ff57f3194c2] (-->
> /usr/lib64/libglusterfs.so.0(default_readv_resume+0x14d)[0x371a22a75d] (-->
> /usr/lib64/libg
> lusterfs.so.0(call_resume+0x3d6)[0x371a2424b6] ) 0-iobuf: invalid
> argument: iobref

> seems to me it's the out-of-memory issue.

> info: glusterfs is configured as follows

> performance.io-thread-count: 4
> performance.cache-max-file-size: 0
> performance.write-behind-window-size: 64MB
> performance.cache-size: 4GB
> cluster.consistent-metadata: on

> and each node in cluster are deployed both glusterfs client and server.

> is there any way to ease the above issue via modify configuration? such as
> increase cache-size, or some other paramters?

> thx.

> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Upgrade 3.7.6 -> 3.7.13 one gluster server disconnected 1 of 3 volumes

2016-08-22 Thread Atin Mukherjee
On Tue, Aug 23, 2016 at 4:17 AM, Steve Dainard  wrote:

> About 5 hours after upgrading gluster 3.7.6 -> 3.7.13 on Centos 7, one of
> my gluster servers disconnected its volume. The other two volumes this host
> serves were not affected.
>
> # gluster volume status storage
> Status of volume: storage
> Gluster process TCP Port  RDMA Port  Online
>  Pid
> 
> --
> Brick 10.0.231.50:/mnt/raid6-storage/storage
>   49159 0  Y   30743
> Brick 10.0.231.51:/mnt/raid6-storage/storage
>   49159 0  Y   676
> *Brick 10.0.231.52:/mnt/raid6-storage/storag**e
> N/A   N/AN   N/A  *
> Brick 10.0.231.53:/mnt/raid6-storage/storage
>   49154 0  Y   10253
> Brick 10.0.231.54:/mnt/raid6-storage/storage
>   49153 0  Y   2792
> Brick 10.0.231.55:/mnt/raid6-storage/storage
>   49153 0  Y   13590
> Brick 10.0.231.56:/mnt/raid6-storage/storage
>   49152 0  Y   9281
> NFS Server on localhost 2049  0  Y
> 30775
> Quota Daemon on localhost   N/A   N/AY
> 30781
> NFS Server on 10.0.231.54   2049  0  Y
> 2817
> Quota Daemon on 10.0.231.54 N/A   N/AY
> 2824
> NFS Server on 10.0.231.51   2049  0  Y
> 710
> Quota Daemon on 10.0.231.51 N/A   N/AY
> 719
> NFS Server on 10.0.231.52   2049  0  Y
> 9090
> Quota Daemon on 10.0.231.52 N/A   N/AY
> 9098
> NFS Server on 10.0.231.55   2049  0  Y
> 13611
> Quota Daemon on 10.0.231.55 N/A   N/AY
> 13619
> NFS Server on 10.0.231.56   2049  0  Y
> 9303
> Quota Daemon on 10.0.231.56 N/A   N/AY
> 9310
> NFS Server on 10.0.231.53   2049  0  Y
> 26304
> Quota Daemon on 10.0.231.53 N/A   N/AY
> 26320
>
> Task Status of Volume storage
> 
> --
> There are no active volume tasks
>
> I see lots of logs related to trashcan (failed [file exists]), set xattrs
> (failed [no such file or directory]), quota (invalid arguments) in the
> brick logs, which I enabled as a feature after the upgrade this morning.
>

Could you let us know the time (in UTC) around which this issue was seen
such that we can look at the logs around that time and see if something
went wrong.


> After restarting glusterd on that host, the volume came back online.
>
> I've attached logs from that host if someone can take a look.
>
> # gluster volume info storage
>
> Volume Name: storage
> Type: Distribute
> Volume ID: 6f95525a-94d7-4174-bac4-e1a18fe010a2
> Status: Started
> Number of Bricks: 7
> Transport-type: tcp
> Bricks:
> Brick1: 10.0.231.50:/mnt/raid6-storage/storage
> Brick2: 10.0.231.51:/mnt/raid6-storage/storage
> Brick3: 10.0.231.52:/mnt/raid6-storage/storage
> Brick4: 10.0.231.53:/mnt/raid6-storage/storage
> Brick5: 10.0.231.54:/mnt/raid6-storage/storage
> Brick6: 10.0.231.55:/mnt/raid6-storage/storage
> Brick7: 10.0.231.56:/mnt/raid6-storage/storage
> Options Reconfigured:
> nfs.disable: no
> features.trash-max-filesize: 1GB
> features.trash: on
> features.quota-deem-statfs: on
> features.inode-quota: on
> features.quota: on
> performance.readdir-ahead: on
>
> # rpm -qa  |grep glusterfs
> glusterfs-fuse-3.7.13-1.el7.x86_64
> glusterfs-cli-3.7.13-1.el7.x86_64
> glusterfs-3.7.13-1.el7.x86_64
> glusterfs-server-3.7.13-1.el7.x86_64
> glusterfs-api-3.7.13-1.el7.x86_64
> glusterfs-libs-3.7.13-1.el7.x86_64
> glusterfs-client-xlators-3.7.13-1.el7.x86_64
>
>
> Thanks,
> Steve
>
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>



-- 

--Atin
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] gluster: symbol lookup error: gluster: undefined symbol: use_spinlocks

2016-08-22 Thread jayakrishnan mm
Glusterfs ver 3.8.3
Source build
Host : Ubuntu 14.04 (32 bit)

Error:
gluster: symbol lookup error: gluster: undefined symbol: use_spinlocks

Was previously using 3.7.6 without any issues.

Some dependencies are not  updated  properly . Can someone help ?


Best Regards
JK
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] FreeBSD: I can't replace-bricks - Distributed-Replicate

2016-08-22 Thread Jan Michael Martirez
I can't use replace-bricks. 

I followed this tutorial: 
https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Managing%20Volumes/#replace-brick
 


Volume Name: dr
Type: Distributed-Replicate
Volume ID: 0ce3038c-55c6-4a4e-9b97-22269bce9d11
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: gluster01:/glu1
Brick2: gluster02:/glu2
Brick3: gluster03:/glu3
Brick4: gluster04:/glu4
Options Reconfigured:
features.shard-block-size: 4MB
features.shard: on
performance.readdir-ahead: on

I'm stuck with setfattr. I'm using FreeBSD, so I use setextattr instead. 

root@gluster01:/mnt/fuse # setextattr system wheel abc /mnt/fuse
setextattr: /mnt/fuse: failed: Operation not supported

root@gluster01:/mnt/fuse # glusterd --version
glusterfs 3.7.6 built on Jul 13 2016 20:32:46___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Memory leak with a replica 3 arbiter 1 configuration

2016-08-22 Thread Pranith Kumar Karampuri
Could you collect statedump of the brick process by following:
https://gluster.readthedocs.io/en/latest/Troubleshooting/statedump

That should help us identify which datatype is causing leaks and fix it.

Thanks!

On Tue, Aug 23, 2016 at 2:22 AM, Benjamin Edgar  wrote:

> Hi,
>
> I appear to have a memory leak with a replica 3 arbiter 1 configuration of
> gluster. I have a data brick and an arbiter brick on one server, and
> another server with the last data brick. The more I write files to gluster
> in this configuration, the more memory the arbiter brick process takes up.
>
> I am able to reproduce this issue by first setting up a replica 3 arbiter
> 1 configuration and then using the following bash script to create 10,000
> 200kB files, delete those files, and run forever:
>
> while true ; do
>   for i in {1..1} ; do
> dd if=/dev/urandom bs=200K count=1 of=$TEST_FILES_DIR/file$i
>   done
>   rm -rf $TEST_FILES_DIR/*
> done
>
> $TEST_FILES_DIR is a location on my gluster mount.
>
> After about 3 days of this script running on one of my clusters, this is
> what the output of "top" looks like:
>   PID   USER  PR  NIVIRT   RESSHR S   %CPU %MEM
> TIME+   COMMAND
> 16039 root  20   0 1397220  77720 3948 S   20.61.0
>860:01.53  glusterfsd
> 13174 root  20   0 1395824  112728   3692 S   19.61.5
>806:07.17  glusterfs
> 19961 root  20   0 2967204  *2.145g*3896 S   17.329.0
>  752:10.70  glusterfsd
>
> As you can see one of the brick processes is using over 2 gigabytes of
> memory.
>
> One work-around for this is to kill the arbiter brick process and restart
> the gluster daemon. This restarts arbiter brick process and its memory
> usage goes back down to a reasonable level. However I would rather not kill
> the arbiter brick every week for production environments.
>
> Has anyone seen this issue before and is there a known work-around/fix?
>
> Thanks,
> Ben
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>



-- 
Pranith
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Memory leak with a replica 3 arbiter 1 configuration

2016-08-22 Thread Benjamin Edgar
Hi,

I appear to have a memory leak with a replica 3 arbiter 1 configuration of
gluster. I have a data brick and an arbiter brick on one server, and
another server with the last data brick. The more I write files to gluster
in this configuration, the more memory the arbiter brick process takes up.

I am able to reproduce this issue by first setting up a replica 3 arbiter 1
configuration and then using the following bash script to create 10,000
200kB files, delete those files, and run forever:

while true ; do
  for i in {1..1} ; do
dd if=/dev/urandom bs=200K count=1 of=$TEST_FILES_DIR/file$i
  done
  rm -rf $TEST_FILES_DIR/*
done

$TEST_FILES_DIR is a location on my gluster mount.

After about 3 days of this script running on one of my clusters, this is
what the output of "top" looks like:
  PID   USER  PR  NIVIRT   RESSHR S   %CPU %MEM
TIME+   COMMAND
16039 root  20   0 1397220  77720 3948 S   20.61.0
   860:01.53  glusterfsd
13174 root  20   0 1395824  112728   3692 S   19.61.5
 806:07.17  glusterfs
19961 root  20   0 2967204  *2.145g*3896 S   17.329.0
   752:10.70  glusterfsd

As you can see one of the brick processes is using over 2 gigabytes of
memory.

One work-around for this is to kill the arbiter brick process and restart
the gluster daemon. This restarts arbiter brick process and its memory
usage goes back down to a reasonable level. However I would rather not kill
the arbiter brick every week for production environments.

Has anyone seen this issue before and is there a known work-around/fix?

Thanks,
Ben
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] CFP for Gluster Developer Summit

2016-08-22 Thread Jonathan Holloway
On Fri, Aug 12 , 2016 at 03:48:49PM -0400, Vijay Bellur wrote: 
.. 
> If you have a talk/discussion proposal that can be part of these themes, 
> please send out your proposal(s) by replying to this thread. Please clearly 
> mention the theme for which your proposal is relevant when you do so. We 
> will be ending the CFP by 12 midnight PDT on August 31st, 2016 . 

I'm putting my name on Niels' request for a presenter on Practical Glusto 
Example here on the main thread to get it in the hat. 

"Practical Glusto example 
- show how to install Glusto and dependencies 
- write a simple new test-case from scratch (copy/paste example?) 
- run the new test-case (in the development environment?)" 

I would classify the theme as "5. Process & infrastructure." 

If there is additional interest in discussion on other specific features of 
Glusto, maybe also separate lightning talk(s) or similar? 

Cheers, 
Jonathan 
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] memory leak in glusterd 3.7.x

2016-08-22 Thread Zdenek Styblik
On Sat, Aug 20, 2016 at 11:48 AM, Pranith Kumar Karampuri
 wrote:
> hi Zdenek,
> I recently found this issue and there has been a discussion on
> gluster-devel about how to fix this. It is a bit involved, so taking more
> time than I would like to fix, this bug has been there for almost 3-4 years
> I guess.
>
> You can find the discussion here:
> http://www.gluster.org/pipermail/gluster-devel/2016-July/050085.html
>

Hello Pranith,

thank you for reply and heads up. I've read only through mails with
subject "memory leak" and didn't know you, or somebody else, already
found this issue.

Best regards,
Z.

> On Fri, Aug 19, 2016 at 7:10 PM, Zdenek Styblik 
> wrote:
>>
>> Hello,
>>
>> we've found a memory leak in glusterd v3.7.x(currently at v3.7.14, but
>> we are users of v3.7.x from the beginning).
>> It seems, we've empirically verified, that continuous execution of %
>> gluster volume set   ; leads to memory leaks in
>> glusterd and OOM, although not necessarily OOM of glusterd itself.
>> Settings which were being set over and over again are
>> `nfs.addr-namelookup false` and `nfs.disable true`. There might have
>> been other settings, but I was able to find these in recent logs.
>> Unfortunately, we don't have capacity to debug this issue
>> further(statedumps are quite overwhelming :] ).
>> Repeated execution has been caused due to bug in Puppet module we're
>> using(and we were able to address this issue). Therefore, it's safe to
>> say that the number of affected users or like hood of somebody else
>> having these problem is probably low. It's still a memory leak and,
>> well, rather serious one if you happen to stumble upon it. Also, it
>> must be noted that this gets amplified if you have more than just
>> volume.
>>
>> If there is anything I can help with, let me know.
>>
>> Please, keep me on CC as I'm not subscribed to the mailing list.
>>
>> Best regards,
>> Zdenek Styblik
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>
>
>
>
> --
> Pranith
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] iobuf/iobref error

2016-08-22 Thread ngsflow
Hi:


I'v been experiencing an intermittent issue with GlusterFS in 30 nodes cluster 
which makes the mounted file system unavailable through the GlusterFS client.


The symptom is:


$ ls /gluster
ls: cannot access /gluster: Transport endpoint is not connected


the client log reports the following error:


[2016-08-09 23:25:36.012877] E [iobuf.c:759:iobuf_unref] (--> 
/usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1e0)[0x371a220580] (--> /usr/li
b64/glusterfs/3.6.7/xlator/performance/quick-read.so(qr_readv_cached+0xb7)[0x7ff57f318ea7]
 (--> /usr/lib64/glusterfs/3.6.7/xlator/performance/
quick-read.so(qr_readv+0x62)[0x7ff57f3194c2] (--> 
/usr/lib64/libglusterfs.so.0(default_readv_resume+0x14d)[0x371a22a75d] (--> 
/usr/lib64/libgl
usterfs.so.0(call_resume+0x3d6)[0x371a2424b6] ) 0-iobuf: invalid argument: 
iobuf
[2016-08-09 23:25:36.013192] E [iobuf.c:865:iobref_unref] (--> 
/usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1e0)[0x371a220580] (--> /usr/l
ib64/glusterfs/3.6.7/xlator/performance/quick-read.so(qr_readv_cached+0xc1)[0x7ff57f318eb1]
 (--> /usr/lib64/glusterfs/3.6.7/xlator/performance
/quick-read.so(qr_readv+0x62)[0x7ff57f3194c2] (--> 
/usr/lib64/libglusterfs.so.0(default_readv_resume+0x14d)[0x371a22a75d] (--> 
/usr/lib64/libg
lusterfs.so.0(call_resume+0x3d6)[0x371a2424b6] ) 0-iobuf: invalid argument: 
iobref


seems to me it's the out-of-memory issue.




info: glusterfs is configured as follows


performance.io-thread-count: 4
performance.cache-max-file-size: 0
performance.write-behind-window-size: 64MB
performance.cache-size: 4GB
cluster.consistent-metadata: on


and each node in cluster are deployed both glusterfs client and server.




is there any way to ease the above issue via modify configuration? such as 
increase cache-size, or some other paramters?



thx.___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] [Error] 0-iobuf: invalid argument: iobuf (or iobref)

2016-08-22 Thread ngsflow
Hi: 


I'v been experiencing an intermittent issue with GlusterFS in 30 nodes cluster 
which makes the mounted file system unavailable through the GlusterFS client.


The symptom is:


$ ls /gluster
ls: cannot access /gluster: Transport endpoint is not connected


the client log reports the following error: 


[2016-08-09 23:25:36.012877] E [iobuf.c:759:iobuf_unref] (--> 
/usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1e0)[0x371a220580] (--> /usr/li
b64/glusterfs/3.6.7/xlator/performance/quick-read.so(qr_readv_cached+0xb7)[0x7ff57f318ea7]
 (--> /usr/lib64/glusterfs/3.6.7/xlator/performance/
quick-read.so(qr_readv+0x62)[0x7ff57f3194c2] (--> 
/usr/lib64/libglusterfs.so.0(default_readv_resume+0x14d)[0x371a22a75d] (--> 
/usr/lib64/libgl
usterfs.so.0(call_resume+0x3d6)[0x371a2424b6] ) 0-iobuf: invalid argument: 
iobuf
[2016-08-09 23:25:36.013192] E [iobuf.c:865:iobref_unref] (--> 
/usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1e0)[0x371a220580] (--> /usr/l
ib64/glusterfs/3.6.7/xlator/performance/quick-read.so(qr_readv_cached+0xc1)[0x7ff57f318eb1]
 (--> /usr/lib64/glusterfs/3.6.7/xlator/performance
/quick-read.so(qr_readv+0x62)[0x7ff57f3194c2] (--> 
/usr/lib64/libglusterfs.so.0(default_readv_resume+0x14d)[0x371a22a75d] (--> 
/usr/lib64/libg
lusterfs.so.0(call_resume+0x3d6)[0x371a2424b6] ) 0-iobuf: invalid argument: 
iobref


seems to me it's the out-of-memory issue.




info: glusterfs is configured as follows


performance.io-thread-count: 4
performance.cache-max-file-size: 0
performance.write-behind-window-size: 64MB
performance.cache-size: 4GB
cluster.consistent-metadata: on


and each node in cluster are deployed both glusterfs client and server.




is there any way to ease the above issue via modify configuration? such as 
increase cache-size, or some other paramters?



thx.___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] CFP for Gluster Developer Summit

2016-08-22 Thread Shreyas Siravara
Here's my proposal:

Title: GFProxy: Scaling the GlusterFS FUSE Client
Theme: Experience / (Process & Infrastructure)

I plan to cover the following topics:

- Discuss the benefits of the FUSE client vs. NFS & how we use it @ Facebook 
today
- Discuss scalability challenges with the FUSE client, operational overhead, 
etc.
- Introduce GFProxy:
 - Splitting the core parts of the fuse client (DHT + AFR) into a separate 
daemon, which effectively acts as a proxy between FUSE clients and bricks
 - Managing failover in the GFProxy FUSE Client with the AHA (Advanced High 
Availability) xlator
 - Current deployment & code status (hope to get this upstream soon!)


> On Aug 12, 2016, at 12:48 PM, Vijay Bellur  wrote:
> 
> Hey All,
> 
> Gluster Developer Summit 2016 is fast approaching [1] on us. We are looking 
> to have talks and discussions related to the following themes in the summit:
> 
> 1. Gluster.Next - focusing on features shaping the future of Gluster
> 
> 2. Experience - Description of real world experience and feedback from:
>   a> Devops and Users deploying Gluster in production
>   b> Developers integrating Gluster with other ecosystems
> 
> 3. Use cases  - focusing on key use cases that drive Gluster.today and 
> Gluster.Next
> 
> 4. Stability & Performance - focusing on current improvements to reduce our 
> technical debt backlog
> 
> 5. Process & infrastructure  - focusing on improving current workflow, 
> infrastructure to make life easier for all of us!
> 
> If you have a talk/discussion proposal that can be part of these themes, 
> please send out your proposal(s) by replying to this thread. Please clearly 
> mention the theme for which your proposal is relevant when you do so. We will 
> be ending the CFP by 12 midnight PDT on August 31st, 2016.
> 
> If you have other topics that do not fit in the themes listed, please feel 
> free to propose and we might be able to accommodate some of them as 
> lightening talks or something similar.
> 
> Please do reach out to me or Amye if you have any questions.
> 
> Thanks!
> Vijay
> 
> [1] 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.gluster.org_events_summit2016_=CwICAg=5VD0RTtNlTh3ycd41b3MUw=N7LE2BKIHDDBvkYkakYthA=YY6RlpXtmezdNewpKyfYyGENFYgjRPKwC74nWCvnYpg=4R3BxJ2a-DiwEIMB2xOQMrKRlA7jw3cy7dYuz-uA6Eg=
>  ___
> Gluster-devel mailing list
> gluster-de...@gluster.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.gluster.org_mailman_listinfo_gluster-2Ddevel=CwICAg=5VD0RTtNlTh3ycd41b3MUw=N7LE2BKIHDDBvkYkakYthA=YY6RlpXtmezdNewpKyfYyGENFYgjRPKwC74nWCvnYpg=fkgZ0Bm8lrwHWwBw8PNpeCoEba4KZUyh7IpjBKZ3pqc=
>  

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] [Gluster-devel] CFP for Gluster Developer Summit

2016-08-22 Thread Krutika Dhananjay
Here's one from me:

Sharding in GlusterFS - Past, Present and Future

I intend to cover the following in this talk:

* What sharding is, what are its benefits over striping and in general..
* Current design
* Use cases - VM image store/HC/ROBO
* Challenges - atomicity, synchronization across multiple clients,
performance etc
* Future directions - sharding for general purpose use-cases [WIP]
  (optionally inter-op with other features like
file snapshots etc,
   if I find time to think of some solution by
Oct).

-Krutika


On Sat, Aug 13, 2016 at 1:18 AM, Vijay Bellur  wrote:

> Hey All,
>
> Gluster Developer Summit 2016 is fast approaching [1] on us. We are
> looking to have talks and discussions related to the following themes in
> the summit:
>
> 1. Gluster.Next - focusing on features shaping the future of Gluster
>
> 2. Experience - Description of real world experience and feedback from:
>a> Devops and Users deploying Gluster in production
>b> Developers integrating Gluster with other ecosystems
>
> 3. Use cases  - focusing on key use cases that drive Gluster.today and
> Gluster.Next
>
> 4. Stability & Performance - focusing on current improvements to reduce
> our technical debt backlog
>
> 5. Process & infrastructure  - focusing on improving current workflow,
> infrastructure to make life easier for all of us!
>
> If you have a talk/discussion proposal that can be part of these themes,
> please send out your proposal(s) by replying to this thread. Please clearly
> mention the theme for which your proposal is relevant when you do so. We
> will be ending the CFP by 12 midnight PDT on August 31st, 2016.
>
> If you have other topics that do not fit in the themes listed, please feel
> free to propose and we might be able to accommodate some of them as
> lightening talks or something similar.
>
> Please do reach out to me or Amye if you have any questions.
>
> Thanks!
> Vijay
>
> [1] https://www.gluster.org/events/summit2016/
> ___
> Gluster-devel mailing list
> gluster-de...@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] memory leak in glusterd 3.7.x

2016-08-22 Thread Pranith Kumar Karampuri
On Mon, Aug 22, 2016 at 9:03 PM, Zdenek Styblik 
wrote:

> On Sat, Aug 20, 2016 at 11:48 AM, Pranith Kumar Karampuri
>  wrote:
> > hi Zdenek,
> > I recently found this issue and there has been a discussion on
> > gluster-devel about how to fix this. It is a bit involved, so taking more
> > time than I would like to fix, this bug has been there for almost 3-4
> years
> > I guess.
> >
> > You can find the discussion here:
> > http://www.gluster.org/pipermail/gluster-devel/2016-July/050085.html
> >
>
> Hello Pranith,
>
> thank you for reply and heads up. I've read only through mails with
> subject "memory leak" and didn't know you, or somebody else, already
> found this issue.
>

Hey,
   Please err on posting these kinds of mails in future as well :-). In
the worst case it will be something we already know. But in the best case
it wll be something new :-).


>
> Best regards,
> Z.
>
> > On Fri, Aug 19, 2016 at 7:10 PM, Zdenek Styblik <
> zdenek.styb...@showmax.com>
> > wrote:
> >>
> >> Hello,
> >>
> >> we've found a memory leak in glusterd v3.7.x(currently at v3.7.14, but
> >> we are users of v3.7.x from the beginning).
> >> It seems, we've empirically verified, that continuous execution of %
> >> gluster volume set   ; leads to memory leaks in
> >> glusterd and OOM, although not necessarily OOM of glusterd itself.
> >> Settings which were being set over and over again are
> >> `nfs.addr-namelookup false` and `nfs.disable true`. There might have
> >> been other settings, but I was able to find these in recent logs.
> >> Unfortunately, we don't have capacity to debug this issue
> >> further(statedumps are quite overwhelming :] ).
> >> Repeated execution has been caused due to bug in Puppet module we're
> >> using(and we were able to address this issue). Therefore, it's safe to
> >> say that the number of affected users or like hood of somebody else
> >> having these problem is probably low. It's still a memory leak and,
> >> well, rather serious one if you happen to stumble upon it. Also, it
> >> must be noted that this gets amplified if you have more than just
> >> volume.
> >>
> >> If there is anything I can help with, let me know.
> >>
> >> Please, keep me on CC as I'm not subscribed to the mailing list.
> >>
> >> Best regards,
> >> Zdenek Styblik
> >> ___
> >> Gluster-users mailing list
> >> Gluster-users@gluster.org
> >> http://www.gluster.org/mailman/listinfo/gluster-users
> >
> >
> >
> >
> > --
> > Pranith
>



-- 
Pranith
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] CFP for Gluster Developer Summit

2016-08-22 Thread Pranith Kumar Karampuri
On Mon, Aug 22, 2016 at 5:15 PM, Jeff Darcy  wrote:

> Two proposals, both pretty developer-focused.
>
> (1) Gluster: The Ugly Parts
> Like any code base its size and age, Gluster has accumulated its share of
> dead, redundant, or simply inelegant code.  This code makes us more
> vulnerable to bugs, and slows our entire development process for any
> feature.  In this interactive discussion, we'll identify translators or
> other modules that can be removed or significantly streamlined, and develop
> a plan for doing so within the next year or so.  Bring your favorite gripes
> and pet peeves (about the code).
>
> (2) Gluster Debugging
> Every developer has their own "bag of tricks" for debugging Gluster code -
> things to look for in logs, options to turn on, obscure test-script
> features, gdb macros, and so on.  In this session we'll share many of these
> tricks, and hopefully collect more, along with a plan to document them so
> that newcomers can get up to speed more quickly.
>
>
> I could extend #2 to cover more user/support level problem diagnosis, but
> I think I'd need a co-presenter for that because it's not an area in which
> I feel like an expert myself.
>

I can help here. We can chat offline about what exactly you had in mind and
take it from there.


> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>



-- 
Pranith
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Directory c+mtime changed after add-brick and fix-layout

2016-08-22 Thread Hans Henrik Happe

Hi,

We see a lot of directories that got their c+mtime updated after we 
added bricks and ran a rebalance fix-layout.


On the new bricks the dirs have the current time from when fix-layout 
ran. On the old they seem to be as they are supposed to. On clients some 
dirs are showing the new time.


Is this a known issue and is there anything we can do about it.

Version: 3.7.13

Cheers,
Hans Henrik Happe


___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] One client can effectively hang entire gluster array

2016-08-22 Thread Glomski, Patrick
Not a bad idea for a workaround, but that would require significant
investment with our current setup. All of our compute nodes are stateless /
have no disks. All storage is network storage. It's probably still not
feasible if we added disks because some simulations produce terabytes of
data. We would need some kind of periodic check-and-sync mechanism.

I still owe the gluster devs a test of that patch

On Fri, Aug 19, 2016 at 3:22 PM, Steve Dainard  wrote:

> As a potential solution on the compute node side, can you have users copy
> relevant data from the gluster volume to a local disk (ie $TMDIR), operate
> on that disk, write output files to that disk, and then write the results
> back to persistent storage once the job is complete?
>
> There are lots of factors to consider, but this is how we operate in a
> small compute environment trying to avoid over-loading gluster storage
> nodes.
>
> On Fri, Jul 8, 2016 at 6:29 AM, Glomski, Patrick <
> patrick.glom...@corvidtec.com> wrote:
>
>> Hello, users and devs.
>>
>> TL;DR: One gluster client can essentially cause denial of service /
>> availability loss to entire gluster array. There's no way to stop it and
>> almost no way to find the bad client. Probably all (at least 3.6 and 3.7)
>> versions are affected.
>>
>> We have two large replicate gluster arrays (3.6.6 and 3.7.11) that are
>> used in a high-performance computing environment. Two file access cases
>> cause severe issues with glusterfs: Some of our scientific codes write
>> hundreds of files (~400-500) simultaneously (one file or more per processor
>> core, so lots of small or large writes) and others read thousands of files
>> (2000-3000) simultaneously to grab metadata from each file (lots of small
>> reads).
>>
>> In either of these situations, one glusterfsd process on whatever peer
>> the client is currently talking to will skyrocket to *nproc* cpu usage
>> (800%, 1600%) and the storage cluster is essentially useless; all other
>> clients will eventually try to read or write data to the overloaded peer
>> and, when that happens, their connection will hang. Heals between peers
>> hang because the load on the peer is around 1.5x the number of cores or
>> more. This occurs in either gluster 3.6 or 3.7, is very repeatable, and
>> happens much too frequently.
>>
>> Even worse, there seems to be no definitive way to diagnose which client
>> is causing the issues. Getting 'volume status <> clients' doesn't help
>> because it reports the total number of bytes read/written by each client.
>> (a) The metadata in question is tiny compared to the multi-gigabyte output
>> files being dealt with and (b) the byte-count is cumulative for the clients
>> and the compute nodes are always up with the filesystems mounted, so the
>> byte transfer counts are astronomical. The best solution I've come up with
>> is to blackhole-route traffic from clients one at a time (effectively push
>> the traffic over to the other peer), wait a few minutes for all of the
>> backlogged traffic to dissipate (if it's going to), see if the load on
>> glusterfsd drops, and repeat until I find the client causing the issue. I
>> would *love* any ideas on a better way to find rogue clients.
>>
>> More importantly, though, there must be some feature envorced to stop one
>> user from having the capability to render the entire filesystem unavailable
>> for all other users. In the worst case, I would even prefer a gluster
>> volume option that simply disconnects clients making over some threshold of
>> file open requests. That's WAY more preferable than a complete availability
>> loss reminiscent of a DDoS attack...
>>
>> Apologies for the essay and looking forward to any help you can provide.
>>
>> Thanks,
>> Patrick
>>
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>>
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] CFP for Gluster Developer Summit

2016-08-22 Thread Jeff Darcy
Two proposals, both pretty developer-focused.

(1) Gluster: The Ugly Parts
Like any code base its size and age, Gluster has accumulated its share of dead, 
redundant, or simply inelegant code.  This code makes us more vulnerable to 
bugs, and slows our entire development process for any feature.  In this 
interactive discussion, we'll identify translators or other modules that can be 
removed or significantly streamlined, and develop a plan for doing so within 
the next year or so.  Bring your favorite gripes and pet peeves (about the 
code).

(2) Gluster Debugging
Every developer has their own "bag of tricks" for debugging Gluster code - 
things to look for in logs, options to turn on, obscure test-script features, 
gdb macros, and so on.  In this session we'll share many of these tricks, and 
hopefully collect more, along with a plan to document them so that newcomers 
can get up to speed more quickly.


I could extend #2 to cover more user/support level problem diagnosis, but I 
think I'd need a co-presenter for that because it's not an area in which I feel 
like an expert myself.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] [Gluster-devel] CFP for Gluster Developer Summit

2016-08-22 Thread Niels de Vos
On Fri, Aug 12, 2016 at 03:48:49PM -0400, Vijay Bellur wrote:
..
> If you have a talk/discussion proposal that can be part of these themes,
> please send out your proposal(s) by replying to this thread. Please clearly
> mention the theme for which your proposal is relevant when you do so. We
> will be ending the CFP by 12 midnight PDT on August 31st, 2016.


I'd like to propose a discussion, mainly to get opinions and ideas of
others about a design I'm thinking of applying for certain features.

Thanks,
Niels


Title:
  Client initiated server-side processing - A.k.a. FOP-Bouncing

Summary:
  Certain operations a client wants to do can be offloaded to the
storage servers. One of these is server-sode-copy where a client wants
to copy a file from one Gluster volume to an other. It is inefficient to
read the file (transfer) to the client, and then write it to the server
again. It would be nice for a client to send the copy operation to one
of the storage servers and have the storage server take care of the
actual reading+writing of the data.

A possible way to accomplish this, is by using the upcall framework, and
have a special (per server) process listening for specific upcall
events. If a client sends the server-side-copy, the brick receiving it
can create an upcall event that is handled only by a server-side
process to offload the copy.

I might call this design FOP-Bouncing, unless someone comes up with a
more suitable name.


signature.asc
Description: PGP signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Unable to start volume : libgfdb.so missing

2016-08-22 Thread Niels de Vos
On Mon, Aug 22, 2016 at 03:33:18PM +0800, jayakrishnan mm wrote:
> Glusterfs 3.7.6
> Host: x86_64-linux (both client & Server)
> 
> Volume : Disperse
> 
> Create  volume is success. But when I  am unable to start the volume.
> 
> 
> Brick log says libgfdb.so.0  can't be opened. How can I install this ?
>  There is  no mention about such lib in the build requirements
> (
> https://gluster.readthedocs.io/en/latest/Developer-guide/Building-GlusterFS/
> )

This library is part of Gluster. It should have gotten installed when
you did a 'make install'. Tiering (with changetimerecorder as one
component) uses libgfdb. My RPMs build from the master branch have this
dependency:

# ldd /usr/lib64/glusterfs/3.9dev/xlator/features/changetimerecorder.so
...
libgfdb.so.0 => /lib64/libgfdb.so.0 (0x7f60a90ba000)
...

Could you explain why you are building an old version like 3.7.6 from
the sources, eventhough we have (regular updated) packages for many
different distrubutions available?

HTH,
Niels


> 
> I could start it by force. But gluster vol status show
> 
> 
> Status of volume: dsi4-vol
> Gluster process TCP Port  RDMA Port  Online  Pid
> --
> Brick 192.168.36.200:/home/jaya/gluster/dsi
> 4-brick1N/A   N/AN
> N/A
> Brick 192.168.36.200:/home/jaya/gluster/dsi
> 4-brick2N/A   N/AN
> N/A
> Brick 192.168.36.200:/home/jaya/gluster/dsi
> 4-brick3N/A   N/AN
> N/A
> Brick 192.168.36.200:/home/jaya/gluster/dsi
> 4-brick4N/A   N/AN
> N/A
> Brick 192.168.36.200:/home/jaya/gluster/dsi
> 4-brick5N/A   N/AN
> N/A
> Brick 192.168.36.200:/home/jaya/gluster/dsi
> 4-brick6N/A   N/AN
> N/A
> Brick 192.168.36.200:/home/jaya/gluster/dsi
> 4-brick7N/A   N/AN
> N/A
> Brick 192.168.36.200:/home/jaya/gluster/dsi
> 4-brick8N/A   N/AN
> N/A
> NFS Server on localhost N/A   N/AN
> N/A
> 
> Task Status of Volume dsi4-vol
> --
> There are no active volume tasks
> 
> 
> 
> Pls help.
> 
> best regards
> JK
> 
> 
> 
> *usr-local-etc-glusterfs-glusterd.vol.log *
> **
> 
> [2016-08-20 10:31:34.712494] I [MSGID: 100030] [glusterfsd.c:2319:main]
> 0-/usr/local/sbin/glusterd: Started running /usr/local/sbin/glusterd
> version 3.7.6 (args: /usr/local/sbin/glusterd -p /var/run/glusterd.pid)
> [2016-08-20 10:31:34.718198] I [MSGID: 106478] [glusterd.c:1350:init]
> 0-management: Maximum allowed open file descriptors set to 65536
> [2016-08-20 10:31:34.718246] I [MSGID: 106479] [glusterd.c:1399:init]
> 0-management: Using /var/lib/glusterd as working directory
> [2016-08-20 10:31:34.723445] W [MSGID: 103071]
> [rdma.c:4592:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event
> channel creation failed [No such device]
> [2016-08-20 10:31:34.723479] W [MSGID: 103055] [rdma.c:4899:init]
> 0-rdma.management: Failed to initialize IB Device
> [2016-08-20 10:31:34.723490] W [rpc-transport.c:359:rpc_transport_load]
> 0-rpc-transport: 'rdma' initialization failed
> [2016-08-20 10:31:34.723558] W [rpcsvc.c:1597:rpcsvc_transport_create]
> 0-rpc-service: cannot create listener, initing the transport failed
> [2016-08-20 10:31:34.723573] E [MSGID: 106243] [glusterd.c:1623:init]
> 0-management: creation of 1 listeners failed, continuing with succeeded
> transport
> [2016-08-20 10:31:37.028385] I [MSGID: 106513]
> [glusterd-store.c:2071:glusterd_restore_op_version] 0-glusterd: retrieved
> op-version: 30706
> [2016-08-20 10:31:37.028524] I [MSGID: 106194]
> [glusterd-store.c:3543:glusterd_store_retrieve_missed_snaps_list]
> 0-management: No missed snaps list.
> Final graph:
> +--+
>   1: volume management
>   2: type mgmt/glusterd
>   3: option rpc-auth.auth-glusterfs on
>   4: option rpc-auth.auth-unix on
>   5: option rpc-auth.auth-null on
>   6: option rpc-auth-allow-insecure on
>   7: option transport.socket.listen-backlog 128
>   8: option ping-timeout 30
>   9: option transport.socket.read-fail-log off
>  10: option transport.socket.keepalive-interval 2
>  11: option transport.socket.keepalive-time 10
>  12: option transport-type rdma
> [2016-08-20 10:31:37.030150] I [MSGID: 101190]
> [event-epoll.c:633:event_dispatch_epoll_worker] 0-epoll: Started thread
> with index 2
>  13: option working-directory /var/lib/glusterd
>  14: end-volume
>  15:
> 

[Gluster-users] Unable to start volume : libgfdb.so missing

2016-08-22 Thread jayakrishnan mm
Glusterfs 3.7.6
Host: x86_64-linux (both client & Server)

Volume : Disperse

Create  volume is success. But when I  am unable to start the volume.


Brick log says libgfdb.so.0  can't be opened. How can I install this ?
 There is  no mention about such lib in the build requirements
(
https://gluster.readthedocs.io/en/latest/Developer-guide/Building-GlusterFS/
)

I could start it by force. But gluster vol status show


Status of volume: dsi4-vol
Gluster process TCP Port  RDMA Port  Online  Pid
--
Brick 192.168.36.200:/home/jaya/gluster/dsi
4-brick1N/A   N/AN
N/A
Brick 192.168.36.200:/home/jaya/gluster/dsi
4-brick2N/A   N/AN
N/A
Brick 192.168.36.200:/home/jaya/gluster/dsi
4-brick3N/A   N/AN
N/A
Brick 192.168.36.200:/home/jaya/gluster/dsi
4-brick4N/A   N/AN
N/A
Brick 192.168.36.200:/home/jaya/gluster/dsi
4-brick5N/A   N/AN
N/A
Brick 192.168.36.200:/home/jaya/gluster/dsi
4-brick6N/A   N/AN
N/A
Brick 192.168.36.200:/home/jaya/gluster/dsi
4-brick7N/A   N/AN
N/A
Brick 192.168.36.200:/home/jaya/gluster/dsi
4-brick8N/A   N/AN
N/A
NFS Server on localhost N/A   N/AN
N/A

Task Status of Volume dsi4-vol
--
There are no active volume tasks



Pls help.

best regards
JK



*usr-local-etc-glusterfs-glusterd.vol.log *
**

[2016-08-20 10:31:34.712494] I [MSGID: 100030] [glusterfsd.c:2319:main]
0-/usr/local/sbin/glusterd: Started running /usr/local/sbin/glusterd
version 3.7.6 (args: /usr/local/sbin/glusterd -p /var/run/glusterd.pid)
[2016-08-20 10:31:34.718198] I [MSGID: 106478] [glusterd.c:1350:init]
0-management: Maximum allowed open file descriptors set to 65536
[2016-08-20 10:31:34.718246] I [MSGID: 106479] [glusterd.c:1399:init]
0-management: Using /var/lib/glusterd as working directory
[2016-08-20 10:31:34.723445] W [MSGID: 103071]
[rdma.c:4592:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event
channel creation failed [No such device]
[2016-08-20 10:31:34.723479] W [MSGID: 103055] [rdma.c:4899:init]
0-rdma.management: Failed to initialize IB Device
[2016-08-20 10:31:34.723490] W [rpc-transport.c:359:rpc_transport_load]
0-rpc-transport: 'rdma' initialization failed
[2016-08-20 10:31:34.723558] W [rpcsvc.c:1597:rpcsvc_transport_create]
0-rpc-service: cannot create listener, initing the transport failed
[2016-08-20 10:31:34.723573] E [MSGID: 106243] [glusterd.c:1623:init]
0-management: creation of 1 listeners failed, continuing with succeeded
transport
[2016-08-20 10:31:37.028385] I [MSGID: 106513]
[glusterd-store.c:2071:glusterd_restore_op_version] 0-glusterd: retrieved
op-version: 30706
[2016-08-20 10:31:37.028524] I [MSGID: 106194]
[glusterd-store.c:3543:glusterd_store_retrieve_missed_snaps_list]
0-management: No missed snaps list.
Final graph:
+--+
  1: volume management
  2: type mgmt/glusterd
  3: option rpc-auth.auth-glusterfs on
  4: option rpc-auth.auth-unix on
  5: option rpc-auth.auth-null on
  6: option rpc-auth-allow-insecure on
  7: option transport.socket.listen-backlog 128
  8: option ping-timeout 30
  9: option transport.socket.read-fail-log off
 10: option transport.socket.keepalive-interval 2
 11: option transport.socket.keepalive-time 10
 12: option transport-type rdma
[2016-08-20 10:31:37.030150] I [MSGID: 101190]
[event-epoll.c:633:event_dispatch_epoll_worker] 0-epoll: Started thread
with index 2
 13: option working-directory /var/lib/glusterd
 14: end-volume
 15:
+--+
[2016-08-20 10:31:37.031436] I [MSGID: 101190]
[event-epoll.c:633:event_dispatch_epoll_worker] 0-epoll: Started thread
with index 1
[2016-08-20 10:31:37.031508] I [MSGID: 101190]
[event-epoll.c:633:event_dispatch_epoll_worker] 0-epoll: Started thread
with index 2


[2016-08-20 10:31:37.139926] I [MSGID: 106544]
[glusterd.c:159:glusterd_uuid_init] 0-management: retrieved UUID:
543ab2b3-de06-4655-b228-a1f085643613

2016-08-20 10:31:37.439380] W [common-utils.c:1685:gf_string2boolean]
(-->/usr/local/lib/glusterfs/3.7.6/xlator/mgmt/glusterd.so(glusterd_op_commit_perform+0x7ab)
[0x7f4fc382ad4b]
-->/usr/local/lib/glusterfs/3.7.6/xlator/mgmt/glusterd.so(glusterd_op_start_volume+0x2c0)
[0x7f4fc38a9be0]
-->/usr/local/lib/libglusterfs.so.0(gf_string2boolean+0x15a)
[0x7f4fc85fa0ca] ) 0-management: argument invalid [Invalid