Re: [Gluster-users] ec heal questions

2016-08-10 Thread Pranith Kumar Karampuri
I don't think these will help. We need to trigger parallel heals, I gave
the command as a reply to one of your earlier threads. Sorry again for the
delay :-(.

On Tue, Aug 9, 2016 at 3:53 PM, Serkan Çoban  wrote:

> Does increasing any of below values helps ec heal speed?
>
> performance.io-thread-count 16
> performance.high-prio-threads 16
> performance.normal-prio-threads 16
> performance.low-prio-threads 16
> performance.least-prio-threads 1
> client.event-threads 8
> server.event-threads 8
>
>
> On Mon, Aug 8, 2016 at 2:48 PM, Ashish Pandey  wrote:
> > Serkan,
> >
> > Heal for 2 different files could be parallel but not for a single file
> and
> > different chunks.
> > I think you are referring your previous mail in which you had to remove
> one
> > complete disk.
> >
> > In this case heal starts automatically but it scans through each and
> every
> > file/dir
> > to decide if it needs heal or not. No doubt it is more time taking
> process
> > as compared to index heal.
> > If the data is 900GB then it might take lot of time.
> >
> > What configuration to choose depends a lot on your storage requirement,
> > hardware capability and
> > probability of failure of disk and network.
> >
> > For example : A small configuration  like 4+2 could help you in this
> > scenario. You can have distributed disp volume of 4+2 config.
> > In this case each sub vol have a comparatively less data. If a brick
> fails
> > in that sub vol, it will have to heal only that much data and that too
> from
> > reading 4 bricks only.
> >
> > dist-disp-vol
> >
> > subvol-1subvol-2subvol-3
> > 4+24+24+2
> > 4GB4GB4GB
> > ^^^
> > If a brick in this subvol-1 fails, it will be local to this subvol only
> and
> > will require only 4GB of data to be healed which will require reading
> from 4
> > disk only.
> >
> > I am keeping Pranith in CC to take his input too.
> >
> > Ashish
> >
> >
> > 
> > From: "Serkan Çoban" 
> > To: "Ashish Pandey" 
> > Cc: "Gluster Users" 
> > Sent: Monday, August 8, 2016 4:47:02 PM
> > Subject: Re: [Gluster-users] ec heal questions
> >
> >
> > Is reading the good copies to construct the bad chunk is a parallel or
> > sequential operation?
> > Should I revert my 16+4 ec cluster to 8+2 because it takes nearly 7
> > days to heal just one broken 8TB disk which has only 800GB of data?
> >
> > On Mon, Aug 8, 2016 at 1:56 PM, Ashish Pandey 
> wrote:
> >>
> >> Hi,
> >>
> >> Considering all the other factor same for both the configuration, yes
> >> small
> >> configuration
> >> would take less time. To read good copies, it will take less time.
> >>
> >> I think, multi threaded shd is the only enhancement in near future.
> >>
> >> Ashish
> >>
> >> 
> >> From: "Serkan Çoban" 
> >> To: "Gluster Users" 
> >> Sent: Monday, August 8, 2016 4:02:22 PM
> >> Subject: [Gluster-users] ec heal questions
> >>
> >>
> >> Hi,
> >>
> >> Assume we have 8+2 and 16+4 ec configurations and we just replaced a
> >> broken disk in each configuration  which has 100GB of data. In which
> >> case heal completes faster? Does heal speed has anything related with
> >> ec configuration?
> >>
> >> Assume we are in 16+4 ec configuration. When heal starts it reads 16
> >> chunks from other bricks recompute our chunks and writes it to just
> >> replaced disk. Am I correct?
> >>
> >> If above assumption is true then small ec configurations heals faster
> >> right?
> >>
> >> Is there any improvements in 3.7.14+ that makes ec heal faster?(Other
> >> than multi-thread shd for ec)
> >>
> >> Thanks,
> >> Serkan
> >> ___
> >> Gluster-users mailing list
> >> Gluster-users@gluster.org
> >> http://www.gluster.org/mailman/listinfo/gluster-users
> >>
> > ___
> > Gluster-users mailing list
> > Gluster-users@gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-users
> >
>



-- 
Pranith
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster Infiniband/RDMA Help

2016-08-10 Thread Pranith Kumar Karampuri
Added Rafi, Raghavendra who work on RDMA

On Mon, Aug 8, 2016 at 7:58 AM, Dan Lavu  wrote:

> Hello,
>
> I'm having some major problems with Gluster and oVirt, I've been ripping
> my hair out with this, so if anybody can provide insight, that will be
> fantastic. I've tried both transports TCP and RDMA... both are having
> instability problems.
>
> So the first thing I'm running into, intermittently, on one specific node,
> will get spammed with the following message;
>
> "[2016-08-08 00:42:50.837992] E [rpc-clnt.c:357:saved_frames_unwind] (-->
> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x1a3)[0x7fb728b0f293] (-->
> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1d1)[0x7fb7288d73d1] (-->
> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fb7288d74ee] (-->
> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e)[0x7fb7288d8d0e]
> (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7fb7288d9528] )
> 0-vmdata1-client-0: forced unwinding frame type(GlusterFS 3.3)
> op(WRITE(13)) called at 2016-08-08 00:42:43.620710 (xid=0x6800b)"
>
> Then the infiniband device will get bounced and VMs will get stuck.
>
> Another problem I'm seeing, once a day, or every two days, an oVirt node
> will hang on gluster mounts. Issuing a df to check the mounts will just
> stall, this occurs hourly if RDMA is used. I can log into the hypervisor
> remount the gluster volumes most of the time.
>
> This is on Fedora 23; Gluster 3.8.1-1, the Infiniband gear is 40Gb/s QDR
> Qlogic, using the ib_qib module, this configuration was working with our
> old infinihost III. I couldn't get OFED to compile so all the infiniband
> modules are Fedora installed.
>
> So a volume looks like the following, (please if there is anything I need
> to adjust, the settings was pulled from several examples)
>
> Volume Name: vmdata_ha
> Type: Replicate
> Volume ID: 325a5fda-a491-4c40-8502-f89776a3c642
> Status: Started
> Number of Bricks: 1 x (2 + 1) = 3
> Transport-type: tcp,rdma
> Bricks:
> Brick1: deadpool.ib.runlevelone.lan:/gluster/vmdata_ha
> Brick2: spidey.ib.runlevelone.lan:/gluster/vmdata_ha
> Brick3: groot.ib.runlevelone.lan:/gluster/vmdata_ha (arbiter)
> Options Reconfigured:
> performance.least-prio-threads: 4
> performance.low-prio-threads: 16
> performance.normal-prio-threads: 24
> performance.high-prio-threads: 24
> cluster.self-heal-window-size: 32
> cluster.self-heal-daemon: on
> performance.md-cache-timeout: 1
> performance.cache-max-file-size: 2MB
> performance.io-thread-count: 32
> network.ping-timeout: 5
> performance.write-behind-window-size: 4MB
> performance.cache-size: 256MB
> performance.cache-refresh-timeout: 10
> server.allow-insecure: on
> network.remote-dio: enable
> performance.io-cache: off
> performance.read-ahead: off
> performance.quick-read: off
> storage.owner-gid: 36
> storage.owner-uid: 36
> performance.readdir-ahead: on
> nfs.disable: on
> config.transport: tcp,rdma
> performance.stat-prefetch: off
> cluster.eager-lock: enable
>
> Volume Name: vmdata1
> Type: Distribute
> Volume ID: 3afefcb3-887c-4315-b9dc-f4e890f786eb
> Status: Started
> Number of Bricks: 2
> Transport-type: tcp,rdma
> Bricks:
> Brick1: spidey.ib.runlevelone.lan:/gluster/vmdata1
> Brick2: deadpool.ib.runlevelone.lan:/gluster/vmdata1
> Options Reconfigured:
> config.transport: tcp,rdma
> network.remote-dio: enable
> performance.io-cache: off
> performance.read-ahead: off
> performance.quick-read: off
> nfs.disable: on
> storage.owner-gid: 36
> storage.owner-uid: 36
> performance.readdir-ahead: on
> server.allow-insecure: on
> performance.stat-prefetch: off
> performance.cache-refresh-timeout: 10
> performance.cache-size: 256MB
> performance.write-behind-window-size: 4MB
> network.ping-timeout: 5
> performance.io-thread-count: 32
> performance.cache-max-file-size: 2MB
> performance.md-cache-timeout: 1
> performance.high-prio-threads: 24
> performance.normal-prio-threads: 24
> performance.low-prio-threads: 16
> performance.least-prio-threads: 4
>
>
> /etc/glusterfs/glusterd.vol
> volume management
> type mgmt/glusterd
> option working-directory /var/lib/glusterd
> option transport-type socket,tcp
> option transport.socket.keepalive-time 10
> option transport.socket.keepalive-interval 2
> option transport.socket.read-fail-log off
> option ping-timeout 0
> option event-threads 1
> #option rpc-auth-allow-insecure on
> option transport.socket.bind-address 0.0.0.0
> #   option transport.address-family inet6
> #   option base-port 49152
> end-volume
>
> I think that's a good start, thank you so much for taking the time to look
> at this. You can find me on freenode, nick side_control if you want to
> chat, I'm GMT -5.
>
> Cheers,
>
> Dan
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>



-- 
Pranith
___
Gluster-users mailing list
Gluster-users@gluster.org
http:/

Re: [Gluster-users] Fwd: disperse heal speed up

2016-08-10 Thread Pranith Kumar Karampuri
On Fri, Aug 5, 2016 at 8:37 PM, Serkan Çoban  wrote:

> Hi again,
>
> I am seeing the above situation in production environment now.
> One disk on one of my servers broken. I killed the brick process,
> replace the disk, mount it and then I do a gluster v start force.
>
> For a 24 hours period  after replacing disks I see below gluster v
> heal info count increased until 200.000
>
> gluster v heal v0 info | grep "Number of entries" | grep -v "Number of
> entries: 0"
> Number of entries: 205117
> Number of entries: 205231
> ...
> ...
> ...
>
> For about 72 hours It decreased to 40K, and it is going very slowly right
> now.
> What I am observing is very very slow heal speed. There is no errors
> in brick logs.
> There was 900GB data in broken disk and now I see 200GB healed after
> 96 hours after replacing disk.
> There are below warnings in glustershd.log but I think they are harmless.
>
> W [ec_combine.c:866:ec_combine_check] 0-v0-disperse-56: Mismatching
> xdata in answers of LOOKUP
> W [ec_common.c:116:ec_check_status] 0-v0-disperse-56: Operation failed
> on some subvolumes (up=F, mask=F, remaining=0, good=7,
> bad=8)
> W [ec_common.c:71:ec_heal_report] 0-v0-disperse-56: Heal failed
> [invalid argument]
>
> I tried turning on performance.client-io-threads but it did not
> changed anything.
> For 900GB data It will take nearly 8 days to heal. What can I do?
>

Sorry for the delay in response, do you still have this problem?
You can trigger heals using the following command:

find  -d -exec getfattr -h -n trusted.ec.heal {} \;

If you have 10 top level directories may be you can spawn 10 such processes.



>
> Serkan
>
>
>
> On Fri, Apr 15, 2016 at 1:28 PM, Serkan Çoban 
> wrote:
> > 100TB is newly created files when brick is down.I rethink the
> > situation and realized that I reformatted all the bricks in case 1 so
> > write speed limit is 26*100MB/disk
> > In case 2 I just reformatted one brick so write speed limited to
> > 100MB/disk...I will repeat the tests using one brick in both cases
> > once with reformat, and once with just killing brick process...
> > Thanks for reply..
> >
> > On Fri, Apr 15, 2016 at 9:27 AM, Xavier Hernandez 
> wrote:
> >> Hi Serkan,
> >>
> >> sorry for the delay, I'm a bit busy lately.
> >>
> >> On 13/04/16 13:59, Serkan Çoban wrote:
> >>>
> >>> Hi Xavier,
> >>>
> >>> Can you help me about the below issue? How can I increase the disperse
> >>> heal speed?
> >>
> >>
> >> It seems weird. Is there any related message in the logs ?
> >>
> >> In this particular test, are the 100TB modified files or newly created
> files
> >> while the brick was down ?
> >>
> >> How many files have been modified ?
> >>
> >>> Also I would be grateful if you have detailed documentation about
> disperse
> >>> heal,
> >>> why heal happens on disperse volume, how it is triggered? Which nodes
> >>> participate in heal process? Any client interaction?
> >>
> >>
> >> Heal process is basically the same used for replicate. There are two
> ways to
> >> trigger a self-heal:
> >>
> >> * when an inconsistency is detected, the client initiates a background
> >> self-heal of the inode
> >>
> >> * the self-heal daemon scans the lists of modified files created by the
> >> index xlator when a modification is made while some node is down. All
> these
> >> files are self-healed.
> >>
> >> Xavi
> >>
> >>
> >>>
> >>> Serkan
> >>>
> >>>
> >>> -- Forwarded message --
> >>> From: Serkan Çoban 
> >>> Date: Fri, Apr 8, 2016 at 5:46 PM
> >>> Subject: disperse heal speed up
> >>> To: Gluster Users 
> >>>
> >>>
> >>> Hi,
> >>>
> >>> I am testing heal speed of disperse volume and what I see is 5-10MB/s
> per
> >>> node.
> >>> I increased disperse.background-heals to 32 and
> >>> disperse.heal-wait-qlength to 256, but still no difference.
> >>> One thing I noticed is that, when I kill a brick process, reformat it
> >>> and restart it heal speed is nearly 20x (200MB/s/node)
> >>>
> >>> But when I kill the brick, then write 100TB data, and start brick
> >>> afterwords heal is slow (5-10MB/s/node)
> >>>
> >>> What is the difference between two scenarios? Why one heal is slow and
> >>> other is fast? How can I increase disperse heal speed? Should I
> >>> increase thread count to 128 or 256? I am on 78x(16+4) disperse volume
> >>> and my servers are pretty strong (2x14 cores with 512GB ram, each node
> >>> has 26x8TB disks)
> >>>
> >>> Gluster version is 3.7.10.
> >>>
> >>> Thanks,
> >>> Serkan
> >>>
> >>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>



-- 
Pranith
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Weekly Community meeting - 10/Aug/2016

2016-08-10 Thread Kaushal M
Hi All,

The meeting minutes and logs for this weeks meeting are available at
the links below.
- Minutes: 
https://meetbot.fedoraproject.org/gluster-meeting/2016-08-10/weekly_community_meeting_10aug2016.2016-08-10-12.00.html
- Minutes (text):
https://meetbot.fedoraproject.org/gluster-meeting/2016-08-10/weekly_community_meeting_10aug2016.2016-08-10-12.00.txt
- Log: 
https://meetbot.fedoraproject.org/gluster-meeting/2016-08-10/weekly_community_meeting_10aug2016.2016-08-10-12.00.log.html

We had a very lively meeting this time, and had good participation.
Hope next weeks meeting is also the same. The next meeting is as
always at 1200UTC next Wednesday in #gluster-meeting. See you all
there and thank you for attending todays meeting.

~kaushal

Meeting summary
---
* Roll call  (kshlm, 12:00:48)

* Next weeks meeting host  (kshlm, 12:04:07)
  * rafi hosts the next meeting  (kshlm, 12:06:23)

* GlusterFS-4.0  (kshlm, 12:06:42)

* GlusterFS-3.9  (kshlm, 12:14:50)
  * ACTION: pranithk/aravindavk/dblack to send out a reminder about the
feature deadline for 3.9  (kshlm, 12:20:33)

* GlusterFS 3.8  (kshlm, 12:21:36)

* GlusterFS-3.7  (kshlm, 12:30:01)

* GlusterFS-3.6  (kshlm, 12:38:27)

* NFS-Ganesha  (kshlm, 12:44:12)

* Samba  (kshlm, 12:49:45)

* Community Infrastructure  (kshlm, 12:54:05)

* Open Floor  (kshlm, 13:02:22)

* Glusto - libraries have been ported by the QE Automation Team and just
  need your +1s on Glusto to begin configuring upstream and make
  available  (kshlm, 13:02:47)

* Need some more reviews for
  https://github.com/gluster/glusterdocs/pull/139  (kshlm, 13:17:05)

Meeting ended at 13:19:53 UTC.




Action Items

* pranithk/aravindavk/dblack to send out a reminder about the feature
  deadline for 3.9




Action Items, by person
---
* **UNASSIGNED**
  * pranithk/aravindavk/dblack to send out a reminder about the feature
deadline for 3.9




People Present (lines said)
---
* kshlm (129)
* ndevos (46)
* obnox (34)
* kkeithley (31)
* nigelb (22)
* ira (20)
* post-factum (17)
* loadtheacc (13)
* msvbhat (10)
* skoduri (8)
* rafi (6)
* anoopcs (3)
* glusterbot (3)
* zodbot (3)
* ankitraj (1)
* misc (1)
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Using LIO with Gluster

2016-08-10 Thread Prasanna Kalever
On Fri, Aug 5, 2016 at 2:53 PM, luca giacomelli
 wrote:
> Hi, I'm trying to implement something similar to
> http://blog.gluster.org/2016/04/using-lio-with-gluster/ and
> https://pkalever.wordpress.com/2016/06/29/non-shared-persistent-gluster-storage-with-kubernetes/
>
> Gluster 3.8.1 and Fedora 24
>
> gluster is up and running. The initiator discover the target but is not able
> to find the disk. I successfully tested targetli fileio with the same
> target and initiator.

Hey Luca,


Can you attach your /etc/target/saveconfig.json here ?
I doubt the targetcli configuration


Before that quick checks you can do:
* mount the gluster volume and see the target file exist
* targetcli /backstores/user:glfs/glfsLUN info

If you notice the correct signs above, try restarting the
tcmu-runner.service and target.service.

I remember there was a similar situation for me, nothing above worked,
I have rebooted the target nodes and flushed the iptables then it
worked.


Cheers,
--
Prasanna

>
> I tried also with tcmu-runner 1.1.0
>
> Any help would be appreciated.
>
> Thanks, Luca
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] SELinux is preventing /usr/sbin/glusterfsd from getattr access on the chr_file

2016-08-10 Thread Pranith Kumar Karampuri
I included Prashanth and Rejy who know about SELinux in detail.

On Fri, Aug 5, 2016 at 4:05 PM, tecforte jason 
wrote:

> Hi,
>
> I have my glusterfs configured as distributed replicated mode. total node
> is 3.
>
> I have below error when issue command service glusterd status on my node 2.
>
>  *  Plugin catchall_boolean (89.3 confidence) suggests
> **...
> Aug 05 14:15:04 localhost.localdomain python[1875]: SELinux is preventing
> /usr/sbin/glusterfsd from getattr access on the chr_file .
>
>
> I also having problem to mount on boot for node 2 and centos go to
> emergency mode, is the above error related to this ?
> Other 2 nodes don't have above error and able to mount on boot.
>
> i have this in all my /etc/fsatb
> localhost:/syslog-volume  /mnt glusterfs defaults 1 2
>
> Appreciate for any guide.
>
> Thanks.
>
> Jason
>
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>



-- 
Pranith
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Using LIO with Gluster

2016-08-10 Thread Pranith Kumar Karampuri
+Prasanna, author of the blog post.

On Fri, Aug 5, 2016 at 2:53 PM, luca giacomelli 
wrote:

> Hi, I'm trying to implement something similar to
> http://blog.gluster.org/2016/04/using-lio-with-gluster/ and
> https://pkalever.wordpress.com/2016/06/29/non-shared-persist
> ent-gluster-storage-with-kubernetes/
>
> Gluster 3.8.1 and Fedora 24
>
> gluster is up and running. The initiator discover the target but is not
> able to find the disk. I successfully tested targetli fileio with the same
> target and initiator.
>
> I tried also with tcmu-runner 1.1.0
>
> Any help would be appreciated.
>
> Thanks, Luca
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>



-- 
Pranith
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] GlusterFS-3.7.14 released

2016-08-10 Thread Pranith Kumar Karampuri
On Wed, Aug 10, 2016 at 1:58 PM, Serkan Çoban  wrote:

> Hi,
>
> Any progress about the patch?
>

hi Serkan,
   While testing the patch by myself, I am seeing that it is taking
more than one crawl to complete heals even when there are no  directory
hierarchies. It is faster than before but it shouldn't take more than 1
crawl to complete the heal because all the files exist already. I am
investigating why that is the case now. If you want to test things out
without this patch I will give you rpms today. Otherwise we need to find
until we find RCA for this crawl problem. Let me know your decision. If you
are okay with testing progressive versions of this feature, that would be
great. We can compare how each patch improved the performance.

Pranith


>
> On Thu, Aug 4, 2016 at 10:16 AM, Pranith Kumar Karampuri
>  wrote:
> >
> >
> > On Thu, Aug 4, 2016 at 11:30 AM, Serkan Çoban 
> wrote:
> >>
> >> Thanks Pranith,
> >> I am waiting for RPMs to show, I will do the tests as soon as possible
> >> and inform you.
> >
> >
> > I guess on 3.7.x the RPMs are not automatically built. Let me find how it
> > can be done. I will inform you after finding that out. Give me a day.
> >
> >>
> >>
> >> On Wed, Aug 3, 2016 at 11:19 PM, Pranith Kumar Karampuri
> >>  wrote:
> >> >
> >> >
> >> > On Thu, Aug 4, 2016 at 1:47 AM, Pranith Kumar Karampuri
> >> >  wrote:
> >> >>
> >> >>
> >> >>
> >> >> On Thu, Aug 4, 2016 at 12:51 AM, Serkan Çoban  >
> >> >> wrote:
> >> >>>
> >> >>> I use rpms for installation. Redhat/Centos 6.8.
> >> >>
> >> >>
> >> >> http://review.gluster.org/#/c/15084 is the patch. In some time the
> rpms
> >> >> will be built actually.
> >> >
> >> >
> >> > In the same URL above it will actually post the rpms for
> fedora/el6/el7
> >> > at
> >> > the end of the page.
> >> >
> >> >>
> >> >>
> >> >> Use gluster volume set  disperse.shd-max-threads
>  >> >> (range: 1-64)>
> >> >>
> >> >> While testing this I thought of ways to decrease the number of crawls
> >> >> as
> >> >> well. But they are a bit involved. Try to create same set of data and
> >> >> see
> >> >> what is the time it takes to complete heals using number of threads
> as
> >> >> you
> >> >> increase the number of parallel heals from 1 to 64.
> >> >>
> >> >>>
> >> >>> On Wed, Aug 3, 2016 at 10:16 PM, Pranith Kumar Karampuri
> >> >>>  wrote:
> >> >>> >
> >> >>> >
> >> >>> > On Thu, Aug 4, 2016 at 12:45 AM, Serkan Çoban
> >> >>> > 
> >> >>> > wrote:
> >> >>> >>
> >> >>> >> I prefer 3.7 if it is ok for you. Can you also provide build
> >> >>> >> instructions?
> >> >>> >
> >> >>> >
> >> >>> > 3.7 should be fine. Do you use rpms/debs/anything-else?
> >> >>> >
> >> >>> >>
> >> >>> >>
> >> >>> >> On Wed, Aug 3, 2016 at 10:12 PM, Pranith Kumar Karampuri
> >> >>> >>  wrote:
> >> >>> >> >
> >> >>> >> >
> >> >>> >> > On Thu, Aug 4, 2016 at 12:37 AM, Serkan Çoban
> >> >>> >> > 
> >> >>> >> > wrote:
> >> >>> >> >>
> >> >>> >> >> Yes, but I can create 2+1(or 8+2) ec using two servers right?
> I
> >> >>> >> >> have
> >> >>> >> >> 26 disks on each server.
> >> >>> >> >
> >> >>> >> >
> >> >>> >> > On which release-branch do you want the patch? I am testing it
> on
> >> >>> >> > master-branch now.
> >> >>> >> >
> >> >>> >> >>
> >> >>> >> >>
> >> >>> >> >> On Wed, Aug 3, 2016 at 9:59 PM, Pranith Kumar Karampuri
> >> >>> >> >>  wrote:
> >> >>> >> >> >
> >> >>> >> >> >
> >> >>> >> >> > On Thu, Aug 4, 2016 at 12:23 AM, Serkan Çoban
> >> >>> >> >> > 
> >> >>> >> >> > wrote:
> >> >>> >> >> >>
> >> >>> >> >> >> I have two of my storage servers free, I think I can use
> them
> >> >>> >> >> >> for
> >> >>> >> >> >> testing. Is two server testing environment ok for you?
> >> >>> >> >> >
> >> >>> >> >> >
> >> >>> >> >> > I think it would be better if you have at least 3. You can
> >> >>> >> >> > test
> >> >>> >> >> > it
> >> >>> >> >> > with
> >> >>> >> >> > 2+1
> >> >>> >> >> > ec configuration.
> >> >>> >> >> >
> >> >>> >> >> >>
> >> >>> >> >> >>
> >> >>> >> >> >> On Wed, Aug 3, 2016 at 9:44 PM, Pranith Kumar Karampuri
> >> >>> >> >> >>  wrote:
> >> >>> >> >> >> >
> >> >>> >> >> >> >
> >> >>> >> >> >> > On Wed, Aug 3, 2016 at 6:01 PM, Serkan Çoban
> >> >>> >> >> >> > 
> >> >>> >> >> >> > wrote:
> >> >>> >> >> >> >>
> >> >>> >> >> >> >> Hi,
> >> >>> >> >> >> >>
> >> >>> >> >> >> >> May I ask if multi-threaded self heal for distributed
> >> >>> >> >> >> >> disperse
> >> >>> >> >> >> >> volumes
> >> >>> >> >> >> >> implemented in this release?
> >> >>> >> >> >> >
> >> >>> >> >> >> >
> >> >>> >> >> >> > Serkan,
> >> >>> >> >> >> > At the moment I am a bit busy with different
> work,
> >> >>> >> >> >> > Is
> >> >>> >> >> >> > it
> >> >>> >> >> >> > possible
> >> >>> >> >> >> > for you to help test the feature if I provide a patch?
> >> >>> >> >> >> > Actually
> >> >>> >> >> >> > the
> >> >>> >> >> >> > patch
> >> >>> >> >> >> > should be small. Testing is where lot of time will be
> spent
> >> >>> >> >> >> > on.
> >> >>> >> >> >> >
> >> >>> >> >> >> >>
> >> >>> >> >> >> >>
> >>

Re: [Gluster-users] Linux (ls -l) command pauses/slow on GlusterFS mounts

2016-08-10 Thread Deepak Naidu
To be more precious the hang is clearly seen when there is some IO(write) to 
the mount point. Even rm -rf takes time to clear the files.

Below, time command showing the delay. Typically it should take less then a 
second, but glusterfs take more than 5seconds just to list 32x 2GB files.

[root@client-host ~]# time ls -l /mnt/gluster/
total 34575680
-rw-r--r--. 1 root root 2147483648 Aug 10 12:23 rand.0.0
-rw-r--r--. 1 root root 2147483648 Aug 10 12:23 rand.1.0
-rw-r--r--. 1 root root 2147454976 Aug 10 12:23 rand.10.0
-rw-r--r--. 1 root root 2147463168 Aug 10 12:23 rand.11.0
-rw-r--r--. 1 root root 2147467264 Aug 10 12:23 rand.12.0
-rw-r--r--. 1 root root 2147475456 Aug 10 12:23 rand.13.0
-rw-r--r--. 1 root root 2147479552 Aug 10 12:23 rand.14.0
-rw-r--r--. 1 root root 2147479552 Aug 10 12:23 rand.15.0
-rw-r--r--. 1 root root 2147483648 Aug 10 12:23 rand.16.0
-rw-r--r--. 1 root root 2147479552 Aug 10 12:23 rand.17.0
-rw-r--r--. 1 root root 2147483648 Aug 10 12:23 rand.18.0
-rw-r--r--. 1 root root 2147467264 Aug 10 12:23 rand.19.0
-rw-r--r--. 1 root root 2147483648 Aug 10 12:23 rand.2.0
-rw-r--r--. 1 root root 2147475456 Aug 10 12:23 rand.20.0
-rw-r--r--. 1 root root 2147479552 Aug 10 12:23 rand.21.0
-rw-r--r--. 1 root root 2147483648 Aug 10 12:23 rand.22.0
-rw-r--r--. 1 root root 2147459072 Aug 10 12:23 rand.23.0
-rw-r--r--. 1 root root 2147483648 Aug 10 12:23 rand.24.0
-rw-r--r--. 1 root root 2147471360 Aug 10 12:23 rand.25.0
-rw-r--r--. 1 root root 2147483648 Aug 10 12:23 rand.26.0
-rw-r--r--. 1 root root 2147483648 Aug 10 12:23 rand.27.0
-rw-r--r--. 1 root root 2147479552 Aug 10 12:23 rand.28.0
-rw-r--r--. 1 root root 2147483648 Aug 10 12:23 rand.29.0
-rw-r--r--. 1 root root 2147483648 Aug 10 12:23 rand.3.0
-rw-r--r--. 1 root root 2147442688 Aug 10 12:23 rand.30.0
-rw-r--r--. 1 root root 2147483648 Aug 10 12:23 rand.31.0
-rw-r--r--. 1 root root 2147483648 Aug 10 12:23 rand.4.0
-rw-r--r--. 1 root root 2147483648 Aug 10 12:23 rand.5.0
-rw-r--r--. 1 root root 2147483648 Aug 10 12:23 rand.6.0
-rw-r--r--. 1 root root 2147483648 Aug 10 12:23 rand.7.0
-rw-r--r--. 1 root root 2147483648 Aug 10 12:23 rand.8.0
-rw-r--r--. 1 root root 2147483648 Aug 10 12:23 rand.9.0

real0m7.478s
user0m0.001s
sys 0m0.005s
 [root@client-host ~]#

--
Deepak

-Original Message-
From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Deepak Naidu
Sent: Wednesday, August 10, 2016 2:18 PM
To: Vijay Bellur
Cc: gluster-users@gluster.org
Subject: Re: [Gluster-users] Linux (ls -l) command pauses/slow on GlusterFS 
mounts

* PGP Signed: 08/10/2016 at 02:18:22 PM, Decrypted

I did strace & its waiting on IO.

--
Deepak

-Original Message-
From: Vijay Bellur [mailto:vbel...@redhat.com]
Sent: Wednesday, August 10, 2016 2:17 PM
To: Deepak Naidu
Cc: gluster-users@gluster.org
Subject: Re: [Gluster-users] Linux (ls -l) command pauses/slow on GlusterFS 
mounts

On 08/10/2016 05:12 PM, Deepak Naidu wrote:
> Before we can try physical we wanted POC on VM.
>
> Just a note the VMs are decently powerful 18cpus, 10gig NIC, 45GB Ram 1TB SSD 
> drives. This is per node spec.
>
> I don't see the ls -l command hanging when I try to list the files from the 
> gluster-node VMs itself So the question.

The reason I alluded to a physical setup was to remove the variables that can 
affect performance in a virtual setup. The behavior is not usual for the scale 
of deployment that you mention. You could use strace in conjunction with 
gluster volume profile to determine where the latency is stemming from.

Regards,
Vijay

>
> --
> Deepak
>
>> On Aug 10, 2016, at 2:01 PM, Vijay Bellur  wrote:
>>
>>> On 08/10/2016 04:54 PM, Deepak Naidu wrote:
>>> Anyone who has seen the issue in their env ?
>>
>>
>>> --
>>> Deepak
>>>
>>> -Original Message-
>>> From: gluster-users-boun...@gluster.org 
>>> [mailto:gluster-users-boun...@gluster.org] On Behalf Of Deepak Naidu
>>> Sent: Tuesday, August 09, 2016 9:14 PM
>>> To: gluster-users@gluster.org
>>> Subject: [Gluster-users] Linux (ls -l) command pauses/slow on 
>>> GlusterFS mounts
>>>
>>> Greetings,
>>>
>>> I have 3node GlusterFS on VM for POC each node has 2x bricks of 200GB. 
>>> Regardless of what type of volume I create when listing files under 
>>> directory using ls command the GlusterFS mount hangs pauses for few 
>>> seconds. This is same if there're 2-5 19gb file each or 2gb file each. 
>>> There are less than  10 files under the GlusterFS mount.
>>>
>>> I am using NFS-Ganesha for NFS server with GlusterFS and the Linux client 
>>> is mounted using GlusterFS fuse mount with direct-io enabled.
>>>
>>> GlusterFS version 3.8(latest)
>>>
>>>
>>> Any insight is appreciated.
>>
>> This does not seem usual for the deployment that you describe. Can you try 
>> on a physical setup to see if the same behavior is observed?
>>
>> -Vijay
>>
>>
> --
> - This email message is for

Re: [Gluster-users] Linux (ls -l) command pauses/slow on GlusterFS mounts

2016-08-10 Thread Deepak Naidu
I did strace & its waiting on IO.

--
Deepak

-Original Message-
From: Vijay Bellur [mailto:vbel...@redhat.com] 
Sent: Wednesday, August 10, 2016 2:17 PM
To: Deepak Naidu
Cc: gluster-users@gluster.org
Subject: Re: [Gluster-users] Linux (ls -l) command pauses/slow on GlusterFS 
mounts

On 08/10/2016 05:12 PM, Deepak Naidu wrote:
> Before we can try physical we wanted POC on VM.
>
> Just a note the VMs are decently powerful 18cpus, 10gig NIC, 45GB Ram 1TB SSD 
> drives. This is per node spec.
>
> I don't see the ls -l command hanging when I try to list the files from the 
> gluster-node VMs itself So the question.

The reason I alluded to a physical setup was to remove the variables that can 
affect performance in a virtual setup. The behavior is not usual for the scale 
of deployment that you mention. You could use strace in conjunction with 
gluster volume profile to determine where the latency is stemming from.

Regards,
Vijay

>
> --
> Deepak
>
>> On Aug 10, 2016, at 2:01 PM, Vijay Bellur  wrote:
>>
>>> On 08/10/2016 04:54 PM, Deepak Naidu wrote:
>>> Anyone who has seen the issue in their env ?
>>
>>
>>> --
>>> Deepak
>>>
>>> -Original Message-
>>> From: gluster-users-boun...@gluster.org 
>>> [mailto:gluster-users-boun...@gluster.org] On Behalf Of Deepak Naidu
>>> Sent: Tuesday, August 09, 2016 9:14 PM
>>> To: gluster-users@gluster.org
>>> Subject: [Gluster-users] Linux (ls -l) command pauses/slow on 
>>> GlusterFS mounts
>>>
>>> Greetings,
>>>
>>> I have 3node GlusterFS on VM for POC each node has 2x bricks of 200GB. 
>>> Regardless of what type of volume I create when listing files under 
>>> directory using ls command the GlusterFS mount hangs pauses for few 
>>> seconds. This is same if there're 2-5 19gb file each or 2gb file each. 
>>> There are less than  10 files under the GlusterFS mount.
>>>
>>> I am using NFS-Ganesha for NFS server with GlusterFS and the Linux client 
>>> is mounted using GlusterFS fuse mount with direct-io enabled.
>>>
>>> GlusterFS version 3.8(latest)
>>>
>>>
>>> Any insight is appreciated.
>>
>> This does not seem usual for the deployment that you describe. Can you try 
>> on a physical setup to see if the same behavior is observed?
>>
>> -Vijay
>>
>>
> --
> - This email message is for the sole use of the intended 
> recipient(s) and may contain confidential information.  Any 
> unauthorized review, use, disclosure or distribution is prohibited.  
> If you are not the intended recipient, please contact the sender by 
> reply email and destroy all copies of the original message.
> --
> -
>

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Linux (ls -l) command pauses/slow on GlusterFS mounts

2016-08-10 Thread Vijay Bellur

On 08/10/2016 05:12 PM, Deepak Naidu wrote:

Before we can try physical we wanted POC on VM.

Just a note the VMs are decently powerful 18cpus, 10gig NIC, 45GB Ram 1TB SSD 
drives. This is per node spec.

I don't see the ls -l command hanging when I try to list the files from the 
gluster-node VMs itself So the question.


The reason I alluded to a physical setup was to remove the variables 
that can affect performance in a virtual setup. The behavior is not 
usual for the scale of deployment that you mention. You could use strace 
in conjunction with gluster volume profile to determine where the 
latency is stemming from.


Regards,
Vijay



--
Deepak


On Aug 10, 2016, at 2:01 PM, Vijay Bellur  wrote:


On 08/10/2016 04:54 PM, Deepak Naidu wrote:
Anyone who has seen the issue in their env ?




--
Deepak

-Original Message-
From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Deepak Naidu
Sent: Tuesday, August 09, 2016 9:14 PM
To: gluster-users@gluster.org
Subject: [Gluster-users] Linux (ls -l) command pauses/slow on GlusterFS mounts

Greetings,

I have 3node GlusterFS on VM for POC each node has 2x bricks of 200GB. 
Regardless of what type of volume I create when listing files under directory 
using ls command the GlusterFS mount hangs pauses for few seconds. This is same 
if there're 2-5 19gb file each or 2gb file each. There are less than  10 files 
under the GlusterFS mount.

I am using NFS-Ganesha for NFS server with GlusterFS and the Linux client is 
mounted using GlusterFS fuse mount with direct-io enabled.

GlusterFS version 3.8(latest)


Any insight is appreciated.


This does not seem usual for the deployment that you describe. Can you try on a 
physical setup to see if the same behavior is observed?

-Vijay



---
This email message is for the sole use of the intended recipient(s) and may 
contain
confidential information.  Any unauthorized review, use, disclosure or 
distribution
is prohibited.  If you are not the intended recipient, please contact the 
sender by
reply email and destroy all copies of the original message.
---



___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Linux (ls -l) command pauses/slow on GlusterFS mounts

2016-08-10 Thread Deepak Naidu
Before we can try physical we wanted POC on VM.

Just a note the VMs are decently powerful 18cpus, 10gig NIC, 45GB Ram 1TB SSD 
drives. This is per node spec.

I don't see the ls -l command hanging when I try to list the files from the 
gluster-node VMs itself So the question.

--
Deepak

> On Aug 10, 2016, at 2:01 PM, Vijay Bellur  wrote:
> 
>> On 08/10/2016 04:54 PM, Deepak Naidu wrote:
>> Anyone who has seen the issue in their env ?
> 
> 
>> --
>> Deepak
>> 
>> -Original Message-
>> From: gluster-users-boun...@gluster.org 
>> [mailto:gluster-users-boun...@gluster.org] On Behalf Of Deepak Naidu
>> Sent: Tuesday, August 09, 2016 9:14 PM
>> To: gluster-users@gluster.org
>> Subject: [Gluster-users] Linux (ls -l) command pauses/slow on GlusterFS 
>> mounts
>> 
>> Greetings,
>> 
>> I have 3node GlusterFS on VM for POC each node has 2x bricks of 200GB. 
>> Regardless of what type of volume I create when listing files under 
>> directory using ls command the GlusterFS mount hangs pauses for few seconds. 
>> This is same if there're 2-5 19gb file each or 2gb file each. There are less 
>> than  10 files under the GlusterFS mount.
>> 
>> I am using NFS-Ganesha for NFS server with GlusterFS and the Linux client is 
>> mounted using GlusterFS fuse mount with direct-io enabled.
>> 
>> GlusterFS version 3.8(latest)
>> 
>> 
>> Any insight is appreciated.
> 
> This does not seem usual for the deployment that you describe. Can you try on 
> a physical setup to see if the same behavior is observed?
> 
> -Vijay
> 
> 
---
This email message is for the sole use of the intended recipient(s) and may 
contain
confidential information.  Any unauthorized review, use, disclosure or 
distribution
is prohibited.  If you are not the intended recipient, please contact the 
sender by
reply email and destroy all copies of the original message.
---
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Linux (ls -l) command pauses/slow on GlusterFS mounts

2016-08-10 Thread Vijay Bellur

On 08/10/2016 04:54 PM, Deepak Naidu wrote:

Anyone who has seen the issue in their env ?





--
Deepak

-Original Message-
From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Deepak Naidu
Sent: Tuesday, August 09, 2016 9:14 PM
To: gluster-users@gluster.org
Subject: [Gluster-users] Linux (ls -l) command pauses/slow on GlusterFS mounts

Greetings,

I have 3node GlusterFS on VM for POC each node has 2x bricks of 200GB. 
Regardless of what type of volume I create when listing files under directory 
using ls command the GlusterFS mount hangs pauses for few seconds. This is same 
if there're 2-5 19gb file each or 2gb file each. There are less than  10 files 
under the GlusterFS mount.

I am using NFS-Ganesha for NFS server with GlusterFS and the Linux client is 
mounted using GlusterFS fuse mount with direct-io enabled.

GlusterFS version 3.8(latest)


Any insight is appreciated.



This does not seem usual for the deployment that you describe. Can you 
try on a physical setup to see if the same behavior is observed?


-Vijay


___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Linux (ls -l) command pauses/slow on GlusterFS mounts

2016-08-10 Thread Deepak Naidu
Anyone who has seen the issue in their env ?

--
Deepak

-Original Message-
From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Deepak Naidu
Sent: Tuesday, August 09, 2016 9:14 PM
To: gluster-users@gluster.org
Subject: [Gluster-users] Linux (ls -l) command pauses/slow on GlusterFS mounts

Greetings,

I have 3node GlusterFS on VM for POC each node has 2x bricks of 200GB. 
Regardless of what type of volume I create when listing files under directory 
using ls command the GlusterFS mount hangs pauses for few seconds. This is same 
if there're 2-5 19gb file each or 2gb file each. There are less than  10 files 
under the GlusterFS mount.

I am using NFS-Ganesha for NFS server with GlusterFS and the Linux client is 
mounted using GlusterFS fuse mount with direct-io enabled.

GlusterFS version 3.8(latest) 


Any insight is appreciated.

--
Deepak
---
This email message is for the sole use of the intended recipient(s) and may 
contain confidential information.  Any unauthorized review, use, disclosure or 
distribution is prohibited.  If you are not the intended recipient, please 
contact the sender by reply email and destroy all copies of the original 
message.
---
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] NFS-Ganesha lo traffic

2016-08-10 Thread Mahdi Adnan
Thank you very much.I just noticed even without ganesha nfs i see this kind of 
traffic to the lo address.and the warning message about the health status only 
happen when i hit 100% brick utilization, so it should be fine anyway.I'll keep 
digging.
Thanks again.



-- 



Respectfully

Mahdi A. Mahdi



> Subject: Re: [Gluster-users] NFS-Ganesha lo traffic
> To: mahdi.ad...@outlook.com
> CC: gluster-users@gluster.org; nfs-ganesha-de...@lists.sourceforge.net
> From: skod...@redhat.com
> Date: Wed, 10 Aug 2016 11:05:50 +0530
> 
> 
> 
> On 08/09/2016 09:06 PM, Mahdi Adnan wrote:
> > Hi,
> > Thank you for your reply.
> >
> > The traffic is related to GlusterFS;
> >
> > 18:31:20.419056 IP 192.168.208.134.49058 > 192.168.208.134.49153: Flags
> > [.], ack 3876, win 24576, options [nop,nop,TS val 247718812 ecr
> > 247718772], length 0
> > 18:31:20.419080 IP 192.168.208.134.49056 > 192.168.208.134.49154: Flags
> > [.], ack 11625, win 24576, options [nop,nop,TS val 247718812 ecr
> > 247718772], length 0
> > 18:31:20.419084 IP 192.168.208.134.49060 > 192.168.208.134.49152: Flags
> > [.], ack 9861, win 24576, options [nop,nop,TS val 247718812 ecr
> > 247718772], length 0
> > 18:31:20.419088 IP 192.168.208.134.49054 > 192.168.208.134.49155: Flags
> > [.], ack 4393, win 24568, options [nop,nop,TS val 247718812 ecr
> > 247718772], length 0
> > 18:31:20.420084 IP 192.168.208.134.49052 > 192.168.208.134.49156: Flags
> > [.], ack 5525, win 24576, options [nop,nop,TS val 247718813 ecr
> > 247718773], length 0
> > 18:31:20.420092 IP 192.168.208.134.49049 > 192.168.208.134.49158: Flags
> > [.], ack 6657, win 24576, options [nop,nop,TS val 247718813 ecr
> > 247718773], length 0
> > 18:31:20.421065 IP 192.168.208.134.49050 > 192.168.208.134.49157: Flags
> > [.], ack 4729, win 24570, options [nop,nop,TS val 247718814 ecr
> > 247718774], length 0
> >
> 
> Looks like that is the traffic coming to the bricks local to that node 
> (>4915* ports are used by glusterfs brick processes). It could be from 
> nfs-ganesha or any other glusterfs client processes (like self-heal 
> daemon etc). Do you see this traffic even when there is no active I/O 
> from the nfs-client? If so, it could be from the self-heal daemon then. 
> Verify if there are any files/directories to be healed.
> 
> > Screenshot from wireshark can be found in the attachments.
> > 208.134 is the server IP address, and it's looks like it talking to
> > itself via the lo interface, im wondering if this is a normal behavior
> > or not.
> yes. It is the expected behavior when there are clients actively 
> accessing the volumes.
> 
> > and regarding the Ganesha server logs, how can i debug it to find why
> > the servers not responding to the requests on time ?
> 
> I suggest again to take tcpdump. Sometimes nfs-ganesha server (glusterfs 
> client) may have to communicate with all the bricks over the network 
> (like LOOKUP) and that may result in delay if there are lots of bricks 
> involved. Try capturing packets from the node where the nfs-ganesha 
> server is running and examine the packets between any of the NFS-client 
> request and its corresponding reply packet.
> 
> I usually use below cmd to capture the packets on all the interfaces -
> #tcpdump -i any -s 0 -w /var/tmp/nfs.pcap tcp and not port 22
> 
> Thanks,
> Soumya
> >
> >
> > --
> >
> > Respectfully*
> > **Mahdi A. Mahdi*
> >
> >
> >
> >> Subject: Re: [Gluster-users] NFS-Ganesha lo traffic
> >> To: mahdi.ad...@outlook.com
> >> From: skod...@redhat.com
> >> CC: gluster-users@gluster.org; nfs-ganesha-de...@lists.sourceforge.net
> >> Date: Tue, 9 Aug 2016 18:02:01 +0530
> >>
> >>
> >>
> >> On 08/09/2016 03:33 PM, Mahdi Adnan wrote:
> >> > Hi,
> >> >
> >> > Im using NFS-Ganesha to access my volume, it's working fine for now but
> >> > im seeing lots of traffic on the Loopback interface, in fact it's the
> >> > same amount of traffic on the bonding interface, can anyone please
> >> > explain to me why is this happening ?
> >>
> >> Could you please capture packets on those interfaces using tcpdump and
> >> examine the traffic?
> >>
> >> > also, i got the following error in the ganesha log file;
> >> >
> >> > 09/08/2016 11:35:54 : epoch 57a5da0c : gfs04 :
> >> > ganesha.nfsd-1646[dbus_heartbeat] dbus_heartbeat_cb :DBUS :WARN :Health
> >> > status is unhealthy. Not sending heartbeat
> >> > 09/08/2016 11:46:04 : epoch 57a5da0c : gfs04 :
> >> > ganesha.nfsd-1646[dbus_heartbeat] dbus_heartbeat_cb :DBUS :WARN :Health
> >> > status is unhealthy. Not sending heartbeat
> >> > 09/08/2016 11:54:39 : epoch 57a5da0c : gfs04 :
> >> > ganesha.nfsd-1646[dbus_heartbeat] dbus_heartbeat_cb :DBUS :WARN :Health
> >> > status is unhealthy. Not sending heartbeat
> >> > 09/08/2016 12:06:04 : epoch 57a5da0c : gfs04 :
> >> > ganesha.nfsd-1646[dbus_heartbeat] dbus_heartbeat_cb :DBUS :WARN :Health
> >> > status is unhealthy. Not sending heartbeat
> >> >
> >> > is it something i should care about ?
> >>
> >> Above warnings are thrown when the 

Re: [Gluster-users] gluster profile mode - showing unexpected WRITE fops

2016-08-10 Thread Shyam

Hi,

How was the profile data collected? Was it a cumulative profile output 
or an incremental profile output?


How did the initial data get written to the 20 bricks, before the read 
workload was started?


What I suspect here is that, you are collecting cumulative output, which 
is possibly showing up the initial writes that were performed to 
populate the volume.


From the code I do not see seek tracked in iostats, so it is not the 
*seek* that is bloating these numbers up.


Regards,
Shyam

On 08/08/2016 04:47 PM, Jackie Tung wrote:

Hi,

I’m doing some benchmarking vs our trial GlusterFS setup (distributed 
replicated, 20 bricks configured as 10 pairs).  I’m running 3.6.9 currently.  
Our benchmarking load involves a large number of concurrent readers that 
continuously pick random file / offsets to read.  No writes are ever issued.  
However I’m seeing the following gluster profiling output:

   Block Size:  32768b+   65536b+  131072b+
 No. of Reads:0 0   6217035
No. of Writes:87967 25443   4372800

Why are there writes at all?  Are seeks being shown here as writes?

Thanks,
Jackie


___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Upgrade guide to 3.8 missing

2016-08-10 Thread Gandalf Corvotempesta
Il 10 ago 2016 14:17, "ML mail"  ha scritto:
>
> Good point Gandalf! I really don't feel adventurous on a production
cluster...
>
>

This is mostly the only points that keeps me away from gluster on any
production storage.

If there isn't any official, safe and suggested rolling update procedure
with no downtime, I'll never use gluster in production.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Upgrade guide to 3.8 missing

2016-08-10 Thread ML mail
Good point Gandalf! I really don't feel adventurous on a production cluster...


On Wednesday, August 10, 2016 2:14 PM, Gandalf Corvotempesta 
 wrote:



Il 10 ago 2016 11:59, "ML mail"  ha scritto:
>
> Hi,
>
> The Upgrading to 3.8 guide is missing from:
>
>
> http://gluster.readthedocs.io/en/latest/Upgrade-Guide/README/
>
Additionally,  all upgrade guides say, about a rolling upgrade, the following: 
"feel adventurous"
So, the only secure way to upgrade gluster is to bring down the whole cluster?
If rolling upgrade is supported and safe, that "warning" should be removed and 
rolling update should be the suggested method to use (no one wants to bring 
down the whole storage infrastructure to upgrade)
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Upgrade guide to 3.8 missing

2016-08-10 Thread Gandalf Corvotempesta
Il 10 ago 2016 11:59, "ML mail"  ha scritto:
>
> Hi,
>
> The Upgrading to 3.8 guide is missing from:
>
>
> http://gluster.readthedocs.io/en/latest/Upgrade-Guide/README/
>

Additionally,  all upgrade guides say, about a rolling upgrade, the
following:
"feel adventurous"

So, the only secure way to upgrade gluster is to bring down the whole
cluster?

If rolling upgrade is supported and safe, that "warning" should be removed
and rolling update should be the suggested method to use (no one wants to
bring down the whole storage infrastructure to upgrade)
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Upgrade guide to 3.8 missing

2016-08-10 Thread ML mail
Hi,

The Upgrading to 3.8 guide is missing from:


http://gluster.readthedocs.io/en/latest/Upgrade-Guide/README/

Regards,
ML
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] GlusterFS-3.7.14 released

2016-08-10 Thread Serkan Çoban
Hi,

Any progress about the patch?

On Thu, Aug 4, 2016 at 10:16 AM, Pranith Kumar Karampuri
 wrote:
>
>
> On Thu, Aug 4, 2016 at 11:30 AM, Serkan Çoban  wrote:
>>
>> Thanks Pranith,
>> I am waiting for RPMs to show, I will do the tests as soon as possible
>> and inform you.
>
>
> I guess on 3.7.x the RPMs are not automatically built. Let me find how it
> can be done. I will inform you after finding that out. Give me a day.
>
>>
>>
>> On Wed, Aug 3, 2016 at 11:19 PM, Pranith Kumar Karampuri
>>  wrote:
>> >
>> >
>> > On Thu, Aug 4, 2016 at 1:47 AM, Pranith Kumar Karampuri
>> >  wrote:
>> >>
>> >>
>> >>
>> >> On Thu, Aug 4, 2016 at 12:51 AM, Serkan Çoban 
>> >> wrote:
>> >>>
>> >>> I use rpms for installation. Redhat/Centos 6.8.
>> >>
>> >>
>> >> http://review.gluster.org/#/c/15084 is the patch. In some time the rpms
>> >> will be built actually.
>> >
>> >
>> > In the same URL above it will actually post the rpms for fedora/el6/el7
>> > at
>> > the end of the page.
>> >
>> >>
>> >>
>> >> Use gluster volume set  disperse.shd-max-threads > >> (range: 1-64)>
>> >>
>> >> While testing this I thought of ways to decrease the number of crawls
>> >> as
>> >> well. But they are a bit involved. Try to create same set of data and
>> >> see
>> >> what is the time it takes to complete heals using number of threads as
>> >> you
>> >> increase the number of parallel heals from 1 to 64.
>> >>
>> >>>
>> >>> On Wed, Aug 3, 2016 at 10:16 PM, Pranith Kumar Karampuri
>> >>>  wrote:
>> >>> >
>> >>> >
>> >>> > On Thu, Aug 4, 2016 at 12:45 AM, Serkan Çoban
>> >>> > 
>> >>> > wrote:
>> >>> >>
>> >>> >> I prefer 3.7 if it is ok for you. Can you also provide build
>> >>> >> instructions?
>> >>> >
>> >>> >
>> >>> > 3.7 should be fine. Do you use rpms/debs/anything-else?
>> >>> >
>> >>> >>
>> >>> >>
>> >>> >> On Wed, Aug 3, 2016 at 10:12 PM, Pranith Kumar Karampuri
>> >>> >>  wrote:
>> >>> >> >
>> >>> >> >
>> >>> >> > On Thu, Aug 4, 2016 at 12:37 AM, Serkan Çoban
>> >>> >> > 
>> >>> >> > wrote:
>> >>> >> >>
>> >>> >> >> Yes, but I can create 2+1(or 8+2) ec using two servers right? I
>> >>> >> >> have
>> >>> >> >> 26 disks on each server.
>> >>> >> >
>> >>> >> >
>> >>> >> > On which release-branch do you want the patch? I am testing it on
>> >>> >> > master-branch now.
>> >>> >> >
>> >>> >> >>
>> >>> >> >>
>> >>> >> >> On Wed, Aug 3, 2016 at 9:59 PM, Pranith Kumar Karampuri
>> >>> >> >>  wrote:
>> >>> >> >> >
>> >>> >> >> >
>> >>> >> >> > On Thu, Aug 4, 2016 at 12:23 AM, Serkan Çoban
>> >>> >> >> > 
>> >>> >> >> > wrote:
>> >>> >> >> >>
>> >>> >> >> >> I have two of my storage servers free, I think I can use them
>> >>> >> >> >> for
>> >>> >> >> >> testing. Is two server testing environment ok for you?
>> >>> >> >> >
>> >>> >> >> >
>> >>> >> >> > I think it would be better if you have at least 3. You can
>> >>> >> >> > test
>> >>> >> >> > it
>> >>> >> >> > with
>> >>> >> >> > 2+1
>> >>> >> >> > ec configuration.
>> >>> >> >> >
>> >>> >> >> >>
>> >>> >> >> >>
>> >>> >> >> >> On Wed, Aug 3, 2016 at 9:44 PM, Pranith Kumar Karampuri
>> >>> >> >> >>  wrote:
>> >>> >> >> >> >
>> >>> >> >> >> >
>> >>> >> >> >> > On Wed, Aug 3, 2016 at 6:01 PM, Serkan Çoban
>> >>> >> >> >> > 
>> >>> >> >> >> > wrote:
>> >>> >> >> >> >>
>> >>> >> >> >> >> Hi,
>> >>> >> >> >> >>
>> >>> >> >> >> >> May I ask if multi-threaded self heal for distributed
>> >>> >> >> >> >> disperse
>> >>> >> >> >> >> volumes
>> >>> >> >> >> >> implemented in this release?
>> >>> >> >> >> >
>> >>> >> >> >> >
>> >>> >> >> >> > Serkan,
>> >>> >> >> >> > At the moment I am a bit busy with different work,
>> >>> >> >> >> > Is
>> >>> >> >> >> > it
>> >>> >> >> >> > possible
>> >>> >> >> >> > for you to help test the feature if I provide a patch?
>> >>> >> >> >> > Actually
>> >>> >> >> >> > the
>> >>> >> >> >> > patch
>> >>> >> >> >> > should be small. Testing is where lot of time will be spent
>> >>> >> >> >> > on.
>> >>> >> >> >> >
>> >>> >> >> >> >>
>> >>> >> >> >> >>
>> >>> >> >> >> >> Thanks,
>> >>> >> >> >> >> Serkan
>> >>> >> >> >> >>
>> >>> >> >> >> >> On Tue, Aug 2, 2016 at 5:30 PM, David Gossage
>> >>> >> >> >> >>  wrote:
>> >>> >> >> >> >> > On Tue, Aug 2, 2016 at 6:01 AM, Lindsay Mathieson
>> >>> >> >> >> >> >  wrote:
>> >>> >> >> >> >> >>
>> >>> >> >> >> >> >> On 2/08/2016 5:07 PM, Kaushal M wrote:
>> >>> >> >> >> >> >>>
>> >>> >> >> >> >> >>> GlusterFS-3.7.14 has been released. This is a regular
>> >>> >> >> >> >> >>> minor
>> >>> >> >> >> >> >>> release.
>> >>> >> >> >> >> >>> The release-notes are available at
>> >>> >> >> >> >> >>>
>> >>> >> >> >> >> >>>
>> >>> >> >> >> >> >>>
>> >>> >> >> >> >> >>>
>> >>> >> >> >> >> >>>
>> >>> >> >> >> >> >>>
>> >>> >> >> >> >> >>>
>> >>> >> >> >> >> >>> https://github.com/gluster/glusterfs/blob/release-3.7/doc/release-notes/3.7.14.md
>> >>> >> >> >> >> >>
>> >>> >> >> >> >> >>
>> >>> >> >> >> >> >> Thanks Kaushal, I'll check it out
>> >>> >> >> >> >> >>
>> >>> >> >> >> >> >
>> >>> >> >> >> >> > So far on my test box its working as