Re: [Gluster-devel] Fw: Re[2]: missing files

2015-02-11 Thread David F. Robinson
I will forward the emails to Shyam to the devel list. 


David  (Sent from mobile)

===
David F. Robinson, Ph.D. 
President - Corvid Technologies
704.799.6944 x101 [office]
704.252.1310  [cell]
704.799.7974  [fax]
david.robin...@corvidtec.com
http://www.corvidtechnologies.com

 On Feb 11, 2015, at 8:21 AM, Pranith Kumar Karampuri pkara...@redhat.com 
 wrote:
 
 
 On 02/11/2015 06:49 PM, Pranith Kumar Karampuri wrote:
 
 On 02/11/2015 08:36 AM, Shyam wrote:
 Did some analysis with David today on this here is a gist for the list,
 
 1) Volumes classified as slow (i.e with a lot of pre-existing data) and 
 fast (new volumes carved from the same backend file system that slow bricks 
 are on, with little or no data)
 
 2) We ran an strace of tar and also collected io-stats outputs from these 
 volumes, both show that create and mkdir is slower on slow as compared to 
 the fast volume. This seems to be the overall reason for slowness.
 Did you happen to do strace of the brick when this happened? If not, David, 
 can we get that information as well?
 It would be nice to compare the difference in syscalls of the bricks of two 
 volumes to see if there are any extra syscalls that is adding to the delay.
 
 Pranith
 
 Pranith
 
 3) The tarball extraction is to a new directory on the gluster mount, so 
 all lookups etc. happen within this new name space on the volume
 
 4) Checked memory footprints of the slow bricks and fast bricks etc. 
 nothing untoward noticed there
 
 5) Restarted the slow volume, just as a test case to do things from 
 scratch, no improvement in performance.
 
 Currently attempting to reproduce this on a local system to see if the same 
 behavior is seen so that it becomes easier to debug etc.
 
 Others on the list can chime in as they see fit.
 
 Thanks,
 Shyam
 
 On 02/10/2015 09:58 AM, David F. Robinson wrote:
 Forwarding to devel list as recommended by Justin...
 
 David
 
 
 -- Forwarded Message --
 From: David F. Robinson david.robin...@corvidtec.com
 To: Justin Clift jus...@gluster.org
 Sent: 2/10/2015 9:49:09 AM
 Subject: Re[2]: [Gluster-devel] missing files
 
 Bad news... I don't think it is the old linkto files. Bad because if
 that was the issue, cleaning up all of bad linkto files would have fixed
 the issue. It seems like the system just gets slower as you add data.
 
 First, I setup a new clean volume (test2brick) on the same system as the
 old one (homegfs_bkp). See 'gluster v info' below. I ran my simple tar
 extraction test on the new volume and it took 58-seconds to complete
 (which, BTW, is 10-seconds faster than my old non-gluster system, so
 kudos). The time on homegfs_bkp is 19-minutes.
 
 Next, I copied 10-terabytes of data over to test2brick and re-ran the
 test which then took 7-minutes. I created a test3brick and ran the test
 and it took 53-seconds.
 
 To confirm all of this, I deleted all of the data from test2brick and
 re-ran the test. It took 51-seconds!!!
 
 BTW. I also checked the .glusterfs for stale linkto files (find . -type
 f -size 0 -perm 1000 -exec ls -al {} \;). There are many, many thousands
 of these types of files on the old volume and none on the new one, so I
 don't think this is related to the performance issue.
 
 Let me know how I should proceed. Send this to devel list? Pranith?
 others? Thanks...
 
 [root@gfs01bkp .glusterfs]# gluster volume info homegfs_bkp
 Volume Name: homegfs_bkp
 Type: Distribute
 Volume ID: 96de8872-d957-4205-bf5a-076e3f35b294
 Status: Started
 Number of Bricks: 2
 Transport-type: tcp
 Bricks:
 Brick1: gfsib01bkp.corvidtec.com:/data/brick01bkp/homegfs_bkp
 Brick2: gfsib01bkp.corvidtec.com:/data/brick02bkp/homegfs_bkp
 
 [root@gfs01bkp .glusterfs]# gluster volume info test2brick
 Volume Name: test2brick
 Type: Distribute
 Volume ID: 123259b2-3c61-4277-a7e8-27c7ec15e550
 Status: Started
 Number of Bricks: 2
 Transport-type: tcp
 Bricks:
 Brick1: gfsib01bkp.corvidtec.com:/data/brick01bkp/test2brick
 Brick2: gfsib01bkp.corvidtec.com:/data/brick02bkp/test2brick
 
 [root@gfs01bkp glusterfs]# gluster volume info test3brick
 Volume Name: test3brick
 Type: Distribute
 Volume ID: 9b1613fc-f7e5-4325-8f94-e3611a5c3701
 Status: Started
 Number of Bricks: 2
 Transport-type: tcp
 Bricks:
 Brick1: gfsib01bkp.corvidtec.com:/data/brick01bkp/test3brick
 Brick2: gfsib01bkp.corvidtec.com:/data/brick02bkp/test3brick
 
 
 From homegfs_bkp:
 # find . -type f -size 0 -perm 1000 -exec ls -al {} \;
 T 2 gmathur pme_ics 0 Jan 9 16:59
 ./00/16/00169a69-1a7a-44c9-b2d8-991671ee87c4
 -T 3 jcowan users 0 Jan 9 17:51
 ./00/16/0016a0a0-fd22-4fb5-b6fb-5d7f9024ab74
 -T 2 morourke sbir 0 Jan 9 18:17
 ./00/16/0016b36f-32fc-4f2c-accd-e36be2f6c602
 -T 2 carpentr irl 0 Jan 9 18:52
 ./00/16/00163faf-741c-4e40-8081-784786b3cc71
 -T 3 601 raven 0 Jan 9 22:49
 ./00/16/00163385-a332-4050-8104-1b1af6cd8249
 -T 3 bangell sbir 0 Jan 9 22:56
 

Re: [Gluster-devel] Problems with ec/nfs.t in regression tests

2015-02-11 Thread Shyam

On 02/11/2015 09:40 AM, Xavier Hernandez wrote:

Hi,

it seems that there are some failures in ec/nfs.t test on regression
tests. Doing some investigation I've found that before applying the
multi-threaded patch (commit 5e25569e) the problem does not seem to happen.


This has in interesting history in failures, on the regression runs for 
the MT epoll this (i.e ec/nfs.t) did not fail (there were others, but 
not nfs.t).


The patch that allows configuration of MT epoll is where this started 
failing around Feb 5th (but later passed). (see patchset 7 failures on, 
http://review.gluster.org/#/c/9488/ )


I state the above, as it may help narrowing down the changes in EC 
(maybe) that could have caused it.


Also in the latter commit, there was an error configuring the number of 
threads so all regression runs would have run with a single epoll thread 
(the MT epoll patch had this hard coded, so that would have run with 2 
threads, but did not show up the issue (patch: 
http://review.gluster.org/#/c/3842/)).


Again I state the above, as this should not be exposing a 
race/bug/problem due to the multi threaded nature of epoll, but of 
course needs investigation.




I'm not sure if this patch is the cause or it has revealed some bug in
ec or any other xlator.


I guess we can reproduce this issue? If so I would try setting 
client.event-threads on master branch to 1, restarting the volume and 
then running the test (as a part of the test itself maybe) to eliminate 
the possibility that MT epoll is causing it.


My belief on MT epoll causing it is in doubt as the runs failed on the 
http://review.gluster.org/#/c/9488/ (configuration patch), which had the 
thread count as 1 due to a bug in that code.




I can try to identify it (any help will be appreciated), but it may take
some time. Would it be better to remove the test in the meantime ?


I am checking if this is reproducible on my machine, so that I can 
possibly see what is going wrong.


Shyam
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] missing files

2015-02-11 Thread David F. Robinson
My base filesystem has 40-TB and the tar takes 19 minutes. I copied over 10-TB 
and it took the tar extraction from 1-minute to 7-minutes. 

My suspicion is that it is related to number of files and not necessarily file 
size. Shyam is looking into reproducing this behavior on a redhat system. 

David  (Sent from mobile)

===
David F. Robinson, Ph.D. 
President - Corvid Technologies
704.799.6944 x101 [office]
704.252.1310  [cell]
704.799.7974  [fax]
david.robin...@corvidtec.com
http://www.corvidtechnologies.com

 On Feb 11, 2015, at 7:38 AM, Justin Clift jus...@gluster.org wrote:
 
 On 11 Feb 2015, at 12:31, David F. Robinson david.robin...@corvidtec.com 
 wrote:
 
 Some time ago I had a similar performance problem (with 3.4 if I remember 
 correctly): a just created volume started to work fine, but after some time 
 using it performance was worse. Removing all files from the volume didn't 
 improve the performance again.
 
 I guess my problem is a little better depending on how you look at it. If I 
 date the data from the volume, the performance goes back to that of an empty 
 volume. I don't have to delete the .glusterfs entries to regain my 
 performance. I only have to delete the data from the mount point.
 
 Interesting.  Do you have somewhat accurate stats on how much data (eg # of 
 entries, size
 of files) was in the data set that did this?
 
 Wondering if it's repeatable, so we can replicate the problem and solve. :)
 
 + Justin
 
 --
 GlusterFS - http://www.gluster.org
 
 An open source, distributed file system scaling to several
 petabytes, and handling thousands of clients.
 
 My personal twitter: twitter.com/realjustinclift
 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Problems with ec/nfs.t in regression tests

2015-02-11 Thread Xavier Hernandez
Thanks for the information. I'll do some tests changing the number of 
threads for epoll.


Xavi

On 02/11/2015 04:20 PM, Shyam wrote:

On 02/11/2015 09:40 AM, Xavier Hernandez wrote:

Hi,

it seems that there are some failures in ec/nfs.t test on regression
tests. Doing some investigation I've found that before applying the
multi-threaded patch (commit 5e25569e) the problem does not seem to
happen.


This has in interesting history in failures, on the regression runs for
the MT epoll this (i.e ec/nfs.t) did not fail (there were others, but
not nfs.t).

The patch that allows configuration of MT epoll is where this started
failing around Feb 5th (but later passed). (see patchset 7 failures on,
http://review.gluster.org/#/c/9488/ )

I state the above, as it may help narrowing down the changes in EC
(maybe) that could have caused it.

Also in the latter commit, there was an error configuring the number of
threads so all regression runs would have run with a single epoll thread
(the MT epoll patch had this hard coded, so that would have run with 2
threads, but did not show up the issue (patch:
http://review.gluster.org/#/c/3842/)).

Again I state the above, as this should not be exposing a
race/bug/problem due to the multi threaded nature of epoll, but of
course needs investigation.



I'm not sure if this patch is the cause or it has revealed some bug in
ec or any other xlator.


I guess we can reproduce this issue? If so I would try setting
client.event-threads on master branch to 1, restarting the volume and
then running the test (as a part of the test itself maybe) to eliminate
the possibility that MT epoll is causing it.

My belief on MT epoll causing it is in doubt as the runs failed on the
http://review.gluster.org/#/c/9488/ (configuration patch), which had the
thread count as 1 due to a bug in that code.



I can try to identify it (any help will be appreciated), but it may take
some time. Would it be better to remove the test in the meantime ?


I am checking if this is reproducible on my machine, so that I can
possibly see what is going wrong.

Shyam
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Gluster 3.6.2 On Xeon Phi

2015-02-11 Thread Rudra Siva
Rafi,

I'm preparing the Phi RDMA patch for submission - definitely
performance is better with the buffer pre-registration fixes. My patch
will be without your fixes and doesn't rely on your enhancements so it
can come in at any time. There are two default values that Phi
generally has a problem with:

options-send_count = 4096;
options-recv_count = 4096;

Is there anything that relies on these values being at 4096? Presently
have it set to 256 for the Phi to initialize quickly - has been
working fine.


On Mon, Feb 9, 2015 at 7:23 AM, Rudra Siva rudrasiv...@gmail.com wrote:
 In rdma.c : gf_rdma_do_reads : pthread_mutex_lock
 (priv-write_mutex); - lock guards against what?


 On Mon, Feb 9, 2015 at 1:10 AM, Mohammed Rafi K C rkavu...@redhat.com wrote:

 On 02/08/2015 07:52 PM, Rudra Siva wrote:
 Thanks for trying and sending the changes - finally got it all working
 ... it turned out to be a problem with my changes (in
 gf_rdma_post_unref - goes back to lack of SRQ on the interface)

 You may be able to simulate the crash if you set volume parameters to
 something like the following (it would be purely academic):

 gluster volume set data_volume diagnostics.brick-log-level TRACE
 gluster volume set data_volume diagnostics.client-log-level TRACE

 Had those because stuff began from communication problems (queue size,
 lack of SRQ) so things have come a long way from there - will test for
 some more time and make my small changes available.

 The transfer speeds of the default VE (Virtual Ethernet) that Intel
 ships with it is ~6 MB/sec  - presently with Gluster I see around 80
 MB/sec on the virtual IB (there is no real infiniband card) and with a
 stable gluster mount. The interface benchmarks show it can give 5000
 MB/sec so there looks to be more room for improvement - stable gluster
 mount is required first though for doing anything.

 Questions:

 1. ctx is shared between posts - parts of code with locks and without
 - intentional/oversight?
 I didn't get your question properly. If you are talking about the ctx
 inside the post variable, it is not shared.

 2.  iobuf_pool-default_page_size  = 128 * GF_UNIT_KB - why is 128 KB
 chosen and not higher?
 For glusterfs default page size is 128KB. May be because of fuse is
 limited to 128KB. I'm not sure about the exact reason.


 -Siva


 On Fri, Feb 6, 2015 at 6:12 AM, Mohammed Rafi K C rkavu...@redhat.com 
 wrote:
 On 02/06/2015 05:31 AM, Rudra Siva wrote:
 Rafi,

 Sorry it took me some time - I had to merge these with some of my
 changes - the scif0 (iWARP) does not support SRQ (max_srq : 0) so have
 changed some of the code to use QP instead - can provide those if
 there is interest after this is stable.

 Here's the good -

 The performance with the patches is better than without (esp.
 http://review.gluster.org/#/c/9327/).
 Good to hear. My thought was, http://review.gluster.org/#/c/9506/  will
 give a much better performance than the others :-) . A rebase is needed
 if it is applying on top the other patches.

 The bad - glusterfsd crashes for large files so it's difficult to get
 some decent benchmark numbers
 Thanks for rising the bug. I tried to reproduce the problem on 3.6.2
 version+the four patches with a simple distributed volume. But I
 couldn't reproduce the same, and still trying. (we are using mellanox ib
 cards).

 If possible can you please share the volume info and workload used for
 large files.


 - small ones look good - trying to
 understand the patch at this time. Looks like this code comes from
 9327 as well.

 Can you please review the reset of mr_count?
 Yes, The problem could be the wrong value in mr_count. And I guess we
 failed to reset the value to zero, so that for some I/O mr_count will be
 incremented couple of times. So the variable might be got overflown. Can
 you apply the patch attached with mail, and try with this.

 Info from gdb is as follows - if you need more or something jumps out
 please feel free to let me know.

 (gdb) p *post
 $16 = {next = 0x7fffe003b280, prev = 0x7fffe0037cc0, mr =
 0x7fffe0037fb0, buf = 0x7fffe0096000 \005\004, buf_size = 4096, aux
 = 0 '\000',
   reused = 1, device = 0x7fffe00019c0, type = GF_RDMA_RECV_POST, ctx =
 {mr = {0x7fffe0003020, 0x7fffc8005f20, 0x7fffc8000aa0, 0x7fffc80030c0,
   0x7fffc8002d70, 0x7fffc8008bb0, 0x7fffc8008bf0, 0x7fffc8002cd0},
 mr_count = -939493456, vector = {{iov_base = 0x77fd6000,
 iov_len = 112}, {iov_base = 0x7fffbf14, iov_len = 131072},
 {iov_base = 0x0, iov_len = 0} repeats 14 times}, count = 2,
 iobref = 0x7fffc8001670, hdr_iobuf = 0x61d710, is_request = 0
 '\000', gf_rdma_reads = 1, reply_info = 0x0}, refcount = 1, lock = {
 __data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0,
 __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}},
 __size = '\000' repeats 39 times, __align = 0}}

 (gdb) bt
 #0  0x7fffe7142681 in __gf_rdma_register_local_mr_for_rdma
 (peer=0x7fffe0001800, vector=0x7fffe003b108, 

Re: [Gluster-devel] Gluster testing on CentOS CI

2015-02-11 Thread Karanbir Singh
Hi!

I'm not sub'd to the gluster-devel list, so unsure if my reply will make
it there.

On 10/02/15 13:23, Justin Clift wrote:
 Hi KB,
 
 We're interested in the offer to have our CI stuff (Jenkins jobs)
 run on the CentOS CI infrastructure.

woo!

 
 We have some initial questions, prior to testing things technically.
 
 *  Is there a document that describes the CentOS Jenkins setup
(like provisioning of the slaves)?

Not at the moment, its something we're working on - however, there is
quite a lot of flexibility there. If there is a specific setup you need,
we can try to make it happen.

How are the tests run presently ? do you deploy testing environ, deploy
jenkins-slave, run tests ? or are tests run from a central Jenkins
instance that uses a remote transport ( like we use for some of the
centos distro ci, where the job worker runs the entire suite over ssh ) ?

 
 * Is it possible for Gluster Community members to log into a Jenkins
   slave and troubleshoot specific failures?

Short Answer : yes.

Long Answer : whe you setup with ci.centos.org, you can ask for any
number of people to be added in ( ideally the list will be small, and
only trusted people ). We store their ssh keys in the central management
software ( duffy! ). When tests run through and pass : we teardown nodes
right away. However, when tests fail - we can inject the ssh keys for
the people, and they can then ssh in via the jump host. Now, if its for
the baremetal nodes, ideally we'd want folks to clear out real quick (
it defaults to reinstall machine in 12 hrs, but can be extended ).
Alternatively, we can rsync the machine contents into a container and
ship that.

 
   eg if something fails a test, can one of the Community members log
   into the slave node to investigate (with root perms)

yeah, the keys are setup for user root ( note that we reprovision the
test environment after every test run finishes, and machine instances
are only running while there are tests running ).

 * We also want to be sure that our Community members (developers) will
   have access in the Jenkins interface to do stuff for our jobs.

Fabian is looking at Jenkins Job Builder, that should give people most
of what they need - alternatively, we can help with plugin additions /
setup etc as well. Its a multi tennant setup, so we just need to make
sure everyone is being nice.

   eg create new jobs + editing existing ones (so our tests improve
   over time), rerunning failed jobs, and so on.  We don't need full
   admin access, but we do need the ability to do stuff for our jobs

sure, thats easy. I believe there are ways to also get admin rights per
project ( which can then have jobs under it ) without needing admin on
the entire jenkins setup. I've not used it in the past, but it might be
something we can investigate.

Also, I know you mentioned needing VM's from other distro's etc - the
Libvirt and libguestfs folks are also doing something around that, and
we should be able to host these for you ( but would, as you can imagine,
really prefer you testing with CentOS on baremetal )

regards,


-- 
Karanbir Singh, Project Lead, The CentOS Project
+44-207-0999389 | http://www.centos.org/ | twitter.com/CentOS
GnuPG Key : http://www.karan.org/publickey.asc
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Skip regression run for work in progress patch.

2015-02-11 Thread Emmanuel Dreyfus
On Thu, Feb 12, 2015 at 02:35:46AM -0500, Sachin Pandit wrote:
 Whenever we send out a patch for review, Jenkins parses the necessary
 information and triggers the regression build for the same. Do we have
 a mechanism (a keyword) which indicates Jenkins to skip the regression run.
 If not, can we have a mechanism to skip the regression run for work in
 progress patch.

What about comitting with tests/* cleared? Or with tun-tests.sh that
starts with exit 0?

-- 
Emmanuel Dreyfus
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] GlusterFS 4.0 Call For Participation

2015-02-11 Thread Pranith Kumar Karampuri


On 02/10/2015 03:42 AM, Jeff Darcy wrote:

Interest in 4.0 seems to be increasing.  So is developer activity, but
all of the developers involved in 4.0 are stretched a bit thin.  As a
result, some sub-projects still don't have anyone who's working on them
often enough to make significant progress.  The full list is here:

http://www.gluster.org/community/documentation/index.php/Planning40

In particular, the following sub-projects could benefit from more
volunteers:

* Multi-network support

* Composite operations (small-file performance)
I was thinking of an xlator for doing something similar. I will be happy 
to do this part. Post Feb though, is that fine?


Pranith


* All of the other stuff except code generation

I'm not going to pretend that any of these will be easy to pick up, but
I'd be glad to work with any volunteers to establish the necessary
knowledge baseline.  If you want to get in early and make your mark on
the codebase that will eventually replace some of that hoary old 3.x
cruft, please respond here or let me know some other way.
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] REMINDER: Weekly Gluster Community meeting today at 12:00 UTC

2015-02-11 Thread Niels de Vos

Hi all,

In a little more than one hour from now we will have the regular weekly
Gluster Community meeting.

Meeting details:
- location: #gluster-meeting on Freenode IRC
- date: every Wednesday
- time: 7:00 EST, 12:00 UTC, 13:00 CET, 17:30 IST
   (in your terminal,  run: date -d 12:00 UTC)
- agenda:https://public.pad.fsfe.org/p/gluster-community-meetings

Currently the following items are listed:
* Roll Call
* Status of last weeks action items
* GlusterFS 3.6
* GlusterFS 3.5
* GlusterFS 3.4
* GlusterFS Next
* Open Floor

The last topic has space for additions. If you have a suitable topic to
discuss, please add it to the agenda.

Thanks,
Niels
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] missing files

2015-02-11 Thread David F. Robinson
Don't think it is the underlying file system. /data/brickxx is the underlying 
xfs. Performance to this is fine. When I created a volume it just puts the data 
in /data/brick/test2. The underlying filesystem shouldn't know/care that it is 
in a new directory. 

Also, if I create a /data/brick/test2 volume and put data on it, it gets slow 
in gluster. But, writing to /data/brick is still fine. And, after test2 gets 
slow, I can create a /data/test3 volume that is empty and its speed is fine. 

My knowledge is admittedly very limited here, but I don't see how it could be 
the underlying filesystem if the slowdown only occurs on the gluster mount and 
not on the underlying xfs filesystem. 

David  (Sent from mobile)

===
David F. Robinson, Ph.D. 
President - Corvid Technologies
704.799.6944 x101 [office]
704.252.1310  [cell]
704.799.7974  [fax]
david.robin...@corvidtec.com
http://www.corvidtechnologies.com

 On Feb 11, 2015, at 12:18 AM, Justin Clift jus...@gluster.org wrote:
 
 On 11 Feb 2015, at 03:06, Shyam srang...@redhat.com wrote:
 snip
 2) We ran an strace of tar and also collected io-stats outputs from these 
 volumes, both show that create and mkdir is slower on slow as compared to 
 the fast volume. This seems to be the overall reason for slowness
 
 Any idea's on why the create and mkdir is slower?
 
 Wondering if it's a case of underlying filesystem parameters (for the bricks)
 + maybe physical storage structure having become badly optimised over time.
 eg if its on spinning rust, not ssd, and sector placement is now bad
 
 Any idea if there are tools that can analyse this kind of thing?  eg meta
 data placement / fragmentation / on a drive for XFS/ext4
 
 + Justin
 
 --
 GlusterFS - http://www.gluster.org
 
 An open source, distributed file system scaling to several
 petabytes, and handling thousands of clients.
 
 My personal twitter: twitter.com/realjustinclift
 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] missing files

2015-02-11 Thread Xavier Hernandez
Some time ago I had a similar performance problem (with 3.4 if I 
remember correctly): a just created volume started to work fine, but 
after some time using it performance was worse. Removing all files from 
the volume didn't improve the performance again.


The only way I had to recover a performance similar to the initial one 
without recreating the volume was to remove all volume contents and also 
delete all 256 .glusterfs/xx/ directories from all bricks.


The backend filesystem was XFS.

Could you try if this is the same case ?

Xavi

On 02/11/2015 12:22 PM, David F. Robinson wrote:

Don't think it is the underlying file system. /data/brickxx is the underlying 
xfs. Performance to this is fine. When I created a volume it just puts the data 
in /data/brick/test2. The underlying filesystem shouldn't know/care that it is 
in a new directory.

Also, if I create a /data/brick/test2 volume and put data on it, it gets slow 
in gluster. But, writing to /data/brick is still fine. And, after test2 gets 
slow, I can create a /data/test3 volume that is empty and its speed is fine.

My knowledge is admittedly very limited here, but I don't see how it could be 
the underlying filesystem if the slowdown only occurs on the gluster mount and 
not on the underlying xfs filesystem.

David  (Sent from mobile)

===
David F. Robinson, Ph.D.
President - Corvid Technologies
704.799.6944 x101 [office]
704.252.1310  [cell]
704.799.7974  [fax]
david.robin...@corvidtec.com
http://www.corvidtechnologies.com


On Feb 11, 2015, at 12:18 AM, Justin Clift jus...@gluster.org wrote:


On 11 Feb 2015, at 03:06, Shyam srang...@redhat.com wrote:
snip
2) We ran an strace of tar and also collected io-stats outputs from these 
volumes, both show that create and mkdir is slower on slow as compared to the 
fast volume. This seems to be the overall reason for slowness


Any idea's on why the create and mkdir is slower?

Wondering if it's a case of underlying filesystem parameters (for the bricks)
+ maybe physical storage structure having become badly optimised over time.
eg if its on spinning rust, not ssd, and sector placement is now bad

Any idea if there are tools that can analyse this kind of thing?  eg meta
data placement / fragmentation / on a drive for XFS/ext4

+ Justin

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Minutes of todays Gluster Community meeting

2015-02-11 Thread Niels de Vos
On Wed, Feb 11, 2015 at 11:43:49AM +0100, Niels de Vos wrote:
 Hi all,
 
 In a little more than one hour from now we will have the regular weekly
 Gluster Community meeting.
 
 Meeting details:
 - location: #gluster-meeting on Freenode IRC
 - date: every Wednesday
 - time: 7:00 EST, 12:00 UTC, 13:00 CET, 17:30 IST
(in your terminal,  run: date -d 12:00 UTC)
 - agenda:https://public.pad.fsfe.org/p/gluster-community-meetings
 
 Currently the following items are listed:
 * Roll Call
 * Status of last weeks action items
 * GlusterFS 3.6
 * GlusterFS 3.5
 * GlusterFS 3.4
 * GlusterFS Next
 * Open Floor
 
 The last topic has space for additions. If you have a suitable topic to
 discuss, please add it to the agenda.

We ran a little long today, but we had great participation and several
interesing discussions.

You can find the meeting minutes in the links below, but the text format
is also included for your convenience.

Minutes: 
http://meetbot.fedoraproject.org/gluster-meeting/2015-02-11/gluster-meeting.2015-02-11-12.01.html
Minutes (text): 
http://meetbot.fedoraproject.org/gluster-meeting/2015-02-11/gluster-meeting.2015-02-11-12.01.txt
Log: 
http://meetbot.fedoraproject.org/gluster-meeting/2015-02-11/gluster-meeting.2015-02-11-12.01.log.html


Meeting summary
---
* Roll Call  (ndevos, 12:02:21)

* Action Items from last week  (ndevos, 12:05:17)
  * Subtopic: ndevos should publish an article on his blog  (ndevos,
12:05:31)
  * Subtopic: hchiramm will try to fix the duplicate syndication of
posts from ndevos  (ndevos, 12:05:58)
  * Subtopic: hchiramm will start a discussion on the mailinglist about
the RHEL/Community packaging issues and solutions  (ndevos,
12:06:27)
  * ACTION: hchiramm will share the outcome of the non-mailinglist
packagng discussions on the mailinglist  (ndevos, 12:08:52)
  * Subtopic: hagarth to open a feature page for (k)vm hyperconvergence
(ndevos, 12:09:16)
  * ACTION: hchiramm should keep the Gluster Community Board in the loop
on the outcome of the packaging discussions  (ndevos, 12:13:14)
  * Subtopic: keep the Gluster Community Board more informed and
involved  (ndevos, 12:14:38)
  * ACTION: *all*moderators* should post meeting minutes to the Gluster
Community Board bo...@gluster.org  (ndevos, 12:17:17)
  * LINK: http://www.gluster.org/mailman/listinfo/board   (JustinClift,
12:17:26)
  * Subtopic: spot to investigate repetitive posts on social networking
sites  (ndevos, 12:17:54)
  * AGREED: drop the topic from the list, if the issue happens again,
file a bug against the project-infrastructure component in Bugzilla
(ndevos, 12:20:36)
  * Subtopic: spot to reach out to community about website messaging
(ndevos, 12:21:59)
  * Subtopic: hagarth to carry forward discussion on automated builds
for various platforms in gluster-infra ML  (ndevos, 12:23:17)

* Gluster 3.6  (ndevos, 12:26:04)
  * ACTION: raghu` will send a notification about 3.6.3beta1 to the
mailinglists  (ndevos, 12:33:12)

* Gluster 3.5  (ndevos, 12:35:15)
  * ACTION: ndevos to prepare for the next 3.5 beta/update somewhere
next week  (ndevos, 12:36:54)

* Gluster 3.4  (ndevos, 12:37:38)

* Gluster Next  (ndevos, 12:40:16)
  * Subtopic: Gluster 3.7  (ndevos, 12:40:26)
  * Subtopic: Gluster 4.0  (ndevos, 12:49:27)
  * LINK:

http://meetbot.fedoraproject.org/gluster-meeting/2015-02-06/glusterfs_4.0.2015-02-06-12.05.html
(jdarcy, 12:51:14)

* Open Floor  (ndevos, 12:53:40)
  * Subtopic: Maintainer responsibilities  (ndevos, 12:53:57)
  * ACTION: ndevos should send out a reminder about Maintainer
responsibilities to the -devel list  (ndevos, 12:55:59)
  * Subtopic: Upcoming talks at conference  (ndevos, 12:56:24)
  * Subtopic: GSOC 2015  (ndevos, 12:57:09)
  * ACTION: JustinClift should follow up on the GSOC email  (ndevos,
13:00:07)
  * Subtopic: Qemu / Clusterfs integration  (ndevos, 13:01:10)
  * ACTION: telmich will send an email to the gluster-users list about
Gluster support in QEMU on Debian/Ubuntu  (ndevos, 13:06:44)
  * ACTION: jimjag to engage the board, asking for their direction and
input for both 3.7, and 4.0 releases  (JustinClift, 13:10:00)

Meeting ended at 13:14:27 UTC.




Action Items

* hchiramm will share the outcome of the non-mailinglist packagng
  discussions on the mailinglist
* hchiramm should keep the Gluster Community Board in the loop on the
  outcome of the packaging discussions
* *all*moderators* should post meeting minutes to the Gluster Community
  Board bo...@gluster.org
* raghu` will send a notification about 3.6.3beta1 to the mailinglists
* ndevos to prepare for the next 3.5 beta/update somewhere next week
* ndevos should send out a reminder about Maintainer responsibilities to
  the -devel list
* JustinClift should follow up on the GSOC email
* telmich will send an email to the gluster-users list about Gluster
  support in QEMU on Debian/Ubuntu
* jimjag to engage the 

Re: [Gluster-devel] GlusterFS 4.0 Meeting Time Survey

2015-02-11 Thread Jeff Darcy
So far the winner seems to be Thursday@11, with Friday@11 close behind.
Every other time except for Friday@12 has at least one person who can't
make it.  Even though it's 6am my time, I'm going to propose Thursday@11
starting on February 26.  Last call for objections...


- Original Message -
 The inaugural GlusterFS 4.0 meeting on Friday was a great success.
 Thanks to all who attended.  Minutes are here:
 
 http://meetbot.fedoraproject.org/gluster-meeting/2015-02-06/glusterfs_4.0.2015-02-06-12.05.html
 
 One action item was to figure out when IRC meetings (#gluster-meeting on
 Freenode) should occur.  Meetings are likely to be every other week,
 starting February 26 or 27.  Obviously, it would be good to pick a time
 that gets us maximum participation.  If you would be interested in
 attending, please let me know:
 
 * Which day - Thursday or Friday
 
 * What time - 11:00, 12:00, or 13:00 UTC
 
 Please try to keep in mind that our two main constituencies are in IST
 (+0530) and EST (-0500).  With at least one developer in Europe and none
 so far on the US west coast, this is as close to a sweet spot as we
 can get.  If we do pick up developers in time zones where this isn't
 sufficient, we could rotate or I would be willing to run two meetings
 per cycle on opposite schedules; let's cross that bridge when we get
 there.
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] GlusterFS 4.0 Meeting Time Survey

2015-02-11 Thread Niels de Vos
On Mon, Feb 09, 2015 at 05:34:04PM -0500, Jeff Darcy wrote:
 The inaugural GlusterFS 4.0 meeting on Friday was a great success.
 Thanks to all who attended.  Minutes are here:
 
 http://meetbot.fedoraproject.org/gluster-meeting/2015-02-06/glusterfs_4.0.2015-02-06-12.05.html
 
 One action item was to figure out when IRC meetings (#gluster-meeting on
 Freenode) should occur.  Meetings are likely to be every other week,
 starting February 26 or 27.  Obviously, it would be good to pick a time
 that gets us maximum participation.  If you would be interested in
 attending, please let me know:
 
 * Which day - Thursday or Friday
 
 * What time - 11:00, 12:00, or 13:00 UTC

Thursday at 13:00 would work, other times not so much.
Any of the proposed times on Friday are OK.

Thanks,
Niels

 
 Please try to keep in mind that our two main constituencies are in IST
 (+0530) and EST (-0500).  With at least one developer in Europe and none
 so far on the US west coast, this is as close to a sweet spot as we
 can get.  If we do pick up developers in time zones where this isn't
 sufficient, we could rotate or I would be willing to run two meetings
 per cycle on opposite schedules; let's cross that bridge when we get
 there.
 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-devel


pgpXzbs8YA1L4.pgp
Description: PGP signature
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Skip regression run for work in progress patch.

2015-02-11 Thread Sachin Pandit
Hi,

Whenever we send out a patch for review, Jenkins parses the necessary
information and triggers the regression build for the same. Do we have
a mechanism (a keyword) which indicates Jenkins to skip the regression run.
If not, can we have a mechanism to skip the regression run for work in
progress patch.

If any developer thinks that patch he is sending out is not ready, but
he wants to send the patch anyhow so that his peers can take a look,
then he can add WIP or Work in Progress at the start of the commit
message of the patch. Looking at that keyword, Jenkins can skip a regression
run.

Please provide your valuable inputs.

Best regards,
Sachin Pandit.

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel