Re: [Gluster-devel] Release 4.0: Branched
On 01/23/2018 03:17 PM, Shyam Ranganathan wrote: > 4.0 release has been branched! > > I will follow this up with a more detailed schedule for the release, and > also the granted feature backport exceptions that we are waiting. > > Feature backports would need to make it in by this weekend, so that we > can tag RC0 by the end of the month. Backports need to be ready for merge on or before Jan, 29th 2018 3:00 PM Eastern TZ. Features that requested and hence are granted backport exceptions are as follows, 1) Dentry fop serializer xlator on brick stack https://github.com/gluster/glusterfs/issues/397 @Du please backport the same to the 4.0 branch as the patch in master is merged. 2) Leases support on GlusterFS https://github.com/gluster/glusterfs/issues/350 @Jiffin and @ndevos, there is one patch pending against master, https://review.gluster.org/#/c/18785/ please do the needful and backport this to the 4.0 branch. 3) Data corruption in write ordering of rebalance and application writes https://github.com/gluster/glusterfs/issues/308 @susant, @du if we can conclude on the strategy here, please backport as needed. 4) Couple of patches that are tracked for a backport are, https://review.gluster.org/#/c/19223/ https://review.gluster.org/#/c/19267/ (prep for ctime changes in later releases) Other features discussed are not in scope for a backports to 4.0. If you asked for one and do not see it in this list, shout out! > > Only exception could be: https://review.gluster.org/#/c/19223/ > > Thanks, > Shyam > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-devel > ___ Gluster-devel mailing list Gluster-devel@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Regression tests time
On Thu, Jan 25, 2018 at 3:03 PM, Jeff Darcywrote: > > > > On Wed, Jan 24, 2018, at 9:37 AM, Xavi Hernandez wrote: > > That happens when we use arbitrary delays. If we use an explicit check, it > will work on all systems. > > > You're arguing against a position not taken. I'm not expressing opposition > to explicit checks. I'm just saying they don't come for free. If you don't > believe me, try adding explicit checks in some of the harder cases where > we're waiting for something that's subject to OS scheduling delays, or for > large numbers of operations to complete. Geo-replication or multiplexing > tests should provide some good examples. Adding explicit conditions is the > right thing to do in the abstract, but as a practical matter the returns > must justify the cost. > > BTW, some of our longest-running tests are in EC. Do we need all of those, > and do they all need to run as long, or could some be eliminated/shortened? > Some tests were already removed some time ago. Anyway, with the changes introduced, it takes between 10 and 15 minutes to execute all ec related tests from basic/ec and bugs/ec (an average of 16 to 25 seconds per test). Before the changes, the same tests were taking between 30 and 60 minutes. AFR tests have also improved from almost 60 minutes to around 30. > I agree that parallelizing tests is the way to go, but if we reduce the > total time to 50%, the parallelized tests will also take 50% less of the > time. > > > Taking 50% less time but failing spuriously 1% of the time, or all of the > time in some environments, is not a good thing. If you want to add explicit > checks that's great, but you also mentioned shortening timeouts and that's > much more risky. > If we have a single test that takes 45 minutes (as we currently have in some executions: bugs/nfs/bug-1053579.t), parallelization won't help much. We need to make this test to run faster. Some tests that were failing after the changes have revealed errors in the test itself or even in the code, so I think it's a good thing. Currently I'm investigating what seems a race in the rpc layer during connections that causes some tests to fail. This is a real problem that high delays or slow machines were hiding. It seems to cause some gluster requests to fail spuriously after reconnecting to a brick or glusterd. I'm not 100% sure about this yet, but initial analysis seems to indicate that. Xavi ___ Gluster-devel mailing list Gluster-devel@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Regression tests time
On Wed, Jan 24, 2018, at 9:37 AM, Xavi Hernandez wrote: > That happens when we use arbitrary delays. If we use an explicit > check, it will work on all systems. You're arguing against a position not taken. I'm not expressing opposition to explicit checks. I'm just saying they don't come for free. If you don't believe me, try adding explicit checks in some of the harder cases where we're waiting for something that's subject to OS scheduling delays, or for large numbers of operations to complete. Geo- replication or multiplexing tests should provide some good examples. Adding explicit conditions is the right thing to do in the abstract, but as a practical matter the returns must justify the cost. BTW, some of our longest-running tests are in EC. Do we need all of those, and do they all need to run as long, or could some be eliminated/shortened? > I agree that parallelizing tests is the way to go, but if we reduce > the total time to 50%, the parallelized tests will also take 50% less > of the time. Taking 50% less time but failing spuriously 1% of the time, or all of the time in some environments, is not a good thing. If you want to add explicit checks that's great, but you also mentioned shortening timeouts and that's much more risky. ___ Gluster-devel mailing list Gluster-devel@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Glusto failures with dispersed volumes + Samba
Hi gluster-devel experts, I've stumbled upon the very same issue/observations like the one mentioned in http://lists.gluster.org/pipermail/gluster-devel/2017-July/053234.html. Nigel Babu and Pranith Kumar Karampuri told me that this selinux related issue had been also been fixed in >= v3.13.0. The test setup is as follows: OS: OpenSUSE Leap 42.3 (selinux features installed/enabled) SW: gluster 3.13.1 - 3 gluster nodes (gnode[1,2,3] with own vlan for gluster communication) - 1 virtualization server (snode) upon which the test samba server instance is running under KVM/QEMU. The image is provided by the gluster (/vmvol/). The gluster volume /vol1/ is for the samba share. snode:~ # ssh gnode1 gluster vol info all *Volume Name: vmvol* Type: Replicate Volume ID: a03b8fc1-4fcb-4268-bf09-0f554ba5e7a5 Status: Started Snapshot Count: 0 Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: gs-gnode1:/data/glusterfs/vmvol/brick1/brick Brick2: gs-gnode2:/data/glusterfs/vmvol/brick1/brick Brick3: gs-gnode3:/data/glusterfs/vmvol/brick1/brick Options Reconfigured: performance.client-io-threads: off nfs.disable: on transport.address-family: inet performance.quick-read: off performance.read-ahead: off performance.io-cache: off performance.low-prio-threads: 32 network.remote-dio: enable cluster.eager-lock: enable cluster.quorum-type: auto cluster.server-quorum-type: server cluster.data-self-heal-algorithm: full cluster.locking-scheme: granular cluster.shd-max-threads: 8 cluster.shd-wait-qlength: 1 features.shard: on user.cifs: off server.allow-insecure: on storage.owner-uid: 500 storage.owner-gid: 500 *Volume Name: vol1 * Type: Disperse Volume ID: fb081b58-bffc-4ddd-bf62-a87a13abec9b Status: Started Snapshot Count: 0 Number of Bricks: 1 x (4 + 2) = 6 Transport-type: tcp Bricks: Brick1: gs-gnode1:/data/glusterfs/vol1/brick1/brick Brick2: gs-gnode2:/data/glusterfs/vol1/brick1/brick Brick3: gs-gnode3:/data/glusterfs/vol1/brick1/brick Brick4: gs-gnode1:/data/glusterfs/vol1/brick2/brick Brick5: gs-gnode2:/data/glusterfs/vol1/brick2/brick Brick6: gs-gnode3:/data/glusterfs/vol1/brick2/brick Options Reconfigured: performance.readdir-ahead: on storage.batch-fsync-delay-usec: 0 performance.stat-prefetch: off nfs.disable: on transport.address-family: inet server.allow-insecure: on The samba server config is: samba:~ # cat /etc/samba/smb.conf [global] workgroup = TESTGROUP server string = Samba Server Version %v log file = /var/log/samba/log.%m log level = 2 map to guest = Bad User unix charset = UTF-8 idmap config * : backend = autorid idmap config * : range = 100-199 [gluster] kernel share modes = no vfs objects = acl_xattr glusterfs comment = glusterfs based volume browseable = Yes read only = No writeable = Yes public = Yes guest ok = Yes inherit acls = Yes path = /shares/data/ glusterfs:volume = vol1 glusterfs:volfile_server = gs-gnode1.origenis.de gs-gnode2.origenis.de gs-gnode3.origenis.de # glusterfs:volfile_server = 172.17.20.1 172.17.20.2 172.17.20.3 glusterfs:loglevel = 9 glusterfs:logfile = /var/log/samba/glusterfs-vol1.%M.log The corresponding logs on the samba server are: * The vfs_gluster log shows lots of: [2018-01-15 15:27:49.349995] D [logging.c:1817:__gf_log_inject_timer_event] 0-logging-infra: Starting timer now. Timeout = 120, current buf size = 5 [2018-01-15 15:27:49.351598] D [rpc-clnt.c:1047:rpc_clnt_connection_init] 0-gfapi: defaulting frame-timeout to 30mins [2018-01-15 15:27:49.351625] D [rpc-clnt.c:1061:rpc_clnt_connection_init] 0-gfapi: disable ping-timeout [2018-01-15 15:27:49.351644] D [rpc-transport.c:279:rpc_transport_load] 0-rpc-transport: attempt to load file /usr/lib64/glusterfs/3.13.1/rpc-transport/socket.so [2018-01-15 15:27:49.352372] W [MSGID: 101002] [options.c:995:xl_opt_validate] 0-gfapi: option 'address-family' is deprecated, preferred is 'transport.address-family', continuing with correction [2018-01-15 15:27:49.352402] T [MSGID: 0] [options.c:86:xlator_option_validate_int] 0-gfapi: no range check required for 'option remote-port 24007' [2018-01-15 15:27:49.354865] D [socket.c:4236:socket_init] 0-gfapi: Configued transport.tcp-user-timeout=0 [2018-01-15 15:27:49.354885] D [socket.c:4254:socket_init] 0-gfapi: Reconfigued transport.keepalivecnt=9 [2018-01-15 15:27:49.354901] D [socket.c:4339:socket_init] 0-gfapi: SSL support on the I/O path is NOT enabled [2018-01-15 15:27:49.354914] D [socket.c:4342:socket_init] 0-gfapi: SSL support for glusterd is NOT enabled [2018-01-15 15:27:49.354926] D [socket.c:4359:socket_init] 0-gfapi: using system polling thread [2018-01-15 15:27:49.354943] D [rpc-clnt.c:1567:rpcclnt_cbk_program_register] 0-gfapi: New program registered: GlusterFS Callback, Num: 52743234, Ver: 1 [2018-01-15 15:27:49.354958] T [rpc-clnt.c:406:rpc_clnt_reconnect] 0-gfapi: attempting reconnect [2018-01-15 15:27:49.354971] T [socket.c:3146:socket_connect] 0-gfapi: connecting
[Gluster-devel] Coverity covscan for 2018-01-25-b7844629 (master branch)
GlusterFS Coverity covscan results are available from http://download.gluster.org/pub/gluster/glusterfs/static-analysis/master/glusterfs-coverity/2018-01-25-b7844629 ___ Gluster-devel mailing list Gluster-devel@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [FAILED][master] tests/basic/afr/durability-off.t
On Thu, Jan 25, 2018 at 3:09 PM, Milind Changirewrote: > could AFR engineers check why tests/basic/afr/durability-off.t fails in > brick-mux mode; > Issue seems to be something with connections to the bricks at the time of mount. *09:30:04* dd: opening `/mnt/glusterfs/0/a.txt': Transport endpoint is not connected*09:30:10* ./tests/basic/afr/durability-off.t .. > here's the job URL: https://build.gluster.org/job/centos6-regression/8654/ > console > > -- > Milind > > > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-devel > -- Pranith ___ Gluster-devel mailing list Gluster-devel@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] [FAILED][master] tests/basic/afr/durability-off.t
could AFR engineers check why tests/basic/afr/durability-off.t fails in brick-mux mode; here's the job URL: https://build.gluster.org/job/centos6-regression/8654/console -- Milind ___ Gluster-devel mailing list Gluster-devel@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-Maintainers] Release 4.0: Branched
Shyam, We need to have 4.0 version created in bugzilla for GlusterFS which is currently missing. I have a patch to backport into this branch. On Wed, Jan 24, 2018 at 1:47 AM, Shyam Ranganathanwrote: > 4.0 release has been branched! > > I will follow this up with a more detailed schedule for the release, and > also the granted feature backport exceptions that we are waiting. > > Feature backports would need to make it in by this weekend, so that we > can tag RC0 by the end of the month. > > Only exception could be: https://review.gluster.org/#/c/19223/ > > Thanks, > Shyam > ___ > maintainers mailing list > maintain...@gluster.org > http://lists.gluster.org/mailman/listinfo/maintainers > ___ Gluster-devel mailing list Gluster-devel@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-devel