Re: [Gluster-devel] [Gluster-Maintainers] Master branch lock down for stabilization (unlocking the same)
On Mon, Aug 13, 2018 at 10:55 PM Shyam Ranganathan wrote: > On 08/13/2018 02:20 AM, Pranith Kumar Karampuri wrote: > > - At the end of 2 weeks, reassess master and nightly test status, and > > see if we need another drive towards stabilizing master by locking > down > > the same and focusing only on test and code stability around the > same. > > > > > > When will there be a discussion about coming up with guidelines to > > prevent lock down in future? > > A thread for the same is started in the maintainers list. > Could you point me to the thread please? I am only finding a thread with subject "Lock down period merge process" > > > I think it is better to lock-down specific components by removing commit > > access for the respective owners for those components when a test in a > > particular component starts to fail. > > Also I suggest we move this to the maintainers thread, to keep the noise > levels across lists in check. > > Thanks, > Shyam > -- Pranith ___ Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] [regression tests] seeing files from previous test run
All, I was consistently seeing failures for test https://review.gluster.org/#/c/glusterfs/+/20639/12/tests/bugs/readdir-ahead/bug-1390050.t TEST glusterfs --volfile-server=$H0 --volfile-id=$V0 $M0 rm -rf $M0/* TEST mkdir -p $DIRECTORY #rm -rf $DIRECTORY/* TEST touch $DIRECTORY/file{0..10} EXPECT "0" stat -c "%s" $DIRECTORY/file4 #rdd_tester="$(dirname $0)/rdd-tester" TEST build_tester $(dirname $0)/bug-1390050.c -o $(dirname $0)/rdd-tester TEST $(dirname $0)/rdd-tester $DIRECTORY $DIRECTORY/file4 However, if I uncomment line "rm -rf $DIRECTORY/*" test succeeds. I've also added a sleep just after mkdir -p $DIRECTORY and manually checked the directory. Turns out there are files left from previous run. So, It looks like files (left out from previous runs) are causing the failures. Are there any changes to cleanup sequence which could've caused this failure? regards, Raghavendra ___ Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Master branch lock down: RCA for tests (UNSOLVED bug-1110262.t)
On 08/12/2018 08:42 PM, Shyam Ranganathan wrote: > As a means of keeping the focus going and squashing the remaining tests > that were failing sporadically, request each test/component owner to, > > - respond to this mail changing the subject (testname.t) to the test > name that they are responding to (adding more than one in case they have > the same RCA) > - with the current RCA and status of the same > > List of tests and current owners as per the spreadsheet that we were > tracking are: > > ./tests/bugs/bug-1110262.tTBD The above test fails as follows, Run: https://build.gluster.org/job/line-coverage/427/consoleFull Log snippet: (retried and passed so no further logs) 18:50:33 useradd: user 'dev' already exists 18:50:33 not ok 13 , LINENUM:42 18:50:33 FAILED COMMAND: useradd dev 18:50:33 groupadd: group 'QA' already exists 18:50:33 not ok 14 , LINENUM:43 18:50:33 FAILED COMMAND: groupadd QA Basically, the user/group existed and hence the test failed. Now, I tried getting to the build history of the machine that failed this test, but Jenkins has not been cooperative, this was in an effort to understand which previous run failed. Also one other test case uses the same user and group names, tests/bugs/bug-1584517.t, but that runs after this test. So I do not know how that user and group name leaked, due to which this test case failed. Bug filed: https://bugzilla.redhat.com/show_bug.cgi?id=1615604 Shyam ___ Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-Maintainers] Master branch lock down: RCA for tests (UNSOLVED ./tests/basic/stats-dump.t)
On Mon, Aug 13, 2018 at 02:32:19PM -0400, Shyam Ranganathan wrote: > On 08/12/2018 08:42 PM, Shyam Ranganathan wrote: > > As a means of keeping the focus going and squashing the remaining tests > > that were failing sporadically, request each test/component owner to, > > > > - respond to this mail changing the subject (testname.t) to the test > > name that they are responding to (adding more than one in case they have > > the same RCA) > > - with the current RCA and status of the same > > > > List of tests and current owners as per the spreadsheet that we were > > tracking are: > > > > ./tests/basic/stats-dump.t TBD > > This test fails as follows: > > 01:07:31 not ok 20 , LINENUM:42 > 01:07:31 FAILED COMMAND: grep .queue_size > /var/lib/glusterd/stats/glusterfsd__d_backends_patchy1.dump > > 18:35:43 not ok 21 , LINENUM:43 > 18:35:43 FAILED COMMAND: grep .queue_size > /var/lib/glusterd/stats/glusterfsd__d_backends_patchy2.dump > > Basically when grep'ing for a pattern in the stats dump it is not > finding the second grep pattern of "queue_size" in one or the other bricks. > > The above seems incorrect, if it found "aggr.fop.write.count" it stands > to reason that it found a stats dump, further there is a 2 second sleep > as well in the test case and the dump interval is 1 second. > > The only reason for this to fail could hence possibly be that the file > was just (re)opened (by the io-stats dumper thread) for overwriting > content, at which point the fopen uses the mode "w+", and the file was > hence truncated, and the grep CLI also opened the file at the same time, > and hence found no content. This sounds like a dangerous approach in any case. Truncating a file while there are potential other readers should probably not be done. I wonder if there is a good reason for this. A safer solution would be to create a new temporary file, write the stats to that and once done rename it to the expected filename. Any process reading from the 'old' file will have its file-descriptor open and can still read the previous, but consistent contents. Niels ___ Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Glusterfs v4.1.1 issue encountered while executing test case ./tests/basic/namespace.t
Hi, I am working on Glusterfs v4.1.1 for Ubuntu 16.04 on big endian architecture. After successful build and running the test cases, I encountered a test case failure. The test case is:- · ./tests/basic/namespace.t In the test case ./tests/basic/namespace.t, a NAMESPACE_HASH is generated after calling a SuperFastHash() function on the corresponding folder names. This hash differs on big endian and little endian architectures. Therefore, I have changed the code accordingly. Although there is another subtest in this test which fails with the following error:- TEST 30 (line 119): Y check_samples CREATE 1268089390 /namespace3/file patchy0 getfattr: /d/backends/patchy0/namespace3/file: No such file or directory As seen above, the error is occurring because the folder /d/backends/patchy0/namespace3/ doesn’t contain “file”. However, I resolved this subtest by changing the folder to /d/backends/patchy6/namespace3/ where “file” is actually present. But same is not the case for little endian architectures where the test case passes without any changes. The type of filesystem /d/backends is “ext4” and there is enough space allocated to the directory. Therefore, could you please provide me with some more insight as to why is this happening? Thanks and Regards, Abhay Singh ___ Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Master branch lock down: RCA for tests (UNSOLVED ./tests/basic/stats-dump.t)
On 08/13/2018 02:32 PM, Shyam Ranganathan wrote: > I will be adding a bug and a fix that tries this in a loop to avoid the > potential race that I see above as the cause. Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1615582 Potential fix: https://review.gluster.org/c/glusterfs/+/20726 Shyam ___ Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Master branch lock down: RCA for tests (UNSOLVED ./tests/basic/stats-dump.t)
On 08/12/2018 08:42 PM, Shyam Ranganathan wrote: > As a means of keeping the focus going and squashing the remaining tests > that were failing sporadically, request each test/component owner to, > > - respond to this mail changing the subject (testname.t) to the test > name that they are responding to (adding more than one in case they have > the same RCA) > - with the current RCA and status of the same > > List of tests and current owners as per the spreadsheet that we were > tracking are: > > ./tests/basic/stats-dump.tTBD This test fails as follows: 01:07:31 not ok 20 , LINENUM:42 01:07:31 FAILED COMMAND: grep .queue_size /var/lib/glusterd/stats/glusterfsd__d_backends_patchy1.dump 18:35:43 not ok 21 , LINENUM:43 18:35:43 FAILED COMMAND: grep .queue_size /var/lib/glusterd/stats/glusterfsd__d_backends_patchy2.dump Basically when grep'ing for a pattern in the stats dump it is not finding the second grep pattern of "queue_size" in one or the other bricks. The above seems incorrect, if it found "aggr.fop.write.count" it stands to reason that it found a stats dump, further there is a 2 second sleep as well in the test case and the dump interval is 1 second. The only reason for this to fail could hence possibly be that the file was just (re)opened (by the io-stats dumper thread) for overwriting content, at which point the fopen uses the mode "w+", and the file was hence truncated, and the grep CLI also opened the file at the same time, and hence found no content. I will be adding a bug and a fix that tries this in a loop to avoid the potential race that I see above as the cause. Other ideas/causes welcome! Also, this has failed in mux and non-mux environments, Runs with failure: https://build.gluster.org/job/regression-on-demand-multiplex/175/consoleFull (no logs) https://build.gluster.org/job/regression-on-demand-full-run/59/consoleFull (has logs) Shyam ___ Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Master branch lock down: RCA for tests (./tests/bugs/core/bug-1432542-mpx-restart-crash.t)
On 08/12/2018 08:42 PM, Shyam Ranganathan wrote: > As a means of keeping the focus going and squashing the remaining tests > that were failing sporadically, request each test/component owner to, > > - respond to this mail changing the subject (testname.t) to the test > name that they are responding to (adding more than one in case they have > the same RCA) > - with the current RCA and status of the same > > List of tests and current owners as per the spreadsheet that we were > tracking are: > > ./tests/bugs/core/bug-1432542-mpx-restart-crash.t 1608568 Nithya/Shyam This test had 2 issues, 1. Needed more time in lcov builds, hence timeout was bumped to 800, and also one of the EXPECT_WITHIN needed more tolerance and was bumped up to 120 seconds 2. This test OOM killed at times, and to reduce the memory pressure due to the test, post each client mount that was in use for a dd test, it was unmounted. This resulted in no more OOM kills for the test. Shyam (and Nithya) ___ Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Master branch lock down: RCA for tests (./tests/bugs/distribute/bug-1042725.t)
On 08/12/2018 08:42 PM, Shyam Ranganathan wrote: > As a means of keeping the focus going and squashing the remaining tests > that were failing sporadically, request each test/component owner to, > > - respond to this mail changing the subject (testname.t) to the test > name that they are responding to (adding more than one in case they have > the same RCA) > - with the current RCA and status of the same > > List of tests and current owners as per the spreadsheet that we were > tracking are: > > ./tests/bugs/distribute/bug-1042725.t Shyam The test above failed to even start glusterd (the first line of the test) properly when it failed. On inspection it was noted that the previous test ./tests/bugs/core/multiplex-limit-issue-151.t had not completed succesfully and also had an different cleanup pattern (trapping cleanup as an TERM exit, and not invoking it outright). The test ./tests/bugs/core/multiplex-limit-issue-151.t was cleaned up to perform cleanup as appropriate, and no further errors in the test ./tests/bugs/distribute/bug-1042725.t have been seen since then. Shyam ___ Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Master branch lock down: RCA for tests (./tests/bugs/distribute/bug-1117851.t)
On 08/12/2018 08:42 PM, Shyam Ranganathan wrote: > As a means of keeping the focus going and squashing the remaining tests > that were failing sporadically, request each test/component owner to, > > - respond to this mail changing the subject (testname.t) to the test > name that they are responding to (adding more than one in case they have > the same RCA) > - with the current RCA and status of the same > > List of tests and current owners as per the spreadsheet that we were > tracking are: > > ./tests/bugs/distribute/bug-1117851.t Shyam/Nigel Tests in lcov instrumented code take more time than normal. This test was pushing towards 180-190 seconds on successful runs. As a result to remove any potential issues around tests that run close to the default timeout of 200 seconds, 2 changes were done. 1) https://review.gluster.org/c/glusterfs/+/20648 Added an option to run-tests.sh to enable setting the default timeout to a different value. 2) https://review.gluster.org/c/build-jobs/+/20655 Changed the line-coverage job to use the above to set the default timeout to 300 for the test run Since the changes this issue has not failed in lcov runs. Shyam (and Nigel/Nithya) ___ Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-Maintainers] Master branch lock down for stabilization (unlocking the same)
On 08/13/2018 02:20 AM, Pranith Kumar Karampuri wrote: > - At the end of 2 weeks, reassess master and nightly test status, and > see if we need another drive towards stabilizing master by locking down > the same and focusing only on test and code stability around the same. > > > When will there be a discussion about coming up with guidelines to > prevent lock down in future? A thread for the same is started in the maintainers list. > > I think it is better to lock-down specific components by removing commit > access for the respective owners for those components when a test in a > particular component starts to fail. Also I suggest we move this to the maintainers thread, to keep the noise levels across lists in check. Thanks, Shyam ___ Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Gluster Outreachy
This is great! One thing that I'm noticing is that most proposed projects do not have mentors at this time, which is pretty crucial to the success of these projects. Will signups close 20 August for that as well? - amye On Thu, Aug 9, 2018 at 11:16 PM Bhumika Goyal wrote: > Hi all, > > *Gentle reminder!* > > The doc[1] for adding project ideas for Outreachy will be open for editing > till August 20th. Please feel free to add your project ideas :). > [1]: > https://docs.google.com/document/d/16yKKDD2Dd6Ag0tssrdoFPojKsF16QI5-j7cUHcR5Pq4/edit?usp=sharing > > Thanks, > Bhumika > > > > On Wed, Jul 4, 2018 at 4:51 PM, Bhumika Goyal wrote: > >> Hi all, >> >> Gnome has been working on an initiative known as Outreachy[1] since 2010. >> Outreachy is a three months remote internship program. It aims to increase >> the participation of women and members from under-represented groups in >> open source. This program is held twice in a year. During the internship >> period, interns contribute to a project under the guidance of one or more >> mentors. >> >> For the next round(Dec 2018- March 2019) we are planning to apply >> projects from Gluster. We would like you to propose projects ideas or/and >> come forward as mentors/volunteers. >> Please feel free to add project ideas in this doc[2]. The doc[2] will be >> open for editing till July end. >> >> [1]: https://www.outreachy.org/ >> [2]: >> https://docs.google.com/document/d/16yKKDD2Dd6Ag0tssrdoFPojKsF16QI5-j7cUHcR5Pq4/edit?usp=sharing >> >> Outreachy timeline: >> Pre-Application Period - Late August to early September >> Application Period - Early September to mid-October >> Internship Period - December to March >> >> Thanks, >> Bhumika >> > > ___ > Gluster-users mailing list > gluster-us...@gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users -- Amye Scavarda | a...@redhat.com | Gluster Community Lead ___ Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Coverity covscan for 2018-08-13-febee007 (master branch)
GlusterFS Coverity covscan results for the master branch are available from http://download.gluster.org/pub/gluster/glusterfs/static-analysis/master/glusterfs-coverity/2018-08-13-febee007/ Coverity covscan results for other active branches are also available at http://download.gluster.org/pub/gluster/glusterfs/static-analysis/ ___ Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Setting up machines from softserve in under 5 mins
This is so nice. I tried it and succesfully created a test machine. It would be great if there is a provision to extend the lifetime of vm's beyond the time provided during creation. First I ran the ansible-playbook from the vm machine, then I realized that has to be executed from outside machine. May be we can mention that info in the doc. Regards Rafi KC - Original Message - From: "Nigel Babu" To: "gluster-devel" Cc: "gluster-infra" Sent: Monday, August 13, 2018 3:38:17 PM Subject: [Gluster-devel] Setting up machines from softserve in under 5 mins Hello folks, Deepshikha did the work to make loaning a machine to running your regressions on them faster a while ago. I've tested them a few times today to confirm it works as expected. In the past, Softserve[1] machines would be a clean Centos 7 image. Now, we have an image with all the dependencies installed and *almost* setup to run regressions. It just needs a few steps run on them and we have a simplified playbook that will setup *just* those steps. This brings down the time from around 30 mins to setup a machine to less than 5 mins. The instructions[2] are on the softserve wiki for now, but will move to the site itself in the future. Please let us know if you face troubles by filing a bug.[3] [1]: https://softserve.gluster.org/ [2]: https://github.com/gluster/softserve/wiki/Running-Regressions-on-loaned-Softserve-instances [3]: https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS=project-infrastructure -- nigelb ___ Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Master branch is closed
Oops, I apparently forgot to send out a note. Master has been since ~7 am IST. On Mon, Aug 13, 2018 at 4:25 PM Atin Mukherjee wrote: > Nigel, > > Now that mater branch is reopened, can you please revoke the commit access > restrictions? > > On Mon, 6 Aug 2018 at 09:12, Nigel Babu wrote: > >> Hello folks, >> >> Master branch is now closed. Only a few people have commit access now and >> it's to be exclusively used to merge fixes to make master stable again. >> >> >> -- >> nigelb >> ___ >> Gluster-devel mailing list >> Gluster-devel@gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-devel > > -- > - Atin (atinm) > -- nigelb ___ Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Master branch is closed
Nigel, Now that mater branch is reopened, can you please revoke the commit access restrictions? On Mon, 6 Aug 2018 at 09:12, Nigel Babu wrote: > Hello folks, > > Master branch is now closed. Only a few people have commit access now and > it's to be exclusively used to merge fixes to make master stable again. > > > -- > nigelb > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel -- - Atin (atinm) ___ Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Master branch lock down: RCA for tests (remove-brick-testcases.t)
On 08/13/2018 06:12 AM, Shyam Ranganathan wrote: As a means of keeping the focus going and squashing the remaining tests that were failing sporadically, request each test/component owner to, - respond to this mail changing the subject (testname.t) to the test name that they are responding to (adding more than one in case they have the same RCA) - with the current RCA and status of the same List of tests and current owners as per the spreadsheet that we were tracking are: TBD ./tests/bugs/glusterd/remove-brick-testcases.t TBD In this case, the .t passed but self-heal-daemon (which btw does not have any role in this test because there is no I/O or heals in this .t) has crashed with the following bt: Program terminated with signal SIGSEGV, Segmentation fault. #0 0x7ff8c6bc0b4f in _IO_cleanup () from ./lib64/libc.so.6 [Current thread is 1 (LWP 17530)] (gdb) (gdb) bt #0 0x7ff8c6bc0b4f in _IO_cleanup () from ./lib64/libc.so.6 #1 0x7ff8c6b7cb8b in __run_exit_handlers () from ./lib64/libc.so.6 #2 0x7ff8c6b7cc27 in exit () from ./lib64/libc.so.6 #3 0x0040b14d in cleanup_and_exit (signum=15) at glusterfsd.c:1570 #4 0x0040de71 in glusterfs_sigwaiter (arg=0x7ffd5f270d20) at glusterfsd.c:2332 #5 0x7ff8c757ce25 in start_thread () from ./lib64/libpthread.so.0 #6 0x7ff8c6c41bad in clone () from ./lib64/libc.so.6 Not able to find out the reason of the crash. Any pointers are appreciated. Regression run/core can be found at https://build.gluster.org/job/line-coverage/432/consoleFull . Thanks, Ravi ___ Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-Maintainers] Master branch lock down for stabilization (unlocking the same)
On Mon, Aug 13, 2018 at 6:05 AM Shyam Ranganathan wrote: > Hi, > > So we have had master locked down for a week to ensure we only get fixes > for failing tests in order to stabilize the code base, partly for > release-5 branching as well. > > As of this weekend, we (Atin and myself) have been looking at the > pass/fail rates on the tests, and whether we are discovering newer > failures of more of the same. > > Our runs with patch sets 10->11->12 is looking better than where we > started, and we have a list of tests that we need to still fix. > > But there are other issues and fixes that are needed in the code that > are lagging behind due to the lock down. The plan going forward is as > follows, > > - Unlock master, and ensure that we do not start seeing newer failures > as we merge other patches in, if so raise them on the lists and as bugs > and let's work towards ensuring these are addressed. *Maintainers* > please pay special attention when merging patches. > > - Address the current pending set of tests that have been identified as > failing, over the course of the next 2 weeks. *Contributors* continue > the focus here, so that we do not have to end up with another drive > towards the same in 2 weeks. > > - At the end of 2 weeks, reassess master and nightly test status, and > see if we need another drive towards stabilizing master by locking down > the same and focusing only on test and code stability around the same. > When will there be a discussion about coming up with guidelines to prevent lock down in future? I think it is better to lock-down specific components by removing commit access for the respective owners for those components when a test in a particular component starts to fail. > > Atin and Shyam > ___ > maintainers mailing list > maintain...@gluster.org > https://lists.gluster.org/mailman/listinfo/maintainers > -- Pranith ___ Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel