Re: [Gluster-devel] Shard test failing more commonly on master

2018-12-19 Thread Krutika Dhananjay
Sent https://review.gluster.org/c/glusterfs/+/21889 to fix the original
issue.

-Krutika

On Wed, Dec 5, 2018 at 10:58 AM Atin Mukherjee  wrote:

>  We can't afford to keep a bad test hanging for more than a day which
> penalizes other fixes to be blocked (I see atleast 4-5 more patches failed
> on the same test today). I thought we already had a rule to mark a test bad
> at earliest in such occurrences. Not sure why we haven't done that yet. In
> any case, I have marked this test as bad through
> https://review.gluster.org/#/c/glusterfs/+/21800/ , please review and
> merge.
>
> On Tue, Dec 4, 2018 at 7:46 PM Shyam Ranganathan 
> wrote:
>
>> Test: ./tests/bugs/shard/zero-flag.t
>>
>> Runs:
>>   - https://build.gluster.org/job/centos7-regression/3942/console
>>   - https://build.gluster.org/job/centos7-regression/3941/console
>>   - https://build.gluster.org/job/centos7-regression/3938/console
>>
>> Failures seem to occur at common points across the tests like so,
>>
>> 09:52:34 stat: missing operand
>> 09:52:34 Try 'stat --help' for more information.
>> 09:52:34 not ok 17 Got "" instead of "2097152", LINENUM:40
>> 09:52:34 FAILED COMMAND: 2097152 echo
>>
>> 09:52:34 stat: cannot stat
>> ‘/d/backends/patchy*/.shard/41fed5c6-636e-44d6-b6ed-068b941843cd.2’: No
>> such file or directory
>> 09:52:34 not ok 27 , LINENUM:64
>> 09:52:34 FAILED COMMAND: stat
>> /d/backends/patchy*/.shard/41fed5c6-636e-44d6-b6ed-068b941843cd.2
>> 09:52:34 stat: missing operand
>> 09:52:34 Try 'stat --help' for more information.
>> 09:52:34 not ok 28 Got "" instead of "1048602", LINENUM:66
>> 09:52:34 FAILED COMMAND: 1048602 echo
>>
>> Krutika, is this something you are already chasing down?
>>
>> Thanks,
>> Shyam
>> ___
>> Gluster-devel mailing list
>> Gluster-devel@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Infra Update for Nov and Dec

2018-12-19 Thread Nigel Babu
Hello folks,

The infra team has not been sending regular updates recently because we’ve
been caught up in several different pieces of work that were running into
longer than 2 week sprint cycles. This is a summary of what we’ve done so
far since the last update.

* The bugzilla updates are done with a python script now and there’s now a
patch to handle a patch being abandoned and restored. It’s pending a merge
and deploy after the holiday season.
* Smoke jobs for python linting and shell linting.
* Smoke jobs for 32-bit builds.

The big piece that the infra team has been spending time has been working
on is identifying the best way to write end to end testing for GCS (Gluster
for Container Storage). We started with the assumption that we want to use
a test framework that as far as possible sticks closely to the upstream
kubernetes and Openshift Origin tests. We have had a 3-pronged approach to
this over the last two months.

1. We want to use machines we have access to right now to verify that the
deployment scripts that we publish works as we intend for it to work. To
this end, we created a job on Centos CI that consumes the deployment
exactly like we recommend anyone run the scripts in the gcs repository[1].
We’re running into a couple of failures and Mrugesh is working on
identifying and fixing them. We hope to have this complete in the first
week of January.
2. We want to use the upstream end to end test framework that consumes
ginkgo and gomega. The framework already exists to consume the kubectl
client to talk to a kubernetes cluster. We’ve just had a conversation with
the upstream Storage-SIG developers yesterday that has pointed us in the
right direction. We’re very close to having a first test. When the first
test in the end to end framework comes about, we’ll hook it up to the test
run we have in (1). Deepshikha and I are actively working on making this
happen. We plan to have a proof of concept in the second week of January
and write documentation and demos for the GCS team.
3. We want to do some testing that actively tries to break a production
sized cluster and look for how our stack handles failures. There’s a longer
plan on how to do this, but this work is currently on hold until we get the
first two pieces running. This is also blocked on us having access to
infrastructure where we can make this happen. Mrugesh will lead this
activity once the other blockers are removed.

Once we have the first proof of concept test written, we will hand over
writing the tests to the GCS development team and the infra team will then
move to working on building out the infrastructure for running these new
tests. We will continue to work in close collaboration with the Kubernetes
Storage SIG and the OKD Infrastructure teams to prevent us from duplicating
work.

[1]: https://ci.centos.org/view/Gluster/job/gluster_anteater_gcs/


-- 
nigelb
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] too many failures on mpx-restart-crash.t on master branch

2018-12-19 Thread Amar Tumballi
Since yesterday at least 10+ patches have failed regression on
./tests/bugs/core/bug-1432542-mpx-restart-crash.t


Help to debug them soon would be appreciated.


Regards,

Amar


-- 
Amar Tumballi (amarts)
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel