- Original Message -
> From: "Shwetha Panduranga" <spand...@redhat.com>
> To: "Nigel Babu" <nig...@redhat.com>
> Cc: "Gluster Devel" <gluster-devel@gluster.org>
> Sent: Wednesday, August 30, 2017 6:29:08 AM
> Subject: Re:
I submitted a patch making the changes:
https://review.gluster.org/#/c/18152/2
On Wed, Aug 30, 2017 at 4:45 PM, Shwetha Panduranga
wrote:
> we had the first 'rebalance status" for logging purposes.
> wait_for_rebalance_to_complete will get the xml command output for
>
we had the first 'rebalance status" for logging purposes.
wait_for_rebalance_to_complete will get the xml command output for
validations. --xml ouputs go to debug log levels.
On Wed, Aug 30, 2017 at 4:35 PM, Nigel Babu wrote:
> Why are we failing because the first
It doesn't make any sense to me do a rebalance status at first go and then
go for a rebalance status in a loop. Instead we should go for a rebalance
status in a loop immediately.
On Wed, Aug 30, 2017 at 4:28 PM, Shwetha Panduranga
wrote:
> May be i should change the log
May be i should change the log message from 'Checking rebalance status' to
'Logging rebalance status' because the first 'rebalance status' command
just does that . It executes 'rebalance status'. Now
wait_for_rebalance_to_complete validates rebalance is 'completed' within 5
minutes ( default time
On Wed, Aug 30, 2017 at 4:23 PM, Shwetha Panduranga
wrote:
> This is the first check where we just execute 'rebalance status' . That's
> the command which failed and hence failed the test case. If u see the test
> case, the next step is wait_for_rebalance_to_complete (status
The return code is in the log message you copy pasted:
2017-08-28 15:13:58,952 INFO (test_expanding_volume_when_io_in_progress)
Successfully started rebalance on the volume
testvol_distributed-dispersed
2017-08-28 15:13:58,952 INFO (test_expanding_volume_when_io_in_progress)
Checking Rebalance
Ok, Nigel helped me in understanding the trace back time is not something
we should look and the right way to dig through this problem is by looking
at glusto logs. As per the last rebalance instance from the log I see the
following:
volume rebalance: testvol_distributed-dispersed: success:
Case 2:
1) remove-brick when IO is in progress. (i,e remove-brick state)
2) Immediately triggered remove-brick status. (no delay b/w remove-brick
start and status)
3) wait_for_rebalance_to_complete ( This get's the xml output of rebalance
status and keep checking for rebalance status to be
Case:
1) add-brick when IO is in progress , wait for 30 seconds
2) Trigger rebalance
3) Execute: 'rebalance status' ( there is no time delay b/w 2) and 3) )
4) wait_for_rebalance_to_complete ( This get's the xml output of rebalance
status and keep checking for rebalance status to be 'complete'
On Wed, Aug 30, 2017 at 6:03 AM, Atin Mukherjee wrote:
>
> On Wed, 30 Aug 2017 at 00:23, Shwetha Panduranga
> wrote:
>>
>> Hi Shyam, we are already doing it. we wait for rebalance status to be
>> complete. We loop. we keep checking if the status is
On Wed, 30 Aug 2017 at 00:23, Shwetha Panduranga
wrote:
> Hi Shyam, we are already doing it. we wait for rebalance status to be
> complete. We loop. we keep checking if the status is complete for '20'
> minutes or so.
>
Are you saying in this test rebalance status was
Hi Shyam, we are already doing it. we wait for rebalance status to be
complete. We loop. we keep checking if the status is complete for '20'
minutes or so.
-Shwetha
On Tue, Aug 29, 2017 at 7:04 PM, Shyam Ranganathan
wrote:
> On 08/29/2017 09:31 AM, Atin Mukherjee wrote:
>
On 08/29/2017 09:31 AM, Atin Mukherjee wrote:
On Tue, Aug 29, 2017 at 4:13 AM, Shyam Ranganathan > wrote:
Nigel, Shwetha,
The latest Glusto run [a] that was started by Nigel, post fixing the
prior timeout issue, failed (much
On Tue, Aug 29, 2017 at 4:13 AM, Shyam Ranganathan
wrote:
> Nigel, Shwetha,
>
> The latest Glusto run [a] that was started by Nigel, post fixing the prior
> timeout issue, failed (much later though) again.
>
> I took a look at the logs and my analysis is here [b]
>
> @atin,
Nigel, Shwetha,
The latest Glusto run [a] that was started by Nigel, post fixing the
prior timeout issue, failed (much later though) again.
I took a look at the logs and my analysis is here [b]
@atin, @kaushal, @ppai can you take a look and see if the analysis is
correct?
In short
I have sent a patch to fix this issue last week:
https://review.gluster.org/18099
I will send another patch to move all the hard coded timeouts to make it
configurable.
-Shwetha
On Mon, Aug 28, 2017 at 8:57 AM, Nigel Babu wrote:
> Shwetha,
>
> Is this time out
Shwetha,
Is this time out configurable? Or is it hard-coded into the glusto-tests
repo?
On Sat, Aug 26, 2017 at 1:59 AM, Shyam Ranganathan
wrote:
> Nigel was kind enough to kick off a glusto run on 3.12 head a couple of
> days back. The status can be seen here [1].
>
> The
Nigel was kind enough to kick off a glusto run on 3.12 head a couple of
days back. The status can be seen here [1].
The run failed, but managed to get past what Glusto does on master (see
[2]). Not that this is a consolation, but just stating the fact.
The run [1] failed at,
17:05:57
19 matches
Mail list logo