Re: Review Request 26298: Use a less broad retry loop for RPCs.

2014-10-03 Thread Bill Farner


> On Oct. 3, 2014, 12:21 a.m., Maxim Khutornenko wrote:
> > src/main/python/apache/aurora/client/api/scheduler_client.py, line 286
> > 
> >
> > Why not just add a break into the Exception catcher instead?
> 
> Maxim Khutornenko wrote:
> Actually, we are throwing there already. Wait, how is the exception is 
> trapped in that loop?
> 
> Bill Farner wrote:
> I scratched my head at this for a while, and wound up with this review: 
> https://reviews.apache.org/r/26308/
> 
> AFAICT `threading.Event` does not override `__bool__`, so those branches 
> are never entered, and we loop until the timer is up.
> 
> Maxim Khutornenko wrote:
> I am assuming this patch will be revisited then?
> 
> Bill Farner wrote:
> Yup.  I'm going to open a new review to introduce the sleep between 
> transient error retries, though.

Scratch that - didn't notice the local raise->catch on TransientError.  Closing.


- Bill


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26298/#review55295
---


On Oct. 3, 2014, 12:18 a.m., Bill Farner wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26298/
> ---
> 
> (Updated Oct. 3, 2014, 12:18 a.m.)
> 
> 
> Review request for Aurora, Mark Chu-Carroll and Maxim Khutornenko.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> A few times when making changes, i've found myself confused at a stalled test 
> and spiked CPU, only to find that my test should have failed, but an 
> exception is trapped in this retry loop.  The key change here is that unknown 
> exceptions will break the loop.
> 
> Making this change pointed out what should have been a test failure in 
> `test_transient_error`, where an exception caused by an unexpected call to 
> `getVersion` was swallowed.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/client/api/scheduler_client.py 
> b400cb2dbdb35077fc2c4a6e161c2959a9217317 
>   src/test/python/apache/aurora/client/api/test_scheduler_client.py 
> 1cbfbf86e903d890baac7d34461109f9beaff442 
> 
> Diff: https://reviews.apache.org/r/26298/diff/
> 
> 
> Testing
> ---
> 
> ./pants src/test/python:all -vxs
> 
> 
> Thanks,
> 
> Bill Farner
> 
>



Re: Review Request 26298: Use a less broad retry loop for RPCs.

2014-10-03 Thread Bill Farner


> On Oct. 3, 2014, 12:21 a.m., Maxim Khutornenko wrote:
> > src/main/python/apache/aurora/client/api/scheduler_client.py, line 286
> > 
> >
> > Why not just add a break into the Exception catcher instead?
> 
> Maxim Khutornenko wrote:
> Actually, we are throwing there already. Wait, how is the exception is 
> trapped in that loop?
> 
> Bill Farner wrote:
> I scratched my head at this for a while, and wound up with this review: 
> https://reviews.apache.org/r/26308/
> 
> AFAICT `threading.Event` does not override `__bool__`, so those branches 
> are never entered, and we loop until the timer is up.
> 
> Maxim Khutornenko wrote:
> I am assuming this patch will be revisited then?

Yup.  I'm going to open a new review to introduce the sleep between transient 
error retries, though.


- Bill


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26298/#review55295
---


On Oct. 3, 2014, 12:18 a.m., Bill Farner wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26298/
> ---
> 
> (Updated Oct. 3, 2014, 12:18 a.m.)
> 
> 
> Review request for Aurora, Mark Chu-Carroll and Maxim Khutornenko.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> A few times when making changes, i've found myself confused at a stalled test 
> and spiked CPU, only to find that my test should have failed, but an 
> exception is trapped in this retry loop.  The key change here is that unknown 
> exceptions will break the loop.
> 
> Making this change pointed out what should have been a test failure in 
> `test_transient_error`, where an exception caused by an unexpected call to 
> `getVersion` was swallowed.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/client/api/scheduler_client.py 
> b400cb2dbdb35077fc2c4a6e161c2959a9217317 
>   src/test/python/apache/aurora/client/api/test_scheduler_client.py 
> 1cbfbf86e903d890baac7d34461109f9beaff442 
> 
> Diff: https://reviews.apache.org/r/26298/diff/
> 
> 
> Testing
> ---
> 
> ./pants src/test/python:all -vxs
> 
> 
> Thanks,
> 
> Bill Farner
> 
>



Re: Review Request 26298: Use a less broad retry loop for RPCs.

2014-10-03 Thread Maxim Khutornenko


> On Oct. 3, 2014, 12:21 a.m., Maxim Khutornenko wrote:
> > src/main/python/apache/aurora/client/api/scheduler_client.py, line 286
> > 
> >
> > Why not just add a break into the Exception catcher instead?
> 
> Maxim Khutornenko wrote:
> Actually, we are throwing there already. Wait, how is the exception is 
> trapped in that loop?
> 
> Bill Farner wrote:
> I scratched my head at this for a while, and wound up with this review: 
> https://reviews.apache.org/r/26308/
> 
> AFAICT `threading.Event` does not override `__bool__`, so those branches 
> are never entered, and we loop until the timer is up.

I am assuming this patch will be revisited then?


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26298/#review55295
---


On Oct. 3, 2014, 12:18 a.m., Bill Farner wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26298/
> ---
> 
> (Updated Oct. 3, 2014, 12:18 a.m.)
> 
> 
> Review request for Aurora, Mark Chu-Carroll and Maxim Khutornenko.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> A few times when making changes, i've found myself confused at a stalled test 
> and spiked CPU, only to find that my test should have failed, but an 
> exception is trapped in this retry loop.  The key change here is that unknown 
> exceptions will break the loop.
> 
> Making this change pointed out what should have been a test failure in 
> `test_transient_error`, where an exception caused by an unexpected call to 
> `getVersion` was swallowed.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/client/api/scheduler_client.py 
> b400cb2dbdb35077fc2c4a6e161c2959a9217317 
>   src/test/python/apache/aurora/client/api/test_scheduler_client.py 
> 1cbfbf86e903d890baac7d34461109f9beaff442 
> 
> Diff: https://reviews.apache.org/r/26298/diff/
> 
> 
> Testing
> ---
> 
> ./pants src/test/python:all -vxs
> 
> 
> Thanks,
> 
> Bill Farner
> 
>



Re: Review Request 26298: Use a less broad retry loop for RPCs.

2014-10-02 Thread Bill Farner


> On Oct. 3, 2014, 12:21 a.m., Maxim Khutornenko wrote:
> > src/main/python/apache/aurora/client/api/scheduler_client.py, line 286
> > 
> >
> > Why not just add a break into the Exception catcher instead?
> 
> Maxim Khutornenko wrote:
> Actually, we are throwing there already. Wait, how is the exception is 
> trapped in that loop?

I scratched my head at this for a while, and wound up with this review: 
https://reviews.apache.org/r/26308/

AFAICT `threading.Event` does not override `__bool__`, so those branches are 
never entered, and we loop until the timer is up.


- Bill


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26298/#review55295
---


On Oct. 3, 2014, 12:18 a.m., Bill Farner wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26298/
> ---
> 
> (Updated Oct. 3, 2014, 12:18 a.m.)
> 
> 
> Review request for Aurora, Mark Chu-Carroll and Maxim Khutornenko.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> A few times when making changes, i've found myself confused at a stalled test 
> and spiked CPU, only to find that my test should have failed, but an 
> exception is trapped in this retry loop.  The key change here is that unknown 
> exceptions will break the loop.
> 
> Making this change pointed out what should have been a test failure in 
> `test_transient_error`, where an exception caused by an unexpected call to 
> `getVersion` was swallowed.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/client/api/scheduler_client.py 
> b400cb2dbdb35077fc2c4a6e161c2959a9217317 
>   src/test/python/apache/aurora/client/api/test_scheduler_client.py 
> 1cbfbf86e903d890baac7d34461109f9beaff442 
> 
> Diff: https://reviews.apache.org/r/26298/diff/
> 
> 
> Testing
> ---
> 
> ./pants src/test/python:all -vxs
> 
> 
> Thanks,
> 
> Bill Farner
> 
>



Re: Review Request 26298: Use a less broad retry loop for RPCs.

2014-10-02 Thread Maxim Khutornenko


> On Oct. 3, 2014, 12:21 a.m., Maxim Khutornenko wrote:
> > src/main/python/apache/aurora/client/api/scheduler_client.py, line 286
> > 
> >
> > Why not just add a break into the Exception catcher instead?

Actually, we are throwing there already. Wait, how is the exception is trapped 
in that loop?


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26298/#review55295
---


On Oct. 3, 2014, 12:18 a.m., Bill Farner wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26298/
> ---
> 
> (Updated Oct. 3, 2014, 12:18 a.m.)
> 
> 
> Review request for Aurora, Mark Chu-Carroll and Maxim Khutornenko.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> A few times when making changes, i've found myself confused at a stalled test 
> and spiked CPU, only to find that my test should have failed, but an 
> exception is trapped in this retry loop.  The key change here is that unknown 
> exceptions will break the loop.
> 
> Making this change pointed out what should have been a test failure in 
> `test_transient_error`, where an exception caused by an unexpected call to 
> `getVersion` was swallowed.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/client/api/scheduler_client.py 
> b400cb2dbdb35077fc2c4a6e161c2959a9217317 
>   src/test/python/apache/aurora/client/api/test_scheduler_client.py 
> 1cbfbf86e903d890baac7d34461109f9beaff442 
> 
> Diff: https://reviews.apache.org/r/26298/diff/
> 
> 
> Testing
> ---
> 
> ./pants src/test/python:all -vxs
> 
> 
> Thanks,
> 
> Bill Farner
> 
>



Re: Review Request 26298: Use a less broad retry loop for RPCs.

2014-10-02 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26298/#review55295
---



src/main/python/apache/aurora/client/api/scheduler_client.py


Why not just add a break into the Exception catcher instead?


- Maxim Khutornenko


On Oct. 3, 2014, 12:18 a.m., Bill Farner wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26298/
> ---
> 
> (Updated Oct. 3, 2014, 12:18 a.m.)
> 
> 
> Review request for Aurora, Mark Chu-Carroll and Maxim Khutornenko.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> A few times when making changes, i've found myself confused at a stalled test 
> and spiked CPU, only to find that my test should have failed, but an 
> exception is trapped in this retry loop.  The key change here is that unknown 
> exceptions will break the loop.
> 
> Making this change pointed out what should have been a test failure in 
> `test_transient_error`, where an exception caused by an unexpected call to 
> `getVersion` was swallowed.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/client/api/scheduler_client.py 
> b400cb2dbdb35077fc2c4a6e161c2959a9217317 
>   src/test/python/apache/aurora/client/api/test_scheduler_client.py 
> 1cbfbf86e903d890baac7d34461109f9beaff442 
> 
> Diff: https://reviews.apache.org/r/26298/diff/
> 
> 
> Testing
> ---
> 
> ./pants src/test/python:all -vxs
> 
> 
> Thanks,
> 
> Bill Farner
> 
>



Re: Review Request 26298: Use a less broad retry loop for RPCs.

2014-10-02 Thread Bill Farner

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26298/
---

(Updated Oct. 3, 2014, 12:18 a.m.)


Review request for Aurora, Mark Chu-Carroll and Maxim Khutornenko.


Changes
---

Corrected description.


Repository: aurora


Description (updated)
---

A few times when making changes, i've found myself confused at a stalled test 
and spiked CPU, only to find that my test should have failed, but an exception 
is trapped in this retry loop.  The key change here is that unknown exceptions 
will break the loop.

Making this change pointed out what should have been a test failure in 
`test_transient_error`, where an exception caused by an unexpected call to 
`getVersion` was swallowed.


Diffs
-

  src/main/python/apache/aurora/client/api/scheduler_client.py 
b400cb2dbdb35077fc2c4a6e161c2959a9217317 
  src/test/python/apache/aurora/client/api/test_scheduler_client.py 
1cbfbf86e903d890baac7d34461109f9beaff442 

Diff: https://reviews.apache.org/r/26298/diff/


Testing
---

./pants src/test/python:all -vxs


Thanks,

Bill Farner



Re: Review Request 26298: Use a less broad retry loop for RPCs.

2014-10-02 Thread Bill Farner


> On Oct. 3, 2014, 12:13 a.m., Maxim Khutornenko wrote:
> > | I also changed the test to avoid sleeping, which was eating a fixed 10 
> > seconds of unit test time.
> > Not really sure why it takes that long for you:
> > 
> > $ time ./pants src/test/python/apache/aurora/client/api:scheduler_client -k 
> > test_transient_error
> > ...
> > real0m3.602s
> > user0m2.679s
> > sys 0m0.696s

You're right, my own change added another sleep when a transient error is 
encountered:

self._terminating.wait(self.RPC_RETRY_INTERVAL.as_(Time.SECONDS))

That's what was causing the delay.  It's the right behavior to do this, rather 
than a tight retry loop, but you're correct that this change is not speeding up 
our tests.


- Bill


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26298/#review55292
---


On Oct. 3, 2014, 12:05 a.m., Bill Farner wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26298/
> ---
> 
> (Updated Oct. 3, 2014, 12:05 a.m.)
> 
> 
> Review request for Aurora, Mark Chu-Carroll and Maxim Khutornenko.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> A few times when making changes, i've found myself confused at a stalled test 
> and spiked CPU, only to find that my test should have failed, but an 
> exception is trapped in this retry loop.  The key change here is that unknown 
> exceptions will break the loop.
> 
> Making this change pointed out what should have been a test failure in 
> `test_transient_error`, where an exception caused by an unexpected call to 
> `getVersion` was swallowed.
> 
> I also changed the test to avoid sleeping, which was eating a fixed 10 
> seconds of unit test time.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/client/api/scheduler_client.py 
> b400cb2dbdb35077fc2c4a6e161c2959a9217317 
>   src/test/python/apache/aurora/client/api/test_scheduler_client.py 
> 1cbfbf86e903d890baac7d34461109f9beaff442 
> 
> Diff: https://reviews.apache.org/r/26298/diff/
> 
> 
> Testing
> ---
> 
> ./pants src/test/python:all -vxs
> 
> 
> Thanks,
> 
> Bill Farner
> 
>



Re: Review Request 26298: Use a less broad retry loop for RPCs.

2014-10-02 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26298/#review55292
---


| I also changed the test to avoid sleeping, which was eating a fixed 10 
seconds of unit test time.
Not really sure why it takes that long for you:

$ time ./pants src/test/python/apache/aurora/client/api:scheduler_client -k 
test_transient_error
...
real0m3.602s
user0m2.679s
sys 0m0.696s

- Maxim Khutornenko


On Oct. 3, 2014, 12:05 a.m., Bill Farner wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26298/
> ---
> 
> (Updated Oct. 3, 2014, 12:05 a.m.)
> 
> 
> Review request for Aurora, Mark Chu-Carroll and Maxim Khutornenko.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> A few times when making changes, i've found myself confused at a stalled test 
> and spiked CPU, only to find that my test should have failed, but an 
> exception is trapped in this retry loop.  The key change here is that unknown 
> exceptions will break the loop.
> 
> Making this change pointed out what should have been a test failure in 
> `test_transient_error`, where an exception caused by an unexpected call to 
> `getVersion` was swallowed.
> 
> I also changed the test to avoid sleeping, which was eating a fixed 10 
> seconds of unit test time.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/client/api/scheduler_client.py 
> b400cb2dbdb35077fc2c4a6e161c2959a9217317 
>   src/test/python/apache/aurora/client/api/test_scheduler_client.py 
> 1cbfbf86e903d890baac7d34461109f9beaff442 
> 
> Diff: https://reviews.apache.org/r/26298/diff/
> 
> 
> Testing
> ---
> 
> ./pants src/test/python:all -vxs
> 
> 
> Thanks,
> 
> Bill Farner
> 
>



Review Request 26298: Use a less broad retry loop for RPCs.

2014-10-02 Thread Bill Farner

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26298/
---

Review request for Aurora, Mark Chu-Carroll and Maxim Khutornenko.


Repository: aurora


Description
---

A few times when making changes, i've found myself confused at a stalled test 
and spiked CPU, only to find that my test should have failed, but an exception 
is trapped in this retry loop.  The key change here is that unknown exceptions 
will break the loop.

Making this change pointed out what should have been a test failure in 
`test_transient_error`, where an exception caused by an unexpected call to 
`getVersion` was swallowed.

I also changed the test to avoid sleeping, which was eating a fixed 10 seconds 
of unit test time.


Diffs
-

  src/main/python/apache/aurora/client/api/scheduler_client.py 
b400cb2dbdb35077fc2c4a6e161c2959a9217317 
  src/test/python/apache/aurora/client/api/test_scheduler_client.py 
1cbfbf86e903d890baac7d34461109f9beaff442 

Diff: https://reviews.apache.org/r/26298/diff/


Testing
---

./pants src/test/python:all -vxs


Thanks,

Bill Farner