[Twisted-Python] List of required builds before a merge

2018-03-12 Thread Adi Roiban
Hi,

It is not clear to me what builders need to pass before we can merge something.

I expect that all supported "platforms" need to pass, but it is not
clear what are the currently supported platforms.

We have this info in the wiki but it does not help.
https://twistedmatrix.com/trac/wiki/ReviewProcess#Authors:Howtomergethechangetotrunk

In GitHub I can see Travis / Appveyor and OSX from Buildot as "Required"

Is that all?

--

If I check the "supported" group in Buildbot, I see many more builders.
The problem is that a significant number of slaves are down and those
builders are not available.



Is Fedora still supported and required?

---

I suggest to use GitHub "Required" marker to document what platforms
are supported.

We don't have time to maintain the infrastructure, so I suggest to
drop support for anything that is not supported by Travis and
Appveyor.

I know that this might be disruptive.
I think that we need it in order to raise awareness that supporting a
platform is not easy.
If someone (including me) cares about a platform they should find a
way to help to project supporting that platform.

What do you think?
-- 
Adi Roiban

___
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python


Re: [Twisted-Python] List of required builds before a merge

2018-03-12 Thread Jean-Paul Calderone
On Mon, Mar 12, 2018 at 7:59 AM, Adi Roiban  wrote:

> Hi,
>
> It is not clear to me what builders need to pass before we can merge
> something.
>
> I expect that all supported "platforms" need to pass, but it is not
> clear what are the currently supported platforms.
>
> We have this info in the wiki but it does not help.
> https://twistedmatrix.com/trac/wiki/ReviewProcess#Authors:
> Howtomergethechangetotrunk
>
> In GitHub I can see Travis / Appveyor and OSX from Buildot as "Required"
>
> Is that all?
>
> --
>
> If I check the "supported" group in Buildbot, I see many more builders.
> The problem is that a significant number of slaves are down and those
> builders are not available.
>
> 
>
> Is Fedora still supported and required?
>
> ---
>
> I suggest to use GitHub "Required" marker to document what platforms
> are supported.
>
> We don't have time to maintain the infrastructure, so I suggest to
> drop support for anything that is not supported by Travis and
> Appveyor.
>

It would help to have a list of what coverage this would remove.  What
platforms are only covered by Travis and Appveyor?   What tests are only
run there?  What platforms are only covered by Buildbot?  What tests are
only run there?

Without this information, it's not really possible to make an informed
decision.  No user cares about whether we drop buildbot.  Some user might
care if we, for example, drop HTTP support.


>
> I know that this might be disruptive.
> I think that we need it in order to raise awareness that supporting a
> platform is not easy.
> If someone (including me) cares about a platform they should find a
> way to help to project supporting that platform.
>
>
Note that some people cared about some platforms and they found a way to
help in donating a buildslave.  Do the operators of the offline slaves
*know* that the slaves are offline?  Maybe all that's missing is some
notification to the operators when their slave goes away.  If that's all,
jumping straight to "throw away all of buildbot" seems like an overreaction.

Jean-Paul



> What do you think?
> --
> Adi Roiban
>
> ___
> Twisted-Python mailing list
> Twisted-Python@twistedmatrix.com
> https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
>
___
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python


[Twisted-Python] Waiting for a contended resource

2018-03-12 Thread Richard van der Hoff

Hi folks,

I thought I'd poll the list on the best way to approach a problem in 
Twisted.


The background is that we have a number of resources which can be 
requested by a REST client, and which are calculated on demand. The 
calculation is moderately expensive (can take multiple seconds), so the 
results of the calculation are cached so multiple lookups of the same 
resource are more efficient.


The problem comes in trying to handle multiple clients requesting the 
same resource at once. Obviously if 200 clients all request the same 
resource at the same time, we don't want to fire off 200 calculation 
requests.


The approach we adopted was, effectively, to maintain a lock for each 
resource:



lock = defer.DeferredLock()
cached_result = None

@defer.inlineCallbacks
def getResource():
 yield lock.acquire()
 try:
 if cached_result is None:
 cached_result = yield do_expensive_calculation()
 defer.returnValue(cached_result)
 finally:
 lock.release()


(Of course one can optimise the above to avoid getting the lock if we 
already have the cached result - I've omitted that for simplicity.)


That's all very well, but it falls down when we get more than about 200 
requests for the same resource: once the calculation completes, we can 
suddenly serve all the requests, and the Deferreds returned by 
DeferredLock end up chaining together in a way that overflows the stack.


I reported this as http://twistedmatrix.com/trac/ticket/9304 and, at the 
time, worked around it by adding a call to reactor.callLater(0) into our 
implementation. However, Jean-Paul's comments on that bug implied that 
we were approaching the problem in completely the wrong way, and instead 
we should be avoiding queuing up work like this in the first place.


It's worth reiterating that the requests arrive from REST clients which 
we have no direct control over. We *could* keep track of the number of 
waiting clients, and make the API respond with a 5xx error or similar if 
that number gets too high, with the expectation that the client retries 
- but one concern would be that the load from the additional HTTP 
traffic would outweigh any efficiency gained by not stacking up Deferreds.


So, I'd welcome any advice on better ways to approach the problem.

Richard

___
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python


Re: [Twisted-Python] Waiting for a contended resource

2018-03-12 Thread L. Daniel Burr
Hi Richard,
On March 12, 2018 at 1:49:41 PM, Richard van der Hoff (rich...@matrix.org) 
wrote:

Hi folks,

I thought I'd poll the list on the best way to approach a problem in 
Twisted.

The background is that we have a number of resources which can be 
requested by a REST client, and which are calculated on demand. The 
calculation is moderately expensive (can take multiple seconds), so the 
results of the calculation are cached so multiple lookups of the same 
resource are more efficient.

The problem comes in trying to handle multiple clients requesting the 
same resource at once. Obviously if 200 clients all request the same 
resource at the same time, we don't want to fire off 200 calculation 
requests.

The approach we adopted was, effectively, to maintain a lock for each 
resource:

> lock = defer.DeferredLock()
> cached_result = None
>
> @defer.inlineCallbacks
> def getResource():
> yield lock.acquire()
> try:
> if cached_result is None:
> cached_result = yield do_expensive_calculation()
> defer.returnValue(cached_result)
> finally:
> lock.release()

(Of course one can optimise the above to avoid getting the lock if we 
already have the cached result - I've omitted that for simplicity.)

That's all very well, but it falls down when we get more than about 200 
requests for the same resource: once the calculation completes, we can 
suddenly serve all the requests, and the Deferreds returned by 
DeferredLock end up chaining together in a way that overflows the stack.

I reported this as http://twistedmatrix.com/trac/ticket/9304 and, at the 
time, worked around it by adding a call to reactor.callLater(0) into our 
implementation. However, Jean-Paul's comments on that bug implied that 
we were approaching the problem in completely the wrong way, and instead 
we should be avoiding queuing up work like this in the first place.


You mention using callLater to solve this problem, so I’m guessing that instead 
of using a lock you are re-scheduling the call to getResource if there is no 
cached_result value.  I’ve used this solution plenty of times across multiple 
projects, and have found it both simple and reliable.  Is there some reason why 
this solution is not desirable in your case?

It's worth reiterating that the requests arrive from REST clients which 
we have no direct control over. We *could* keep track of the number of 
waiting clients, and make the API respond with a 5xx error or similar if 
that number gets too high, with the expectation that the client retries 
- but one concern would be that the load from the additional HTTP 
traffic would outweigh any efficiency gained by not stacking up Deferreds.


Have you validated this concern through load-testing?  You may find that there 
is no meaningful negative impact to this approach.

So, I'd welcome any advice on better ways to approach the problem.

Richard
Hope this helps,

L. Daniel Burr___
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python


Re: [Twisted-Python] Waiting for a contended resource

2018-03-12 Thread Ilya Skriblovsky
Hi, Richard,

I've used class like this to cache the result of Expensive Calculation:

class DeferredCache:
pending = None
result = None
failure = None

def __init__(self, expensive_func):
self.expensive_func = expensive_func

def __call__(self):
if self.pending is None:
def on_ready(result):
self.result = result
def on_fail(failure):
self.failure = failure

self.pending =
defer.maybeDeferred(self.expensive_func).addCallbacks(on_ready, on_fail)

return self.pending.addCallback(self._return_result)

def _return_result(self, _):
return self.failure or self.result

Using it you can get rid of DeferredLocks:

deferred_cache = DeferredCache(do_expensive_calculation)

def getResource():
return deferred_cache()

It will start `expensive_func` on the first call. The second and
consequtive calls will return deferreds that resolves with the result when
expensive_func is done. If you call it when result is already here, it will
return alread-fired deferred.

Of course, it will require some more work if you need to pass arguments to
`expensive_func` and memoize results per arguments values.

-- ilya

пн, 12 мар. 2018 г. в 22:38, L. Daniel Burr :

> Hi Richard,
>
> On March 12, 2018 at 1:49:41 PM, Richard van der Hoff (rich...@matrix.org)
> wrote:
>
> Hi folks,
>
> I thought I'd poll the list on the best way to approach a problem in
> Twisted.
>
> The background is that we have a number of resources which can be
> requested by a REST client, and which are calculated on demand. The
> calculation is moderately expensive (can take multiple seconds), so the
> results of the calculation are cached so multiple lookups of the same
> resource are more efficient.
>
> The problem comes in trying to handle multiple clients requesting the
> same resource at once. Obviously if 200 clients all request the same
> resource at the same time, we don't want to fire off 200 calculation
> requests.
>
> The approach we adopted was, effectively, to maintain a lock for each
> resource:
>
> > lock = defer.DeferredLock()
> > cached_result = None
> >
> > @defer.inlineCallbacks
> > def getResource():
> > yield lock.acquire()
> > try:
> > if cached_result is None:
> > cached_result = yield do_expensive_calculation()
> > defer.returnValue(cached_result)
> > finally:
> > lock.release()
>
> (Of course one can optimise the above to avoid getting the lock if we
> already have the cached result - I've omitted that for simplicity.)
>
> That's all very well, but it falls down when we get more than about 200
> requests for the same resource: once the calculation completes, we can
> suddenly serve all the requests, and the Deferreds returned by
> DeferredLock end up chaining together in a way that overflows the stack.
>
> I reported this as http://twistedmatrix.com/trac/ticket/9304 and, at the
> time, worked around it by adding a call to reactor.callLater(0) into our
> implementation. However, Jean-Paul's comments on that bug implied that
> we were approaching the problem in completely the wrong way, and instead
> we should be avoiding queuing up work like this in the first place.
>
>
> You mention using callLater to solve this problem, so I’m guessing that
> instead of using a lock you are re-scheduling the call to getResource if
> there is no cached_result value.  I’ve used this solution plenty of times
> across multiple projects, and have found it both simple and reliable.  Is
> there some reason why this solution is not desirable in your case?
>
> It's worth reiterating that the requests arrive from REST clients which
> we have no direct control over. We *could* keep track of the number of
> waiting clients, and make the API respond with a 5xx error or similar if
> that number gets too high, with the expectation that the client retries
> - but one concern would be that the load from the additional HTTP
> traffic would outweigh any efficiency gained by not stacking up Deferreds.
>
>
> Have you validated this concern through load-testing?  You may find that
> there is no meaningful negative impact to this approach.
>
> So, I'd welcome any advice on better ways to approach the problem.
>
> Richard
>
> Hope this helps,
>
> L. Daniel Burr
> ___
> Twisted-Python mailing list
> Twisted-Python@twistedmatrix.com
> https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
>
___
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python


Re: [Twisted-Python] Waiting for a contended resource

2018-03-12 Thread Jean-Paul Calderone
On Mon, Mar 12, 2018 at 3:52 PM, Ilya Skriblovsky  wrote:

> Hi, Richard,
>
> I've used class like this to cache the result of Expensive Calculation:
>
> class DeferredCache:
> pending = None
> result = None
> failure = None
>
> def __init__(self, expensive_func):
> self.expensive_func = expensive_func
>
> def __call__(self):
> if self.pending is None:
> def on_ready(result):
> self.result = result
> def on_fail(failure):
> self.failure = failure
>
> self.pending = defer.maybeDeferred(self.
> expensive_func).addCallbacks(on_ready, on_fail)
>
> return self.pending.addCallback(self._return_result)
>
>
This seems like basically a correct answer to me.  However, I suggest one
small change.

You probably want to create and return a new Deferred for each result.  If
you don't, then your internal `pending` Deferred is now reachable by
application code.

As written, an application might (very, very reasonably):

d = getResource()
d.addCallback(long_async_operation)

Now `pending` has `long_async_operation` as a callback on its chain.  This
will prevent anyone else from getting a result until `long_async_operation`
is done.

You can fix this by:

result = Deferred()
self.pending.addCallback(self._return_result).chainDeferred(result)
return result

Now the application can only reach `result`.  Nothing they do to `result`
will make much difference to `pending` because `chainDeferred` only puts
`callback` (and `errback`) onto `pending`'s callback chain.  `callback` and
`errback` don't wait on anything.

You have to be a little careful with `chainDeferred` because it doesn't
have the recursion-avoidance logic that implicit chaining has.  However,
that doesn't matter in this particular case because the chain depth is
fixed at two (`pending` and `result`).  The problems only arise if you
extend the chain out in this direction without bound.

Jean-Paul



> def _return_result(self, _):
> return self.failure or self.result
>
> Using it you can get rid of DeferredLocks:
>
> deferred_cache = DeferredCache(do_expensive_calculation)
>
> def getResource():
> return deferred_cache()
>
> It will start `expensive_func` on the first call. The second and
> consequtive calls will return deferreds that resolves with the result when
> expensive_func is done. If you call it when result is already here, it will
> return alread-fired deferred.
>
> Of course, it will require some more work if you need to pass arguments to
> `expensive_func` and memoize results per arguments values.
>
> -- ilya
>
> пн, 12 мар. 2018 г. в 22:38, L. Daniel Burr :
>
>> Hi Richard,
>>
>> On March 12, 2018 at 1:49:41 PM, Richard van der Hoff (rich...@matrix.org)
>> wrote:
>>
>> Hi folks,
>>
>> I thought I'd poll the list on the best way to approach a problem in
>> Twisted.
>>
>> The background is that we have a number of resources which can be
>> requested by a REST client, and which are calculated on demand. The
>> calculation is moderately expensive (can take multiple seconds), so the
>> results of the calculation are cached so multiple lookups of the same
>> resource are more efficient.
>>
>> The problem comes in trying to handle multiple clients requesting the
>> same resource at once. Obviously if 200 clients all request the same
>> resource at the same time, we don't want to fire off 200 calculation
>> requests.
>>
>> The approach we adopted was, effectively, to maintain a lock for each
>> resource:
>>
>> > lock = defer.DeferredLock()
>> > cached_result = None
>> >
>> > @defer.inlineCallbacks
>> > def getResource():
>> > yield lock.acquire()
>> > try:
>> > if cached_result is None:
>> > cached_result = yield do_expensive_calculation()
>> > defer.returnValue(cached_result)
>> > finally:
>> > lock.release()
>>
>> (Of course one can optimise the above to avoid getting the lock if we
>> already have the cached result - I've omitted that for simplicity.)
>>
>> That's all very well, but it falls down when we get more than about 200
>> requests for the same resource: once the calculation completes, we can
>> suddenly serve all the requests, and the Deferreds returned by
>> DeferredLock end up chaining together in a way that overflows the stack.
>>
>> I reported this as http://twistedmatrix.com/trac/ticket/9304 and, at the
>> time, worked around it by adding a call to reactor.callLater(0) into our
>> implementation. However, Jean-Paul's comments on that bug implied that
>> we were approaching the problem in completely the wrong way, and instead
>> we should be avoiding queuing up work like this in the first place.
>>
>>
>> You mention using callLater to solve this problem, so I’m guessing that
>> instead of using a lock you are re-scheduling the call to getResource if
>> there is no cached_result value.  I’ve used this solution plenty of times
>> across multiple projects, and have found it both simple and re

Re: [Twisted-Python] Waiting for a contended resource

2018-03-12 Thread Ilya Skriblovsky
Thanks for correction, Jean-Paul, you're absolutly right

пн, 12 мар. 2018 г. в 23:00, Jean-Paul Calderone :

> On Mon, Mar 12, 2018 at 3:52 PM, Ilya Skriblovsky <
> ilyaskriblov...@gmail.com> wrote:
>
>> Hi, Richard,
>>
>> I've used class like this to cache the result of Expensive Calculation:
>>
>> class DeferredCache:
>> pending = None
>> result = None
>> failure = None
>>
>> def __init__(self, expensive_func):
>> self.expensive_func = expensive_func
>>
>> def __call__(self):
>> if self.pending is None:
>> def on_ready(result):
>> self.result = result
>> def on_fail(failure):
>> self.failure = failure
>>
>> self.pending =
>> defer.maybeDeferred(self.expensive_func).addCallbacks(on_ready, on_fail)
>>
>> return self.pending.addCallback(self._return_result)
>>
>>
> This seems like basically a correct answer to me.  However, I suggest one
> small change.
>
> You probably want to create and return a new Deferred for each result.  If
> you don't, then your internal `pending` Deferred is now reachable by
> application code.
>
> As written, an application might (very, very reasonably):
>
> d = getResource()
> d.addCallback(long_async_operation)
>
> Now `pending` has `long_async_operation` as a callback on its chain.  This
> will prevent anyone else from getting a result until `long_async_operation`
> is done.
>
> You can fix this by:
>
> result = Deferred()
> self.pending.addCallback(self._return_result).chainDeferred(result)
> return result
>
> Now the application can only reach `result`.  Nothing they do to `result`
> will make much difference to `pending` because `chainDeferred` only puts
> `callback` (and `errback`) onto `pending`'s callback chain.  `callback` and
> `errback` don't wait on anything.
>
> You have to be a little careful with `chainDeferred` because it doesn't
> have the recursion-avoidance logic that implicit chaining has.  However,
> that doesn't matter in this particular case because the chain depth is
> fixed at two (`pending` and `result`).  The problems only arise if you
> extend the chain out in this direction without bound.
>
> Jean-Paul
>
>
>
>> def _return_result(self, _):
>> return self.failure or self.result
>>
>> Using it you can get rid of DeferredLocks:
>>
>> deferred_cache = DeferredCache(do_expensive_calculation)
>>
>> def getResource():
>> return deferred_cache()
>>
>> It will start `expensive_func` on the first call. The second and
>> consequtive calls will return deferreds that resolves with the result when
>> expensive_func is done. If you call it when result is already here, it will
>> return alread-fired deferred.
>>
>> Of course, it will require some more work if you need to pass arguments
>> to `expensive_func` and memoize results per arguments values.
>>
>> -- ilya
>>
>> пн, 12 мар. 2018 г. в 22:38, L. Daniel Burr :
>>
>>> Hi Richard,
>>>
>>> On March 12, 2018 at 1:49:41 PM, Richard van der Hoff (
>>> rich...@matrix.org) wrote:
>>>
>>> Hi folks,
>>>
>>> I thought I'd poll the list on the best way to approach a problem in
>>> Twisted.
>>>
>>> The background is that we have a number of resources which can be
>>> requested by a REST client, and which are calculated on demand. The
>>> calculation is moderately expensive (can take multiple seconds), so the
>>> results of the calculation are cached so multiple lookups of the same
>>> resource are more efficient.
>>>
>>> The problem comes in trying to handle multiple clients requesting the
>>> same resource at once. Obviously if 200 clients all request the same
>>> resource at the same time, we don't want to fire off 200 calculation
>>> requests.
>>>
>>> The approach we adopted was, effectively, to maintain a lock for each
>>> resource:
>>>
>>> > lock = defer.DeferredLock()
>>> > cached_result = None
>>> >
>>> > @defer.inlineCallbacks
>>> > def getResource():
>>> > yield lock.acquire()
>>> > try:
>>> > if cached_result is None:
>>> > cached_result = yield do_expensive_calculation()
>>> > defer.returnValue(cached_result)
>>> > finally:
>>> > lock.release()
>>>
>>> (Of course one can optimise the above to avoid getting the lock if we
>>> already have the cached result - I've omitted that for simplicity.)
>>>
>>> That's all very well, but it falls down when we get more than about 200
>>> requests for the same resource: once the calculation completes, we can
>>> suddenly serve all the requests, and the Deferreds returned by
>>> DeferredLock end up chaining together in a way that overflows the stack.
>>>
>>> I reported this as http://twistedmatrix.com/trac/ticket/9304 and, at the
>>>
>>> time, worked around it by adding a call to reactor.callLater(0) into our
>>>
>>> implementation. However, Jean-Paul's comments on that bug implied that
>>> we were approaching the problem in completely the wrong way, and instead
>>>
>>> we should be avoiding queuing up work like this in th

Re: [Twisted-Python] Waiting for a contended resource

2018-03-12 Thread Richard van der Hoff
Thank you for all for all the answers so far, particularly to Ilya and 
Jean-Paul who provided some very helpful code samples.


It's interesting to realise that, by avoiding locking, we can end up 
with a much more efficient implementation. I'll have to figure out how 
widely we can apply this technique - and how often it's going to be 
worth rewriting things to allow that. Thanks for some useful pointers!


Richard


On 12/03/18 20:00, Jean-Paul Calderone wrote:
On Mon, Mar 12, 2018 at 3:52 PM, Ilya Skriblovsky 
mailto:ilyaskriblov...@gmail.com>> wrote:


Hi, Richard,

I've used class like this to cache the result of Expensive
Calculation:

class DeferredCache:
pending = None
result = None
failure = None

def __init__(self, expensive_func):
  self.expensive_func = expensive_func

def __call__(self):
  if self.pending is None:
      def on_ready(result):
          self.result = result
      def on_fail(failure):
          self.failure = failure

      self.pending =
defer.maybeDeferred(self.expensive_func).addCallbacks(on_ready,
on_fail)

  return self.pending.addCallback(self._return_result)


This seems like basically a correct answer to me. However, I suggest 
one small change.


You probably want to create and return a new Deferred for each 
result.  If you don't, then your internal `pending` Deferred is now 
reachable by application code.


As written, an application might (very, very reasonably):

    d = getResource()
    d.addCallback(long_async_operation)

Now `pending` has `long_async_operation` as a callback on its chain.  
This will prevent anyone else from getting a result until 
`long_async_operation` is done.


You can fix this by:

    result = Deferred()
self.pending.addCallback(self._return_result).chainDeferred(result)
    return result

Now the application can only reach `result`.  Nothing they do to 
`result` will make much difference to `pending` because 
`chainDeferred` only puts `callback` (and `errback`) onto `pending`'s 
callback chain.  `callback` and `errback` don't wait on anything.


You have to be a little careful with `chainDeferred` because it 
doesn't have the recursion-avoidance logic that implicit chaining 
has.  However, that doesn't matter in this particular case because the 
chain depth is fixed at two (`pending` and `result`).  The problems 
only arise if you extend the chain out in this direction without bound.


Jean-Paul

def _return_result(self, _):
  return self.failure or self.result

Using it you can get rid of DeferredLocks:

    deferred_cache = DeferredCache(do_expensive_calculation)

    def getResource():
        return deferred_cache()

It will start `expensive_func` on the first call. The second and
consequtive calls will return deferreds that resolves with the
result when expensive_func is done. If you call it when result is
already here, it will return alread-fired deferred.

Of course, it will require some more work if you need to pass
arguments to `expensive_func` and memoize results per arguments
values.

-- ilya



___
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python


Re: [Twisted-Python] List of required builds before a merge

2018-03-12 Thread Glyph


> On Mar 12, 2018, at 4:59 AM, Adi Roiban  wrote:
> 
> Hi,
> 
> It is not clear to me what builders need to pass before we can merge 
> something.
> 
> I expect that all supported "platforms" need to pass, but it is not
> clear what are the currently supported platforms.
> 
> We have this info in the wiki but it does not help.
> https://twistedmatrix.com/trac/wiki/ReviewProcess#Authors:Howtomergethechangetotrunk
> 
> In GitHub I can see Travis / Appveyor and OSX from Buildot as "Required"

These are marked as "required" because they're necessary, but not sufficient.

> Is that all?

Ideally all the supported buildbots should be passing.  It's a real shame that 
the offline buildbots do not report any status, because it makes it very easy 
to miss them.  (I personally did not realize the ramifications of the way 
buildbot repots status until I looked at 
https://buildbot.twistedmatrix.com/boxes-all?branch=trunk&num_builds=10 
 just 
now).

> If I check the "supported" group in Buildbot, I see many more builders.
> The problem is that a significant number of slaves are down and those
> builders are not available.

So, normally I'd say, like Jean-Paul did, that we should just get in touch with 
the maintainers of the buildbots in question.

But it seems the buildbots in question were the ones we had running on our 
donated Rackspace Cloud account.

Logging into the control panel for that account, literally all the servers 
except for the buildmaster (i.e. buildbot.twistedmatrix.com 
) have been deleted.  Not just shut down, 
but, completely gone.  This is baffling to me.  I do not know who could have 
done this or why.  There does not appear to be an audit log I can consult.  
Based on billing data, and consistent with the buildbot logs, it appears that 
this occurred some time in early January.

> Is Fedora still supported and required?

That's the hope.  Those buildbots appear to be online.

> I suggest to use GitHub "Required" marker to document what platforms are 
> supported.

I want to agree with you.  However, our tests are not reliable or performant 
enough for this.

The "required" marker makes it impossible to merge changes without a passing 
status or an administrator override.  This has an unfortunate set of 
corollaries.  Assuming a non-administrator reviewer:

If a single builder has a temporary configuration issue and you're not an 
administrator, you can't merge any code.
Let's say the probably of an intermittent test failing is 50 to 1.  A 2% 
chance.  The probability of a test suite passing is 98%.  We have 36 supported 
builders.  The probability of all the builders passing for a successful run is 
then just 13%; roughly 1 in 10 valid branches will be able to land. (I think 
our probability is actually quite a bit better than this these days, but you 
get my drift.)
Even if a contributor can force all the builds to re-run (which requires 
special permissions, and thus needs to wait for a project member) getting a 
successful run on every builder could require 2 or 3 tries, which could be 2 or 
3 hours of waiting just to get one successful run on a platform that you know 
is not relevant to the change you're testing.

Therefore keeping a small core set of "most pass" statuses and allowing for 
some human judgement about the rest is a practical necessity given the level of 
compute resources available to us.

> We don't have time to maintain the infrastructure, so I suggest to
> drop support for anything that is not supported by Travis and
> Appveyor.

My preference would be to simply drop all the buildbots which have been (for 
some reason) destroyed from the supported build matrix, since the buildbots are 
still covering a multiplicity of kernels and environments that travis and 
appveyor aren't.  But, I don't have the time to do much more than write this 
email, so if we have no other volunteers for maintenance, I will support your 
decision to tear down the buildbots for now.

Jean-Paul recently pointed out that CircleCI has much more performant macOS 
builds than Travis, so if someone were motivated to make that change but didn't 
want to keep maintaining hardware, that might be one way to go.

> I know that this might be disruptive.
> I think that we need it in order to raise awareness that supporting a
> platform is not easy.

I do hope that this will provoke some potential volunteers to come forward to 
help maintain our failing infrastructure.

> If someone (including me) cares about a platform they should find a way to 
> help to project supporting that platform.

> What do you think?

I do hope that if you're going to make a change, you'll consider something 
slightly less drastic than blowing up the buildbots entirely :).  But with a 
dozen servers having just disappeared with no explanation, it's a course of 
action which at least makes sense.

-g

__

Re: [Twisted-Python] List of required builds before a merge

2018-03-12 Thread Amber Brown
The buildbots went after someone said that the RAX hosting was going away,
and I (and a few others) didn't get the (annoyingly quiet) correction that
it was only for new projects (which was not how the original was written).
All the twisted list got was the original letter from the SFC without a
"false alarm" followup. By the time people pointed it out, it was too late.

I have the ansible configs to rebuild them all, but unfortunately, Life has
not stopped since January and hasn't got worse. If anyone wants to take a
stab, the ansible configs are in the twisted-infra repo.

- Amber

On 13 Mar. 2018 16:08, "Glyph"  wrote:

>
>
> On Mar 12, 2018, at 4:59 AM, Adi Roiban  wrote:
>
> Hi,
>
> It is not clear to me what builders need to pass before we can merge
> something.
>
> I expect that all supported "platforms" need to pass, but it is not
> clear what are the currently supported platforms.
>
> We have this info in the wiki but it does not help.
> https://twistedmatrix.com/trac/wiki/ReviewProcess#Authors:
> Howtomergethechangetotrunk
>
> In GitHub I can see Travis / Appveyor and OSX from Buildot as "Required"
>
>
> These are marked as "required" because they're necessary, but not
> sufficient.
>
> Is that all?
>
>
> Ideally all the supported buildbots should be passing.  It's a real shame
> that the offline buildbots do not report any status, because it makes it
> very easy to miss them.  (I personally did not realize the ramifications of
> the way buildbot repots status until I looked at https://buildbot.
> twistedmatrix.com/boxes-all?branch=trunk&num_builds=10 just now).
>
> If I check the "supported" group in Buildbot, I see many more builders.
> The problem is that a significant number of slaves are down and those
> builders are not available.
>
>
> So, normally I'd say, like Jean-Paul did, that we should just get in touch
> with the maintainers of the buildbots in question.
>
> But it seems the buildbots in question were the ones we had running on our
> donated Rackspace Cloud account.
>
> Logging into the control panel for that account, literally all the servers
> except for the buildmaster (i.e. buildbot.twistedmatrix.com) have been
> deleted.  Not just shut down, but, completely gone.  This is baffling to
> me.  I do not know who could have done this or why.  There does not appear
> to be an audit log I can consult.  Based on billing data, and consistent
> with the buildbot logs, it appears that this occurred some time in early
> January.
>
> Is Fedora still supported and required?
>
>
> That's the hope.  Those buildbots appear to be online.
>
> I suggest to use GitHub "Required" marker to document what platforms are
> supported.
>
>
> I want to agree with you.  However, our tests are not reliable or
> performant enough for this.
>
> The "required" marker makes it *impossible *to merge changes without a
> passing status or an administrator override.  This has an unfortunate set
> of corollaries.  Assuming a non-administrator reviewer:
>
>
>1. If a single builder has a temporary configuration issue and you're
>not an administrator, you can't merge any code.
>2. Let's say the probably of an intermittent test failing is 50 to 1.
>A 2% chance.  The probability of a test suite passing is 98%.  We have 36
>supported builders.  The probability of all the builders passing for a
>successful run is then just *13%*; roughly 1 in 10 valid branches will
>be able to land. (I think our probability is actually quite a bit better
>than this these days, but you get my drift.)
>3. Even if a contributor can force all the builds to re-run (which
>requires special permissions, and thus needs to wait for a project member)
>getting a successful run on every builder could require 2 or 3 tries, which
>could be 2 or 3 *hours* of waiting just to get one successful run on a
>platform that you know is not relevant to the change you're testing.
>
>
> Therefore keeping a small core set of "most pass" statuses and allowing
> for some human judgement about the rest is a practical necessity given the
> level of compute resources available to us.
>
> We don't have time to maintain the infrastructure, so I suggest to
> drop support for anything that is not supported by Travis and
> Appveyor.
>
>
> My preference would be to simply drop all the buildbots which have been
> (for some reason) destroyed from the supported build matrix, since the
> buildbots are still covering a multiplicity of kernels and environments
> that travis and appveyor aren't.  But, I don't have the time to do much
> more than write this email, so if we have no other volunteers for
> maintenance, I will support your decision to tear down the buildbots for
> now.
>
> Jean-Paul recently pointed out that CircleCI has much more performant
> macOS builds than Travis, so if someone were motivated to make that change
> but didn't want to keep maintaining hardware, that might be one way to go.
>
> I know that this might be

Re: [Twisted-Python] List of required builds before a merge

2018-03-12 Thread Glyph
On Mar 12, 2018, at 10:27 PM, Amber Brown  wrote:
> 
> The buildbots went after someone said that the RAX hosting was going away, 
> and I (and a few others) didn't get the (annoyingly quiet) correction that it 
> was only for new projects (which was not how the original was written). All 
> the twisted list got was the original letter from the SFC without a "false 
> alarm" followup. By the time people pointed it out, it was too late.
> 
> I have the ansible configs to rebuild them all, but unfortunately, Life has 
> not stopped since January and hasn't got worse. If anyone wants to take a 
> stab, the ansible configs are in the twisted-infra repo.

Thanks for the explanation; I was tearing my hair out (what little remains, 
anyway) trying to figure out what the heck had happened! :)  I do remember that 
ill-fated email.  Normally I'd say we should never deprovision infrastructure 
until it's torn, bloody, from our lifeless hands, but the way the free 
Rackspace account is set up means overages go to bill the SFC and deplete their 
general fund, so I can see not wanting to have anything unnecessary hanging 
around there if the discount were to end.

(For those of you that may not have been informed at the time: 
https://www.theregister.co.uk/2017/10/20/rackspace_ends_discount_hosting_for_open_source_projects/
 

 )

-g

___
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python