Re: [fedora-arm] Regarding Seneca Issues

2012-11-29 Thread Peter Robinson
On Wed, Nov 28, 2012 at 8:45 PM, Jon Chiappetta
 wrote:
>> > 6) Your topic here
>>
>> Koji: There have been a number of issues over the last couple of weeks
>> that have been bought up and I haven't seen any form of update from
>> Seneca as to the state:
>>
>> - DB perf tuning, it was done or at least there was an outage. What
>> was the outcome
>
> * The dump and restore worked in shrinking our db size and cleaning
> old entries from our tables. Auto-vacuum seems to be broken or not
> working as we expected it but manual vacuum seems to work well on
> all of the tables.

Excellent, thanks for the update.

>> - repo issues (the generally perl based build failures due to repo
>> issues). I reported I thought I had found the offending host but the
>> issue appears to have come back. Was the host re-enabled, what testing
>> has Seneca done?
>
> * Could you please provide the specific task links as examples or names
> of hosts that are causing the problem so we could diagnose the problem
> and look into resolving it?

No I can't because I don't have time spare to dig through koji builds
but I have provided lots of them in the past on the IRC channel but
basically if you look through this list:

http://arm.koji.fedoraproject.org/koji/tasks?state=failed&view=tree&method=all&order=-id

If you get a build like this:
http://arm.koji.fedoraproject.org/koji/taskinfo?taskID=1271590

That has the "BuildrootError: could not init mock buildroot, mock
exited with status 30; see root.log for more information" error you're
basically guaranteed to find 100s of examples and it covers most of
our failures now days.

>> - builder issues, seeing issues with things like the disk space on the
>> large builders without a resolution, or a resolution being reported.
>> What is the status, is it fixed?
>
> * What are the problems with the large builders? Are there RAM issues,
> permission issues, low space issues? Which builders need to be looked at
> specifically because it's hard to solve this without the needed info.

There's a number of issues that I've pinged on IRC about . I don't
have access to the platform (and I'm quite happy to keep it that way)
so I see an issue with builders and I normally disable them so they
stop causing builds issues and report the builder to the channel with
as much detail I can tell from my build experience and what I suspect
is wrong and I expect the people in charge of the platform to
investigate (I can't and don't want to do everything, I don't scale
that well). The large builder channel had issues with the Trimslices
due to the HDD partition sizing and some of them dgilmore couldn't
access and was questioned "Why do you need access to this".

>> - Some people have remote access to the builders via a ssh key but it
>> appears that not all build hosts are configured for this. What's the
>> steps to resolution so that people can support the platform?
>
> * We specifically setup bcfg2 across all builders which helps to distribute
> our keys to them. If you would like access as you are describing then
> please contact one of us and we can generate an ssh key on hongkong so
> that you can login to the builders without a password. I believe this option
> was offered by one of my co-workers but was denied by a certain person so
> ya. :/

I don't want access, I've never requested access and I've never been
offered access to it. I'm quite happy with that situation. Dennis has
access and always has had access and there were particular build hosts
that he couldn't access where he could access others and should be
able to access all of them. So this is a corner case for specific
hosts which obviously have missing bits in their bcfg2 config or
something (pure speculation here).

The only real issue I have with all of the above is communication
it's the only real problem I've ever had and something I've asked
people to be aware of regularly whether it be communication about
maintenance, acknowledgement of an issue and who it working on it and
when it's fixed. The reason I bought this up is because it seems like
a black hole at the moment. This is not directed at anyone in
particular but is a general observation. If your not sure what I'm
asking please ask me and I'll give you as much information and
direction I can to help out.

Peter
___
arm mailing list
arm@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/arm

Re: [fedora-arm] Regarding Seneca Issues

2012-11-28 Thread Chris Tyler
On Wed, 2012-11-28 at 23:17 -0500, Jon Masters wrote:
> Thing is, and not speaking for Brendan, but I'm sure he agrees. None of
> this is personal or meant to detract from the great work being done at
> Seneca. You guys have been great, and I know that sometimes we all get a
> bit frustrated, and even ranty on IRC. Let's try this though. As
> custodians of Koji you get an awesome new project, which is to figure
> out the best way to do change tracking and notification of outages and
> the like.

This isn't that complicated :-)  We're going to try a Trac instance
connected to our team email - watch for URLs tomorrow.

And... just to help clarify who is who and does what here, I've done a
long-overdue blog post to introduce the current team members:
http://blog.chris.tylers.info/index.php?/archives/268-.html

--
Chris

___
arm mailing list
arm@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/arm

Re: [fedora-arm] Regarding Seneca Issues

2012-11-28 Thread Jon Masters
On 11/28/2012 09:59 PM, Brendan Conoboy wrote:
> On 11/28/2012 12:45 PM, Jon Chiappetta wrote:
>>> - repo issues (the generally perl based build failures due to repo
>>> issues). I reported I thought I had found the offending host but the
>>> issue appears to have come back. Was the host re-enabled, what testing
>>> has Seneca done?
>>
>> * Could you please provide the specific task links as examples or names
>> of hosts that are causing the problem so we could diagnose the problem
>> and look into resolving it?
>>
>>> - builder issues, seeing issues with things like the disk space on the
>>> large builders without a resolution, or a resolution being reported.
>>> What is the status, is it fixed?
>>
>> * What are the problems with the large builders? Are there RAM issues,
>> permission issues, low space issues? Which builders need to be looked at
>> specifically because it's hard to solve this without the needed info.
> 
> There's a common theme here: Issues are coming up and there isn't an 
> obvious way to report them, track them, or otherwise make sure they're 
> resolved, much less documented.  We need to fix that.

Let me jump in here...

Thing is, and not speaking for Brendan, but I'm sure he agrees. None of
this is personal or meant to detract from the great work being done at
Seneca. You guys have been great, and I know that sometimes we all get a
bit frustrated, and even ranty on IRC. Let's try this though. As
custodians of Koji you get an awesome new project, which is to figure
out the best way to do change tracking and notification of outages and
the like. Perhaps someone could take that on as a project exercise?
Heck, I'd give huge course credit - these are hard problems! :)

Jon.

___
arm mailing list
arm@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/arm

Re: [fedora-arm] Regarding Seneca Issues

2012-11-28 Thread Brendan Conoboy

On 11/28/2012 12:45 PM, Jon Chiappetta wrote:

- repo issues (the generally perl based build failures due to repo
issues). I reported I thought I had found the offending host but the
issue appears to have come back. Was the host re-enabled, what testing
has Seneca done?


* Could you please provide the specific task links as examples or names
of hosts that are causing the problem so we could diagnose the problem
and look into resolving it?


- builder issues, seeing issues with things like the disk space on the
large builders without a resolution, or a resolution being reported.
What is the status, is it fixed?


* What are the problems with the large builders? Are there RAM issues,
permission issues, low space issues? Which builders need to be looked at
specifically because it's hard to solve this without the needed info.


There's a common theme here: Issues are coming up and there isn't an 
obvious way to report them, track them, or otherwise make sure they're 
resolved, much less documented.  We need to fix that.



- Some people have remote access to the builders via a ssh key but it
appears that not all build hosts are configured for this. What's the
steps to resolution so that people can support the platform?


* We specifically setup bcfg2 across all builders which helps to distribute
our keys to them. If you would like access as you are describing then
please contact one of us and we can generate an ssh key on hongkong so
that you can login to the builders without a password. I believe this option
was offered by one of my co-workers but was denied by a certain person so ya. :/


Don't really understand this.


I'm sorry if it seems like I'm not following along here in close detail
but our team has various other projects that we are working on simultaneously
and sometimes we don't have time to monitor in close detail what exactly is
happening in the farm. If you're able to point out specific examples of 
something
failing and highlight all of our names in channel, I will at least definitely be
able to see who it is and what is happening, hopefully.


This doesn't scale well.  The contact names change on a somewhat regular 
basis and #fedora-arm and #seneca are not support trackers- they're chat 
channels running 24/7, hours we do not any of us keep.  What's needed is 
a consistent point of contact, whether it be a bot, a web tracker, a 
separate mailing list, or some other mechanism.  Suggestions welcome!


--
Brendan Conoboy / Red Hat, Inc. / b...@redhat.com
___
arm mailing list
arm@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/arm