Re: [Toolserver-l] Status of the toolserver

2013-05-25 Thread Dr. Trigon
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

 yes, there is a minor problem with ngnix I haven?t time to fix yet.
 Also there is a harmless error-message about quota at login.

The funny thing with the quota error-message is, it works correct if I
do have over-quota when loggin-in. Not so if the quota is not
exceeded... ;))

Greetings

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.13 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlGgiMQACgkQAXWvBxzBrDDIaQCgr4kDw3LP3BckJmYxMUd4u7YA
gqIAoM+U4xSdPdxLQz4TsjCWJgN4X4OH
=THtV
-END PGP SIGNATURE-

___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette

Re: [Toolserver-l] Status of the toolserver

2013-05-25 Thread Tim Landscheidt
(anonymous) wrote:

 yes, there is a minor problem with ngnix I haven?t time to fix yet.
 Also there is a harmless error-message about quota at login.

 The funny thing with the quota error-message is, it works correct if I
 do have over-quota when loggin-in. Not so if the quota is not
 exceeded... ;))

The more irritating thing is that it works on willow, yet
the mounts look identical:

| [tim@passepartout ~]$ for HOST in willow yarrow; do ssh $HOST.toolserver.org 
mount | grep -w /sge; done
| /sge on ha-sge.esi:/global/misc/sge 
remote/read/write/setuid/devices/rstchown/vers=3/proto=udp/xattr/dev=4b6 on 
Sun May 19 20:45:25 2013
| ha-sge.esi:/global/misc/sge on /sge type nfs 
(rw,proto=udp,vers=3,addr=10.24.1.16)
| [tim@passepartout ~]$

Either Solaris's quota is silent about not being able to ac-
cess some file systems, or ha-sge.esi seems to be blocked
from yarrow, but not from willow (host ha-nfs.esi yields
the same on both).

Tim


___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette

Re: [Toolserver-l] Status of the toolserver

2013-05-23 Thread Patricia Pintilie
General chat • Re: Navigating Multiboot GRUB2 menu entries successfully.
http://forum.porteus.org/viewtopic.php?t=2195p=15042#p15042
On May 20, 2013 5:44 PM, DaB. w...@daniel.baur4.info wrote:

 Hello all,
 At Tuesday 21 May 2013 00:42:31 DaB. wrote:
Is SVN supposed to be down still?

 yes, there is a minor problem with ngnix I haven’t time to fix yet. Also
 there
 is a harmless error-message about quota at login.
 I will try to fix the SVN-problem tomorrow.

 Sincerely,
 DaB.


 --
 Userpage: [[:w:de:User:DaB.]] — PGP: 0x2d3ee2d42b255885

 ___
 Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
 https://lists.wikimedia.org/mailman/listinfo/toolserver-l
 Posting guidelines for this list:
 https://wiki.toolserver.org/view/Mailing_list_etiquette

___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette

Re: [Toolserver-l] Status of the toolserver

2013-05-20 Thread Dr. Trigon
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Is SVN supposed to be down still?

DrTrigon

On 18.05.2013 22:37, DaB. wrote:
 Hello all, At Saturday 18 May 2013 22:29:07 DaB. wrote:
 We can only think of one solution: Replacing the solaris at the
 ha-nodes with  linux. But this can not start before Friday and it
 will take some time until everything is moved over.
 
 I started today to move some services to the linux-version of the
 ha-cluster. Until now nagios (without web-interface), the
 sql-tunnel to the WMF and the ts-irc-bot moved over. The next big
 things are SGE and LDAP which will move tomorrow (Sunday). For this
 I announce a total downtime of SGE for
 
 TOMORROW, between 18:00 and 22:00 UTC.
 
 SGE will not be down the hole time, but better expect that it can
 be down anytime during that timeframe. The LDAP-move will happen at
 a unknown timestamp tomorrow, but the downtime should not be more
 than a few minutes
 
 Sincerely, DaB.
 
 
 
 
 ___ Toolserver-l
 mailing list (Toolserver-l@lists.wikimedia.org) 
 https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting
 guidelines for this list:
 https://wiki.toolserver.org/view/Mailing_list_etiquette
 

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.13 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlGZ4K0ACgkQAXWvBxzBrDAc9gCgyba+G0ZPh1zJhm2xm08y7Ii7
h0sAn3Tj7pwG2QnSjkpeiPT4a6hbTonF
=zOZd
-END PGP SIGNATURE-

___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette

Re: [Toolserver-l] Status of the toolserver

2013-05-20 Thread Dr. Trigon
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Please make SVN work... :)

On 20.05.2013 12:39, Jason Y. Lee wrote:
 Please let me know if there is any assistance I can provide.
 
 Regards,
 
 AllyUnion
 
 
 On Mon, May 20, 2013 at 1:37 AM, Dr. Trigon dr.tri...@surfeu.ch 
 mailto:dr.tri...@surfeu.ch wrote:
 
 Is SVN supposed to be down still?
 
 DrTrigon
 
 On 18.05.2013 22:37, DaB. wrote:
 Hello all, At Saturday 18 May 2013 22:29:07 DaB. wrote:
 We can only think of one solution: Replacing the solaris at
 the ha-nodes with  linux. But this can not start before Friday
 and it will take some time until everything is moved over.
 
 I started today to move some services to the linux-version of
 the ha-cluster. Until now nagios (without web-interface), the 
 sql-tunnel to the WMF and the ts-irc-bot moved over. The next
 big things are SGE and LDAP which will move tomorrow (Sunday).
 For this I announce a total downtime of SGE for
 
 TOMORROW, between 18:00 and 22:00 UTC.
 
 SGE will not be down the hole time, but better expect that it
 can be down anytime during that timeframe. The LDAP-move will
 happen at a unknown timestamp tomorrow, but the downtime should
 not be more than a few minutes
 
 Sincerely, DaB.
 
 
 
 
 ___ Toolserver-l 
 mailing list (Toolserver-l@lists.wikimedia.org
 mailto:Toolserver-l@lists.wikimedia.org)
 https://lists.wikimedia.org/mailman/listinfo/toolserver-l
 Posting guidelines for this list: 
 https://wiki.toolserver.org/view/Mailing_list_etiquette
 
 
 
 ___ Toolserver-l
 mailing list (Toolserver-l@lists.wikimedia.org 
 mailto:Toolserver-l@lists.wikimedia.org) 
 https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting
 guidelines for this list: 
 https://wiki.toolserver.org/view/Mailing_list_etiquette
 
 
 
 
 ___ Toolserver-l
 mailing list (Toolserver-l@lists.wikimedia.org) 
 https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting
 guidelines for this list:
 https://wiki.toolserver.org/view/Mailing_list_etiquette
 

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.13 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlGaRBIACgkQAXWvBxzBrDDivgCcCYo4eJdVnzYilaRoiT2xx5pF
y1sAnj04GD4z4s2A4qJCVzL0gmOeCSjV
=+Yr1
-END PGP SIGNATURE-

___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette

Re: [Toolserver-l] Status of the toolserver

2013-05-19 Thread DaB.
Hello all,
At Sunday 19 May 2013 23:46:02 DaB. wrote:
 SGE will not be down the hole time, but better expect that it can be down 
 anytime during that timeframe.
 The LDAP-move will happen at a unknown timestamp tomorrow, but the
 downtime  should not be more than a few minutes

the SGE-move working more or less without a problem and everything seems to 
work AFAIS. It was noticed during the move that the solaris-version of 
qcronsub was broken and that was fix on the fly too.
The LDAP-move is not complete yet and Nosy will continue there tomorrow. So if 
you notice a LDAP or a (file-)right-problem tomorrow that is nothing to worry 
about.

Good night,
DaB.

-- 
Userpage: [[:w:de:User:DaB.]] — PGP: 0x2d3ee2d42b255885


signature.asc
Description: This is a digitally signed message part.
___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette

Re: [Toolserver-l] Status of the toolserver

2013-05-18 Thread DaB.
Hello all,
At Saturday 18 May 2013 22:29:07 DaB. wrote:
 We can only think of one solution: Replacing the solaris at the ha-nodes
 with  linux. But this can not start before Friday and it will take some
 time until everything is moved over.

I started today to move some services to the linux-version of the ha-cluster. 
Until now nagios (without web-interface), the sql-tunnel to the WMF and the 
ts-irc-bot moved over. The next big things are SGE and LDAP which will move 
tomorrow (Sunday). For this I announce a total downtime of SGE for

TOMORROW, between 18:00 and 22:00 UTC.

SGE will not be down the hole time, but better expect that it can be down 
anytime during that timeframe.
The LDAP-move will happen at a unknown timestamp tomorrow, but the downtime 
should not be more than a few minutes

Sincerely,
DaB.


-- 
Userpage: [[:w:de:User:DaB.]] — PGP: 0x2d3ee2d42b255885


signature.asc
Description: This is a digitally signed message part.
___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette

Re: [Toolserver-l] Status of the toolserver

2013-05-15 Thread Patricia Pintilie
Sure
On May 14, 2013 2:53 PM, Marc A. Pelletier m...@uberbox.org wrote:

 On 05/13/2013 05:01 PM, DaB. wrote:
  The problem is that both ha-nodes run Solaris and all roots are no
 Solaris-
  experts what makes it hard for us to find errors or in this case
 impossible.

 There is a former colleague or mine with whom I've kept contact that is
 a serious high-grade guru with Solaris.  Would you like me to put him in
 contact with you guys?  Maybe he can give a hand or lend expertise?

 -- Marc



 ___
 Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
 https://lists.wikimedia.org/mailman/listinfo/toolserver-l
 Posting guidelines for this list:
 https://wiki.toolserver.org/view/Mailing_list_etiquette
___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette

Re: [Toolserver-l] Status of the toolserver

2013-05-14 Thread Alex Brollo
Just a flash feedback - some ours again I could login again, but qstat gave
an error message while crontab was running regularly; now qstat runs again.

Presently is running under Alebot account a IRC script only, that can be
considered a test routine; have I to stop it, to make server update easier?

Alex


2013/5/13 DaB. w...@daniel.baur4.info

 Hello all,

 as you have surely noticed the toolserver is even more unstable and
 unreliable
 than normal at the moment. The reason is that our ha-nodes are not longer
 working as intended and neither Nosy nor I are able to fix this.

 A quick word was ha-nodes are: The ha stands for high available and we
 have 2 servers for that. Some services at the toolserver are so important
 that
 a downtime is unacceptable (like /home, LDAP or the DNS) and for this
 reasons
 these services life at the ha-nodes. If one server goes down or crashes
 then
 the other can continue to operate all services with no or little
 interruption
 time and without working by a root. That worked great as long as River was
 here and not-so-good in the last months, but now it is totally broken.
 The problem is that both ha-nodes run Solaris and all roots are no Solaris-
 experts what makes it hard for us to find errors or in this case
 impossible. We
 have setup a very ugly workaround, but it is not stable and so the
 downtime of
 important services cause downtime for the hole toolserver – and more work
 for
 the roots.

 We can only think of one solution: Replacing the solaris at the ha-nodes
 with
 linux. But this can not start before Friday and it will take some time
 until
 everything is moved over. It will also cause some hours of complete
 downtime
 while /home is copied (we will separately announce this). In best case when
 Whitsun is over everything will be working again, in worst case it will
 need 2
 weeks (I will be away between 21 and 26 for the general meeting of WMDE).
 The repairing of the ha-nodes has top priority, so everything else will be
 delayed (linux-update, database-reimports, account-creation (for VERY
 important ones send me a mail), etc.).

 If you have questions, please send them to the ML.

 Sincerely,
 DaB.

 --
 Userpage: [[:w:de:User:DaB.]] — PGP: 0x2d3ee2d42b255885

 ___
 Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
 https://lists.wikimedia.org/mailman/listinfo/toolserver-l
 Posting guidelines for this list:
 https://wiki.toolserver.org/view/Mailing_list_etiquette

___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette

Re: [Toolserver-l] Status of the toolserver

2013-05-14 Thread Russell Blau
On Mon, May 13, 2013, at 05:01 PM, DaB. wrote:
 The repairing of the ha-nodes has top priority, so everything else will
 be delayed (linux-update, database-reimports, account-creation (for VERY 
 important ones send me a mail), etc.).
 
 If you have questions, please send them to the ML.

Is the current outage of replication on sql-s1-user (now approaching 48
hours) related to this ha-node problem?  At least some other dbs seem to
still have replication working.

-- 
  Russell Blau
  russb...@imapmail.org

___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette

Re: [Toolserver-l] Status of the toolserver

2013-05-14 Thread Patricia Pintilie
Linux is your best bet. Also Errors 404  401 are non responsive. I can
connect to all servers but on 2 of them msg/nickserver/password is the 401
 404 error stub. See if this information helps you if not write me back
Best Regards [MILASTARX]:[TS]
On May 13, 2013 6:02 PM, DaB. w...@daniel.baur4.info wrote:

 Hello all,

 as you have surely noticed the toolserver is even more unstable and
 unreliable
 than normal at the moment. The reason is that our ha-nodes are not longer
 working as intended and neither Nosy nor I are able to fix this.

 A quick word was ha-nodes are: The ha stands for high available and we
 have 2 servers for that. Some services at the toolserver are so important
 that
 a downtime is unacceptable (like /home, LDAP or the DNS) and for this
 reasons
 these services life at the ha-nodes. If one server goes down or crashes
 then
 the other can continue to operate all services with no or little
 interruption
 time and without working by a root. That worked great as long as River was
 here and not-so-good in the last months, but now it is totally broken.
 The problem is that both ha-nodes run Solaris and all roots are no Solaris-
 experts what makes it hard for us to find errors or in this case
 impossible. We
 have setup a very ugly workaround, but it is not stable and so the
 downtime of
 important services cause downtime for the hole toolserver – and more work
 for
 the roots.

 We can only think of one solution: Replacing the solaris at the ha-nodes
 with
 linux. But this can not start before Friday and it will take some time
 until
 everything is moved over. It will also cause some hours of complete
 downtime
 while /home is copied (we will separately announce this). In best case when
 Whitsun is over everything will be working again, in worst case it will
 need 2
 weeks (I will be away between 21 and 26 for the general meeting of WMDE).
 The repairing of the ha-nodes has top priority, so everything else will be
 delayed (linux-update, database-reimports, account-creation (for VERY
 important ones send me a mail), etc.).

 If you have questions, please send them to the ML.

 Sincerely,
 DaB.

 --
 Userpage: [[:w:de:User:DaB.]] — PGP: 0x2d3ee2d42b255885

 ___
 Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
 https://lists.wikimedia.org/mailman/listinfo/toolserver-l
 Posting guidelines for this list:
 https://wiki.toolserver.org/view/Mailing_list_etiquette

___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette

[Toolserver-l] Status of the toolserver

2013-05-13 Thread DaB.
Hello all,

as you have surely noticed the toolserver is even more unstable and unreliable 
than normal at the moment. The reason is that our ha-nodes are not longer 
working as intended and neither Nosy nor I are able to fix this.

A quick word was ha-nodes are: The ha stands for high available and we 
have 2 servers for that. Some services at the toolserver are so important that 
a downtime is unacceptable (like /home, LDAP or the DNS) and for this reasons 
these services life at the ha-nodes. If one server goes down or crashes then 
the other can continue to operate all services with no or little interruption 
time and without working by a root. That worked great as long as River was 
here and not-so-good in the last months, but now it is totally broken.
The problem is that both ha-nodes run Solaris and all roots are no Solaris-
experts what makes it hard for us to find errors or in this case impossible. We 
have setup a very ugly workaround, but it is not stable and so the downtime of 
important services cause downtime for the hole toolserver – and more work for 
the roots.

We can only think of one solution: Replacing the solaris at the ha-nodes with 
linux. But this can not start before Friday and it will take some time until 
everything is moved over. It will also cause some hours of complete downtime 
while /home is copied (we will separately announce this). In best case when 
Whitsun is over everything will be working again, in worst case it will need 2 
weeks (I will be away between 21 and 26 for the general meeting of WMDE).
The repairing of the ha-nodes has top priority, so everything else will be 
delayed (linux-update, database-reimports, account-creation (for VERY 
important ones send me a mail), etc.).

If you have questions, please send them to the ML.

Sincerely,
DaB.

-- 
Userpage: [[:w:de:User:DaB.]] — PGP: 0x2d3ee2d42b255885


signature.asc
Description: This is a digitally signed message part.
___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette