[Toolserver-l] Two beginner questions
1. I'm testing my skill and I run my script under cron. The python script begin with these rows (and it runs): # -*- coding: utf-8 -*- #!/usr/bin/python import os,sys if not sys.platform==win32: sys.path.append('/home/alebot/pywikipedia') os.chdir(/home/alebot/scripts) Then I tried to move to batch job sheduling, but... my script gives an error: now the server dislikes sys.path row. Why? I obviously have to study more: but what/where have I sto study? :-( 2. The script bring into life a python bot, who reads RecentChanges at 10 minutes intervals by a cron routine. Is perhaps more efficient a #irc bot listening it.wikisource #irc channel for recent changes in your opinion? Where can I find a good python script to read #irc channels? Thanks - I apologize for so banal questions. Alex ___ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
Re: [Toolserver-l] Two beginner questions
irc listening with python is fairly easy; just use a socket import socket IRC = socket.socket(socket.AF_INET, socket.SOCK_STREAM) IRC.connect(('irc.freenode.net', 6667)) while True: text = IRC.recv(1024) msgs = text.split('\n') for msg in msgs: if msg.split(' ', 1)[0] == PING: pong = msg.split(' ', 1)[1] IRC.send(PONG %s % pong) print msg If you want to do periodically things, like writing the output to a file very 10 minutes, you have to set a timeout. Otherwise the script will wait at the recv-line till it receives data 2010/12/9 Alex Brollo alex.bro...@gmail.com 1. I'm testing my skill and I run my script under cron. The python script begin with these rows (and it runs): # -*- coding: utf-8 -*- #!/usr/bin/python import os,sys if not sys.platform==win32: sys.path.append('/home/alebot/pywikipedia') os.chdir(/home/alebot/scripts) Then I tried to move to batch job sheduling, but... my script gives an error: now the server dislikes sys.path row. Why? I obviously have to study more: but what/where have I sto study? :-( 2. The script bring into life a python bot, who reads RecentChanges at 10 minutes intervals by a cron routine. Is perhaps more efficient a #irc bot listening it.wikisource #irc channel for recent changes in your opinion? Where can I find a good python script to read #irc channels? Thanks - I apologize for so banal questions. Alex ___ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette ___ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
Re: [Toolserver-l] Two beginner questions
Oops, forgot to put a return after the pongmsg, like this: IRC.send(PONG %s\n % pong) The IRC-server will try to process the line after it finds a \n in your msg Op 9 december 2010 17:04:24 UTC+1 heeft Sumurai8 sumur...@wikiweet.nl het volgende geschreven: irc listening with python is fairly easy; just use a socket import socket IRC = socket.socket(socket.AF_INET, socket.SOCK_STREAM) IRC.connect(('irc.freenode.net', 6667)) while True: text = IRC.recv(1024) msgs = text.split('\n') for msg in msgs: if msg.split(' ', 1)[0] == PING: pong = msg.split(' ', 1)[1] IRC.send(PONG %s % pong) print msg If you want to do periodically things, like writing the output to a file very 10 minutes, you have to set a timeout. Otherwise the script will wait at the recv-line till it receives data 2010/12/9 Alex Brollo alex.bro...@gmail.com 1. I'm testing my skill and I run my script under cron. The python script begin with these rows (and it runs): # -*- coding: utf-8 -*- #!/usr/bin/python import os,sys if not sys.platform==win32: sys.path.append('/home/alebot/pywikipedia') os.chdir(/home/alebot/scripts) Then I tried to move to batch job sheduling, but... my script gives an error: now the server dislikes sys.path row. Why? I obviously have to study more: but what/where have I sto study? :-( 2. The script bring into life a python bot, who reads RecentChanges at 10 minutes intervals by a cron routine. Is perhaps more efficient a #irc bot listening it.wikisource #irc channel for recent changes in your opinion? Where can I find a good python script to read #irc channels? Thanks - I apologize for so banal questions. Alex ___ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette ___ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
Re: [Toolserver-l] Two beginner questions
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Sumurai8 (DD): text = IRC.recv(1024) msgs = text.split('\n') This seems to have a bug: if there's more than 1024 bytes waiting, you could receive only part of the final message; so you will truncate that message, and the next recv will receive the other half (which will then be effectively junk). - river. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (FreeBSD) iEYEARECAAYFAk0A/0QACgkQIXd7fCuc5vKX8QCeKN77J7YXVJaO5utUVMyxCC5a ubsAnR/+E/8WtjZuD1Qrc78S5v68ZQ5/ =z4ru -END PGP SIGNATURE- ___ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
Re: [Toolserver-l] Two beginner questions
2010/12/9 Bryan Tong Minh bryan.tongm...@gmail.com On Thu, Dec 9, 2010 at 4:54 PM, Alex Brollo alex.bro...@gmail.com wrote: Then I tried to move to batch job sheduling, but... my script gives an error: now the server dislikes sys.path row. Why? I obviously have to study more: but what/where have I sto study? :-( Please give the specific error message. It is hard to believe that the error is the server dislikes sys.path. :-) It gives an error for that line, precisely mentioning sys.path. I didn't save the message, but I can try to reproduce it. Alex ___ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
Re: [Toolserver-l] Two beginner questions
Alex Brollo wrote: 2. The script bring into life a python bot, who reads RecentChanges at 10 minutes intervals by a cron routine. Is perhaps more efficient a #irc bot listening it.wikisource #irc channel for recent changes in your opinion? Yes. Specially since you presumably want to get *all* RecentChanges which makes the 10 minutes value arbitrary. ___ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
Re: [Toolserver-l] Two beginner questions
2010/12/9 Platonides platoni...@gmail.com Alex Brollo wrote: 2. The script bring into life a python bot, who reads RecentChanges at 10 minutes intervals by a cron routine. Is perhaps more efficient a #irc bot listening it.wikisource #irc channel for recent changes in your opinion? Yes. Specially since you presumably want to get *all* RecentChanges which makes the 10 minutes value arbitrary. Thanks to all from you. My 10 minutes interval readings were only a trick to skip over my continuously listening unskillness. I'll study a little bit the socket stuff and your code, then - I guess - I'll ask you again for details/troubles. :-) Consider that I'm VERY slow when learning new routines and presently I've no idea about what precisely is a socket. :-) Alex ___ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
Re: [Toolserver-l] Two beginner questions
It's just a plain idea how you can make an irc bot. Possible solutions are making the buffer bigger or preserving the last message if it doesn't end with a \n. For WikiLinkBot the first solution works just fine (If reading the recent changes every 10 minutes just works fine, making a bigger buffer should do the job (max. 500 edits in 600 seconds, then just make the buffer a little bigger). Sumurai8 2010/12/9 River Tarnell river.tarn...@wikimedia.de: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Sumurai8 (DD): text = IRC.recv(1024) msgs = text.split('\n') This seems to have a bug: if there's more than 1024 bytes waiting, you could receive only part of the final message; so you will truncate that message, and the next recv will receive the other half (which will then be effectively junk). - river. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (FreeBSD) iEYEARECAAYFAk0A/0QACgkQIXd7fCuc5vKX8QCeKN77J7YXVJaO5utUVMyxCC5a ubsAnR/+E/8WtjZuD1Qrc78S5v68ZQ5/ =z4ru -END PGP SIGNATURE- ___ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette ___ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
Re: [Toolserver-l] Two beginner questions
On Thu, Dec 9, 2010 at 5:36 PM, Platonides platoni...@gmail.com wrote: Sumurai8 (DD) wrote: Oops, forgot to put a return after the pongmsg, like this: IRC.send(PONG %s\n % pong) The IRC-server will try to process the line after it finds a \n in your msg According to the protocol, it should be a CRLF (\r\n). Although a bare \n seems to be commonly accepted as well. In fact some ircds only look at the first 4 chars, PONG, regardless whether there is a new line at all. Bryan ___ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
Re: [Toolserver-l] Two beginner questions
Well... you can actually send every 3 minutes a PONG-message without listening to the IRC-channel and the server will gladly accept that ^_^ . That's what I did at the time I didn't know about the timeout-option of a socket :) But most of the time it is just better to follow the rules and end each line with \r\n (nice, didn't know about that, so changed it in my script :) ), send a PONG-msg followed by everything that was send after the PING-message, etc, etc. 2010/12/9 Bryan Tong Minh bryan.tongm...@gmail.com: On Thu, Dec 9, 2010 at 5:36 PM, Platonides platoni...@gmail.com wrote: Sumurai8 (DD) wrote: Oops, forgot to put a return after the pongmsg, like this: IRC.send(PONG %s\n % pong) The IRC-server will try to process the line after it finds a \n in your msg According to the protocol, it should be a CRLF (\r\n). Although a bare \n seems to be commonly accepted as well. In fact some ircds only look at the first 4 chars, PONG, regardless whether there is a new line at all. Bryan ___ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette ___ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
Re: [Toolserver-l] Two beginner questions
Long ago I have noticed that the irc server is kicking my bot out after some time from some reason. Then I looked closer and noticed there is a server's ping around that mishaps. Alright, then I just added an ad-hoc pong: public void responsePing(String line) { try { out.println(PONG : + line.substring(line.indexOf(:)+1)); } catch(Throwable th) { // ... } } And said it to go to hell. Pure storytelling is not why I am writing this. I have a question. I was returning the server whatever it was sending to me as a ping. This is how it worked like two years ago. Has something changed? M ___ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
[Toolserver-l] mysql quries killing script
Hello, One day it was announced that long running queries are being killed in case the replag exceeds some value. I've added a simple piece of code to my tools, which prints replag info in case a query is killed and a few days ago I've got the following result: -- ERROR 1317 (70100) at line 6874: Query execution was interrupted last replicated timestamp: 20101207214400 replag: 00:00:01 -- Could anyone explain whether it was possible that a query (even a long running one) has been killed when replag was so good? mashiah ___ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
Re: [Toolserver-l] Two beginner questions
Михајло Анђелковић wrote: Long ago I have noticed that the irc server is kicking my bot out after some time from some reason. Then I looked closer and noticed there is a server's ping around that mishaps. Alright, then I just added an ad-hoc pong: public void responsePing(String line) { try { out.println(PONG : + line.substring(line.indexOf(:)+1)); } catch(Throwable th) { // ... } } And said it to go to hell. Pure storytelling is not why I am writing this. I have a question. I was returning the server whatever it was sending to me as a ping. This is how it worked like two years ago. Has something changed? M No. ___ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
Re: [Toolserver-l] Two beginner questions
Sumurai8 (DD) wrote: Well... you can actually send every 3 minutes a PONG-message without listening to the IRC-channel and the server will gladly accept that ^_^ . That's what I did at the time I didn't know about the timeout-option of a socket :) But most of the time it is just better to follow the rules and end each line with \r\n (nice, didn't know about that, so changed it in my script :) ), send a PONG-msg followed by everything that was send after the PING-message, etc, etc. Some ircds will, with every right to do so, not complete your login into the network in that case. Strangely, I don't see that kind of protection in freenode's ircd-seven despite being alledgedly protected from the javascript spam that plagued the last days of hyperion[1]. 1- http://blog.freenode.net/2010/01/javascript-spam/ ___ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
Re: [Toolserver-l] Two beginner questions
MZMcBride schrieb: Alex Brollo wrote: 2. The script bring(s) into life a python bot, who reads RecentChanges at 10 minutes intervals by a cron routine. Is perhaps more efficient a #irc bot listening it.wikisource #irc channel for recent changes in your opinion? Where can I find a good python script to read #irc channels? Gahhh, this list. Nobody suggested just using Python's Twisted?[1] So much easier than trying to write your own script in Python using sockets and manual pongs and all that jazz. The process of IRC listening is not that dramatic, regardless of language. That could easily be made manually. You're more than welcome to look around my home directory (check /home/mzmcbride/scripts/irc/) for some IRC bots. The bot I specifically use to relay irc.wikimedia.org to irc.freenode.net is on another server, but I'd be happy to post the code for you if you'd like. His name is snitch and he supports all Wikimedia wikis, multiple channels, and stalks per-page, per-user, or per-wiki. Interesting. Here’s my RE that parses the RC IRC message in all aspects I know of: The first line splits the server line into the actual IRC message and the channel (i.e. wiki) it is coming from. The sending nick is ignored since noone is allowed to talk at all and because it may change. The second splits the message into its 6 constituent parts. That works for every single line at the moment (sometimes a detail changes and we are left with a mess), be it even a log entry and not an ordinary edit, because the surrounding markup is present at every line. Sometimes the message is too long for the IRC format (which allows for 512 bytes including the final \r\n), so beware of cut off lines. The REs are in the re_syntax(n) Tcl-style format (since this is taken from my MediaWiki Tcl Library [~gifti/bot/irc.tcl]) but can easily be adopted to other languages I assume. I use \003 and \002 instead of direct ASCII for better readability and transportability. Consider that the color codes are sometimes with leading zeros, sometimes not. regexp {:[^ ]+ PRIVMSG #([^ ]+) :(.*?)} $line - channel message regexp {\00314\[\[\00307(.*)\00314\]\]\0034 (.*)\00310 \00302(.*)\003 \0035\*\003 \00303(.*)\003 \0035\*\003 \(*\002*\+*([^)]*)\002*\)* \00310(.*?)\003*} $message - title action url user bytes comment Giftpflanze ___ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
Re: [Toolserver-l] Two beginner questions
2010/12/10 Giftpflanze m.p.ropp...@web.de Gahhh, this list. Nobody suggested just using Python's Twisted?[1] So much easier than trying to write your own script in Python using sockets and manual pongs and all that jazz. I'm going to drag as deep as I can into http://krondo.com/?p=1209. Thanks for suggestion. This will help me into the second step: and now that I have my clean parsed #irc message... how can I use it for my tasks, sometimes simple, sometimes far from simple, while listening for other messages? I'd try a DIY (do it yourself) way... but I guess that it's not so an exotic problem, nad that's much better to study a little bit. Here’s my RE that parses the RC IRC message in all aspects I know of: The first line splits the server line into the actual IRC message and the channel (i.e. wiki) it is coming from. The sending nick is ignored since noone is allowed to talk at all and because it may change. The second splits the message into its 6 constituent parts. That works for every single line at the moment (sometimes a detail changes and we are left with a mess), be it even a log entry and not an ordinary edit, because the surrounding markup is present at every line. Sometimes the message is too long for the IRC format (which allows for 512 bytes including the final \r\n), so beware of cut off lines. The REs are in the re_syntax(n) Tcl-style format (since this is taken from my MediaWiki Tcl Library [~gifti/bot/irc.tcl]) but can easily be adopted to other languages I assume. I use \003 and \002 instead of direct ASCII for better readability and transportability. Consider that the color codes are sometimes with leading zeros, sometimes not. regexp {:[^ ]+ PRIVMSG #([^ ]+) :(.*?)} $line - channel message regexp {\00314\[\[\00307(.*)\00314\]\]\0034 (.*)\00310 \00302(.*)\003 \0035\*\003 \00303(.*)\003 \0035\*\003 \(*\002*\+*([^)]*)\002*\)* \00310(.*?)\003*} $message - title action url user bytes comment VERY interesting, thank you! Alex ___ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette