Re: JUST GOT HACKED
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Am 02.10.2013 13:03, schrieb Νίκος: I have to make some money and that needs for some reason to happen now as we speak, so i have no alternative than to hop into a car and learn to drive during the process, hoping i will not bang-smash the car. I'm really sorry for the fact that it seems as though your livelyhood really does depend on your current mess, but: this is not the way to learn to administer servers or how to program. Do that first in one way or another - and then start making money off it. And, from my personal experience, you are exacerbating your problems by behaving and/or acting as you currently are, as generally at some point in time the mess you currently leave behind - which you permanently choose to ignore - will start to become a liability for the rest of your livelyhood, which _will_ then get you in real trouble. - -- - --- Heiko. -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.20 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJSTATwAAoJEDMqpHf921/SK54IAJvUF+3nTJq5nKPN2s1WdQbz hOvqMThrhBE7BG6ybF8TfbKpLmZ+20cZExVzn4Xy9PPGe+WTrt6UR8+UizSst1Vs EgZ0DrmWb+WRN+nUZPyL45psDMaHdi1bQy0ReVGbav1faG9Y9tAMZ2KEQwfrnmZz CJ9mTJ95IbuB3iizCdlUOT2qCzhGPyCsx1ejR6IkKofKaO0QU712V7rHN9u/xdlJ v687pSzeNuRxWP9Rdlp25FIVDgj3oNGrK9HXrYUyra9TXSyZW3XTWbUjwriNMxer 8B00cngvLTEf14AmMeIkno7GvTP5QWq7yNul7n85Pq6ZXJKWLjVLodKwYndSf9I= =AOdw -END PGP SIGNATURE- -- https://mail.python.org/mailman/listinfo/python-list
Re: I haev fixed it
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Am 01.10.2013 13:06, schrieb Νίκος: But it seems you don't want to provide an explanation although i think you might have a theory. You need a theory? 1) Your password(s) is/are leaked (see the URL referenced somewhere before, and IIRC you also posted your GMail password sometime ago), and 2) you did password-reuse, so that by an attacker getting access to one password, more than one of your accounts was compromised. - -- - --- Heiko. -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.20 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJSSrIKAAoJEDMqpHf921/SO+UH/iujBSt7ZXmXAIAHgHXoKH0Q Qxvzi2L1pCXcXvEE4yeUI0g0TiYD9B88Q5eRyCegWWm2BwpOjx7KLNBkMqQeiI6H M52L/ulXwMkwVq0HTn6YPNncReQrPMu2V5xQaKWhfVhBnWLZnZYm1n7WZse9M2Sr 9KaAkZ4j2jlHozJ9tAGXnIt/9bj6MM3SQPuG1b68qSWThisUhvTcbrDkm3e4KDoq I9i9kEF93XPLYeOMefEOksm51vKjpDWFlRu20Vqy5quYxDHpUU/5e04Z6doz0py8 6XhR892g4zetQ3OwtzxQOKunwaLOvSg9VtXfIn7ElBkCE0v/XbCxTnO/oBLcb7g= =I1kO -END PGP SIGNATURE- -- https://mail.python.org/mailman/listinfo/python-list
Re: I haev fixed it
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Am 01.10.2013 14:06, schrieb Νίκος: i know about the link you mentioned and i have deleted the source code from there. Guess what: Google keeps a cache. See here: http://webcache.googleusercontent.com/search?q=cache:http://superhost.gr/~dauwin/cgi-bin/metrites.py So if you haven't changed your password(s), you'd better do that now. - -- - --- Heiko. -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.20 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJSSrwRAAoJEDMqpHf921/SY/MH/3pf9ZdHCXuu84urCodUyBrQ RVRbVN1lXAzCXY1nyPGfzANOsraXLzRDe0j9ZBfHbEaZR19Hvl4DOf8+RJfRl8jg LWCsgIIVb2fWWVLrx1CU3oz47sVfy1vGOp8XRiIqjcDKa+zOtyqqlxIolKCFM6CL /YsHnb1/9JE1zn07WaKYJTi1/9+uptaQPR9kNzOssv1TpvRiJ+4H1oO67Px6tpdj VchpEirkV7CaD39mD9BLEoB24FhEX+NSNYPRJx89ivC+MENpNUp6n5vVjQ+ciXI/ NvJJxBalypi/DLNaCR/up2B2018ebH+3ByDv3xO+UnbS6MYx5YVppstilvkvr1c= =2VTx -END PGP SIGNATURE- -- https://mail.python.org/mailman/listinfo/python-list
Re: Tryign to send mail via a python script by using the local MTA
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Am 17.09.2013 01:41, schrieb Steven D'Aprano: I cannot fathom for the life of me a legitimate reason for your website to use a fake IP address and hostname when sending email. In addition to that: it's amazing that Nikos thinks TCP will still work in the presence of spoofed IP addresses. Email without TCP is a challenge, at the least. - -- - --- Heiko. -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.20 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJSODMPAAoJEDMqpHf921/SC0YH/3rCWDcX+rzJKonfeJXUYNxz nbrBPDsoZf6xPh0socOn88TrzgbZewhWf2l3dHAPOKxTAwUWjRjygatTccBmZur9 6B+t410Nq7axz5+0jg4OwBSOQVt3jr0YInK3vWzq4nd0V0cHchvZzfrdSmnEloDU V3wIPhBM7MEavyuvrxhutIM8DxA/0z6L/cLhwnpHfE6AxVMeGh/dHhGK9eaxJ03C pfPWgb2fuCRHrOd3+cLUx3ZFF6YkK00PZzICFhkx236K8iaTvBgqIsod2tpyP6+t H9qlXCfxit1d6nEzTJavx4suBGStcbhDr1C6VlDaPjfVH+w8842h/0QLhTsMXjY= =K/XL -END PGP SIGNATURE- -- https://mail.python.org/mailman/listinfo/python-list
Re: Tryign to send mail via a python script by using the local MTA
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Am 17.09.2013 13:55, schrieb Joel Goldstick: At least if you want to add to this nonsense, read each of the (several?) dozen entries. Actually, I have read each of the troll cycles (just as I read much of clp, although I haven't participated much for the last five years), and found most of them to be rather interesting reads (in their digression from the original topic). And: I actually find it rather valuable answering indirectly to things noticed along the ride (or reading what other people answer indirectly in the same manner) - see my post in this thread where I pointed out that the original code actually does not sanitize inputs to a shell command to send mail. But, again, your impression may differ, and I can respect that. - -- - --- Heiko. -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.20 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJSOEl/AAoJEDMqpHf921/SAJQIAI48Kzz0js1QqMDkotmMZfdE XJYwsWlRXtaPhRy1VEGKHiSgCEd71/IVDUOPEv5TuJMy9zfsW1McexrYMW0NW63J RiAlDmLSITfdPRYqPgmOTA4MqgJ3V2/oAzOpYXwPqs8Qdt92AX5Tr5itDFgua18T TSdsD4gNudtIMUBkACzMjJKGyxrHvFFhGpUHlFh5swrZhflaGm1TuCWwz3ojTSbG yoQRPe1ylSjcxkJesaKXR8mIaUMq4mrUaChBe+FwoCJXXs8kkX/EO3KULvKCxQGU lzsom+b/eTaxB/ttyHwbt7QSsYq1ko2fIeqqDD/jmhTpg5gshOC+JHLs3bUkmMw= =sTJq -END PGP SIGNATURE- -- https://mail.python.org/mailman/listinfo/python-list
Re: Having both if() and for() statements in one liner
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Am 17.09.2013 15:21, schrieb Ferrous Cranus: ... there must be written on soem way. You've already given yourself the answer in the initial post. The Python way to write this is: if person == George: for times in range(5): ... Why not just use what works and get some actual work done? - -- - --- Heiko. -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.20 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJSOF0gAAoJEDMqpHf921/Sv0oH/AyuaOk5sFlx4j7CKzv4Bb9i +REyAtLJXpgcziviFXjIbnPsNLtGqMU6yOgp9OV7LGwfn0mnZtmI+SoYp08t7G9U 3WSMC6BOCugg419EEMmf+Gkf4fWvv/aZYWBTd8MhyiJLsQ9R7Sg9LlGYheDQ6m+S RwWpYSHYCaJu3iy2xBJ+8AqQjOqACcMREtW1Rt1uHiydO93Dn2Abm0XLq11psYeR OV3sftEJ2EpMEcR4I/HLx95KWIh7wvQcZywTF9y+pe1uOnLrKW/1NdkUxNdkMofy RBNOjYJjT9JAnB2UHI1wVtbipwSi4A4zIIYsE6exv4s1IjnInrVERdDOOlqjwzQ= =rxPO -END PGP SIGNATURE- -- https://mail.python.org/mailman/listinfo/python-list
Re: Tryign to send mail via a python script by using the local MTA
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Am 16.09.2013 13:21, schrieb Denis McMahon: If he's trying to prove communication works, he might be better off using a message subject of test and a message body of this is a test message. Generally, he might be best off if he didn't use os.system() with string-interpolated (without escaping or any such) and user-specified (!) parameters to send out the mail using mailx though a sub-shell. This begs of using his mailer script for code injection as his web-server user, and I'm amazed that nobody has commented on that so far. - -- - --- Heiko. -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.20 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJSNuxTAAoJEDMqpHf921/Sd8IH/2BcapK/dNqbs/PDz3LZLiUS JYYmNaWSjui7KYJsA/A8R3XVaM0eyHkYI8dr8Jx6hPdPJyeE27MCKddF3GlYs17Z iO1AydR2J8kYjXgVLrCWtfH3taB6ryUko6sOe1j/u0hYbQOATxuBPvxTVK4Wmi85 1m8unw9NvlTelAREg6WLudqpE9i115dns87+FTNcgNd3ieppJw+Cv2Mp6z3Yn3he y0W9yMqH1LV4oW/6arZVVIcaWDHCb1I0L++aC8JLnOHYz1osf+34BbHHBcY6Qkty reon+sWKwrlJ56o8Zi1Lx97ymxXxuvUtJS/5WGpRh/XLWYVBGCX3XA42DKqscQk= =xENG -END PGP SIGNATURE- -- https://mail.python.org/mailman/listinfo/python-list
Re: Tryign to send mail via a python script by using the local MTA
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Am 16.09.2013 13:37, schrieb Ferrous Cranus: What i want now is to be able to alter the hostname of my server so the mails wont indicate that they derive from superhost.gr as they aare now sen in the mail headers. There is no way to do that, as the Received:-header which you complain about is inserted by Google mail servers. - -- - --- Heiko. -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.20 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJSNu8bAAoJEDMqpHf921/SFjwH/RXH79ynaWTkdeYWc3koAPvv wQJKaiYy0FMJgV0JoZqWcg6xc/gEfoyBVvMlxTjSI9Jq44Ay6p3xYl4mCV9Oxplc nx3SD8XKE6HV8H8cdUE+MAVxcI4mhz43so6yG7vWFJskuKZMC4zCwnP3F2Wt3zNK EpgYyyKSCG+5KOhnOryw3lVQ0qlsqp02/cEQbn3iWtoe5ojh8qFr+bHL1vs02gtK 16YgKXre+69ne1hs4Hcyj1OKzYHU+YJmP6WTbdIXFXv1ujS3pf0vjpPWLX8f02Y1 n0HHBL0hWvm9+rbxYXera75jQUqY0v042fDjajhKNa/Sq36OBeoW6vt5hysDBxc= =4HLT -END PGP SIGNATURE- -- https://mail.python.org/mailman/listinfo/python-list
Re: Tryign to send mail via a python script by using the local MTA
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Am 16.09.2013 14:11, schrieb Ferrous Cranus: But even so, if we alter for example the hostname of our server to a different name then wouldn't Google use that to identify the server thus protecting the real identity(hostname that is) of the server that initiated the connection? Why on earth would you want to do that? Mail routing headers are there for a reason. - -- - --- Heiko. -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.20 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJSNvbTAAoJEDMqpHf921/SUMoH/2iX9pw0DiOWqHzAj1igbswY tlQVigpz8eprFpsl84JW0+NAOFcpy65VdmwynJ57+qhHvChpdygGoNYjtStP37nF oYbMNHs2gRA+dbhl3xxjedGgIzQinGM7aiy+7ZGU/KIGHorMykV0eUDQaObklFNb oepbNMu1yo2U2PWyBHxlH8iehyECFdeKfLRJX6YrkT5jSS7EKKn6UuaCLKRMYJNN sClGe4J5x5GnIsPtPSWK73rdmYtY/vLmM9P4tDKCBAJvdW5nU52EXLlMNzWu8lPS wvg6bor1/meYPxfzIcmIvKTiYZ+omkmQ8iPkOGEupPCKp8SRB9J2iq8nQA0mYo4= =1cDc -END PGP SIGNATURE- -- https://mail.python.org/mailman/listinfo/python-list
Re: Cannot form correctly the FORM part of the header when sending mail
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Am 03.09.2013 09:48, schrieb Ferrous Cranus: Si there a workaround for that please? Yes, use/setup your own mailserver. Google will not allow you to send as (i.e., From:) an arbitrary address besides the one you've authenticated as. - -- - --- Heiko. -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.20 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJSJyUuAAoJEDMqpHf921/SksoIAJyFyYDN9zj/SypXERj+W1wK fRuby0EyfWWMQayJ7SlbiSUzK3OF1ZVxO5s3WqGdXWI2WhXrrZbltuMyHedlBcqy Dl9F1MtpItg01weICAYJCNcTNm649PCAuc47zbGahE7tDeJwU9xNlgEgXfnpx+eX RvtyYAJlYnz5MTfftYZS9AxxEbaA+k5TNHcVE+5m3YX3Uno6rW+T19H4z4wC374K MHxN4jS+z/qaZ+fDIkK6Uq8aRC5PA9pI37iTD5dJFikKugcp/9AqssnsEUkhMAGV dcGPJnI1tiGrSLY6Q8q31DpkAlO79ETA4ag0yGvnjtmR/ZZjENlb2Ikls7JOA9Y= =JDjl -END PGP SIGNATURE- -- https://mail.python.org/mailman/listinfo/python-list
Re: FSR and unicode compliance - was Re: RE Module Performance
Am 29.07.2013 13:43, schrieb wxjmfa...@gmail.com: 3.2 timeit.timeit(r = dir(list)) 22.300465007102908 3.3 timeit.timeit(r = dir(list)) 27.13981129541519 For the record, I do not put your example to contradict you. I was expecting such a result even before testing. Now, if you do not understand why, you do not understand. There nothing wrong. Please give a single *proof* (not your gut feeling) that this is related to the FSR, and not rather due to other side-effects such as changes in how dir() works or (as Chris pointed out) due to more members on the list type in 3.3. If you can't or won't give that proof, there's no sense in continuing the discussion. -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Don't feed the troll... (was: Re: A few questiosn about encoding)
Am 14.06.2013 10:37, schrieb Nick the Gr33k: So everything we see like: 16474 nikos abc123 everything is a string and nothing is a number? not even number 1? Come on now, this is _so_ obviously trolling, it's not even remotely funny anymore. Why doesn't killfiling work with the mailing list version of the python list? :-( -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Don't feed the troll...
Am 14.06.2013 11:32, schrieb Nick the Gr33k: I'mm not trolling man, i just have hard time understanding why numbers acts as strings. If you can't grasp the conceptual differences between numbers and their/a representation, it's probably best if you stayed away from programming alltogether. I don't think you're actually as thick as you sound, but rather either you're simply too damn lazy to take the time to inform yourself from all the hints/links/information you've been given, or you're trolling. I'm still leaning towards the second. -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Don't feed the help-vampire
Am 14.06.2013 14:09, schrieb rusi: Since identifying a disease by the right name is key to finding a cure: Nikos is not trolling or spamming; he is help-vampiring. Just to explain the trolling allegation: I'm not talking about him wanting to get his scripts fixed, that's help-vampiring most certainly, and an extreme form of that (thanks btw. for pointing me to that term, whoever did). I was talking about his repeated attempts at making conversation by asking questions about encoding, short-circuit evaluation and such which seem like they are relevant for him to solve his problem, but due to his persistence of understanding things in a wrong way/not understanding them at all/repeating the same misunderstandings time after time have drifted off into endless repetitions of the same facts by helpful posters, and have gotten a lot of people seriously annoyed (also, due to other facts such as him changing his NNTP hosts and/or From-addresses which breaks kill-filing). Now, if that latter behaviour isn't trolling, I don't know what is. Simply nobody who takes what he does at least a little bit serious is _as_ thick as he makes himself seem. -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Don't feed the troll...
Am 14.06.2013 14:45, schrieb Nick the Gr33k: we are all benefit out of this. Let's nominate you for a nobel prize, saviour of python-list! -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Changing filenames from Greeklish = Greek (subprocess complain)
Am 05.06.2013 18:44, schrieb MRAB: From the previous posts I guessed that the filename might be encoded using ISO-8859-7: s = b\305\365\367\336\ \364\357\365\ \311\347\363\357\375.mp3 s.decode(iso-8859-7) 'Ευχή\\ του\\ Ιησού.mp3' Yes, that looks the same. Most probably, his terminal is set to ISO-8859-7, so that when he issues the rename command on the command-line of his shell session, the mv command gets a stream of bytes as the new file name which happens to be the ISO-8859-7 encoding of the file name he'd like the file to have. This is what's stored on disk. So, his biggest problem isn't that the operating system is encoding agnostic wrt. filenames (i.e., treats them as a stream of bytes), but rather that he's using an ISO-7 terminal window when having set up UTF-8 as his operating system locale and expects filenames to be encoded in UTF-8 when he's not passing in UTF-8 byte streams from his client computer at all. -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Changing filenames from Greeklish = Greek (subprocess complain)
Am 06.06.2013 12:35, schrieb Νικόλαος Κούρας: ni...@superhost.gr [~/www/data/apps]# ls -l | file - /dev/stdin: ASCII text Did you actually try to understand what I wrote? -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Changing filenames from Greeklish = Greek (subprocess complain)
Am 06.06.2013 13:00, schrieb Νικόλαος Κούρας: Heiko, the ssh client i used to 'mv' the .mp3 was putty.Do you mean that putty is responsible for the encoding mess? Exactly. Check the encoding that putty uses for the terminal session. If it doesn't use UTF-8, switch your terminal session to UTF-8 and try the rename again. If it does, try to use another terminal client (I recommend the Cygwin-Suite). -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Changing filenames from Greeklish = Greek (subprocess complain)
Am 06.06.2013 13:24, schrieb Νικόλαος Κούρας: ni...@superhost.gr [~/www/data/apps]# ls *.mp3 | file - /dev/stdin: ASCII text Again, did you actually read (and try to understand) what I wrote? I said to redo the rename after you change your terminal session to UTF-8. -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Apache and suexec issue that wont let me run my python script
Am 05.06.2013 10:53, schrieb Νικόλαος Κούρας: I ALSO HAVE GIVEN ROOT ACCESS TO ANOTHER MEMBER OF THIS LIST AND HE IN FACT TRIED TO HELP ME INSTEAD OF DOING WHAT YOU DID. AND FROM 2 OTHER PEOPLE AS SOME OTHER FORUMS TOO. You know what you're saying there? You've given (at least) four people you don't know at all (you know, on the internet nobody knows you're a dog and stuff) - and as such shouldn't trust them at all, either - free and full admission to a system that critical for you. That's like handing out keys to the front door of your home to any passer-by on the street who you feel like talking to - and then later wondering why your belongings are suddenly gone. Seeing how riled up you get about this, what Chris did is for the better. At least it seems that you won't be able to change your root password back, either, and as such you won't have root access anymore to your system for the time being, which makes your system and the internets a safer place for now. -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Apache and suexec issue that wont let me run my python script
Am 05.06.2013 11:19, schrieb Chris Angelico: Not quite accurate; he can change his root password back as soon as he logs in as the non-root user and cats one little file. I understood that - I rather got the impression that he (as a person) wasn't technically capable of changing it. Alas, the internets didn't remain a better place for long. :-) -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Apache and suexec issue that wont let me run my python script
Am 05.06.2013 11:33, schrieb Νικόλαος Κούρας: It will remain, if you go away. Look, pal, I work as a programmer for a (medium size) network service provider, and due to that I (should) know my networking security 101. It's generally people like you who are: 1) extremely careless about their system 2) intolerably naive and persistently refusing to learn and who as a consequence hand out root logins for hosts with big (!) pipes to people that should - under no circumstances ever, EVER - be trusted, who are in turn causing the scourge of the public internets that's called a botnet. It doesn't matter whether you're simply so stupid (yes, I said it!) as to hand out actual root logins or whether you refuse to update your system or whether you use weak passwords: in all cases, your system is compromised, and due to the rather big pipe that your system has it in turn compromises the integrity of the whole network that the system is connected to. Chris is completely right: you shouldn't thank him for not doing 'rm -rf /' on your system (that's utter peanuts, and only hits you), you should rather thank him for not copying your complete client data (and in turn their client's data, let's talk about identity theft) and/or for not installing a bot on your system which would in turn cause me to have headaches when the bot's misused to DDoS or for any other form of network-based attack on the network that I need to administer. It's you who's the untrustworthy, completely unreliable and utterly irresponsible member of the community of networks that's called the Internet. Please go somewhere else. -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Apache and suexec issue that wont let me run my python script
Am 05.06.2013 12:21, schrieb Νικόλαος Κούρας: I dont care what you do for a living, you never helped me a bit in anything, you just presented to me your self 1 hour ago to join the party. Guess why I did so: you're presently touching a subject (network safety) that I hold dear, and not only being a troll. -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Apache and suexec issue that wont let me run my python script
Am 05.06.2013 12:30, schrieb Νικόλαος Κούρας: You and Heiko of course would be excluded from the programmer for hire list. Guess what: I have a job. And I don't give a damn. -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Apache and suexec issue that wont let me run my python script
Am 05.06.2013 13:07, schrieb Νικόλαος Κούρας: Btw, since history doesnt show me his history comamnds when he logged in from .au(why not really?), how can i tell what exactly did he do when he logged on to the server? As root has full access to your system (i.e., can change file contents and system state at will), and you gave him root access: you can't. And he made sure to remove things such as .bash_history and the syslog contents, I guess. At least that's what I'd have done to prove a point. -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Apache and suexec issue that wont let me run my python script
Am 05.06.2013 13:19, schrieb Νικόλαος Κούρας: Is there some logging utility i can use next time iam offering root access to someone(if i do it) or perhaps logging a normal's account activity? Short answer: Not for root, no. Long answer: as I've already said: root can change file contents, or more explicitly _any_ system state, and (s)he can do that at will, and as such you can't ever be sure that what any form of logging is telling you will be the truth in some form or another if you've had a malicious root user on your system. Now: think again why it's such a plain stupid and incredibly bad idea to hand out root credentials to people you shouldn't trust, and why people (like me) keep telling you that you're naive and a fool to even consider handing out root logins. PS: the same is true for normal logins. You don't know whether some form of privilege escalation exists on your system, so even by handing out supposedly safe non-root accounts, your installation might get compromised due to insecure SUID software or due to privilege escalation bugs in the kernel. -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: convert string to bytes without changing data (encoding)
Am 28.03.2012 11:43, schrieb Peter Daum: ... in my example, the variable s points to a string, i.e. a series of bytes, (0x61,0x62 ...) interpreted as ascii/unicode characters. No; a string contains a series of codepoints from the unicode plane, representing natural language characters (at least in the simplistic view, I'm not talking about surrogates). These can be encoded to different binary storage representations, of which ascii is (a common) one. What I am looking for is a general way to just copy the raw data from a string object to a byte object without any attempt to decode or encode anything ... There is logically no raw data in the string, just a series of codepoints, as stated above. You'll have to specify the encoding to use to get at raw data, and from what I gather you're interested in the latin-1 (or iso-8859-15) encoding, as you're specifically referencing chars = 0x80 (which hints at your mindset being in LATIN-land, so to speak). -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: convert string to bytes without changing data (encoding)
Am 28.03.2012 19:43, schrieb Peter Daum: As it seems, this would be far easier with python 2.x. With python 3 and its strict distinction between str and bytes, things gets syntactically pretty awkward and error-prone (something as innocently looking like s=s+'/' hidden in a rarely reached branch and a seemingly correct program will crash with a TypeError 2 years later ...) It seems that you're mixing things up wrt. the string/bytes distinction; it's not as complicated as it might seem. 1) Strings s = This is a test string s = 'This is another test string with single quotes' s = And this is a multiline test string. s = 'c' # This is also a string... all create/refer to string objects. How Python internally stores them is none of your concern (actually, that's rather complicated anyway, at least with the upcoming Python 3.3), and processing a string basically means that you'll work on the natural language characters present in the string. Python strings can store (pretty much) all characters and surrogates that unicode allows, and when the python interpreter/compiler reads strings from input (I'm talking about source files), a default encoding defines how the bytes in your input file get interpreted as unicode codepoint encodings (generally, it depends on your system locale or file header indications) to construct the internal string object you're using to access the data in the string. There is no such thing as a type for a single character; single characters are simply strings of length 1 (and so indexing also returns a [new] string object). Single/double quotes work no different. The internal encoding used by the Python interpreter is of no concern to you. 2) Bytes s = b'this is a byte-string' s = b'\x22\x33\x44' The above define bytes. Think of the bytes type as arrays of 8-bit integers, only representing a buffer which you can process as an array of fixed-width integers. Reading from stdin/a file gets you bytes, and not a string, because Python cannot automagically guess what format the input is in. Indexing the bytes type returns an integer (which is the clearest distinction between string and bytes). Being able to input string-looking data in source files as bytes is a debatable feature (IMHO; see the first example), simply because it breaks the semantic difference between the two types in the eye of the programmer looking at source. 3) Conversions To get from bytes to string, you have to decode the bytes buffer, telling Python what kind of character data is contained in the array of integers. After decoding, you'll get a string object which you can process using the standard string methods. For decoding to succeed, you have to tell Python how the natural language characters are encoded in your array of bytes: b'hello'.decode('iso-8859-15') To get from string back to bytes (you want to write the natural language character data you've processed to a file), you have to encode the data in your string buffer, which gets you an array of 8-bit integers to write to the output: 'hello'.encode('iso-8859-15') Most output methods will happily do the encoding for you, using a standard encoding, and if that happens to be ASCII, you're getting UnicodeEncodeErrors which tell you that a character in your string source is unsuited to be transmitted using the encoding you've specified. If the above doesn't make the string/bytes-distinction and usage clearer, and you have a C#-background, check out the distinction between byte[] (which the System.IO-streams get you), and how you have to use a System.Encoding-derived class to get at actual System.String objects to manipulate character data. Pythons type system wrt. character data is pretty much similar, except for missing the single character type (char). Anyway, back to what you wrote: how are you getting the input data? Why are high bytes in there which you do not know the encoding for? Generally, from what I gather, you'll decode data from some source, process it, and write it back using the same encoding which you used for decoding, which should do exactly what you want and not get you into any trouble with encodings. -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Inconsistency between os.getgroups and os.system('groups') after os.setgroups()
Am 25.03.2012 23:32, schrieb jeff: After the os.setgroups, os.getgroups says that the process is not in any groups, just as you would expect... I can suppress membership in the root group only by doing os.setgid and os.setuid before the os.system call (in which case I wind up in the group of the new user instead of root), but I have to be able to get back to root privilege so I can't use setgid and setuid. Simply not possible (i.e., you can't drop root privileges, be it by setuid()/setgid() or removing yourself from groups with setgroups()), and later reacquire them _in the same process_. See the discussion of how to implement privilege separation at http://www.citi.umich.edu/u/provos/ssh/privsep.html (which discusses how this is implemented in OpenSSH) by running multiple processes which communicate through IPC mechanisms, and each of those drops the rights it requires. Using IPC to implement reduced-privilege process spawning has a long history; also, Postfix comes to mind as an early adopter of a privilege separation mechanism. -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: how to read serial stream of data [newbie]
Am 07.02.2012 14:48, schrieb Antti J Ylikoski: On 7.2.2012 14:13, Jean Dupont wrote: ser2 = serial.Serial(voltport, 2400, 8, serial.PARITY_NONE, 1, rtscts=0, dsrdtr=0, timeout=15) In Python, if you want to continue the source line into the next text line, you must end the line to be continued with a backslash '\'. Absolutely not true, and this is bad advice (stylistically). When (any form of) brackets are open at the end of a line, Python does not start a new command on the next line but rather continues the backeted content. So: ser2 = serial.Serial(voltport, 2400, 8, serial.PARITY_NONE, 1, rtscts=0, dsrdtr=0, timeout=15) is perfectly fine and certainly the recommended way of putting this. Adding the backslash-continuation is always _possible_, but only _required_ when there are no open brackets. So: x = hello \ test is equivalent to: x = (hello test) in assigning: x = hello test -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: [Perl Golf] Round 1
Am 05.02.2012 12:49, schrieb Alec Taylor: Solve this problem using as few lines of code as possible[1]. Pardon me, but where's the problem? If your intention is to propose a challenge, say so, and state the associated problem clearly. -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: [Perl Golf] Round 1
Am 05.02.2012 23:15, schrieb Neal Becker: Heiko Wundram wrote: Am 05.02.2012 12:49, schrieb Alec Taylor: Solve this problem using as few lines of code as possible[1]. Pardon me, but where's the problem? If your intention is to propose a challenge, say so, and state the associated problem clearly. But this really misses the point. Python is not about coming up with some clever, cryptic, one-liner to solve some problem. It's about clear code. If you want clever, cryptic, one-liner's stick with perl. You're only allowed to bash him for one-liners as soon as he formulates something that in some way or another resembles a programming challenge, and not some incoherent listing of words without actual intent... ;-) -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Looking under Python's hood: Will we find a high performance or clunky engine?
Am 22.01.2012 16:50, schrieb Rick Johnson: What does Python do when presented with this code? py [line.strip('\n') for line in f.readlines()] If Python reads all the file lines first and THEN iterates AGAIN to do the strip; we are driving a Fred flintstone mobile. If however Python strips each line of the lines passed into readlines in one fell swoop, we made the correct choice. Which is it Pythonistas? Which is it? You aren't one (considering how vocal you are in arguing for changes to the language)? So: shouldn't you be able to answer your own question? -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Hash stability
Am 16.01.2012 09:44, schrieb Christian Heimes: Am 16.01.2012 09:18, schrieb Peter Otten: I've taken a quick look into the suds source; the good news is that you have to change a single method, reader.Reader.mangle(), to fix the problem with hash stability. However, I didn't see any code to deal with hash collisions at all. It smells like suds is vulnerable to cache poisoning. That it is, yes, at least partially. Generally, this is only relevant in case you are actually caching DTDs (which is the default) and in case you are querying untrusted SOAP-servers (in which case you'll most likely/should not use caching anyway), and in case the attacker has control over the URL namespace of a DTD-serving host (because the host-part of the DTD URL is used in the cache filename, unhashed, only the actual path is hashed to form the cache index). The easier way to poison the cache is most probably through actual traffic modification, as most DTD URLs are served through plain http and thus are suspect to MitM-modifications, anyway. -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Hash stability
Am 15.01.2012 11:13, schrieb Stefan Behnel: That's a stupid design. Using a hash function that the application does not control to index into persistent storage just screams for getting the code broken at some point. I agree completely with that (I hit the corresponding problem with suds while transitioning from 32-bit Python to 64-bit Python, where hashes aren't stable either), but as stated in my mail: that wasn't the original question. ;-) -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Hash stability
Am 15.01.2012 13:22, schrieb Peter Otten: Heiko Wundram wrote: I agree completely with that (I hit the corresponding problem with suds while transitioning from 32-bit Python to 64-bit Python, where hashes aren't stable either), but as stated in my mail: that wasn't the original question. ;-) I'm curious: did you actually get false cache hits or just slower responses? It broke the application using suds, not due to false cache hits, but due to not getting a cache hit anymore at all. Long story: to interpret WSDL-files, suds has to get all related DTDs for the WSDL file, and Microsoft (as I wrote I was querying Exchange Web Services) insists on using http://www.w3.org/2001/xml.dtd for the XML spec path. This path is sometimes functional as a GET URL, but mostly not (due to overload of the W3-servers), so basically I worked around the problem by creating an appropriate cache entry with the appropriate name based on hash() using a local copy of xml.dtd I had around. This took place on a development machine (32-bit), and when migrating the application to a production machine (64-bit), the cache file wasn't used anymore (due to the hash not being stable). It's not that this came as a surprise (I quickly knew the workaround by simply rehashing on the target machine and moving the cache file appropriately), and I already said that this is mostly just a plain bad design decision on the part of the suds developers, but it's one of those cases where a non-stable hash() can break applications, and except if you know the internal workings of suds, this will seriously bite the developer. I don't know the prevalence of suds, but I guess there's more people than me using it to query SOAP-services - all of those will be affected if the hash() output is changed. Additionally, if hash() isn't stable between runs (the randomized hash() solution which is preferred, and would also be my preference), suds caching becomes completely useless. And for the results, see above. -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Hash stability
Am 15.01.2012 17:13, schrieb Chris Angelico: On Mon, Jan 16, 2012 at 3:07 AM, Heiko Wundrammodeln...@modelnine.org wrote: I don't know the prevalence of suds, but I guess there's more people than me using it to query SOAP-services - all of those will be affected if the hash() output is changed. Additionally, if hash() isn't stable between runs (the randomized hash() solution which is preferred, and would also be my preference), suds caching becomes completely useless. And for the results, see above. Or you could just monkey-patch it so that 'hash' points to an old hashing function. If the current hash() is kept in builtins as (say) hash_320() or hash_272() or something, then anyone who wants the old version of the hash can still get it. Or even easier: overwrite the default caching module (called FileCache) with something that implements sensible caching, for example by using the complete URL (with special characters replaced) of the DTD as a cache index, instead of hash()ing it. ;-) There's workarounds, I know - and I may be implementing one of them if the time comes. Again, my mail was only to point at the fact that there are (serious) projects out there relying on the stableness of hash(), and that these will get bitten when hash() is replaced. Which is not a bad thing if you ask me. ;-) -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Hash stability
Am 14.01.2012 10:46, schrieb Peter Otten: Steven D'Aprano wrote: How many people rely on hash(some_string) being stable across Python versions? Does anyone have code that will be broken if the string hashing algorithm changes? Nobody who understands the question ;) Erm, not exactly true. There are actually some packages out there (take suds [https://fedorahosted.org/suds/], for example) that rely on the hashing algorithm to be stable to function properly (suds uses hash() of strings to create caches of objects/XML Schemas on the filesystem). This, in a different context, bit me at the end of last week, when required to use suds to access EWS. I'd personally start debating the sensibility of this decision on the part of the suds developers, but... That's not the question. ;-) -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Avoid race condition with Popen.send_signal
Am 03.01.2012 02:19, schrieb Adam Skutt: On Jan 2, 6:09 pm, Jérômejer...@jolimont.fr wrote: What is the clean way to avoid this race condition ? The fundamental race condition cannot be removed nor avoided. Ideally, avoid the need to send the subprocess a signal in the first place. If it cannot be avoided, then trap the exception. Yes, it can be avoided, that's what the default SIGCHLD-handling (keeping the process as a zombie until it's explicitly collected by a wait*()) is for, which forces the PID not to be reused by the operating system until the parent has acknowledged (by actively calling wait*()) that the child has terminated. -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Avoid race condition with Popen.send_signal
Am 03.01.2012 14:40, schrieb Adam Skutt: On Jan 3, 7:31 am, Heiko Wundrammodeln...@modelnine.org wrote: Yes, it can be avoided, that's what the default SIGCHLD-handling (keeping the process as a zombie until it's explicitly collected by a wait*()) is for, which forces the PID not to be reused by the operating system until the parent has acknowledged (by actively calling wait*()) that the child has terminated. No, you still can see ESRCH when sending signals to a zombie process. Code that sends signals to child processes via kill(2) must be prepared for the call to fail at anytime since the process can die at anytime. It can't handle the signal, so it's treated as if it doesn't exist by kill(2) in this case. However, you don't have to worry about sending the signal to the wrong process. Getting an error on kill (which you can catch) is not about the race that the posters were speculating about (i.e., sending the signal to the wrong process), and that's what I was trying to put straight. The only advice that I wanted to give is: 1) before calling wait to collect the child, call kill as much as you like, and in case it errors, ignore that, 2) after calling wait, never, ever kill, and you don't need to, because you already know the process is gone. There's no race possibility in this, _except_ if you alter handling of SIGCHLD away from the default (i.e., to autocollect children), in which case you have the possibility of a race and shooting down unrelated processes (which the discussion was about). -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: socket.gethostbyaddr( os.environ['REMOTE_ADDR'] error
Am 02.01.2012 14:25, schrieb Νικόλαος Κούρας: On 23 Δεκ 2011, 19:14, Νικόλαος Κούραςnikos.kou...@gmail.com wrote: I dont know why this line host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0] fails sometimes and some other times works ok retrieving the hostnames correctly. Please i need some help. My webpage doesn't work due to this error... The error herror: (1, ...) says it all: the DNS-name (i.e., the something.in-addr.arpa name) you're trying to resolve is unknown. Not all hosts (or rather, IPs) on the internet have reverse lookups: try the IP 81.14.209.35 from which I'm posting, and dig/nslookup will tell you that it has no reverse resolution, which would result in gethostbyaddr() throwing an herror-instance. Basically: make the reverse lookup conditional by wrapping it in a try:/except herror: and assigning an appropriate default for host in case reverse lookup fails. -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Misleading error message of the day
Am 08.12.2011 15:47, schrieb Robert Kern: Would including the respective numbers help your thought processes? ValueError: too many values to unpack (expected 2, got 3) Not possible in the general case (as the right-hand side might be an arbitrary iterable/iterator...). -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Misleading error message of the day
Am 08.12.2011 16:42, schrieb Roy Smith: The exception was raised when i() returned it's third value, so saying expected 2, got 3 is exactly correct. Yes, it is true that it might have gotten more if it kept going, but that's immaterial; the fact that it got to 3 is what caused the Holy Hand Grenade to be thrown. Please explain how that error message (in case you're not aiming at the actual count of elements in the source) differs from the curent wording too many values, as you're simply displaying expected n, got n+1 where n is visible from the immediate exception output... -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: SSE4a with ctypes in python? (gcc __builtin_popcount)
Am 31.10.2011 04:13, schrieb est: Is it possible to rewrite the above gcc code in python using ctypes (preferably Win/*nix compatible)? No; the (gcc-injected) functions starting with __builtin_* are not real functions in the sense that they can be called by calling into a library, but rather are converted to a series of assembler instructions by the compiler directly. Wrapping this (distance) primitive by writing a C-module for Python, thus exposing the respective gcc-generated assembler code to Python through a module, won't yield any relevant speedups either, because most of the time will be spent in the call sequence for calling the function, and not in the actual computation. -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Problem receiving UDP broadcast packets.
Am 21.04.2011 03:35, schrieb Dan Stromberg: I think tcpdump and tshark (was tethereal) will put the interface into promiscuous mode so it can see more traffic; on OSF/1 (Tru64), we had to do this manually for said programs to see all that was possible (barring the presence of a switch not repeating packets the way routers and hubs would). It actually depends on the network adapter/card that's in use: many modern cards (especially those in the lower price segment, i.e. Realtek) don't (properly) implement MAC-filtering at the hardware level, and in this case, there's no difference for the operating system between promiscuous mode and non-promiscuous mode (because the card will forward all packets that it sees coming in over the ethernet bus to the operating system, which will then discard those at the ethernet level it doesn't deem necessary to process at a higher level, for example because the destination MAC is unicast, but not the cards own, so the destination wasn't the system itself). For pricier cards/chips, this filtering (which also includes restricting the multicast-destinations that are forwarded to the operating system, think IPv6-multicast which uses quite a range of multicast MAC addresses for its neighbour discovery) is implemented at the hardware level, and the ethernet adapter throws away uninteresting packets and doesn't signal the operating system (think of the cost of interrupts you save; on high throughput links, this makes perfect sense). Putting the card into promiscuous mode basically disables this filtering, so that the card will again forward all packets to the operating system. This is why tcpdump for example puts the network adapter into promiscuous mode, but normally (see above, depending on the network adapter), that's not required because the operating system sees all ethernet packets anyway. -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: learnpython.org - an online interactive Python tutorial
Am 21.04.2011 09:19, schrieb Chris Angelico: On Thu, Apr 21, 2011 at 5:10 PM, Algis Kabaila akaba...@pcug.org.au wrote: False: Python IS strongly typed, without doubt (though the variables are not explicitly declared.) Strongly duck-typed though. If I create a class that has all the right members, it can simultaneously be a file, an iterable, a database, and probably even a web browser if it feels like it. Is that strong typing or not? Yes, that's strong typing, because your class only works in those contexts that you explicitly allow it to work in (through implementing an interface, be it an iterator, a file, etc.), independent of duck-typing (which is pretty much described by the term interface-based typing IMHO). The difference between strong typing and weak typing is best described by: Python 2.6.5 (r265:79063, Jun 12 2010, 17:07:01) [GCC 4.3.4 20090804 (release) 1] on cygwin Type help, copyright, credits or license for more information. 1+'2' Traceback (most recent call last): File stdin, line 1, in module TypeError: unsupported operand type(s) for +: 'int' and 'str' which means that the interface for implementing + on the input types int and str isn't implemented (i.e., TypeError). Weakly typed languages allow this to work: modelnine@gj-celle ~ $ php ?php echo 1+'2'; ? 3 modelnine@gj-celle ~ $ through all kinds of type-casting magic, which isn't explicitly specified as interfaces on the objects (PHP also has integer and string objects) themselves. -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: is there a difference between one line and many lines
Am 21.04.2011 11:55, schrieb vino19: I am asking about what happens in Python interpreter? Why is there a difference between running one line like a=1;b=1 and two lines like a=1 \n b=1? Does it decide to locate memory in different types depend on a code? There is no difference between the two. You've not given the initializers for a/b in the two statement groups you showed, so that what Chris Angelico said is probably what's happening here (i.e.: in the first case, you stay in the singleton range, in the second case which builds on the first, you don't). -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: is there a difference between one line and many lines
Am 21.04.2011 11:59, schrieb Heiko Wundram: Am 21.04.2011 11:55, schrieb vino19: I am asking about what happens in Python interpreter? Why is there a difference between running one line like a=1;b=1 and two lines like a=1 \n b=1? Does it decide to locate memory in different types depend on a code? There is no difference between the two. ... Erm, sorry, forget my post. I misread a=-6 as a-=6, etc... So: what Chris said. Anyway, there is semantically no difference between the two, and that stands. -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Problem receiving UDP broadcast packets.
Am 20.04.2011 01:54, schrieb Grant Edwards: I guess the problem is that I expected to receive a packet on an interface anytime a packet was received with a destination IP address that matched that of the the interface. Apprently there's some filtering in the network stack based on the _source_ address as well (that seems very counter-intuitive to me). Just to pitch in here (because nobody's mentioned it yet AFAICT): yes, there's a filtering done (at least under Linux, and I'd guess something similar on xBSD too) to packets based on the source address coming in on an interface, and it's called the reverse path filter and is on by default (the tunable on Linux is /proc/sys/net/ipv4/conf/*/rp_filter). The idea behind the reverse path filter is that your machine won't accept packets coming in over an interface when a return packet (i.e., the presumed response) won't be routed over the same interface, and from what I gather, this is what makes the TCP/IP stack drop the packets because your machine will not route packets to 192.168.x.x over the same interface it sees the packet coming in. This is a _security_ feature, because it makes address spoofing harder. If you need to see the packets regardless, either use a promiscuous mode sniffer (i.e., tcpdump, but that's relatively easy to mirror in Python using SOCK_RAW, capturing packets at the ethernet level), or add a route on your system for the 192.168.x.x network on the same interface. HTH! -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Problem receiving UDP broadcast packets.
Am 20.04.2011 16:30, schrieb Grant Edwards: If you need to see the packets regardless, either use a promiscuous mode sniffer (i.e., tcpdump, but that's relatively easy to mirror in Python using SOCK_RAW, capturing packets at the ethernet level), or add a route on your system for the 192.168.x.x network on the same interface. I've thought about the SOCK_RAW option, but the CPU load of looking all received Ethernet packets in user-space would be a big down-side. Not necessarily: instead of using UDP datagrams to send the data, use ethernet datagrams (without any IP/UDP header) with your own ethernet-type (there is a range of local types that you can use for your own local use-case), and then simply create a RAW socket that only listens on packets that have the specified ethernet types. We use something similar at work for a high-availability application. The server-side looks something like: PKT_TYPE = 0x1234 # My very own ethertype. sock = socket(AF_PACKET,SOCK_DGRAM,htons(PKT_TYPE)) sock.bind((ethxyz,PKT_TYPE)) while True: data, (_, _, _, _, addr) = sock.recvfrom(1500) print I got:, repr(data), from etheraddr:, addr The client-side looks similar. Because you're using UDP broacast, you have unreliable transport anyway, and if the client-side supports sending ethernet datagrams (with a broadcast address), I'd rather advise to use that for your use case. This makes you independent of IP configuration (and as I can see, you're actually not interested in the routing that IP gives you, but rather interested in contacting all nodes on a local ethernet; why not use ethernet directly?). -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Copy-on-write when forking a python process
Am 08.04.2011 18:14, schrieb John Connor: Has anyone else looked into the COW problem? Are there workarounds and/or other plans to fix it? Does the solution I am proposing sound reasonable, or does it seem like overkill? Does anyone foresee any problems with it? Why'd you need a fix like this for something that isn't broken? COW doesn't just refer to the object reference-count, but to the object itself, too. _All_ memory of the parent (and, as such, all objects, too) become unrelated to memory in the child once the fork is complete. The initial object reference-count state of the child is guaranteed to be sound for all objects (because the parent's final reference-count state was, before the process image got cloned [remember, COW is just an optimization for a complete clone, and it's up the operating-system to make sure that you don't notice different semantics from a complete copy]), and what you're proposing (opting in/out of reference counting) breaks that. -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Copy-on-write when forking a python process
Am 08.04.2011 20:34, schrieb jac: I disagree with your statement that COW is an optimization for a complete clone, it is an optimization that works at the memory page level, not at the memory image level. In other words, if I write to a copy-on-write page, only that page is copied into my process' address space, not the entire parent image. To the best of my knowledge by preventing the child process from altering an object's reference count you can prevent the object from being copied (assuming the object is not altered explicitly of course.) As I said before: COW for sharing a processes forked memory is simply an implementation-detail, and an _optimization_ (and of course a sensible one at that) for fork; there is no provision in the semantics of fork that an operating system should use COW memory-pages for implementing the copying (and early UNIXes didn't do that; they explicitly copied the complete process image for the child). The only semantic that is specified for fork is that the parent and the child have independent process images, that are equivalent copies (except for some details) immediately after the fork call has returned successfully (see SUSv4). What you're thinking of (and what's generally useful in the context you're describing) is shared memory; Python supports putting objects into shared memory using e.g. POSH (which is an extension that allows you to place Python objects in shared memory, using the SysV IPC-featureset that most UNIXes implement today). -- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Find class of an instance?
Am Mittwoch, den 06.08.2008, 08:44 -0400 schrieb Neal Becker: Sounds simple, but how, given an instance, do I find the class? inst.__class__ For example: Python 2.5.2 (r252:60911, Aug 5 2008, 03:26:50) [GCC 4.3.1] on linux2 Type help, copyright, credits or license for more information. x = hello x.__class__ type 'str' --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: proposal, change self. to .
Am 03.08.2008, 12:51 Uhr, schrieb Equand [EMAIL PROTECTED]: how about changing the precious self. to . imagine self.update() .update() simple right? What about: class x: def x(self,ob): ob.doSomethingWith(self) ? Not so simple anymore, isn't it? If you're not trolling, there's hundreds of reasons why the explicit self is as it is, and it's not going to go away, just as a thread that produced immense amounts of response demonstrated around a week ago. Read that, and rethink. --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Help me
Am 02.08.2008, 18:02 Uhr, schrieb [EMAIL PROTECTED]: snip I'll help you by giving some good advice: homework is meant to be homework, so you should get started reading and processing the assignment. If you have any specific questions besides text comprehension, come back to ask. --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: How smart is the Python interpreter?
Am Donnerstag, 31. Juli 2008 13:09:57 schrieb ssecorp: def str_sort(string): s = for a in sorted(string): s+=a return s if i instead do: def str_sort(string): s = so = sorted(string) for a in so: s+=a return s will that be faster or the interpreter can figure out that it only has to do sorted(string) once? or that kind of cleverness is usually reserved for compilers and not interpreters? In a statement of the form for name in iterable: the expression iterable will only be evaluated once (to retrieve an iterator), so basically, both ways of stating it are equivalent and make negligible difference in runtime (the second version will be slower, because you have additional code to assign/fetch a local). Anyway, if you care about speed, probably: def str_sort(string): return .join(sorted(string)) will be the fastest way of stating this. -- Heiko Wundram -- http://mail.python.org/mailman/listinfo/python-list
Re: problem when reading file
Am Donnerstag, 31. Juli 2008 15:44:33 schrieb shrimpy: hi every one, i am new to python, and coz i want to write a handy command for my linux machine, to find a word in all the files which are under the current folder. What about grep -R myword . ? Even works on regular expression (with e/fgrep). Type grep --help to see all the options you get (context display, ignoring anything that's not a proper file or directory, only printing filenames with matches, not the matches themselves, etc.). -- Heiko Wundram -- http://mail.python.org/mailman/listinfo/python-list
Re: Boolean tests [was Re: Attack a sacred Python Cow]
Am Mittwoch, 30. Juli 2008 08:30:48 schrieb Russ P.: On Jul 29, 11:09 pm, Erik Max Francis [EMAIL PROTECTED] wrote: I'm getting this sneaking suspicion that you guys are all putting us on. As I said in an earlier post, I realize that this would only work if there were only one copy of empty (as there is only one copy of None). I don't know off hand if that is feasible or not. You reply reeks of the kind of pedantic snobbishness that makes me sick. I can understand (and pretty much sympathise) that you get this kind of reply, simply because the point you and Carl Banks (formulated somewhat differently) put up has been answered again and again (in this thread), and I can only repeat it once more: __nonzero__(), i.e. the cast to boolean, is THE WAY to test whether a container is empty or not. Like this design decision, or don't like it, but the discussion is not going to go anywhere unless you concede that there is a (very explicit!) way to test for non-emptiness of a container already, and you're currently simply discussing about adding/using syntactic sugar (different means of expressing the test) to suit your own personal taste better. Anyway, check the documentation for __nonzero__(): if the object doesn't implement that, but implements __len__(), the interpreter replaces the __nonzero__() test by __len__()0, so I guess someone in the design department must've seen it logical for the truth value of a container to express the test len(x)0 at some point in time to make this interpretation for the truth value of a container. There cannot be an argument about missing/misplaced functionality (that's what you make it sound like), if the functionality for doing what you want to do is there and you simply don't like the syntax, which I can somewhat relate to because style is a personal thing, even though I don't see either points made by you or Carl Banks, because implicit casting to bool is so common in pretty much every programming language to test for truth of an object, and IMHO it's completely logical to extend that idea to containers to mean empty/non-empty. Eric Max Francis tried to explain why your syntactic enhancement would come at a much greater price than its worth, and he's absolutely right in that, as it's an abuse of the is operator, but again, that's a somewhat different point. It changes nothing about the fact that all this discussion centers around something that is a non-point, but simply a matter of personal taste. -- Heiko Wundram -- http://mail.python.org/mailman/listinfo/python-list
Re: Boolean tests [was Re: Attack a sacred Python Cow]
Am Mittwoch, 30. Juli 2008 09:18:48 schrieb Russ P.: Oh, Lordy. I understand perfectly well how boolean tests, __len__, and __nonzero__ work in Python. It's very basic stuff. You can quit patronizing me (and Carl too, I'm sure). I'll stop repeating what the current state is (which might sound like I'm patronizing, but that's not the real intent actually, I'm just trying to get the discussion straight with what is fact, namely that there already exists an explicit way which doesn't seem to be recognized by some people here) if you agree to my point that we're not talking about a problem with Python, but about a personal stylistic issue with the language design. That's just what I said in my last mail: I can concede that you, personally, have an issue with the current state, but for me and seemingly also for a majority of people who have posted in this thread, that's a non-issue. The point that you seem to be missing, or refuse to acknowledge for some reason, is that if x can be mistakenly applied to any object when the programmer thinks that x is a list -- and the programmer will receive no feedback on the error. I have made errors like that, and I could have saved some time had I used an empty method that only applies to a list or other sequence. For me, I've never had this kind of problem, simply because if I test for the truth value of something, I'm going to do something with it later on, and as soon as I'm doing something with it, I'll see whether the object supports the interface I want it to support (and get an error there if the object doesn't support the basic notions of a sequence type, for example, i.e. __iter__()). Testing for truth is IMHO not doing something with an object, and as such I'm more than happy that it's foolproof. This is one thing that I personally find attractive about Python's way of duck-typing. But, if you personally have been bitten by this, give an example, and I'm sure that we can start discussing from there. I've already given an example why the explicit test for length is less polymorphic than the explicit test for truth of a container elsewhere as a proof for the point I'm trying to make. -- Heiko Wundram -- http://mail.python.org/mailman/listinfo/python-list
Re: Proxy server?
Am Mittwoch, 30. Juli 2008 13:48:08 schrieb Gary: Diez B. Roggisch [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] Gary schrieb: Diez B. Roggisch [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] You can't make any TCP/IP communication run through a proxy, unless it's transparent. Thanks for all the info. This is not entirely true. There are libc-plugins (i.e. LD_PRELOAD hacks) which use SOCKS (which is a generic proxying protocol for [TCP/]IP) to redirect all locally originating TCP/IP traffic _which is managed through the socket interface of the libc_ in the application that you applied the LD_PRELOAD hack to through a specified SOCKS-proxy (this should capture pretty much everything, except for communication originating in the *nix-kernel itself). I seem to recall that something similar exists for WinSock, but I wouldn't know for sure. Check the web for documentation on setting up a SOCKS proxy, and for the respective libc-plugins or WinSock SOCKS hack. If you cannot make the user use SOCKS through a means like this (in which case there has to be no application support) or by instructing a specific application to use a SOCKS proxy directly (which all browsers can out of the box AFAIK), and you don't have the possibility to put yourself somewhere in the middle by means of a transparent proxy (i.e., a firewall applicance which does this; I seem to recall that there was some FreeBSD-based software which basically did just this kind of transparent proxying for a network), you're out of luck, just like Diez said. -- Heiko Wundram -- http://mail.python.org/mailman/listinfo/python-list
Re: Boolean tests [was Re: Attack a sacred Python Cow]
Am Dienstag, 29. Juli 2008 10:37:45 schrieb Carl Banks: You keep bringing up this notion of more complex with no benefit, which I'm simply not interested in talking about that at this time, and I won't respond to any of your points. I am seeking the answer to one question: whether if x can usefully do something a simple explicit test can't. Everyone already knows that if x requires fewer keystrokes and parses to fewer nodes. Yes, there are quite a lot of use cases. Think of a polymorphic function, where the input can be any object that implements the iterator protocol (concerning base types, I'm thinking of strings, tuples, lists, dicts and sets here, which are all iterable and yield a chain of singular values) and you want to check whether the iterable object is empty or not for special-casing that. if x uses the special interface method __nonzero__() if that's implemented (which all of the above types implement as returning True iff the container yields at least one value when iterated over, i.e., it isn't empty), and falls back to a test for __len__() != 0, otherwise x is considered to be true. Now, explicitly comparing x against the five empty values of the container types I specified above would be broken design in such a function: when I implement a container class myself, which implements the __iter__() and __nonzero__() methods, I can directly use it with the polymorphic function I wrote, and the special case for an empty container will work out of the box. In the case of explicit comparisons, I have to modify the polymorphic function to accept my container type in addition to those it already processes to be able to special-case the empty container for my type. I can't dig up a simple example from code I wrote quickly, but because of the fact that explicit comparisons always hamper polymorphism (which might not be needed initially, but you never know what comes up later, thinking of reusability of components), I personally always stick to the idiom if x rather than comparing it to an empty value, even when I'm sure that the type of x is a singular type. Additionally, IMHO if x is so much more readable than if x != something. Just my 2 (euro)cents. -- Heiko Wundram -- http://mail.python.org/mailman/listinfo/python-list
Re: Boolean tests [was Re: Attack a sacred Python Cow]
Am Dienstag, 29. Juli 2008 11:15:05 schrieb Heiko Wundram: I can't dig up a simple example from code I wrote quickly... Just to get back to that: an example I found where if x (the generic __nonzero__() test) will work to test for emptiness/non-emptiness of a container, whereas if len(x) 0 (the specific test for this example) will not, is my for own integer set type I wrote a while back (which you can find on ASPN). The corresponding set type allows you to create infinitely sized sets of integers (which of course are stored as pairs of start,stop-values, so the storage itself for the set is bounded), for which len(x) does not have a proper meaning anymore, and __len__() is limited to returning a (platform dependent) ssize_t anyway IIRC, so even with a bounded set, the length of the set might not necessarily be accessible using len(x); that's why the set type additionally got a member function called .len() to work around this restriction. I should think is a non-contrieved example where the generic test whether the object considers itself True/False (which for containers means non-empty/empty) is preferrable over the special case test whether the length is positive. A polymorphic function, which for example only accesses the first ten members of the container is able to work with an infinite set if it uses the generic test, but is not in case it uses len(x) 0. -- Heiko Wundram -- http://mail.python.org/mailman/listinfo/python-list
Re: SWIG and char* newb questions :)
Am Dienstag, 29. Juli 2008 12:51:36 schrieb code_berzerker: Ok now more seriously. I have question refering to char* used as function parameters to return values. I have read SWIG manual to find best way to overcome that, but there are many warnings about memory leaks and stuff, so I feel confused. Ok to put it more simply: how to safely define a variable in Python and have it modified by C/C++ function? At least for strings, this won't work. Python strings are immutable (and Python optimizes some things based on this knowledge), and as such you can pass a Python string(-object) into a C/C++ function and retrieve its value there as a const (!) char* using the PyString_*-API (I have no idea how this is encapsulated in SWIG), but cannot/should not modify it (a const_cast is almost always a sign of bad programming, anyway). The only real choice you have is to have your wrapper return a new string object, created using one of the PyString_FromString[AndSize] functions. Check the Python C-API documentation for more info on this. Anyway, on a different note, I personally have always found it simpler to not use SWIG to generate C extensions for Python, but to use the Python C-API directly. Hope this helps! -- Heiko Wundram -- http://mail.python.org/mailman/listinfo/python-list
Re: Boolean tests [was Re: Attack a sacred Python Cow]
Am 29.07.2008, 18:30 Uhr, schrieb Carl Banks [EMAIL PROTECTED]: On Jul 29, 5:15 am, Heiko Wundram [EMAIL PROTECTED] wrote: I can't dig up a simple example from code I wrote quickly, but because of the fact that explicit comparisons always hamper polymorphism I'm not going to take your word for it. Do you have code that demonstrates how if x improves polymorphism relative to simple explicit tests? As I wrote in the second reply email I sent, check out my integer set recipe on ASPN (and to save you the search: http://code.activestate.com/recipes/466286/). To test whether the integer set is empty or not (in a polymorphic function which accepts any kind of sequence type), the explicit test would be, as you proposed elsewhere: len(x) 0. This simply WILL NOT work with some sets of that respective type, because, as I documented for the __len__() method there: the return value of __len__() has to (had to?) be in the range 0 = len 2**31 (which I think means, as I tested and implemented it on i386, that the return value has to fit in an ssize_t platform type, but someone with more knowledge of the interpreter internals might be able to comment here; I'm not in the mood for checking this out now). Another reason why the test for __nonzero__() is beneficial, at least here: testing whether the set is empty or not is easy, because an empty set has no ranges, and a set with at least one element has at least one range (i.e., to test whether the set is non-empty, check whether the _ranges member, a list, is __nonzero__()); taking the len() of a set always means adding the size of the ranges together (even though this could of course be precomputed/cached, as the set type is immutable, but I'm not doing that in that recipe's code). So, adding things up: the interpretation of __nonzero__(), i.e. the direct conversion to bool, for container-types, implements THE means to test whether the container is empty or not. Insisting on not using it, because a simple explicit test is supposedly better, will prove to not work in those cases where the container type might not have a representable length (because of the constraints on the return value of __len__()), even though the container has an empty/non-empty state. I think this does make a very compelling use case for if x instead of if len(x) 0. --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Boolean tests [was Re: Attack a sacred Python Cow]
Also, just a couple of points: Am 29.07.2008, 22:27 Uhr, schrieb Carl Banks [EMAIL PROTECTED]: 1. Any container type that returns a length that isn't exactly the number of elements in it is broken. I agree, but how do you ever expect to return an infinite element count? The direction I took in that recipe was not returning some magic value but raising an OverflowError (for example, you could've also cropped the length at 2**31-1 as meaning anything equal to or larger). This is the thing that breaks your explicit test for non-emptyness using len(x) 0, but it's also the only possible thing to do if you want to return the number of elements exactly where possible and inform the user when not (and OverflowError should make the point clear). Anyway, that's why there is a separate member function which is explicitly documented to return a magic value in case of an infinite set (i.e., -1) and an exact element count otherwise, but using that (you could write x.len() != 0 for the type in question to test for non-emptiness) breaks polymorphism. 2. The need for __nonzero__ in this case depends on a limitation in the language. True, but only for sets with are finite. For an infinite set, as I said above: what would you want __len__() to return? There is no proper interpretation of __len__() for an infinite set, even though the set is non-empty, except if you introduced the magic value infinity into Python (which I did as -1 for my personal length protocol). 3. On the other hand, I will concede that sometimes calculating len is a lot more expensive than determining emptiness, and at a basic level it's important to avoid these costs. You have found a practical use case for __nonzero__. This is just a somewhat additional point I was trying to make; the main argument are the two points you see above. However, I'd like to point out the contrasting example of numpy arrays. For numpy arrays, if x fails (it raises an exception) but if len(x)!=0 succeeds. The only sane advice for dealing with nonconformant classes like numpy arrays or your interger set is to be wary of nonconformances and don't expect polymorphism to work all the time. The thing is: my integer set type IS conformant to the protocols of all other sequence types that Python offers directly, and as such can be used in any polymorphic function that expects a sequence type and doesn't test for the length (because of the obvious limitation that the length might not have a bound), but only for emptiness/non-emptiness. It's the numpy array that's non-conformant (at least from what you're saying here; I haven't used numpy yet, so I can't comment). So I guess I'll concede that in the occasional cases with nonconformant classes the if x might help increase polymorphism a little. (BTW: here's another little thing to think about: the if x is useful here only because there isn't an explicit way to test emptiness without len.) The thing being, again, as others have already stated: __nonzero__() IS the explicit way to test non-emptiness of a container (type)! If I wanted to make things more verbose, I'd not use if len(x)0, but if bool(x) anyway, because casting to a boolean calls __nonzero__(). if len(x)0 solves a different problem (even though in set theory the two are logically similar), and might not apply to all container types because of the restrictions on the return value of __len__(), which will always exist. --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Some notes on a high-performance Python application.
Am Mittwoch, 26. März 2008 17:33:43 schrieb John Nagle: ... Using MySQL as a queueing engine across multiple servers is unusual, but it works well. It has the nice feature that the queue ordering can be anything you can write in a SELECT statement. So we put fair queueing in the rating scheduler; multiple requests from the same IP address compete with each other, not with those from other IP addresses. So no one site can use up all the rating capacity. ... Does anyone else architect their systems like this? A Xen(tm) management system I've written at least shares this aspect in that the RPC subsystem for communication between the frontend and the backends is basically a (MySQL) database table which is regularily queried by all backends that work on VHosts to change the state (in the form of a command) according to what the user specifies in the (Web-)UI. FWIW, the system is based on SQLObject and CherryPy, doing most of the parallel tasks threaded from a main process (because the largest part of the backends is dealing with I/O from subprocesses [waiting for them to complete]), which is different from what you do. CherryPy is also deployed with the threading server. -- Heiko Wundram -- http://mail.python.org/mailman/listinfo/python-list
Re: Some notes on a high-performance Python application.
Am Mittwoch, 26. März 2008 18:54:29 schrieb Michael Ströder: Heiko Wundram wrote: Am Mittwoch, 26. März 2008 17:33:43 schrieb John Nagle: ... Using MySQL as a queueing engine across multiple servers is unusual, but it works well. It has the nice feature that the queue ordering can be anything you can write in a SELECT statement. So we put fair queueing in the rating scheduler; multiple requests from the same IP address compete with each other, not with those from other IP addresses. So no one site can use up all the rating capacity. ... Does anyone else architect their systems like this? A Xen(tm) management system I've written at least shares this aspect in that the RPC subsystem for communication between the frontend and the backends is basically a (MySQL) database table which is regularily queried by all backends that work on VHosts to change the state (in the form of a command) according to what the user specifies in the (Web-)UI. I vaguely remember that this database approach was teached at my former university as a basic mechanism for distributed systems at least since 1992, but I'd guess much longer... I didn't say it was unusual or frowned upon (and I was also taught this at uni IIRC as a means to easily distribute systems which don't have specific requirements for response time to RPC requests), but anyway, as you noted for Biztalk, it's much easier to hit bottlenecks with a polling-style RPC than with a true RPC system, as I've come to experience when the number of nodes (i.e., backends) grew over the last year and a half. That's what's basically causing a re-consideration to move from DB-style RPC to socket-based RPC, which is going to happen at some point in time for the system noted above (but I've sinced changed jobs and am now only a consulting developer for that anyway, so it won't be my job to do the dirty migration and the redesign ;-)). -- Heiko Wundram -- http://mail.python.org/mailman/listinfo/python-list
Re: what does ^ do in python
Am Mittwoch, 26. März 2008 19:04:44 schrieb David Anderson: HOw can we use express pointers as in C or python? There's no such thing as a pointer in Python, so you can't express them either. Was this what you were trying to ask? -- Heiko Wundram -- http://mail.python.org/mailman/listinfo/python-list
Re: what does ^ do in python
Am Dienstag, 25. März 2008 23:02:00 schrieb Dark Wind: In most of the languages ^ is used for 'to the power of'. In python we have ** for that. But what does ^ do? ^ is the binary exclusive-or (xor) operator. Possibly it helps to see the following (numbers are in binary) to get the drift: 00 ^ 01 = 01 01 ^ 01 = 00 10 ^ 01 = 11 11 ^ 01 = 10 -- Heiko Wundram -- http://mail.python.org/mailman/listinfo/python-list
Re: getattr for modules not classes
Am Mittwoch 24 Mai 2006 15:43 schrieb Piet van Oostrum: Heiko Wundram [EMAIL PROTECTED] (HW) wrote: HW y.py HW --- HW from x import test HW print test.one HW print test.two HW print test.three HW --- Or even: import x x = x.test print x.one print x.two print x.three Or even: --- from x import test as x print x.one print x.two print x.three --- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: NEWB: how to convert a string to dict (dictionary)
Am Mittwoch 24 Mai 2006 07:52 schrieb manstey: Hi, How do I convert a string like: a={'syllable': u'cv-i b.v^ y^-f', 'ketiv-qere': 'n', 'wordWTS': u'8'} into a dictionary: b={'syllable': u'cv-i b.v^ y^-f', 'ketiv-qere': 'n', 'wordWTS': u'8'} b = eval(a) (if a contains a dict-repr) --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP-xxx: Unification of for statement and list-comp syntax
Am Mittwoch 24 Mai 2006 06:12 schrieb Tim Roberts: At one time, it was said that the % operator was the fastest way to concatenate strings, because it was implemented in C, whereas the + operator was interpreted. However, as I recall, the difference was hardly measurable, and may not even exist any longer. The difference doesn't exist anymore for CPython (if you join a lot of strings), but for Jython (and several other dialects), the fastest way to join strings is still .join(), because there are no optimizations on a += b and the likes of it (replacement for the % operator, e.g.). --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP-xxx: Unification of for statement and list-comp syntax
Am Montag 22 Mai 2006 11:27 schrieb Boris Borcic: Mhhh, your unsugared form remind me of darks hours with primitive BASICS in my youth - the kind Dijsktra commented on. Why don't you write for node in tree: if node.haschildren(): do something with node As I've replied on python-dev, indentation is not always a good thing, especially if the for-body is longer than a few lines. The if not: continue form allows you to keep the indentation at one level, so that it's pretty clear what is part of the loop body, and what is not. If you add an extra indentation, your mind has to keep track of the indentation, and will expect an else: somewhere, which in the use case I propose won't happen. At least that's what my mind does, and is majorly confused, if the else doesn't appear. This certainly is personal taste, and AFAICT there are pretty few people who feel like I do. But, basically, I find it easier to have one level of indentation, and to test for the negated condition, than to put the loop body in an enclosing if-statement, which will always add an extra level of indentation. I put forth the proposal, because it allows you to save this level of indentation, which makes the code more readable for me. Anyway, the PEP has already been rejected on python-dev, and I'm currently just rewriting it with the positive and negative things that have so far been said, so basically, it'll just be there so that people can be pointed at it when anybody else'll ask for it. --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
PEP-xxx: Unification of for statement and list-comp syntax
Hi all! The following PEP tries to make the case for a slight unification of for statement and list comprehension syntax. Comments appreciated, including on the sample implementation. === PEP: xxx Title: Unification of for-statement and list-comprehension syntax Version: $Revision$ Last-Modified: $Date$ Author: Heiko Wundram [EMAIL PROTECTED] Status: Active Type: Standards Track Content-Type: text/plain Created: 21-May-2006 Post-History: 21-May-2006 17:00 GMT+0200 Abstract When list comprehensions were introduced, they added the ability to add conditions which are tested before the expression which is associated with the list comprehension is evaluated. This is often used to create new lists which consist only of those items of the original list which match the specified condition(s). For example: [node for node in tree if node.haschildren()] will create a new list which only contains those items of the original list (tree) whose items match the havechildren() condition. Generator expressions work similarily. With a standard for-loop, this corresponds to adding a continue statement testing for the negated expression at the beginning of the loop body. As I've noticed that I find myself typing the latter quite often in code I write, it would only be sensible to add the corresponding syntax for the for statement: for node in tree if node.haschildren(): do something with node as syntactic sugar for: for node in tree: if not node.haschildren(): continue do something with node There are several other methods (including generator-expressions or list-comprehensions, the itertools module, or the builtin filter function) to achieve this same goal, but all of them make the code longer and harder to understand and/or require more memory, because of the generation of an intermediate list. Implementation details The implementation of this feature requires changes to the Python grammar, to allow for a variable number of 'if'-expressions before the colon of a 'for'-statement: for_stmt: 'for' exprlist 'in' testlist_safe ('if' old_test)* ':' suite ['else' ':' suite] This change would replace testlist with testlist_safe as the 'in'-expression of a for statement, in line with the definition of list comprehensions in the Python grammar. Each of the 'if'-expressions is evaluated in turn (if present), until one is found False, in which case the 'for'-statement restarts at the next item from the generator of the 'in'-expression immediately (the tests are thus short-circuting), or until all are found to be True (or there are no tests), in which case the suite body is executed. The behaviour of the 'else'-suite is unchanged. The intermediate code that is generated is modelled after the byte-code that is generated for list comprehensions: def f(): for x in range(10) if x == 1: print x would generate: 2 0 SETUP_LOOP 42 (to 45) 3 LOAD_GLOBAL 0 (range) 6 LOAD_CONST 1 (10) 9 CALL_FUNCTION1 12 GET_ITER 13 FOR_ITER28 (to 44) 16 STORE_FAST 0 (x) 19 LOAD_FAST0 (x) 22 LOAD_CONST 2 (1) 25 COMPARE_OP 2 (==) 28 JUMP_IF_FALSE9 (to 40) 31 POP_TOP 3 32 LOAD_FAST0 (x) 35 PRINT_ITEM 36 PRINT_NEWLINE 37 JUMP_ABSOLUTE 13 40 POP_TOP 41 JUMP_ABSOLUTE 13 44 POP_BLOCK 45 LOAD_CONST 0 (None) 48 RETURN_VALUE where all tests are inserted immediately at the beginning of the loop body, and jump to a new block if found to be false which pops the comparision from the stack and jumps back to the beginning of the loop to fetch the next item. Implementation issues The changes are backwards-compatible, as they don't change the default behaviour of the 'for'-loop. Also, as the changes that this PEP proposes don't change the byte-code structure of the interpreter, old byte-code continues to run on Python with this addition unchanged. Implementation A sample implementation (with updates to the grammar documentation and a small test case) is available at: http://sourceforge.net/tracker/index.php?func=detailaid=1492509group_id=5470atid=305470 Copyright This document has been placed in the public domain. === --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Feature request: sorting a list slice
Am Sonntag 21 Mai 2006 18:55 schrieb Raymond Hettinger: If the perf gain is small and the use cases are infrequent, the addition is likely unwarranted. There is an entire class of feature requests that are more appropriate as recipes than for inclusion in the language. The thing is: having an explicit start/stop argument to reverse() and sort() doesn't slow down method call much (it's just one if() whose body is skipped when the parameters aren't passed, I'd say that the time that's lost here is pretty insignificant, in the order of 10^-6 seconds, on _any_ modern machine), and on the other hand warrants huge memory gains (if not performance gains by saving a memcpy) when you do need to sort or reverse slices of large lists. I've had use cases of the latter (reversing large parts of even larger lists in memory) in several data archiving and evaluation programs I wrote, but I can also understand the use case that was made by George for having these arguments for sort(), so that groupBy() can be extended easily to work on subgroups without requiring slicing. Anyway: having these extensions as a recipe won't work here: the main idea is saving the slicing by having sort() and reverse() do the slicing internally (rather, they just add a constant to the lower and upper bound, and the user specifies these constants, the internal functions they call already work on slices, and the current listobject.c gives them the whole list as the slice by default). The user can't patch this as an extension on the fly, because it requires changes to the underlying listobject.c source. That's why George is asking for inclusion of this patch. I just wrote the patch because I had the time to do so, and I won't battle for it's inclusion, but again, I see the use cases clearly, at the very least for slice support in list.reverse() and array.reverse() (which the patch also implements). --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Iterators: Would rewind be a good idea?
Am Sonntag 21 Mai 2006 21:43 schrieb Charles D Hixson: I was reading through old messages in the list and came up against an idea that I thought might be of some value: Wouldn't it be a good idea if one could rewind an iterator? Not stated in precisely those terms, perhaps, but that's the way I read it. Yes, that certainly would be a neat idea. But, think of the following: what if the iterator computes the values at runtime, and you're not iterating over a predefined list of some sort? Do you want the machinery to store the state of the iterator at every earlier point in time (sometimes this may not even be possible, think of socket communication handled by iterators, or of some state being kept in a C extension outside of the grasp of the Python interpreter), and to restore the iterator to this point? Even if the generator you're trying to rewind can be pickled at each yield statement, memory requirements come to mind when thinking about this for the general case, where you simply don't want to rewind, but just iterate forward. Anyway, an easy way out (if the aforementioned concerns don't apply) is always to create a list of the generator, and then to index that list directly: x = list(gen()) A list can be freely indexed, so basically if you implement the iterator logic using a while loop, you're free to rewind as much as you'd like. --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: proposal: disambiguating type
Am Sonntag 21 Mai 2006 21:13 schrieb gangesmaster: i suggest splitting this overloaded meaning into two separate builtins: * type(name, bases, dict) - a factory for types * typeof(obj) - returns the type of the object While I personally don't find this proposal to be bad, this is something that should only be considered for Python 3000, because it breaks old code in very unnecessary ways (because it's only for beauty, not for function), considering that most scripts written for Python 1.5.2 still work under current Python without modification, and people expect them to. --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: getattr for modules not classes
Am Sonntag 21 Mai 2006 21:52 schrieb Daniel Nogradi: Is there something analogous to __getattr__ for modules? I know how to create a class that has attributes from a list and nothing else by overloading __getattr__ and making sure that the accessed attribute appears in my list. Now I would like to do the same with a module, say x.py, in which I have a list, say mylist, and after importing x from another module I would like to be able to say x.one( ) or x.two( ) if 'one' and 'two' are in mylist and raise an exception if they aren't. Is this possible? Not really. But, why not create an instance of some custom type in x.py, and import that instance into the current namespace? Just as convenient. x.py mylist = {one:1,two:2} class test(object): def __getattr__(self,name): return mylist.get(name,None) test = test() --- y.py --- from x import test print test.one print test.two print test.three --- --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: escapes in regular expressions
Am Sonntag 21 Mai 2006 19:49 schrieb James Thiele: re.match('\d', '7').group() print '\d' \d re.match('\\d', '7').group() print '\\d' \d '\d' evaluates to \d, because d is not a valid escape sequence. '\n' evaluates to newline, because n is a valid escape sequence. '\\' evaluates to \, because \ is a valid escape sequence. --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Slicing Issues
Am Sonntag 21 Mai 2006 22:52 schrieb BJ Swope: district_combo=line[85:3] This returns the slice from character 85 to character 3 in the string, read forwards. Basically, as Python slices are forgiving (because the borders are actually illogical), this amounts to nothing, but could also amount to: your indexing boundaries are invalid. Basically, what you want is: district_combo = line[85:88] where 88 = 85 + 3 (3 being the length). Read up on Python slices... --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Feature request: sorting a list slice
Am Donnerstag 18 Mai 2006 19:27 schrieb George Sakkis: It would be useful if list.sort() accepted two more optional parameters, start and stop, so that you can sort a slice in place. I've just submitted: http://sourceforge.net/tracker/index.php?func=detailaid=1491804group_id=5470atid=305470 to the bugtracker, which extends the (start, stop) keyword arguments to list.reverse() (which I've needed more than once). The patch updates the test suite, documentation, list object, and sorted() builtin to accept (or specify) the new arguments. Any comment/feedback would be appreciated. --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Feature request: sorting a list slice
Am Freitag 19 Mai 2006 23:24 schrieb George Sakkis: This is great, thanks Heiko ! Any idea on the chances of being considered for inclusion in 2.5 ? Don't ask me, I'm not one of the core developers... ;-) But, anyway, the people on python-dev are doing their best to review patches. Just: I rather write them, than review them... ;-) --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Complex evaluation bug
Am Freitag 19 Mai 2006 18:03 schrieb Paul McGuire: An eval-less approach - the problem is the enclosing parens. snip I've just submitted two patches to the Python bugtracker at: http://sourceforge.net/tracker/index.php?func=detailaid=1491866group_id=5470atid=305470 which either change the repr() format (removing the parentheses), which I find doubtful, because it's not backwards-compatible, or alter the constructor to accept the repr() format for complex numbers (a bracketed number). Feel free to comment. --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Reference Counts
Am Donnerstag 18 Mai 2006 08:28 schrieb raghu: #!/usr/bin/python import sys global a print Total Reference count at the start =,sys.gettotalrefcount() a=1 print a ref count =,sys.getrefcount(a) b=a print a ref count =,sys.getrefcount(a) del a del b print Total Reference count at the end =,sys.gettotalrefcount() ... Total Reference count at the start = 16538 a ref count = 49 a ref count = 50 Total Reference count at the end = 16540 [6416 refs] There are a few questions that I am having on this. (1) Why should 'a' reference count be 49 before I even made an assignment ? Because 1 is a special integer. Small integers (-1..100, but this depends on the Python version) are interned, similar to strings, so there are already references to the integer object before you assign it to a (at least one; 1 is such a magic constant that you can guess that there are already other references to it in other places of the stdlib, which has loaded when your script runs, so it's not hard to imagine that 1 already has 48 references outside of your program). (2) The Total Reference count at the end has increased by 2 . Why ? Am I leaking memory ? No. I'd guess that the names a and b were interned as strings (as they are used as dict lookup keys in the globals() dict), and you have one reference to each interned object. (3) I have read somewhere that an increase in sys.gettotalrefcount() is indicative of a memory leak ? Aint that correct ? Yes. It is correct if consecutive runs of your algorithm always yield a higher sys.gettotalrefcount() for each run. In this case (where you run your algorithm only once), it isn't. It just shows you some of the innards of the Python runtime machinery. Execute the following script to see the result of a memory leak: import sys x = {} i = 0 def test(): global x, i x[i] = test i += 1 # Forget to clean up x... LEAK a reference to test! for j in xrange(1): print Before, j, :, sys.gettotalrefcount() test() print After, j, :, sys.gettotalrefcount() And, the following (slightly altered) program doesn't exhibit this memory leak: import sys x = {} i = 0 def test(): global x, i x[i] = test i += 1 del x[i-1] # Properly clean up x. for j in xrange(1): print Before, j, :, sys.gettotalrefcount() test() print After, j, :, sys.gettotalrefcount() I don't have a debug build of Python at hand, so I can't run them now. But, if you're interested in the results, you'll certainly do that yourself. ;-) --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python - Web Display Technology
Am Donnerstag 18 Mai 2006 08:51 schrieb SamFeltus: I am trying to figure out why so little web development in Python uses Flash as a display technology. It seems most Python applications choose HTML/CSS/JS as the display technology, yet Flash is a far more powerful and elegant display technology. On the other hand, HTML/JS seems clunky and antiquated. I am a gardener, and not a coder by trade, but Flash seems to integrate just fine with Python. Anyways, what are the technical reasons for this? There no Python specific reason, but I refrain from using Flash because it requires more than just the usual browser (which is available everywhere). Using HTML/CSS/JS, I can make it so that the information I want to give to the user displays right on pretty much every computer that's available out there (think PS3), when I resort to techniques such as Flash or Java, I limit the number of people I can reach. Take me for example: I'm running Linux on AMD64, and there's no proper Flash implementation yet which I can plug into my Firefox. So, I'm out on any Flash page. If you want to exclude me from viewing the information you want to present, fine, use Flash. If you don't, don't use it. And: the web is a platform to offer _information_. Not to offer shiny graphics/sound, which is the only benefit Flash has to offer. To sum it up: Flash/Java considered evil here. But that's just my 5 cents. --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Reference Counts
Am Donnerstag 18 Mai 2006 09:33 schrieb raghu: However, the 'non-leaky' one showed a funny trend ...it kept increasing the totalrefcount for five iterations (see 1 thru 5) and then dropped down by 5 ( See Before 5 : 16584 After 5 : 16580 ) suddenly and again increase as shown below. However, at the time when the script finsished execution, we were not too far from the starting totalrefcount (16584 from 16579), The cyclic garbage collector isn't run after every byte-code instruction, but only after several have executed (because of performance issues). That's why you see an increase in reference counts, until the interpreter calls the garbage collector, which frees the object cycles, and so forth. I don't exactly know what the magic constant (i.E. number of byte-code instructions between subsequent runs of the garbage collector) is, but I presume it's somewhere in the order of 100 bytecode instructions. Why you need the cyclic gc to clean up the data structures my sample creates is beyond be, but I'd guess it has something to do with the internal structure of dicts. Anyway, you can easily test this hypothesis by calling gc.collect() explicitly in the main loop after test() has run (remember to import gc... ;-)). This forces a run of the cyclic gc. If funny pattern still remains I wouldn't know of any other explanation... ;-) But, as long as references aren't being leaked (you don't see the drop in references after every x instructions), there's nothing to worry about. --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Proposal for new operators to python that add syntactic sugar for hierarcical data.
Am Donnerstag 18 Mai 2006 13:27 schrieb bruno at modulix: Adding ugly and unintuitive operators to try to turn a general purpose programming language into a half-backed unusable HTML templating language is of course *much* more pythonic... What about writing a mini-language that gets translated to Python? Think of Cheetah, which does exactly this (albeit not being limited to templating HTML data). Adding these kind of operators to Python is an absolute NoNo, because it's nothing general the OP is trying to achieve here. Creating a small wrapper language: why not? (it's not that we have enough templating languages already ;-)) By the way: the language you (the OP) are trying to implement here goes strictly against the MVC model of application programming. You know that, right? --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Proposal for new operators to python that add syntactic sugar for hierarcical data.
Am Donnerstag 18 Mai 2006 15:17 schrieb glomde: nothing general the OP is trying to achieve here Define general :-). I do think I solve something and make it more readable. You could also argue that list comprehension doesnt solve anything general. Sure, a list comprehension solves something general. It's syntactic sugar for a recurring pattern in a for loop. What you are trying to achieve is to make syntactic sugar for making namespace definitions look nicer. But: the way you are trying to do so isn't pythonic, because there isn't one obvious way how your proposal works; you're not even specifying a proper semantic interpretation of your syntax (and use magic markers, which is even more a NoNo). For a better thought out proposal (IMHO) for stacking and defining namespaces (based on the current metaclass behaviour), look for the PEP on the make keyword (which was sent to Py-Dev some weeks ago). By the way: the language you (the OP) are trying to implement here goes strictly against the MVC model of application programming. You know that, right? ???. I cant see how this breaks MVC. MVC depends on how you parition your application this doesnt put any constraint on how you should do your application. Sure it does. In case you implement a templating language that simply contains the full power of the Python language, you create a (somewhat) cripled PHP, basically. The View part of MVC shouldn't do branching (well, in the strict sense of MVC), shouldn't do looping, shouldn't do modification of the data, all that stuff is up to the controller. If you empower the template writer with the full power of the Python programming language by integrating your proposal with Python itself, you're bound to water down the split between the three, and to create a maintenance nightmare (as I said, PHP comes to mind, where you basically can't split controller from model properly). For a proper (and pretty strict) MVC templating language, have a look at Nevow. --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python - Web Display Technology
Am Donnerstag 18 Mai 2006 16:09 schrieb SamFeltus: I guess there isn't much to understand. Sure, there's a lot to understand here. What I guess you can't come to terms with is the fact that the web (hell, the whole Internet) isn't designed for Windows personal computers only, but for a whole range of computer systems which need to interoperate. For that, you need standards. And: Flash isn't one, and will never become one. Simply, because it's full of bad design decisions, and because the company that has the power over Flash doesn't want to make it an open standard. At least I don't see that happen any time soon. If you are satisfied with a text based, static image web, that is light on artistic possabilities, all that HTML stuff is acceptable. Are you actually familiar with what you can do with JavaScript and HTML/CSS? CSS is pretty powerful. Hell, it's very powerful, even. And: why do I need animated graphics to convey _information_ to a user? I don't surf the web to have the feeling of walking through an art gallery, but rather surf the web to gather information I need for my daily life. And: HTML is designed for that explicitly. CSS too (as in proper presentation of the content you're trying to convey to the user). And even JavaScript is designed to deal with _content_, not with pretty but meaningless graphical imagery. I'm not saying that graphics can't convey meaning. But: the tools to deal with images are sufficiently advanced in HTML and CSS that I can display any kind of graphic imagery I need to convey the information to the user. Perhaps the HTML/JS group will even get off their rear ends and bring some decent cross platform graphics capabilities to the web one decade? Perhaps even bring some 90's style graphics to the browser one decade? Same as before: do you actually know what the HTML group (well, the W3C) is doing? They are a very active group, have designed an open format for vector graphics (SVG, which has been referenced here before), and actually have the guts to stand up to MickeySoft and their lackeys to keep the format open, and to keep development of further extensions open. This is technological advancement at work. Not some company like Macromedia trying to design a proprietary, insufficiently engineered format, that's just there so that people who think they need to burry the information they are trying to convey to the user in graphic imagery so that noone will notice that there's no actual content in what they are trying to tell you. --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Proposal for new operators to python that add syntactic sugar for hierarcical data.
Am Freitag 19 Mai 2006 02:08 schrieb Bruno Desthuilliers: We'd need the make: statement, but the BDFL has pronounced against. I'm still -2 against your proposition, but it could make a good use case for the make statement. I gave an eye at the new 'with' statement, but I'm not sure it could be used to solve this. Couldn't. with is a blatant misnomer for that it's functionality is (basically a protected generator), at least if you know what with does in VB (god, am I really comparing VB with Python? And I've never even programmed in the former...) --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Feature request: sorting a list slice
Am Donnerstag 18 Mai 2006 22:13 schrieb Raymond Hettinger: This is a false optimization. The slicing steps are O(n) and the sort step is O(n log n) unless the data has some internal structure that Timsort can use to get closer to O(n). If you implemented this and timed in it real apps, I would be surprised to find that the slice extraction and assignments were the performance bottleneck. IMO, a worthwhile performance gain is unlikely. I personally wouldn't find this to be a false optimization, at least not memory-wise. If you sort really large lists, and only want to sort part of the list, the memory gains of not having to do a slice should be enormous, and there should be some time-gains too. And, additionally, implementing this (if I understand timsort correctly, just had a look at the sources) isn't hard. Rather, I'd pose the usage question: why are you only sorting a part of a list? lists are basically meant for homogenous data. And sorting only a part of that should be a pretty meaningless operation, mostly... Anyway, I've just written up a patch to extend sort() with a start/stop parameter (which behaves just like the default slice notation). Generally, this will make sort() setup slightly slower in the standard case (because there's some more pointer arithmetic involved, but this should be negligible, and is O(1)), but for the actual use case, the following numbers can be seen: [EMAIL PROTECTED] ~/mercurial/python-modelnine $ ./python test.py New average time: 13.7254650593 ms per loop Old average time: 14.839854002 ms per loop [10198 refs] [EMAIL PROTECTED] ~/mercurial/python-modelnine $ This is just a very, very simple test (testing the case of a list with one million entries, where 99000 are sorted): from random import randrange from time import time x = [randrange(256) for x in xrange(100)] timesnew, timesold = [], [] for reps in range(1000): y = x[:] timesnew.append(time()) y.sort(start=1000,stop=1) timesnew[-1] = time() - timesnew[-1] for reps in range(1000): y = x[:] timesold.append(time()) y[1000:1] = sorted(y[1000:1]) timesold[-1] = time() - timesold[-1] print New average time:, sum(timesnew), ms per loop print Old average time:, sum(timesold), ms per loop I'll post the patch to the SF bugtracker tomorrow, it's too late today for me to review it, but generally, I wouldn't find it to be a bad addition, if there's actually a general use case to only sort parts of a list. --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: List behaviour
Am Mittwoch 17 Mai 2006 17:06 schrieb [EMAIL PROTECTED]: Maybe I'm missing something but the latter is not the behaviour I'm expecting: a = [[1,2,3,4], [5,6,7,8]] b = a[:] b [[1, 2, 3, 4], [5, 6, 7, 8]] a == b True a is b False Try an: a[0] is b[0] and a[1] is b[1] here, and you'll see, that [:] only creates a shallow copy. Thus, the lists in the lists aren't copied, they are shared by the distinct lists a and b. Hope this clears it up. --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Pyparsing: Grammar Suggestion
Am Mittwoch 17 Mai 2006 17:24 schrieb Khoa Nguyen: Any suggestions? If you're not limited to PyParsing, pyrr.ltk/ptk might be appropriate for you here (if you're used to bison/flex). The following file implements a small sample lexer/parser which does exactly what you need. pyrr.ltk (the lexing toolkit) is stable, but pyrr.ptk isn't yet, but it's nevertheless available under: http://hg.modelnine.org/hg/pyrr as a mercurial repository. I'd advise you to take the version from the repository, if you're interested in it, as my packaged versions always had quirks, which the current head of the repository doesn't, AFAICT. Anyway, the following implements the parser/lexer for you: from pyrr.ltk import LexerBase, IgnoreMatch from pyrr.ptk import ParserBase class SampleLexer(LexerBase): def f(self,match,data): r f1 [10]- /f1/ f2 [10]- /f2/ f3 [10]- /f3/ f4 [10]- /f4/ f5 [10]- /f5/ f6 [10]- /f6/ Create your specific matches for each of the fs here... return data def fid(self,match,data): r fid - ri/[a-z_][a-z0-9_]*/ Match a record identifier. return data def end_of_record(self,match,data): r EOR - /END_OF_RECORD/ Your end of record marker... def operators(self,match,data): r nl - e/\n/ c - /,/ eq - /=/ Newline is something that I have inserted here... def ws(self,match,data): r ws - r/\s+/ Ignore all whitespace that occurs somewhere in the input. raise IgnoreMatch class SampleParser(ParserBase): __start__ = ifile def ifile(self,data): ifile - record+ return dict(data) def record(self,fid,eq,f1,c1,f2,c2,f3,c3,f4,c4,f5,c5,f6,eor,nl): record - /fid/ /eq/ /f1/? /c/ /f2/? /c/ /f3/? /c/ /f4/? /c/ /f5/? /c/ /f6/? /EOR/ /nl/ return (fid,(f1,f2,f3,f4,f5,f6)) data = rrecmark = f1,f2,,f4,f5,f6 END_OF_RECORD recmark2 = f1,f2,f3,f4,,f6 END_OF_RECORD print SampleParser.parse(SampleLexer(data)) HTH! --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Pyparsing: Grammar Suggestion
Am Mittwoch 17 Mai 2006 17:53 schrieb Heiko Wundram: If you're not limited to PyParsing, pyrr.ltk/ptk might be appropriate for you here (if you're used to bison/flex). The following file implements a small sample lexer/parser which does exactly what you need. pyrr.ltk (the lexing toolkit) is stable, but pyrr.ptk isn't yet, but it's nevertheless available under: http://hg.modelnine.org/hg/pyrr I know answering to oneself is bad, but forget that link. Use: http://dev.modelnine.org/hg/pyrr instead (I get mixed up between the name I use for pushing my trees and my development pages)... --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Pyparsing: Grammar Suggestion. 2nd thought
Am Mittwoch 17 Mai 2006 20:05 schrieb Khoa Nguyen: On 2nd thought, I don't think this will check for the correct order of the fields. For example, the following would be incorrectly accepted: f1,f5,f2 END_RECORD Thanks, Khoa If I'm not completely mistaken, parsers written using PyParsing can accept a small superset of all languages that an N/DFA can accept, and as such PyParsing isn't a general purpose parsing toolkit (the latter implements matching of a subset of all languages a N/DSA can accept, think SLR, LR(1), LALR(1), GLR or the like), because it doesn't support the notion of left-/right-recursion (at least I didn't find anything like it back in the days when I had a look at PyParsing), but I might be wrong here. If I am, someone enlighten me. ;-) Anyway, the language you're trying to match here (along with more complex productions for the f's) is nothing that an NFA can ever match. So, either you use PyParsing to implement the tokenization for you, and postprocess using a handwritten parser (LL-parsers are easy to implement, and I'd guess a small LL-parser is sufficient for your needs), or you have a look at one of the available LR-parsing frameworks for Python, such as pyrr (which isn't the only one, by far). By the way: if you have a variable length argument list, such as: f1,f2,,,f5,f9,f12,... and there is no upper bound on the number of acceptable arguments, no parsing framework that doesn't accept context sensitive grammars (DSA) can ever verify the order for you. You'll have to do the verification of correct order in a later step, after the parsing has been done. --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Pyparsing: Grammar Suggestion. 2nd thought
Am Mittwoch 17 Mai 2006 20:24 schrieb Heiko Wundram: If I'm not completely mistaken, parsers written using PyParsing can accept a small superset of all languages that an N/DFA can accept, snip... Okay, forget what I said about PyParsing here; using Forward(), you can create recursion, but it took an O'Reilly article to find out about this. ;-) --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: creating a new database with mysqldb
Am Mittwoch 17 Mai 2006 21:23 schrieb John Salerno: Well, the thing about it is that all the guides I find online seem to begin with using a command prompt or a unix shell, neither of which will work in my case. I'm trying to find a way to access my database server using just a python script. Perhaps that isn't even possible for me to do without shell access. I might just have to use the msqladministrator in my server control panel, instead of using python. Creating a database is just another SQL command in MySQL (which you can easily send to the MySQL server you're using with Python and MySQLdb): CREATE DATABASE dbname Of course, you need to log on with a user who is allowed to create databases. See the MySQL documentation for more info on the available CREATE commands. --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: send an email with picture/rich text format in the body
Am Sonntag 14 Mai 2006 13:24 schrieb anya: I want to send an email message with picture in it. This... I dont want to put it as attachment but make it in the body of the mail, so every one who open the email will see the picture.. will... (it is possible that the solution will be in any other format that will be opened i.e pdf, doc and I will put this in the body ) never... Neither in MimeWriter nor using the definition of MymeTypes I was able to do it .. work. That is, unless you design your own MIME standard, and get all email clients out there to read your type of structured document, which you inherently need for including pictures directly in an email (body), as MIME (as we know it today) only knows about stacked message parts, not about message content and higher level formatting. Basically, to include a picture in a body today, there's concensus that you insert a HTML-document into one MIME part, and an img link refers to the attachment that comes in another MIME part (by the filename, which is the same as for the attached MIME part). But, as you see, this specifically requires that the recipient is able to view HTML mails (which quite a lot of people, even those using M$ Outlook, have turned off by default). Anyway, read Ben Finney's response carefully. If you're trying to send out commercial email, I'll be the first person to dump your mail if it doesn't at least come in a format I can read (and understand!) text-only. --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Question regarding checksuming of a file
Am Sonntag 14 Mai 2006 20:51 schrieb Andrew Robert: def getblocks(f, blocksize=1024): while True: s = f.read(blocksize) if not s: return yield s This won't work. The following will: def getblocks(f,blocksize=1024): while True: s = f.read(blocksize) if not s: break yield s --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list
Re: Converting String to int
Am Sonntag 14 Mai 2006 22:23 schrieb Ognjen Bezanov: mynums = 423.523.674.324.342.122.943.421.762.158.830 mynumArray = string.split(mynums,.) This is the old way of using string functions using the module string. You should only write this as: mynumArray = mynums.split(.) (using the string methods of string objects directly) x = 0 for nums in mynumArray: This is misleading. Rename the variable to num, as it only contains a single number. if nums.isalnum() == true: .isalnum() checks whether the string consists of _alpha_-numeric characters only. So, in this case, it may contain letters, among digits. .isdigit() checks whether it is a (base = 10) number. x = x + int(nums) else: print Error, element contains some non-numeric characters As you don't know what the offending element is, insert a: print nums here. break If you change the code as noted above, it works fine for me. --- Heiko. -- http://mail.python.org/mailman/listinfo/python-list