Re: Is Eval *always* Evil?
On 11 November 2010 09:07, John Nagle na...@animats.com wrote: Am 10.11.2010 18:56, schrieb Simon Mullis: Yes, eval is evil, may lead to security issues and it's unnecessary slow, too. If you have to use eval, use the 2 or 3 argument form with a globals and locals dictionary. This lists the variables and functions that eval can see and touch. The Python documentation for this is not very good: If the globals dictionary is present and lacks ‘__builtins__’, the current globals are copied into globals before expression is parsed. This means that expression normally has full access to the standard __builtin__ module and restricted environments are propagated. What this means is that you have to put in __builtins__ to PREVENT all built-ins from being imported. Aren't I already doing this? result = eval(xpath_command, {__builtins__:[]},{x: x}) SM -- http://mail.python.org/mailman/listinfo/python-list
Re: parse date/time from a log entry with only strftime (and no regexen)
This was a long time ago But just in case anyone googling ever has the same question, this is what I did (last year). The user just needs to supply a strftime formatted string, such as %A, %e %b %h:%M and this Class figures out the regex to use on the log entries... class RegexBuilder(object): This class is used to create the regex from the strftime string. So, we pass it a strftime string and it returns a regex with capture groups. lookup_table = { '%a' : r(\w{3}),# locale's abbrev day name '%A' : r(\w{6,8}), # locale's full day name '%b' : r(\w{3}),# abbrev month name '%B' : r(\w{4,9}), # full month name '%d' : r(3[0-1]|[1-2]\d|0[1-9]|[1-9]|[1-9]), # day of month '%e' : r([1-9]|[1-3][0-9]), # day of month, no leader '%H' : r(2[0-3]|[0-1]\d|\d), # Hour (24h clock) '%I' : r(1[0-2]|0[1-9]|[1-9]), # Hour (12h clock) '%j' : r(36[0-6]|3[0-5]\d|[1-2]\d\d|0[1-9]\d|00[1-9]\ |[1-9]\d|0[1-9]|[1-9]), # Day of year '%m' : r(1[0-2]|0[1-9]|[1-9]), # Month as decimal '%M' : r([0-5]\d|\d), # Minute '%S' : r(6[0-1]|[0-5]\d|\d), # Second '%U' : r(5[0-3]|[0-4]\d|\d), # Week of year (Sun = 0) '%w' : r([0-6]), # Weekday (Sun = 0) '%W' : r(5[0-3]|[0-5]\d|\d), # Week of year (Mon = 0) '%y' : r(\d{2}), # Year (no century) '%Y' : r(\d{4}), # Year with 4 digits '%p' : r(AM|PM), '%P' : r(am|pm), '%f' : r(\d+), # TODO: microseconds. Only in Py 2.6+ } # Format of the keys in the table above strftime_re = r'%\w' def __init__(self, date_format): r = re.compile(RegexBuilder.strftime_re) self.created_re = r.sub(self._lookup, date_format) def _lookup(self, match): Regex lookup... return RegexBuilder.lookup_table[match.group()] 2009/2/3 andrew cooke and...@acooke.org ValueError: unconverted data remains: this is the remainder of the log line that I do not care about you could catch the ValueError and split at the ':' in the .args attribute to find the extra data. you could then find the extra data in the original string, use the index to remove it, and re-parse the time. ugly, but should work. andrew -- http://mail.python.org/mailman/listinfo/python-list -- Simon Mullis _ si...@mullis.co.uk -- http://mail.python.org/mailman/listinfo/python-list
Is Eval *always* Evil?
Hi All, I'm writing a Django App to webify a Python script I wrote that parses some big XML docs and summarizes certain data contained within. This app will be used by a closed group of people, all with their own login credentials to the system (backed off to the corp SSO system). I've already got Django, Celery and Rabbitmq working to handle uploads and passing the offline parsing / processing task to the backend. One way to do the XML processing is to use a number of XPath queries and a key point is the required output (and therefore the list of queries) is not yet complete (and will always be a little fluid). Ultimately, I don't want to be stuck with maintaining this list of XPath queries and want to allow my colleagues to add / change / remove them as they see fit. I didn't know my xpath from my elbow when I first started looking into this so I think they'll be able to learn it pretty easily. The only easy way to accomplish this I can think of is create a new table with { name : xpath command } key, value pairs and allow it to be edited via the browser (using Django's Admin scaffolding) . I plan on making sure that when amending the list of XPath queries, each entry is tested against a known good XML file so the user can see sample results before they're allowed to save it to the database. I'm aware that this will probably slow things down, as there'll be lots of serial XPath queries, but even if the job takes 5 minutes longer then it will still save man-days of manual work. I can live with this as long as it's flexible. # In the meantime - and as a proof of concept - I'm using a dict instead. xpathlib = { houses: r'[ y.tag for y in x.xpath(//houses/*) ]', names : r'[ y.text for y in x.xpath(//houses/name) ]', footwear_type : r'[ y.tag for y in x.xpath(//cupboard/bottom_shelf/*) ]', shoes : r'[ y.text for y in x.xpath(//cupboard/bottom_shelf/shoes/*) ]', interface_types : r'[ y.text[:2] for y in x.xpath(//interface/name) ]', } # (I left a real one at the bottom so you can see I might want to do some string manipulation on the results within the list comprehension). # Then in my backend task processing scripts I would have something like: def apply_xpath(xpath_command, xml_frag): x = xml_frag if re.findall(r'\[\sy\.\w+(?:\[.+\])?\s+for y in x\.xpath\(/{1,2}[\w+|\/|\*]+\)\s\]', xml_frag): result = eval(xpath_command, {__builtins__:[]},{x: x}) if type(result).__name__ == list: return result Yes, the regex is pretty unfriendly, but it works for all of the cases above and fails for other non-valid examples I've tried. # Let's try it in ipython: from lxml import etree f = open('/tmp/sample.xml') xml_frag = etree.fromstring(f.read()) print apply_xpath(xpathlib[footwear_type], xml_frag) ['ballet shoes', 'loafers', 'ballet shoes', 'hiking boots', 'training shoes', 'hiking boots', 'socks'] Is this approach ok? I've been reading about this and know about the following example that bypass the __builtins__ restriction on eval (edited so it won't work). __import__('shutill').rmtreee('/') ===DO NOT TRY AND RUN THIS As the potential audience for this Web App is both known and controlled, am I being too cautious? If eval is not the way forward, are there any suggestions for another way to do this? Thanks -- http://mail.python.org/mailman/listinfo/python-list
Calling a method with a variable name
Hi All, I'm collating a bunch of my utility scripts into one, creating a single script to which I will symbolic link multiple times. This way I only have to write code for error checking, output-formatting etc a single time. So, I have ~/bin/foo - ~/Code/python/mother_of_all_utility_scripts.py ~/bin/bar - ~/Code/python/mother_of_all_utility_scripts.py ~/bin/baz - ~/Code/python/mother_of_all_utility_scripts.py I would like bar to run the bar method (and so on). - class Statistic() def __init__(self): pass def foo(self): return foo! def bar(self): return bar! #... and so on... def main(): stats_obj = Statistic() name = re.sub([^A-Za-z], , sys.argv[0]) method = getattr(stats_obj, name, None) if callable(method): stats_obj.name() # HERE else: print nope, not sure what you're after --- However, as I'm sure you've all noticed already, there is no method called name. I would really prefer to get a nudge in the right direction before I start evaling variables and so on. Does my approach make sense? If not, please give me a hint... Thanks SM -- http://mail.python.org/mailman/listinfo/python-list
Re: Calling a method with a variable name
May I be the first to say Doh! Problem solved, many thanks to both Carsten and Diez! SM 2009/11/4 Carsten Haese carsten.ha...@gmail.com: Simon Mullis wrote: def main(): stats_obj = Statistic() name = re.sub([^A-Za-z], , sys.argv[0]) method = getattr(stats_obj, name, None) if callable(method): stats_obj.name() # HERE else: print nope, not sure what you're after --- However, as I'm sure you've all noticed already, there is no method called name. I would really prefer to get a nudge in the right direction before I start evaling variables and so on. At the point you marked HERE, you've already found the method, and you have determined that it is callable. You just need to call it. Like this: method(). HTH, -- Carsten Haese http://informixdb.sourceforge.net -- http://mail.python.org/mailman/listinfo/python-list -- Simon Mullis _ si...@mullis.co.uk -- http://mail.python.org/mailman/listinfo/python-list
Re: parse date/time from a log entry with only strftime (and no regexen)
That, my friend, is ingenious...! Thankyou SM 2009/2/3 andrew cooke and...@acooke.org ValueError: unconverted data remains: this is the remainder of the log line that I do not care about you could catch the ValueError and split at the ':' in the .args attribute to find the extra data. you could then find the extra data in the original string, use the index to remove it, and re-parse the time. ugly, but should work. andrew -- http://mail.python.org/mailman/listinfo/python-list -- Simon Mullis _ si...@mullis.co.uk -- http://mail.python.org/mailman/listinfo/python-list
Should open(sys.stdin) and open(file, 'r') be equivalent?
Hi All I've written a simple python script that accepts both stdin and a glob (or at least, that is the plan). Unfortunately, the glob part seems to hang when it's looped through to the end of the filehandle. And I have no idea why... ;-) sys.stdin and a normal file opened with open seem to both be identical filehandles. import sys foo = sys.stdin type(foo) type 'file' repr(foo) open file 'stdin', mode 'r' at 0x16020 bar = open(test_file, 'r') type(bar) type 'file' repr(bar) open file 'test_file', mode 'r' at 0x3936e0 The stdin version is fine. I want to re-use the code for scan_data (and all of the other processing methods) so I'd like to be able to iterate over one line at a time, independently of the source (i.e. either file or stdin) Code that illustrates the issue follows: # cat test_fh.py #!/usr/bin/env python import glob, os, sys class TestParse(object): def __init__(self): if (options.stdin): self.scan_data(sys.stdin) if (options.glob): self.files = glob.glob(options.glob) for f in files: fh = open(f, 'r') self.scan_data(fh) fh.close() def scan_data(self,fileobject): i = int() for line in fileobject: print i i += 1 # do stuff with the line... pass print finished file def main(): T = TestParse() if __name__ == __main__: from optparse import OptionParser p = OptionParser(__doc__, version=testing 1 2 3) p.add_option(--glob, dest=glob) p.add_option(--stdin, dest=stdin, action=store_true, default=False) (options, args) = p.parse_args() main() #EOF Running this against stdin outputs a count of lines and then exits fine (exit code 0). # cat test_file | ./test-fh.py --stdin ...output... # echo $? 0 Running against --glob test_file just hangs. # ./test_fh.py --glob test_file wait 20 seconds or so... ^CTraceback (most recent call last): File ./test_fh.py, line 35, in module main() File ./test_fh.py, line 26, in main T = TestParse() File ./test_fh.py, line 8, in __init__ self.scan_data(sys.stdin) File ./test_fh.py, line 18, in scan_data for line in fileobject: KeyboardInterrupt # echo $? 1 So, what am I doing wrong? Thanks in advance SM -- Simon Mullis _ si...@mullis.co.uk -- http://mail.python.org/mailman/listinfo/python-list
Re: Should open(sys.stdin) and open(file, 'r') be equivalent?
Hi Chris 2009/2/5 Chris Rebert c...@rebertia.com I'd add some print()s in the above loop (and also the 'for f in files' loop) to make sure the part of the code you didn't want to share (do stuff with the line) works correctly, and that nothing is improperly looping in some unexpected way. The point is that even with the very, very simple script I posted above the behavior of open(sys.stdin) and open(filename, 'r') is different. The object foo (where foo = sys.stdin) allows me to iterate then hands back after the loop is finished. The object bar (where bar = open(filename, 'r')) does not. Both foo and bar have the same type, methods, repr etc. Also, there are several series of lines with invalid indentation; could be an email artifact or could be the cause of your problem. If the print()s don't yield any useful insight, repost the code again with absolutely correct indentation. (code posted again to fix indents) #!/usr/bin/env python import glob, os, sys class TestParse(object): def __init__(self): if options.stdin: self.scan_data(sys.stdin) if options.glob: self.files = glob.glob(options.glob) for f in files: fh = open(f, 'r') self.scan_data(fh) fh.close() def scan_data(self,fileobject): i = 0 for line in fileobject: print i i += 1 # do stuff with the line... pass print finished file def main(): T = TestParse() if __name__ == __main__: from optparse import OptionParser p = OptionParser(__doc__, version=testing 1 2 3) p.add_option(--glob, dest=glob) p.add_option(--stdin, dest=stdin, action=store_true, default=False) (options, args) = p.parse_args() main() (The code I'm actually using is much more complex than this. I tried to create the most simple example of it _not_ working as expected...) Finally, some stylistic points: - don't do 'if (foo):' use the less noisy 'if foo:' instead - don't do 'i = int()' use the more obvious 'i = 0' instead ok. My question again, to be more explicit: Should the objects created by sys.stdin and open(filename, 'r') have the same behavior when iterated over? They both have __iter__ methods Thanks in advance for any suggestions SM -- http://mail.python.org/mailman/listinfo/python-list
[SOLVED] Re: Should open(sys.stdin) and open(file, 'r') be equivalent?
Forget it all... I was being very very daft! The default = 'False' in the options for stdin was not being evaluated as I thought, so the script was waiting for stdin even when there was the glob switch was used...No stdin equals the script seeming to hang. Ah well. SM -- http://mail.python.org/mailman/listinfo/python-list
Re: Should open(sys.stdin) and open(file, 'r') be equivalent?
Last try at getting the indenting to appear correctly.. #!/usr/bin/env python import glob, os, sys class TestParse(object): def __init__(self): if options.stdin: self.scan_data(sys.stdin) if options.glob: self.files = glob.glob(options.glob) for f in files: fh = open(f, 'r') self.scan_data(fh) fh.close() def scan_data(self,fileobject): i = 0 for line in fileobject: print i i += 1 # do stuff with the line... pass print finished file def main(): T = TestParse() if __name__ == __main__: from optparse import OptionParser p = OptionParser(__doc__, version=testing 1 2 3) p.add_option(--glob, dest=glob, help=use this glob) p.add_option(--stdin, dest=stdin, action=store_true, default=False, help=use stdin) (options, args) = p.parse_args() main() -- http://mail.python.org/mailman/listinfo/python-list
parse date/time from a log entry with only strftime (and no regexen)
Hi All I'm writing a script to help with analyzing log files timestamps and have a very specific question on which I'm momentarily stumped I'd like the script to support multiple log file types, so allow a strftime format to be passed in as a cli switch (default is %Y-%m-%d %H:%M:%S). When it comes to actually doing the analysis I want to store or discard the log entry based on certain criteria. In fact, I only need the log line timestamp. I'd like to do this in one step and therefore not require the user to supply a regex aswell as a strftime format: import datetime p = datetime.datetime.strptime(2008-07-23 12:18:28 this is the remainder of the log line that I do not care about, %Y-%m-%d %H:%M:%S) Traceback (most recent call last): File stdin, line 1, in module File /opt/local/lib/python2.5/_strptime.py, line 333, in strptime data_string[found.end():]) ValueError: unconverted data remains: this is the remainder of the log line that I do not care about repr(p) NameError: name 'p' is not defined Clearly the strptime method above can grab the right bits of data but the string p is not created due to the error. So, my options are: 1 - Only support one log format. 2 - Support any log format but require a regex as well as a strftime format so I can extract the timestamp portion. 3 - Create another class/method with a lookup table for the strftime options that automagically creates the correct regex to extract the right string from the log entry... (or is this overly complicated) 4 - Override the method above (strptime) to allow what I'm trying to do). 4 - Some other very clever and elegant solution that I would not ever manage to think of myself Am I making any sense whatsoever? Thanks SM (P.S The reason I don't want the end user to supply a regex for the timestamin he log-entry is that we're already using 2 other regexes as cli switches to select the file glob and log line to match) -- http://mail.python.org/mailman/listinfo/python-list
dict generator question
Hi, Let's say I have an arbitrary list of minor software versions of an imaginary software product: l = [ 1.1.1.1, 1.2.2.2, 1.2.2.3, 1.3.1.2, 1.3.4.5] I'd like to create a dict with major_version : count. (So, in this case: dict_of_counts = { 1.1 : 1, 1.2 : 2, 1.3 : 2 } Something like: dict_of_counts = dict([(v[0:3], count) for v in l]) I can't seem to figure out how to get count, as I cannot do x += 1 or x++ as x may or may not yet exist, and I haven't found a way to create default values. I'm most probably not thinking pythonically enough... (I know I could do this pretty easily with a couple more lines, but I'd like to understand if there's a way to use a dict generator for this). Thanks in advance SM -- Simon Mullis -- http://mail.python.org/mailman/listinfo/python-list
Re: dict generator question
Haha! Thanks for all of the suggestions... (I love this list!) SM 2008/9/18 [EMAIL PROTECTED]: On Sep 18, 10:54 am, Simon Mullis [EMAIL PROTECTED] wrote: Hi, Let's say I have an arbitrary list of minor software versions of an imaginary software product: l = [ 1.1.1.1, 1.2.2.2, 1.2.2.3, 1.3.1.2, 1.3.4.5] I'd like to create a dict with major_version : count. (So, in this case: dict_of_counts = { 1.1 : 1, 1.2 : 2, 1.3 : 2 } Something like: dict_of_counts = dict([(v[0:3], count) for v in l]) I can't seem to figure out how to get count, as I cannot do x += 1 or x++ as x may or may not yet exist, and I haven't found a way to create default values. I'm most probably not thinking pythonically enough... (I know I could do this pretty easily with a couple more lines, but I'd like to understand if there's a way to use a dict generator for this). Thanks in advance SM -- Simon Mullis Considering 3 identical simultpost solutions I'd say: one obvious way to do it FTW :-) -- http://mail.python.org/mailman/listinfo/python-list -- Simon Mullis _ [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list
Equivalents of Ruby's ! methods?
Hi All, Quick question, I can't seem to find the answer online (well, at the moment I think the answer is a simple no but I would like to confirm). Consider the following hash: h = { 1 : a\r, 2 : b\n } In order to strip the dict values in Python I (think) I can only do something like: for k,v in h.items: h[k] = v.strip() While in Ruby - for the equivale dict/hash - I have the option of an in-place method: h.each_value { |v| val.strip! } Are there Python equivalents to the ! methods in Ruby? The reason I ask is that I have some fairly complex data-structures and this would make my code alot cleaner... If this is not an accepted and pythonic way of doing things then please let me know... and I'll stop! Thanks in advance SM -- Simon Mullis _ [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list
Re: Equivalents of Ruby's ! methods?
Thanks to all for the quick responses. 2008/8/25 Ben Finney [EMAIL PROTECTED]: This is a 'dict' instance in Python. A 'hash' is a different concept. In order to strip the dict values in Python I (think) I can only do something like: for k,v in h.items: h[k] = v.strip() The above won't do what you describe, since 'h.items' evaluates to that function object, which is not iterable. If you want the return value of the function, you must call the function: for (k, v) in h.items(): h[k] = v.strip() Yes - absolutely, a typo on my part. This will create a new value from each existing value, and re-bind each key to the new value for that key. Clear, straightforward, and Pythonic. You can also create a new dict from a generator, and re-bind the name to that new dict: h = dict( (k, v.strip()) for (k, v) in h.items()) Also quite Pythonic, but rather less clear if one hasn't yet understood generator expressions. Very useful to have when needed, though. Thanks for this - I'll have a look! While in Ruby - for the equivale dict/hash - I have the option of an in-place method: h.each_value { |v| val.strip! } Are there Python equivalents to the ! methods in Ruby? I'm not overly familiar with Ruby, but the feature you describe above seems to rely on mutating the string value in-place. Is that right? There are a number of methods that can be used to change things in-place such as: String.new().grep_methods(!) = [upcase!, gsub!, downcase!, chop!, capitalize!, tr!, chomp!, swapcase!, tr_s!, succ!, strip!, delete!, lstrip!, squeeze!, next!, rstrip!, slice!, reverse!, sub!] Or, Array.new().grep_methods(!) = [map!, shuffle!, uniq!, reject!, compact!, slice!, sort!, flatten!, collect!, reverse!] They normally have a non-! partner which is used only for a return value and does not affect the original object. But! This isn't a Ruby group so I'll stop now... ;-) Strings in Python are immutable (among other reasons, this allows them to meet the requirement of dict keys to be immutable, which in turn allows dict implementations to be very fast), so you can only get a new value for a string by creating a new string instance and re-bind the reference to that new value. Normally I would use a Ruby symbol as a hash key: h = { :key1 = val1, :key2 = val2 } From the stdlib docs: The same Symbol object will be created for a given name or string for the duration of a program's execution, regardless of the context or meaning of that name. Thus if Fred is a constant in one context, a method in another, and a class in a third, the Symbol :Fred will be the same object in all three contexts. There is no equivalent in Python (as far as I know, and I'm only in my second week of Python so I'm more than likely incorrect!). If you're interested: http://www.randomhacks.net/articles/2007/01/20/13-ways-of-looking-at-a-ruby-symbol Thanks again for the pointers. -- Simon Mullis _ [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list
Re: From Ruby to Python?
In case anyone else has the same question: This has been very useful: http://www.poromenos.org/tutorials/python Short, concise. Enough to get me going. SM 2008/8/13 Simon Mullis [EMAIL PROTECTED] Hi All, I just finally found 30 minutes to try and write some code in Python and realized after a couple of minor syntactic false starts that I'd finished my initial attempt without needing to refer to any documention... And after the first few minutes I stopped noticing the whitespace thing that had scared me off previously. (Admittedly it's a very basic script using telnetlib to log into a bunch of network devices via a few console servers and put the results of some remote commands into a data-structure). So, after this initially promising start: Are there any good cheatsheets / guides for a Ruby programmer to learn Python? I searched this list and couldn't find anything, and wasn't successful with google either... Many thanks in advance, SM -- -- Simon Mullis _ [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list
From Ruby to Python?
Hi All, I just finally found 30 minutes to try and write some code in Python and realized after a couple of minor syntactic false starts that I'd finished my initial attempt without needing to refer to any documention... And after the first few minutes I stopped noticing the whitespace thing that had scared me off previously. (Admittedly it's a very basic script using telnetlib to log into a bunch of network devices via a few console servers and put the results of some remote commands into a data-structure). So, after this initially promising start: Are there any good cheatsheets / guides for a Ruby programmer to learn Python? I searched this list and couldn't find anything, and wasn't successful with google either... Many thanks in advance, SM -- -- http://mail.python.org/mailman/listinfo/python-list
Re: Regarding Telnet library in python
Hi there, This works (but bear in mind I'm about only 30 minutes into my Python adventure...): -- def connect(host): tn = telnetlib.Telnet(host) return tn def login(session,user,password): session.write(\n) session.read_until(Login: ) session.write(user + \n) session.read_until(Password: ) session.write(password + \n) session.read_until(#) def apply_command(session,command): session.write(command + \n) response = str(session.read_until(#)) return response def cleanup(session): session.write(exit\n) session.close() tn = connect(1.1.1.1) login(tn,user,password) directory = apply_command(tn,ls -l) kernel_release = apply_command(tn,uname -r\n) cleanup(tn) -- You'd probably need to apply a regex to the output of the apply_command method for it to make sense, and I am confident there is a much, much better way to do this! Cheers SM -- http://mail.python.org/mailman/listinfo/python-list