Re: idea for testing tools
Bruno Desthuilliers [EMAIL PROTECTED] writes: http://codespeak.net/py/current/doc/test.html#assert-with-the-assert-statement Ok, I didn't come across this before. I didn't work for me though, even the simple case #!/usr/bin/python a = 1 b = 2 def test_some(): assert a == b didn't reveal the values for a and b, though some more complex cases showed something. -- Cheers, Jens -- http://mail.python.org/mailman/listinfo/python-list
idea for testing tools
Hello, I find it annoying that one has to write self.assertEqual(x, y) rather than just assert x == y when writing tests. This is a nuisance in all the programming languages I know of (which are not too many). In Python however, there appears to be a better alternative. The piece of code below gives the benefit of printing the violating values in case of a testing failure as well as the concise syntax: The snippet def test_test(): def foo(x): return x + 3 x = 1 y = 2 assert foo(x) y + x try: test_test() except AssertionError: analyse() would give: Traceback (most recent call last): File ./ast-post.py, line 138, in ? test_test() File ./ast-post.py, line 134, in test_test assert foo(x) y + x AssertionError failure analysis: foo: function foo at 0xb7c9148c x: 1 ( x ): 1 foo ( x ): 4 y: 2 x: 1 y + x: 3 foo ( x ) y + x: False The code that makes this possible relies only on code present in the standard library (being traceback, inspect and the parsing stuff) while being as short as: #!/usr/bin/python import sys, types import traceback, inspect import parser, symbol, token import StringIO def get_inner_frame(tb): while tb.tb_next: tb = tb.tb_next return tb.tb_frame def visit_ast(visitor, ast): sym = ast[0] vals = ast[1:] assert len(vals) 0 is_simple = len(vals) == 1 is_leaf = is_simple and type(vals[0]) != types.TupleType if not is_leaf: visitor.enter() for val in vals: visit_ast(visitor, val) visitor.leave() if is_leaf: visitor.leaf(sym, vals[0]) elif is_simple: visitor.simple(sym, vals[0]) else: visitor.compound(sym, vals) class ast_visitor: def enter(self): pass def leave(self): pass def leaf(self, sym, val): pass def simple(self, sym, val): pass def compound(self, sym, vals): pass class simple_printer(ast_visitor): def __init__(self, stream): self.stream = stream def leaf(self, sym, val): print self.stream, val, def str_from_ast(ast): s = StringIO.StringIO() visit_ast(simple_printer(s), ast) return s.getvalue() class assertion_collector(ast_visitor): def __init__(self, statements): self.statements = statements def compound(self, sym, vals): if sym == symbol.assert_stmt: # two nodes: the assert name and the expression self.statements.append(vals[1]) class pretty_evaluate(ast_visitor): def __init__(self, globals_, locals_): self.globals = globals_ self.locals = locals_ def _expr(self, expression): code = compile(expression, 'internal', 'eval') try: result = eval(code, self.globals, self.locals) except Exception, e: result = e print '%50s: %s' % (expression, str(result)) def compound(self, sym, vals): ast = [ sym ] ast.extend(vals) expression = str_from_ast(ast) self._expr(expression) def leaf(self, sym, val): if sym == token.NAME: self._expr(val) def analyse(): type_, exc, tb = sys.exc_info() frame = get_inner_frame(tb) try: filename, line, fun, context, index = ( inspect.getframeinfo(frame, 1) ) ast = parser.suite(context[0].lstrip()).totuple() assert_statements = [ ] visit_ast(assertion_collector(assert_statements), ast) traceback.print_exc() print \nfailure analysis:\n for statement in assert_statements: visit_ast( pretty_evaluate(frame.f_globals, frame.f_locals), statement) finally: del frame -- Cheers, Jens -- http://mail.python.org/mailman/listinfo/python-list
beginner's refcount questions
Hello, python uses gc only where refcounts alone haven't yet done the job. Thus, the following code class Foo: def __del__(self): print deled! def foo(): f = Foo() foo() print done! prints deled! done! and not the other way round. In c++, this is a central technique used for all sorts of tasks, whereas in garbage collected languages it's usually not available. Is there a reason not to rely on this in Python? For example, are there alternative Python implementations that behave differently? Or some other subtle problems? And some other minor question: Is there a way to query the use count of an object? This would be useful for debugging and testing. -- Cheers, Jens -- http://mail.python.org/mailman/listinfo/python-list
Graphical introspection utilities?
Hello, as it would be so obviously a good thing to have a graphical (or maybe curses-base) browser through the dynamic state of a Python program, it's probably there. Can someone point me to something? Cheers, Jens -- http://mail.python.org/mailman/listinfo/python-list
Re: Another try at Python's selfishness
n.estner wrote: Yes, I 100% agree to that point! But the point is, the current situation is not newbie-friendly (I can tell, I am a newbie): I declare a method with 3 parameters but when I call it I only pass 2 parameters. That's confusing. If I declare a member variable, I write: self.x = ValueForX, why can't I be equally explicit for declaring member functions? For someone new to OO in general it might as well be something good, so he realises that there actually really is a hidden parameter. After all, there is something to understand with self, and this discrapency between the number of arguments and parameters puts newbies to it. Jens -- http://mail.python.org/mailman/listinfo/python-list
Re: Python vs C for a mail server
Jay wrote: You can do both, but why? *Especially* in a language like C++, where thanks to pointers and casting, there really isn't any type safety anyway. How much time in your C/C++ code is spent casting and trying to trick the compiler into doing something that it thinks you shouldn't be doing? Not much frankly. Though I have no doubt that there is a lot of code that does, but more so in older C++ code. How does type safety tell you anything about the current usage of your program? Quite a bit; of course, it's doesn't cover everything and clearly testing the semantics is still needed, but a lot of what code is used in what way is described by the type system. And I admit that I'm used to use the type system as documentation of what's going on. And unit tests *might* not tell about the current usage, but integration tests certainly do. Of course tests will cover a lot what static typing does and more. I'm just claiming that they can support each other. I don't think I've ever seen anyone advocating calling a function like getattr(obj foo + bar)(). You can do some very powerful things with getattr, thanks to Python's dynamic nature, but I don't think anyone is recommending calling a function like that. A lot of people got me wrong on that, please see Paul's postings. I really didn't mean that literally. And is that fear based simply on feeling, or on actual experience. The former, that's why I did start this branch of the thread, though I'm already regretting it. Because in all of my own industry experience, it's been MUCH easier to jump into someone else's Python code than someone else's C++ code (and at my last job, I had to do a lot of both). I find Python to be much more self-documenting, because there's not so much scaffolding all over the place obfuscating the true intention of the code. That really depends on who's code you're looking at, as in any language. I can believe there are more C++ obfuscaters out there than ones for Python. I usually have little problems jumping into C++ code of other people but the principle reason for that will be that the people I'm working with have a very clean and expressive coding style. You need to look at doctest: http://docs.python.org/lib/module-doctest.html With doctest, tests are EXACTLY where the code is. I've used doctest with incredibly successful results, in industry. That's indeed a good point for Python. Reference counting by itself is not necessarily sufficient (because of circular references). That's why even Python, with its reference counting based system, has additional capabilities for finding circular references. Whenever I encountered the need for circular references it was because an object, that was in some sense owned by another, needed a pointer back to it's owner. That solved easily with non-owning C-style pointers or weak pointers. If you have an example where this is not sufficient, I'd be *very* keen on hearing it (it may be easy, I don't know). I believe that Alex's official job title at Google is Uber Technical Lead. I'm sure I'll quickly be corrected if that's wrong, but trust me (and everyone else who's spent significant time reading c.l.p.) that Alex Martelli knows what he's talking about. You seem to be thinking that I was ironic. That was certainly not my intention. I was just trying to minimise the amount of flames I'll be getting. A lot of your arguments are the very typical arguments that people levy against Python, when they're first around it. And that's fine, most people go through that. They are taught programming in C++ or Java, and that that *that* is the way you're supposed to program. Then they see that Python does things in different ways, and automatically assume that Python is doing it wrong. I'm sorry to hear this, and whilst I'm certainly lacking experience in Python, I'm not one of those people. All I can say is that if you spend time with Python (and more importantly, read quality, well-established Python code that's already out there), you'll see and understand the Python way of doing things. I will. Jens -- http://mail.python.org/mailman/listinfo/python-list
Re: Python vs C for a mail server
Paul wrote: Or should I be looking for some other context here? Three people were looking at the wrong one, thanks for putting this right. I really should not have given my point that briefly. Jens -- http://mail.python.org/mailman/listinfo/python-list
Re: Python vs C for a mail server
Nicolas wrote: http://nicolas.lehuen.com/ My two latest problems with coding in C++ are due to the environments : libraries using different string types and the whole problem with the building system. I love the language, but I get a much better leverage through Python and Java due to the quality and ease of use of their built-in and third party libraries. I use C++ only for my core data structure (namely a tuned version of a ternary search tree which I use to build full text indices). Those points are all valid. I'm using Python for that reason. And there is another point that there are good Python bindings for the the more important C libraries, but usually no decent C++ wrapper for it. Jens -- http://mail.python.org/mailman/listinfo/python-list
Re: Python vs C for a mail server
Alex wrote: http://www.artima.com/weblogs/viewpost.jsp?thread=4639 http://www.mindview.net/WebLog/log-0025 Since Robert Martin and Bruce Eckel (the authors of the two documents linked above) are both acknowledged gurus of statically typechecked languages such as C++, the convergence of their thinking and experience indicated by those documents is interesting. Indeed, especially Eckels article shed some light about testing as an alternative to static typing. I still can't quite understand why you can't do both. Clearly unit tests should be part of any software, not only Python software. Test failures, however, don't tell you anything about the current usage of your program - just about the inteded usage at the point where the test was writte. Clearly you can't test _anything_? And clearly you can never be sure that all you collegues did so as well? This not only about type safety, but simply name safety. What do you do when you want to no if a certain method or function is actually used from somewhere, say foobar, it a language which allows (and even encourages) that it could be called by: getattr(obj, foo + bar)() ? There is no systematic way to find this call. In C++, just commend out the definition and the compiler will tell you. I'm pretty sure I red a PEP about static type safety in Python at some point. It was even thinking about generics I think. The but without declaration it can't be self-documenting issue is a red herring. Reading, e.g.: int zappolop(int frep) { ... gives me no _useful_ self-documenting information That's true. If the programmer wants to obfuscate his intention, I'm sure neither Python nor C++ can stop him. The question is how much more work is to write comprehensible code in one language or the other. I'm a bit afraid about Python on that matter. Python provides ways to easy literal documentation. But I'd really like to have a way of indicating what I'm talking about in a way that's ensured to be in-sync with the code. Tests are not where the code is. I have difficulties remembering the type of a lot of symbols, and looking at testing code to fresh up usage is more difficult that just jumping to the definition (which the development envirnment is likely to be able to). [smart pointers and GC] As you say, smart pointers are not full-blown garbage collection, which is usually what you want, isn't it? I my (admittedly short) life as a professional developer I have not yet come accross a situation where reference counting was not sufficient to model the memory management. As for the locking: Apart from locking your whatever you need to lock in your user code, I don't think any special locking is necessary for the memory management. Smart pointer can increment and decrement their ref counts atomically as far as I know. We use boost::shared_ptr in multi-threaded code just out of the box. We also use a reference counted string implementation in multi threaded code. At Google, we collectively have rather a lot of experience in these issues, since we use three general-purpose languages: Python, Java, C++. I have no doubt that goolge know what they're doing, and if you're working there then you're likely to know what you're talking about. I found it especially astonishing what you had to say against the use of smart pointers. Jens -- http://mail.python.org/mailman/listinfo/python-list
Re: writing large files quickly
Donn wrote: How the heck does that make a 400 MB file that fast? It literally takes a second or two while every other solution takes at least 2 - 5 minutes. Awesome... thanks for the tip!!! Because it isn't really writing the zeros. You can make these files all day long and not run out of disk space, because this kind of file doesn't take very many blocks. The blocks that were never written are virtual blocks, inasmuch as read() at that location will cause the filesystem to return a block of NULs. Under which operating system/file system? As far as I know this should be file system dependent at least under Linux, as the calls to open and seek are served by the file system driver. Jens -- http://mail.python.org/mailman/listinfo/python-list
Re: writing large files quickly
Ivan wrote: Steven D'Aprano wrote: Isn't this a file system specific solution though? Won't your file system need to have support for sparse files, or else it won't work? Yes, but AFAIK the only modern (meaning: in wide use today) file system that doesn't have this support is FAT/FAT32. I don't think ext2fs does this either. At least the du and df commands tell something different. Actually I'm not sure what this optimisation should give you anyway. The only circumstance under which files with only zeroes are meaningful is testing, and that's exactly when you don't want that optimisation. On compressing filesystems such as ntfs you will get this behaviour as a special case of compression and compression makes more sense. Jens -- http://mail.python.org/mailman/listinfo/python-list
Re: writing large files quickly
Donn wrote: Because it isn't really writing the zeros. You can make these files all day long and not run out of disk space, because this kind of file doesn't take very many blocks. The blocks that were never written are virtual blocks, inasmuch as read() at that location will cause the filesystem to return a block of NULs. Are you sure that's not just a case of asynchronous writing that can be done in a particularly efficient way? df quite clearly tells me that I'm running out of disk space on my ext2fs linux when I dump it full of zeroes. Jens -- http://mail.python.org/mailman/listinfo/python-list
Re: Python vs C for a mail server
Nicolas wrote: If it's just a way to throw a programming challenge at your friend's face, then you should check whether it's okay to use Python rather than C/C++, otherwise he could be charged of cheating by using a more productive language :). Though this comment of mine is likely to start a religious war, it might also bring up some useful points. I'm a bit relucted to swallow the common view of Python being a so- productive language, especially compared to C++. I value C++ productiveness actually much higher than it's performance. To stick to the mailserver example, I'm pretty sure I'd do it in C++. I got very exited about Python when I first saw it, but I've encountered several problems that hindered productivity dramatically. One thing is the lack of static types. The problem is not only that the compiler can't tell you very basic things you're doing wrong but also that code isn't intrinsically documented: def send_mail(mail): ... What can I do with mail? In C++, you're looking up what type it is (presumably by pressing M-x in Emacs on the word before it) and have a look on it's type definition. In Python, it can be quite difficult to tell what you can do with it because the information of what will be passed in can be several layers up. Also there is not even symbol-safety: self.not_defined will never rise a compile-time error. And to address the memory management critisism about C++: Unless you have cyclic structures (you probably won't have in a mail server), just use smart pointers and you don't have to be concerned more about it than you'd have to be in Python. I aggree on C++ libraries being weak on unicode strings though, or even generally weak in the libraries (you have the C libraries, but they're not very type safe or elegant to use). I'm aware that C++ is a horrible monstrosity, an argument whiches weight depends on the OP's friends C++ experience. Please don't be offended, but if anyone could make a point of how Python's disadvantages in these regards could be alleviated, I'd be very interested. Jens -- http://mail.python.org/mailman/listinfo/python-list
Re: writing large files quickly
Ivan wrote: ext2 is a reimplementation of BSD UFS, so it does. Here: f = file('bigfile', 'w') f.seek(1024*1024) f.write('a') $ l afile -rw-r--r-- 1 ivoras wheel 1048577 Jan 28 14:57 afile $ du afile 8 afile Interesting: cp bigfile bigfile2 cat bigfile bigfile3 du bigfile* 8 bigfile2 1032bigfile3 So it's not consumings 0's. It's just doesn't store unwritten data. And I can think of an application for that: An application might want to write the biginning of a file at a later point, so this makes it more efficient. I wonder how other file systems behave. I read somewhere that it has a use in database software, but the only thing I can imagine for this is when using heap queues (http://python.active-venture.com/lib/node162.html). That's an article about the heap efficient data structure. Was it your intention to link this? Jens -- http://mail.python.org/mailman/listinfo/python-list
Re: calling python from C#...
There is IronPython which compiles to .NET. And there was another project bridging the .NET runtime with the standard Python interpreter of which I forgot the name. Jens -- http://mail.python.org/mailman/listinfo/python-list