Re: idea for testing tools

2007-02-08 Thread Jens Theisen
Bruno Desthuilliers [EMAIL PROTECTED] writes:

 http://codespeak.net/py/current/doc/test.html#assert-with-the-assert-statement

Ok, I didn't come across this before.

I didn't work for me though, even the simple case

  #!/usr/bin/python

  a = 1
  b = 2

  def test_some():
  assert a == b

didn't reveal the values for a and b, though some more complex cases
showed something.

-- 
Cheers, Jens
-- 
http://mail.python.org/mailman/listinfo/python-list


idea for testing tools

2007-02-07 Thread Jens Theisen
Hello,

I find it annoying that one has to write

self.assertEqual(x, y)

rather than just

assert x == y

when writing tests. This is a nuisance in all the programming
languages I know of (which are not too many). In Python however, there
appears to be a better alternative. The piece of code below gives the
benefit of printing the violating values in case of a testing failure
as well as the concise syntax:

The snippet

  def test_test():
  def foo(x):
  return x + 3
  x = 1
  y = 2
  assert foo(x)  y + x


  try:
  test_test()
  except AssertionError:
  analyse()

would give:

  Traceback (most recent call last):
File ./ast-post.py, line 138, in ?
  test_test()
File ./ast-post.py, line 134, in test_test
  assert foo(x)  y + x
  AssertionError

  failure analysis:

   foo: function foo at 0xb7c9148c
 x: 1
 ( x ): 1
 foo ( x ): 4
 y: 2
 x: 1
 y + x: 3
 foo ( x )  y + x: False


The code that makes this possible relies only on code present in the
standard library (being traceback, inspect and the parsing stuff)
while being as short as:

  #!/usr/bin/python

  import sys, types
  import traceback, inspect
  import parser, symbol, token
  import StringIO

  def get_inner_frame(tb):
  while tb.tb_next:
  tb = tb.tb_next
  return tb.tb_frame

  def visit_ast(visitor, ast):
  sym  = ast[0]
  vals = ast[1:]

  assert len(vals)  0
  is_simple = len(vals) == 1
  is_leaf   = is_simple and type(vals[0]) != types.TupleType

  if not is_leaf:
  visitor.enter()
  for val in vals:
  visit_ast(visitor, val)
  visitor.leave()

  if is_leaf:
  visitor.leaf(sym, vals[0])
  elif is_simple:
  visitor.simple(sym, vals[0])
  else:
  visitor.compound(sym, vals)


  class ast_visitor:
  def enter(self):
  pass

  def leave(self):
  pass

  def leaf(self, sym, val):
  pass

  def simple(self, sym, val):
  pass

  def compound(self, sym, vals):
  pass


  class simple_printer(ast_visitor):
  def __init__(self, stream):
  self.stream = stream

  def leaf(self, sym, val):
  print self.stream, val,

  def str_from_ast(ast):
  s = StringIO.StringIO()
  visit_ast(simple_printer(s), ast)
  return s.getvalue()

  class assertion_collector(ast_visitor):
  def __init__(self, statements):
  self.statements = statements

  def compound(self, sym, vals):
  if sym == symbol.assert_stmt:
  # two nodes: the assert name and the expression
  self.statements.append(vals[1])

  class pretty_evaluate(ast_visitor):
  def __init__(self, globals_, locals_):
  self.globals = globals_
  self.locals  = locals_

  def _expr(self, expression):
  code = compile(expression, 'internal', 'eval')

  try:
  result = eval(code, self.globals, self.locals)
  except Exception, e:
  result = e

  print '%50s: %s' % (expression, str(result))

  def compound(self, sym, vals):
  ast = [ sym ]
  ast.extend(vals)

  expression = str_from_ast(ast)

  self._expr(expression)

  def leaf(self, sym, val):
  if sym == token.NAME:
  self._expr(val)

  def analyse():
  type_, exc, tb = sys.exc_info()

  frame = get_inner_frame(tb)

  try:
  filename, line, fun, context, index = (
  inspect.getframeinfo(frame, 1)
  )

  ast = parser.suite(context[0].lstrip()).totuple()

  assert_statements = [ ]
  visit_ast(assertion_collector(assert_statements), ast)

  traceback.print_exc()

  print \nfailure analysis:\n

  for statement in assert_statements:
  visit_ast(
pretty_evaluate(frame.f_globals, frame.f_locals), statement)

  finally:
  del frame

-- 
Cheers, Jens

-- 
http://mail.python.org/mailman/listinfo/python-list


beginner's refcount questions

2006-10-29 Thread Jens Theisen
Hello,

python uses gc only where refcounts alone haven't yet done the
job. Thus, the following code

class Foo:
def __del__(self):
print deled!

def foo():
f = Foo()

foo()
print done!

prints

deled!
done!

and not the other way round.

In c++, this is a central technique used for all sorts of tasks,
whereas in garbage collected languages it's usually not available.

Is there a reason not to rely on this in Python? For example, are
there alternative Python implementations that behave differently?  Or
some other subtle problems?

And some other minor question: Is there a way to query the use count
of an object? This would be useful for debugging and testing.

-- 
Cheers, Jens
-- 
http://mail.python.org/mailman/listinfo/python-list


Graphical introspection utilities?

2006-02-04 Thread Jens Theisen
Hello,

as it would be so obviously a good thing to have a graphical (or maybe  
curses-base) browser through the dynamic state of a Python program, it's  
probably there.

Can someone point me to something?

Cheers,

Jens

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Another try at Python's selfishness

2006-02-04 Thread Jens Theisen
n.estner wrote:

 Yes, I 100% agree to that point!
 But the point is, the current situation is not newbie-friendly (I can
 tell, I am a newbie): I declare a method with 3 parameters but when I
 call it I only pass 2 parameters. That's confusing. If I declare a
 member variable, I write: self.x  = ValueForX, why can't I be equally
 explicit for declaring member functions?

For someone new to OO in general it might as well be something good, so he  
realises that there actually really is a hidden parameter. After all,  
there is something to understand with self, and this discrapency between  
the number of arguments and parameters puts newbies to it.

Jens

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python vs C for a mail server

2006-01-31 Thread Jens Theisen
Jay wrote:

 You can do both, but why? *Especially* in a language like C++, where
 thanks to pointers and casting, there really isn't any type safety
 anyway. How much time in your C/C++ code is spent casting and trying to
 trick the compiler into doing something that it thinks you shouldn't be
 doing?

Not much frankly. Though I have no doubt that there is a lot of code that  
does, but more so in older C++ code.

 How does type safety tell you anything about the current usage of your
 program?

Quite a bit; of course, it's doesn't cover everything and clearly testing  
the semantics is still needed, but a lot of what code is used in what way  
is described by the type system. And I admit that I'm used to use the type  
system as documentation of what's going on.

 And unit tests *might* not tell about the current usage, but
 integration tests certainly do.

Of course tests will cover a lot what static typing does and more. I'm  
just claiming that they can support each other.

 I don't think I've ever seen anyone advocating calling a function like
 getattr(obj foo + bar)(). You can do some very powerful things with
 getattr, thanks to Python's dynamic nature, but I don't think anyone is
 recommending calling a function like that.

A lot of people got me wrong on that, please see Paul's postings. I really  
didn't mean that literally.

 And is that fear based simply on feeling, or on actual experience.

The former, that's why I did start this branch of the thread, though I'm  
already regretting it.

 Because in all of my own industry experience, it's been MUCH easier to
 jump into someone else's Python code than someone else's C++ code (and
 at my last job, I had to do a lot of both). I find Python to be much
 more self-documenting, because there's not so much scaffolding all over
 the place obfuscating the true intention of the code.

That really depends on who's code you're looking at, as in any language. I  
can believe there are more C++ obfuscaters out there than ones for Python.

I usually have little problems jumping into C++ code of other people but  
the principle reason for that will be that the people I'm working with  
have a very clean and expressive coding style.

 You need to look at doctest:
 http://docs.python.org/lib/module-doctest.html
 With doctest, tests are EXACTLY where the code is. I've used doctest
 with incredibly successful results, in industry.

That's indeed a good point for Python.

 Reference counting by itself is not necessarily sufficient (because of
 circular references). That's why even Python, with its reference
 counting based system, has additional capabilities for finding circular
 references.

Whenever I encountered the need for circular references it was because an  
object, that was in some sense owned by another, needed a pointer back to  
it's owner. That solved easily with non-owning C-style pointers or weak  
pointers.

If you have an example where this is not sufficient, I'd be *very* keen on  
hearing it (it may be easy, I don't know).

 I believe that Alex's official job title at Google is Uber Technical
 Lead. I'm sure I'll quickly be corrected if that's wrong, but trust me
 (and everyone else who's spent significant time reading c.l.p.) that
 Alex Martelli knows what he's talking about.

You seem to be thinking that I was ironic. That was certainly not my  
intention. I was just trying to minimise the amount of flames I'll be  
getting.

 A lot of your arguments are the very typical arguments that people levy
 against Python, when they're first around it. And that's fine, most
 people go through that. They are taught programming in C++ or Java, and
 that that *that* is the way you're supposed to program. Then they see
 that Python does things in different ways, and automatically assume
 that Python is doing it wrong.

I'm sorry to hear this, and whilst I'm certainly lacking experience in  
Python, I'm not one of those people.

 All I can say is that if you spend time with Python (and more
 importantly, read quality, well-established Python code that's already
 out there), you'll see and understand the Python way of doing things.

I will.

Jens


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python vs C for a mail server

2006-01-31 Thread Jens Theisen
Paul wrote:

 Or should I be looking for some other context here?

Three people were looking at the wrong one, thanks for putting this right.

I really should not have given my point that briefly.

Jens

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python vs C for a mail server

2006-01-29 Thread Jens Theisen
Nicolas wrote:

 http://nicolas.lehuen.com/

 My two latest problems with coding in C++ are due to the environments :
 libraries using different string types and the whole problem with the
 building system. I love the language, but I get a much better leverage
 through Python and Java due to the quality and ease of use of their
 built-in and third party libraries. I use C++ only for my core data
 structure (namely a tuned version of a ternary search tree which I use
 to build full text indices).

Those points are all valid. I'm using Python for that reason. And there is  
another point that there are good Python bindings for the the more  
important C libraries, but usually no decent C++ wrapper for it.

Jens

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python vs C for a mail server

2006-01-29 Thread Jens Theisen
Alex wrote:

 http://www.artima.com/weblogs/viewpost.jsp?thread=4639
 http://www.mindview.net/WebLog/log-0025

 Since Robert Martin and Bruce Eckel (the authors of the two documents
 linked above) are both acknowledged gurus of statically typechecked
 languages such as C++, the convergence of their thinking and experience
 indicated by those documents is interesting.

Indeed, especially Eckels article shed some light about testing as an  
alternative to static typing. I still can't quite understand why you can't  
do both. Clearly unit tests should be part of any software, not only  
Python software.

Test failures, however, don't tell you anything about the current usage of  
your program - just about the inteded usage at the point where the test  
was writte. Clearly you can't test _anything_? And clearly you can never  
be sure that all you collegues did so as well? This not only about type  
safety, but simply name safety.

What do you do when you want to no if a certain method or function is  
actually used from somewhere, say foobar, it a language which allows  
(and even encourages) that it could be called by:

getattr(obj, foo + bar)()

?

There is no systematic way to find this call.

In C++, just commend out the definition and the compiler will tell you.

I'm pretty sure I red a PEP about static type safety in Python at some  
point. It was even thinking about generics I think.

 The but without declaration it can't be self-documenting issue is a
 red herring.  Reading, e.g.:

 int zappolop(int frep) { ...

 gives me no _useful_ self-documenting information

That's true. If the programmer wants to obfuscate his intention, I'm sure  
neither Python nor C++ can stop him. The question is how much more work is  
to write comprehensible code in one language or the other. I'm a bit  
afraid about Python on that matter.

Python provides ways to easy literal documentation. But I'd really like to  
have a way of indicating what I'm talking about in a way that's ensured to  
be in-sync with the code. Tests are not where the code is. I have  
difficulties remembering the type of a lot of symbols, and looking at  
testing code to fresh up usage is more difficult that just jumping to the  
definition (which the development envirnment is likely to be able to).

 [smart pointers and GC]

As you say, smart pointers are not full-blown garbage collection, which is  
usually what you want, isn't it? I my (admittedly short) life as a  
professional developer I have not yet come accross a situation where  
reference counting was not sufficient to model the memory management.

As for the locking: Apart from locking your whatever you need to lock in  
your user code, I don't think any special locking is necessary for the  
memory management. Smart pointer can increment and decrement their ref  
counts atomically as far as I know.

We use boost::shared_ptr in multi-threaded code just out of the box. We  
also use a reference counted string implementation in multi threaded code.

 At Google, we collectively have rather a lot of experience in these
 issues, since we use three general-purpose languages: Python, Java, C++.

I have no doubt that goolge know what they're doing, and if you're working  
there then you're likely to know what you're talking about.

I found it especially astonishing what you had to say against the use of  
smart pointers.

Jens

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: writing large files quickly

2006-01-28 Thread Jens Theisen
Donn wrote:

 How the heck does that make a 400 MB file that fast? It literally takes
 a second or two while every other solution takes at least 2 - 5 minutes.
 Awesome... thanks for the tip!!!

 Because it isn't really writing the zeros.   You can make these
 files all day long and not run out of disk space, because this
 kind of file doesn't take very many blocks.   The blocks that
 were never written are virtual blocks, inasmuch as read() at
 that location will cause the filesystem to return a block of NULs.

Under which operating system/file system?

As far as I know this should be file system dependent at least under  
Linux, as the calls to open and seek are served by the file system driver.

Jens

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: writing large files quickly

2006-01-28 Thread Jens Theisen
Ivan wrote:

 Steven D'Aprano wrote:

 Isn't this a file system specific solution though? Won't your file system
 need to have support for sparse files, or else it won't work?

 Yes, but AFAIK the only modern (meaning: in wide use today) file
 system that doesn't have this support is FAT/FAT32.

I don't think ext2fs does this either. At least the du and df commands  
tell something different.

Actually I'm not sure what this optimisation should give you anyway. The  
only circumstance under which files with only zeroes are meaningful is  
testing, and that's exactly when you don't want that optimisation.

On compressing filesystems such as ntfs you will get this behaviour as a  
special case of compression and compression makes more sense.

Jens

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: writing large files quickly

2006-01-28 Thread Jens Theisen
Donn wrote:

 Because it isn't really writing the zeros.   You can make these
 files all day long and not run out of disk space, because this
 kind of file doesn't take very many blocks.   The blocks that
 were never written are virtual blocks, inasmuch as read() at
 that location will cause the filesystem to return a block of NULs.

Are you sure that's not just a case of asynchronous writing that can be  
done in a particularly efficient way? df quite clearly tells me that I'm  
running out of disk space on my ext2fs linux when I dump it full of  
zeroes.

Jens


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python vs C for a mail server

2006-01-28 Thread Jens Theisen
Nicolas wrote:

 If it's just a way to throw a programming challenge at your friend's
 face, then you should check whether it's okay to use Python rather than
 C/C++, otherwise he could be charged of cheating by using a more
 productive language :).

Though this comment of mine is likely to start a religious war, it might  
also bring up some useful points.

I'm a bit relucted to swallow the common view of Python being a so- 
productive language, especially compared to C++. I value C++  
productiveness actually much higher than it's performance.

To stick to the mailserver example, I'm pretty sure I'd do it in C++. I  
got very exited about Python when I first saw it, but I've encountered  
several problems that hindered productivity dramatically.

One thing is the lack of static types. The problem is not only that the  
compiler can't tell you very basic things you're doing wrong but also that  
code isn't intrinsically documented:

def send_mail(mail):
...

What can I do with mail? In C++, you're looking up what type it is  
(presumably by pressing M-x in Emacs on the word before it) and have a  
look on it's type definition. In Python, it can be quite difficult to tell  
what you can do with it because the information of what will be passed in  
can be several layers up.

Also there is not even symbol-safety: self.not_defined will never rise a  
compile-time error.

And to address the memory management critisism about C++: Unless you have  
cyclic structures (you probably won't have in a mail server), just use  
smart pointers and you don't have to be concerned more about it than you'd  
have to be in Python.

I aggree on C++ libraries being weak on unicode strings though, or even  
generally weak in the libraries (you have the C libraries, but they're not  
very type safe or elegant to use).

I'm aware that C++ is a horrible monstrosity, an argument whiches weight  
depends on the OP's friends C++ experience.

Please don't be offended, but if anyone could make a point of how Python's  
disadvantages in these regards could be alleviated, I'd be very  
interested.

Jens

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: writing large files quickly

2006-01-28 Thread Jens Theisen
Ivan wrote:

 ext2 is a reimplementation of BSD UFS, so it does. Here:

 f = file('bigfile', 'w')
 f.seek(1024*1024)
 f.write('a')

 $ l afile
 -rw-r--r--  1 ivoras  wheel  1048577 Jan 28 14:57 afile
 $ du afile
 8 afile

Interesting:

cp bigfile bigfile2

cat bigfile  bigfile3

du bigfile*
8   bigfile2
1032bigfile3

So it's not consumings 0's. It's just doesn't store unwritten data. And I  
can think of an application for that: An application might want to write  
the biginning of a file at a later point, so this makes it more efficient.

I wonder how other file systems behave.

 I read somewhere that it has a use in database software, but the only
 thing I can imagine for this is when using heap queues
 (http://python.active-venture.com/lib/node162.html).

That's an article about the heap efficient data structure. Was it your  
intention to link this?

Jens


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: calling python from C#...

2006-01-23 Thread Jens Theisen
There is IronPython which compiles to .NET. And there was another project  
bridging the .NET runtime with the standard Python interpreter of which I  
forgot the name.

Jens

-- 
http://mail.python.org/mailman/listinfo/python-list