[web2py:27839] Fwd: My thoughts on web2py

Massimo Di Pierro Sun, 02 Aug 2009 11:22:05 -0700

Thank you Armin, I am passing this to the list and will address them  
asap.


Begin forwarded message:

> From: Armin Ronacher <armin.ronac...@active-4.com>
> Date: August 2, 2009 1:10:29 PM CDT
> To: "DiPierro, Massimo" <mdipie...@cs.depaul.edu>
> Subject: My thoughts on web2py
>
> Hi Massimo,
>
> First a word of warning.  I send you this mail in private because I  
> want
> to share my thoughts on the web2py code and because I hope I can help
> with that somehow.  You're free to publish this mail if you want.
>
> As you might remember I came in contract with web2py a long ago when  
> you
> were evaluating pygments as a syntax highlighter for the web2py
> documentation or something and you decided to implement your own  
> because
> you were unable to "ship" pygments as part of web2py.  As far as I
> remember py2app or py2exe were not able to pick up the on-demand  
> imports
> in the lexer module.
>
> I have been watching the development ever since because (And please
> don't think bad of me because of that) it appeared to me that web2py  
> was
> a joke.  I remember reading some very early revisions of your PDF
> documentation where I remember a code that looks something like that:
>
>    db.define_table('something', db.Field('meh', 'date',
>                    default=date.today()))
>
> That line freaked me out.  This line alone showed me that web2py is
> doing something very, very weird.  From this line I could see that it
> means one of the following two things:
>
> - either the code has a bug for long running applications (in this  
> case
>  when the application code runs longer than a day in which case the
>  default would have to change dynamically)
>
> - or the global scope has different semantics and is reevaluated each
>  request.
>
> Turns out the latter is what happens in web2py.  There still is a race
> condition but only a small one.  The bigger problem with this  
> however is
> that this means the (not so) static definitions are reevaluated each
> request.
>
> Especially for the database table definitions (which cause many  
> function
> calls) this means an enormous per-request penality if you have many of
> them.  I even remember reading a case like this in the web2py  
> newsgroup
> about someone who had twenty table definitions or something and the
> majority of the request time was spent setting up the tables.
>
> However that also means another thing.  Standard Python  
> optimizations no
> longer apply.  In Python usually modules are cached in an interpreter
> bound dictionary for reusing.  This of course does not apply for  
> web2py.
> However the second problem here is that only the local scope gets
> certain optimizations in cpython such as fast-local lookup.  Instead  
> of
> storing the variables in a dictionary they are instead stored in an an
> object where items are looked up by index (fast locals) rather than  
> by a
> string.
>
> Now of course one can argue that this is the way it's intended.   
> That's
> nice but for that, the documentation is lacking.  There is no
> explanation of what implications this has on the code.  And in
> comparison with "enterprise ready" that gives me a strange feeling  
> about
> the project.
>
> I'm not even exactly sure what kills web2py for me in the end, but I'm
> pretty sure that part of the reason is the way the project is  
> advertised.
>
> web2py does not have worse code than many popular libraries out there.
> BeautifulSoup comes to mind.  That library does some really ugly  
> things
> there including monkeypatching of the standard library.  It does
> different things depending on the version of Python it's running on  
> and
> in the code are some really ugly idioms like a unicode subclass that  
> has
> a reference to the parsed tree that results in memory leaks very  
> easily
> because it's nearly impossible to get rid of the reference (actually
> just by exploiting some weird Python internals that are mostly
> undocumented).  But the author of BeautifulSoup does not advertise  
> it as
> a backwards-compatible, enterprise ready HTML parser.
>
> The web2py code on the other hand has some really problematic idioms  
> as
> well.  The execution of models, views and whatnot in a made-up  
> namespace
> comes to mind.  Due to the decision of being backwards compatible it
> will be nearly impossible to "fix" those things.  However there are so
> many implications nobody yet knows about which is terribel for large
> projects.
>
> For example the database table redfinition problem cited above.  Of
> course that particular problem can be worked around with making the
> definition code internally lazy or something but there can be more
> nobody yet knows about.
>
> Especially side-effects of code execution can do some terrible damage.
> Decorators that would register code would instantly leak for example.
> And some libraries provide auto-registration that you would not
> instantly know about.
>
> Also libraries like pickle do not support weird situations like this.
>
> So I guess in the end I'm not (only) complaining about the web2py code
> or decisions there, but mainly how web2py is advertised as enterprise
> ready when nobody knows how these idioms scale or behave.
>
> Another part is that web2py (and this is something web2py has in  
> common
> with so many other web frameworks) is the craze to reimplement
> everything.  I know you don't want to pull depedencies in but from  
> what
> I can see you're providing one-click installers for the framework  
> which
> could ship external dependencies.  Both py2exe and py2app provide
> support for external dependencies and I can't see what "not  
> depending on
> webob/werkzeug" gives you there.
>
> The reason why I'm addressing this is that there are so many
> limitations, problems and weird aspects in WSGI and HTTP that are hard
> to tackle and many people get wrong.  I'm not saying you're getting it
> wrong, I haven't even studied the code in question, but I'm pretty  
> sure
> it would make sense to collaborate on that.
>
> About the code quality itself, not the (IMO) broken concepts I picked
> some of the snippets I found quickly searching with the help of ack:
>
> - Missing `close()` calls with files (ignored matches from  
> compileapp):
>
>  gluon/fileutils.py:160:    f.write(open(tarname, 'rb').read())
>  gluon/fileutils.py:246:        data = open(filename, 'rb').read()
>  gluon/highlight.py:313:    data = open(sys.argv[1]).read()
>  gluon/languages.py:37:    lang_text = open(filename, 'r').read()...
>  gluon/languages.py:153:        data = open(file, 'r').read()
>  gluon/main.py:107:web2py_version = open(os.path.join(web2py_path, ...
>  gluon/rewrite.py:26:        exec open('routes.py', 'r').read() ...
>  gluon/template.py:104:            text = open(os.path.join(path, ...
>  gluon/template.py:118:            parent = open(t, 'rb').read()
>  gluon/template.py:134:            child = open(t, 'rb').read()
>  gluon/tools.py:1934:        return urllib.urlopen(url).read()
>  gluon/widget.py:55:ProgramVersion = open('VERSION', ...
>
> - Same in streamer.py for checking for file existence
>  (open(static_file))
>
> - Template engine seems to be missing proper escapes.  Having ''' in  
> the
>  template should be enough to break it.
>
> - Whitespace between "else" and colon in the template engine is  
> unlikely
>  to work with the current parsing rules as far as I can see.
>
> - xmlrpc system sends text/xml content type which is with a missing
>  charset property set to us-ascii.  No charset is set so it has to be
>  specified explicitly (proper header would be application/xml;
>  charset=utf-8)
>
> - Fileutils has a listdir method that does very inefficient deleting
>  from the list.  Current code.
>
>  50         for dir in dirs[:]:
>  51             if dir.startswith('.'):
>  52                 dirs.remove(dir)
>
>  That creates a shallow copy, for each item it performs an O(n) lookup
>  for the item and the remove is a O(n) operation as well.  Even
>  unecessary because you have the position if you enumerate over the
>  list.
>
>  However you can do that with a list comprehension:
>
>  50         dirs[:] = [x for x in dirs if not x.startswith('.')]
>
> - Having classes in local scope (fileutils.py for the tarfile) makes  
> it
>  incredible hard to reuse code.  Not a good idea, move that into the
>  global scope instead.  Will also help performance.
>
> - There is no bug in shutil.copyfileobj.  That function works perfetly
>  and does what the documentation says it should do.  Maybe you were
>  copying from the wsgi input stream.  That of course won't work that
>  easy because there is no end-of-file marker in the input stream.  See
>  werkzeug's LimitedStream for ways how to fix that problem.  A similar
>  class also exists in WebOb but I don't remember the name.
>
> - You should not inherit from BaseException unless you are dealing  
> with
>  special exceptions that should not be catched by an "except  
> Exception"
>
> - there are realtive imports all over the place.  This is a deprecated
>  unfeature, switch to "from gluon.html import blah" instead of the
>  current "from html import blah".
>
> - the email code in tools is vulnerable to header injections if the
>  input is unfiltered.  If this is intentional it should be documented
>  in the code.
>
> - There is no proper undecoding of the path info in main.py but spaces
>  are transformed into underscores.  However a plus is converted into a
>  space as well which is against the RFC.  Plus is only an alias for  
> %20
>  in the querystring of a URL.
>
> - request.vars is either a dict of lists or a dict of plain strings.
>  This means the user has to perform instance checks each access, not
>  very clever.  Other systems fix that by providing multi-value dicts.
>
> - time.strftime should not be used for header-date formatting because
>  it's affected by locales.  That means if a user is using locales it
>  will break all headers sent that deal with dates.
>
> This list was created in ~15 minutes from quickly checking some files.
> If one does a check on the code more things will come up.  Some things
> are also underspecified or unclear in the code.  For example how the
> code is supposed to deal with unicode (it does that in very different
> ways) and how headers are quoted/unquoted, how URLs are decoded etc.
> Also the code architecture is suboptimal in places.  Hardcoded support
> for Google appengine in places where you wouldn't expect it (like
> validators etc.)
>
> For an "enterprise" framework this is a bit too much smelly code  
> that is
> found in no time.
>
> What would I fix?  Stop advertising the framework as enterprise ready
> until it matured more.  And do less advertising in general.  And if  
> you
> can, fix some of the things in the code.  Those that are caused by
> design choices won't be fixed anyway.
>
>
> Regards,
> Armin
>


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"web2py-users" group.
To post to this group, send email to web2py@googlegroups.com
To unsubscribe from this group, send email to 
web2py+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/web2py?hl=en
-~----------~----~----~----~------~----~------~--~---

[web2py:27839] Fwd: My thoughts on web2py

Reply via email to