Once upon a time tor developers wrote trip reports for neat events they attended. Though rare nowadays, recently meejah and I hit Portland for PyCon, the largest Python developer conference there is.
Lots of great talks worth sharing... http://blog.atagar.com/pycon2016/ Cheers! -Damian PS. For those of ya that don't like pretty pictures below is just the text. PPS. Ok, maybe I went a tad overboard with this... ============================================================ I've been to quite a few conferences. LinuxFest Northwest, SeaGL, PETS, Toorcamp, Defcon, but PyCon was particularly impressive. At over three thousand attendees with five parallel tracks of talks the word 'busy' hardly seems to do the conference justice. Top TL;DR highlights for me were new capabilities in the Python 3.x series and HTTP 2.0. In particular... * Python 3.6 releases on Christmas, finally adding string interpolation! >>> name, job = 'Damian', 'software engineer' >>> print f'{name} is a {job}' Damian is a software engineer * Python 2.x support will be completely discontinued in 2020. * New async/await keywords in Python 3.5 provide built-in support for Twisted-style async IO. * Gradual type syntax in Python 3.5 makes code even more self-documenting and supportive of static analysis. * First major protocol update since 1999, HTTP 2.0 is now supported by all modern browsers and 60% of users in the wild. Connection multiplexing allows all site assets to be retrieved over a single connection, improving latency on the order of 50%. The new protocol also negates any need for the clever performance hacks we've developed over the years like asset minimization and sprite maps! https://www.youtube.com/watch?v=Mou17XxYRZk PyCon 2017 will be in Portland one more time before moving on to another venue, so if the following sounds interesting then check it out! ============================================================ Serendipity is delightful. My first time taking the train, I strongly suggest Amtrak (particularly the Coast Starlight) if heading down to Portland. Comfortable, scenic, and by happy coincidence sat with Sarah Leivers: PyCon speaker with roots in the UK deaf community. Sarah made the interesting point that even for deaf communities in English speaking countries English is often a second language. Signing is their native tongue, putting them at a disadvantage when it comes to involvement in our communities. Part of the larger ESL puzzle, our discussion was a nice reminder of why it's important to keep documentation as linguistically simple and accessible as we can. In the observation car the Parks Department described sights we passed, my favorite being the Centralia train station. Completed right around the time these newfangled 'airplane' things were taking off, to celebrate they decided to christen the building with champaign. Three bottles were loaded onto a plain and dropped. The first couple bottles missed but the third hit dead on, puncturing right through the roof. Spoiler alert: this was the last building they christened in such a way. Go to a conference without exploring the area and you're doing it wrong. My train left me a few hours to explore the city, starting with the Portland Saturday Market. Easily comparable to Pike Place, the market is four city blocks jam packed with all the essentials of life: hand-carved bark houses, tie die, and of course fancy hats! https://www.atagar.com/transfer/pycon_2016/4-street_market.jpg Next hit the Lan Su Chinese Garden, beautiful gem nestled into the heart of downtown... https://www.atagar.com/transfer/pycon_2016/5-garden.jpg Of course visited Ground Kontrol just a block away. Classic arcade that successfully reminded me just how much I suck at Marble Madness. In my defense haven't played since my good old Amiga 2000... https://www.atagar.com/transfer/pycon_2016/6-ground_kontrol.jpg Finally, hidden below my hotel lurked a black light pirate themed putt-putt course. So... seems that's a thing! https://www.atagar.com/transfer/pycon_2016/7-mini_golf.jpg ------------------------------------------------------------ File Descriptors, Unix Sockets and other POSIX wizardy ------------------------------------------------------------ First talk of the first day, Christian Heimes gave a crash course on *nix file descriptors. In python descriptors are fetched with f.fileno() and Christian demoed interacting with them directly to open his cd tray. Christian's talk focused on file descriptor basics (which honestly I'm rustier on than I should be)... * Descriptors 0-2 are reserved for stdin/stdout/stderr with -1 for errors. * Fork clones the current process while pointing to the same global entry. * Exec replaces the current program, inheriting the prior descriptors (which is why pipes continue to work). * Descriptors can be delegated. This is useful in sandboxing situations like seccomp, allowing a broker to open files/sockets on a sandboxed process' behalf. Lastly Christian walked through a little strace example that illustrates how descriptors are used in a basic scenario... % cat reader.py with open('/home/atagar/Desktop/reader.py') as my_file: print(my_file.read()) % strace python reader.py ... open("/home/atagar/Desktop/reader.py", O_RDONLY|O_LARGEFILE) = 3 read(3, "with open('/home/atagar/Desktop/"..., 4096) = 80 read(3, "", 4096) = 0 close(3) = 0 write(1, "with open('/home/atagar/Desktop/"..., 81) = 81 ------------------------------------------------------------ Refactoring Python: Why and how to restructure your code ------------------------------------------------------------ Nice presentation by Brett Slatkin, the author of Effective Python on how and when to make code more maintainable. As developers we optimize for making things work in our first pass, and for many of us that's where the story ends. To make code that's truly easy to follow requires time and patience to take follow-up passes that optimize for maintainability. Something most developers don't do. To illustrate this Brett asked: how much of your coding time goes toward implementation? 90%? 75%? The few developers he knows that write easy to follow code only do so because they spend fully half their time refactoring anything they write. Maintainability isn't cheap, and when faced with deadlines it's often the first thing to go. Brett's other main takeaway was that without tests you're DOA. Refactoring requires a willingness to make mistakes, and without high coverage any major overhaul of production systems is in practice impossible. This dovetailed nicely with the following talk, Code Unto Others, which gave a few tips... * When it comes to maintainability remember that you don't scale. Any rough code you write is something you'll need to explain over and over to engineers that touches it. That's not really how you want to spend your time, is it? * Commonly people can track 5-9 things at a time which is why phone numbers are seven digits. Subdivide modules to take advantage of this. As a counter-example they used Mercurial's Repository class, a 17,000 line headache for newcomers. * Be wary when describing your module uses the word 'and' ("it does this *and* that"). If you need that word you're probably doing it wrong. After reading the first half of a class you should be able to take an educated guess at what you'll see in the second. ------------------------------------------------------------ Finding closure with closures ------------------------------------------------------------ Peek under the hood at how Python implements closures... >>> def print_greeting(first_name): ... def msg(last_name): ... platform = os.uname()[0] ... return "Hi %s %s, you're running %s" % (first_name, last_name, platform) ... print(msg('Johnson')) ... print("co_varnames: %s" % ', '.join(msg.__code__.co_varnames)) ... print("co_names: %s" % ', '.join(msg.__code__.co_names)) ... print("co_freevars: %s" % ', '.join(msg.__code__.co_freevars)) ... >>> print_greeting('Damian') Hi Damian Johnson, you're running Linux co_varnames: last_name, platform co_names: os, uname co_freevars: first_name varnames are local variables while freevars are variables we're closing over from the outer scope. A gotcha that's probably bitten every python dev is that assignment to a closed over variable overwrites it with a local... >>> def get_score(): ... total = 0 ... def add_points(): ... total += random.randint(0, 5) ... for i in range(3): ... add_points() ... return total ... >>> get_score() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 6, in get_score File "<stdin>", line 4, in add_points UnboundLocalError: local variable 'total' referenced before assignment Python 3.x adds a new 'nonlocal' keyword for re-binding closures but for those of us stuck in the past our best option is to use the mutable hack. Gross, but it works. >>> def get_score(): ... total = [0] ... def add_points(): ... total[0] = total[0] + random.randint(0, 5) ... for i in range(3): ... add_points() ... return total[0] ... >>> get_score() 8 ------------------------------------------------------------ What is and what can be: an exploration from 'type' to Metaclasses ------------------------------------------------------------ Owww, my head. This and another talk the previous day by Mike Graham introduced audiences to the wonderful world of python metaclasses... "The subject of metaclasses in Python has caused hairs to raise and even brains to explode." -Guido Method for redefining the fundamental behavior of objects and in doing so tear the fabric of reality, metaclasses are what you invoke each time you extend object. Dustin demonstrated this by defining his own metaclass that transparently causes method invocations to be accompanied by a bark... from functools import wraps from inspect import isfunction def bark(f): @wraps(f) def wrapper(*args, **kwargs): print("bark!") return f(*args, **kwargs) return wrapper class MetaDog(type): def __new__(meta, name, bases, attrs): for name, attr in attrs.items(): if isfunction(attr): attrs[name] = bark(attr) return type.__new__(meta, name, bases, attrs) class Dog(metaclass = MetaDog): def sit(self): print("*sitting*") def stay(self): print("*sitting*") d = Dog() d.sit() So why will you use this? Well... hopefully you won't. Besides the obvious unforgivability of this sin upon your coworkers, this is the kind of black magic Ruby folks do all the time but Python devs know better. Like redefining builtins, just don't. That aside, it was interesting to learn a little more about the abstract base class module and how python works under the hood. ------------------------------------------------------------ Building protocol libraries the right way ------------------------------------------------------------ Cory Benfield, author of Requests, urllib3, and other core I/O libraries discussed a common pitfall that inflicts protocol libraries: mixture of I/O with parsing. Python has as many HTTP parsers as there are I/O libraries. Urllib variants, aiohttp, Twisted, Tornado, and friends all reinvent this wheel. Code re-use is particularly great when you have a well defined problem with a single correct solution. Arithmetic, compression, and parsing are all examples of this, so why don't they all share a unified parser? The problem is that we tangle network I/O with parsing of the messages we read. As such all these projects trip over the same obscure edge cases and re-implement the same optimizations. Cory's message was simple: keep parsing separate. Besides code reuse this greatly improves testability because you don't need to invoke your I/O stack for coverage. Personally I found this talk interesting because this is exactly something I ran into with Stem. To work our I/O handler needs enough understanding of the control-spec to delimit message boundaries, but beyond that parsing is a completely separate module. This has been a great boon for testing... TEST_MESSAGE = """\ 250-version=0.2.3.11-alpha-dev 250 OK""" def test_single_getinfo_response(self): """ Parses a GETINFO reply response for a single parameter. """ control_message = stem.response.ControlMessage.from_str(TEST_MESSAGE, msg_type = 'GETINFO') self.assertEqual({'version': b'0.2.3.11-alpha-dev'}, control_message.entries) ------------------------------------------------------------ HTTP can do that?! ------------------------------------------------------------ Whimsical look at lesser known bits of the HTTP specification... * Need just metadata of a GET request? Use HEAD instead for a far lighter response. * Calling OPTIONS will tell you the HTTP operations a resource supports. * Besides normal CRUD operations (GET, POST, PUT, DELETE) the HTTP spec has PATCH to update just part of a resource. * The specification also has TRACE, LINK, and UNLINK methods. Nobody uses them but hey, they're there. * Few interesting headers include ETag for versioning resources, If-Modified-Since to only solicit a response if the resource has changed, and Cache-Control to define cacheability. Actually, the specification even has a From header in case you want to tell everybody in the world your email address... * Few standard but infrequently used response codes are... * 410 - That resource used to be here but now it's gone. * 304 - You asked to get this resource if it's been modified but it hasn't. * 451 - Unavailable for legal reasons. Mostly comes up with censorship firewalls. * Unsurprisingly you can make up your own status codes and reason strings. Sumana had several amusing ones she's found in the wild. ------------------------------------------------------------ Playing with Python bytecode ------------------------------------------------------------ Amusing demonstration of executing raw bytecodes in python, including runtime manipulation to switch a functor's addition operation to multiplication. Interesting in a 'oh god, you can do that?' sense but even the presenters said 'kids, don't do this at home'. Few (if any?) practical applications, and opcodes change even between minor Python interpretor version bumps making any such hacks a maintenance nightmare. ------------------------------------------------------------ SQLite: Gotchas and Gimmes ------------------------------------------------------------ Tips by Dave Sawyer for SQLite, mostly focusing on the advantages over pickles (performance, safety, etc), common pitfalls, and locking strategies... * Deferred - Multiple readers/writers. * Immediate - Multiple readers/single writer * Exclusive - Single reader/writier. WAL (Write Ahead Locking) is an alternative where readers are unlocked with the writer appending deltas. Upon checkpoints SQLite halts all reads/writes to apply the deltas as a batch. ------------------------------------------------------------ See Python, See Python Go, Go Python Go ------------------------------------------------------------ Last talk I attended and the one I wanted to see most. Imagine a world where performance critical code could be written in Go rather than C. No more memory leaks. No compilers. Sounds great, right? Well, keep dreaming. Both Python and Go can drop to C and Andrey gave a demo of doing so as a bridge between them, and in the process explained why this is a terrible idea. The CPython Extension interface requires a bit of boilerplate but can work with no dependencies while CFFI requires some magic but provides a more portable solution. But in either case crossing both the Go-to-C and C-to-Python boundaries drop you to the least common denominator. This means no Go interfaces or routines, and no Python classes or generators. GC, GIL, and JIT all add their own headaches but worse, you need to implement your own memory management. Sharing between Go and Python risks release of memory the other side still references. Andrey got around this by passing his own dereferenceable pointers but... ick. In the end Andrey's demo worked and in fact was just as performant as a direct Go implementation, but made it clear there be dragons. Frustratingly, it's still better to just call os.system(). :( ------------------------------------------------------------ This being my first PyCon I focused on talks rather than the hallway track but none the less had some nice finds... * Seattle is home to quite a few technical meetups. Hardware hacking, TA3M, Ruby Brigade, you name it and there's probably a group for it. SeaPig has been a fun local python group but sadly its gone dormant in recent years. Among the booths however I ran into members of PuPPy, another local python group that seems to be quite alive and well! * Didn't realize in advance but AWS networking ran a booth during the job fair. Fun chats with Shawn - he has a great approach for exciting folks to apply. * Crossed paths with meejah several times. Together we whipped up a recipe combining our libraries so users can read stem-parsed event objects from txtorcon. Neat stuff! Simply a great conference, I look forward to hitting PyCon again next year! _______________________________________________ tor-reports mailing list tor-reports@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-reports