Hi, I'm happy to announce new Python data framework: Bubbles
Motto: Focus on the process, not the data technology. Blog post: http://blog.databrewery.org/posts/bubbles-0-1-released.html Here is a short presentation of the core concepts: http://www.slideshare.net/Stiivi/data-brewery-2-data-objects The concepts are: * data objects – abstraction of tabular data, one object might have multiple representations at once (SQL, iterator, ...) * data stores – abstraction of dataset collections * operations (performing on top of representations) and execution context (with operation catalog) * processing pipelines Priorities of the framework are: * understandability of the process * auditability of the data being processed (frequent use of metadata) * usability * versatility Working with data: * keep data in their original form. For example: represent data by a SQL statement and do not touch neither move around data if not necessary. * use native operations if possible: compose SQL statements, chain python iterators, compose APIs * performance provided by technology: SQL optimizer should know the best * have options – custom operations are easy to create Bubbles is performance agnostic at the low level of physical data implementation. Performance should be assured by the data technology and proper use of operations. Summary of current operations: http://www.scribd.com/doc/147247069/Bubbles-Brewery2-Operations More will come, at least basic Mongo ops are planned for 0.2. Github: https://github.com/Stiivi/bubbles If you have any comments, suggestions or questions, let me know. Cheers, Stefan -- http://mail.python.org/mailman/listinfo/python-announce-list Support the Python Software Foundation: http://www.python.org/psf/donations/