Re: Best way to (re)load test data in Mongo DB
Nagy László Zsolt writes: > We have a system where we have to create an exact copy of the original > database for testing. The database size is over 800GB. [...] That all sounds pretty cool, but it's precisely the opposite of what I'm trying to acheive: keeping things as simple as possible. Snapshotting is neat for testing, especially for the type of snapshots that writes deltas on top of some base. ZopeDB offers precisely this sort of feature directly, I've read; that's what I'd wish from every database. > For much smaller databases, you can (of course) use pure python code to > insert test data into a test database. If it only takes seconds, then it > is not a problem, right? I believe that for small tests (e.g. unit > tests), using python code to populate a test database is just fine. Yeah... I'd just hoped to push it further down given that I only have about a handful entries for each collection. > Regarding question #2, you can always directly give an _id for documents > if you want: > > https://api.mongodb.com/python/current/api/bson/objectid.html#bson.objectid.ObjectId Cheers. I'll give it another go. Martin -- https://mail.python.org/mailman/listinfo/python-list
Re: Best way to (re)load test data in Mongo DB
breamore...@gmail.com writes: > I was going to suggest mocking. I'm no expert so I suggest that you > search for "python test mock database" and go from there. Try to run > tests with a real db and you're likely to be at it until domesday. Mocking is definitely the order of the day for most tests, but I'd like to test the data layer itself, too, and I want a number of comprehensive function tests as well, and these need to exercise the whole stack. Mocking is great so long as you also remember to test the things that you mock. The point of this exercise is to eventually release it as a sort of example project of how to build a complex web application. Testing is particularly important to me since it's too often being overlooked in tutorials, or it only deals with trivial examples. Martin -- https://mail.python.org/mailman/listinfo/python-list
Best way to (re)load test data in Mongo DB
Hi! I'm toying around with Pyramid and am using Mongo via MongoEngine for storage. I'm new to both Pyramid and MongoEngine. For every test case in the part of my suite that tests the data access layer I want to reload the database from scratch, but it feels like there should be a better and faster way than what I'm doing now. I have two distinct issues: 1. What's the fastest way of resetting the database to a clean state? 2. How do I load data with Mongo's internal _id being kept persistent? For issue #1: First of all I'd very much prefer to avoid having to use external client programs such as mongoimport to keep the number of dependencies minimal. Thus if there's a good way to do it through MongoEngine or PyMongo, that'd be preferable. My first shot at populating the database was simply to load data from a JSON file, use this to create my model objects (based on MongoEngine.Document) and save them to the DB. With a single-digit number of test cases and very limited data, this approach already takes close to a second, so I'm thinking there should be a faster way. It's Mongo, after all, not Oracle. My second version uses the underlying PyMongo module's insert_many() function to add all the documents for each collection in one go, but for this small amount of data it doesn't seem any faster. Which brings us to issue #2: For both of these strategies I'm unable to insert the Mongo ObjectId type _id. I haven't made _id properties part of my models, because they seem a bit... alien. I'd rather not include them solely to be able to load my test data properly. How can I populate _id as an ObjectId, not just as a string? (I'm assuming there's a difference, but it's never come up until now.) Am I being too difficult? I haven't been able to find much written about this topic: discussions about mocking drown out everything else the moment you mention 'mongo' and 'test' in the same search. Martin -- https://mail.python.org/mailman/listinfo/python-list
Templating and XML modelling
Hi! At our IT department we've developed a basic templating system for web apps in the spirit of Meld3 (which appears to have been abandoned some time ago), based on lxml. Here's what we like about it: * It's just a library, not a template language * It uses templates that are valid XHTML * It's trivial to generate tables and forms from database metadata * It's trivial to fill named elements á la format strings While we like it, it's more code to maintain. Between the time when we started coding this and now, more new templating systems have appeared than I can reasonably evaluate. So now the question is whether we can find a good replacement or whether we should publish our code and hope that more people will adopt and help maintain it. So... 1) Can you suggest a good alternative that sounds like a good fit? 2) Does our templating system sound like just what you've been looking for? -- Martin Sand Christensen IT Services, Dept. of Electronic Systems -- http://mail.python.org/mailman/listinfo/python-list
Re: How decoupled are the Python frameworks?
J Kenneth King ja...@agentultra.com writes: [...] (though it sounds like cherrypy would be very good at separating dispatching from application code). True. In CherryPy, each page is represented by one method (the 'default' method is an exception, but that's not for this discussion). This method is expected to return a string representing the page/resource that the user requested. At this simple level, CherryPy can be considered more or less just a HTTP server and dispatcher. However, we all know that this isn't where it ends. When we want to handle cookies, we need framework-specific code. When we want to return something other than HTML, we need framework-specific code. The list goes on. However, with a reasonable coding style, it can be quite practical to separate strict application code from the framework-specific code. Here the one-method,-one-page principle is a great help. For instance, we quite often use decorators for such things as authentication and role checking, which is both very practical and technically elegant. For instance, if a user must have a CAS single sign-on identity AND, say, the administrator role, we'd do as follows: @cas @role('administrator') def protectedpage(self, ...): # stuff If the user isn't currently signed in to our CAS, he'll be redirected to the sign-in page and, after signing in, is returned to the page he originally requested. The role decorator checks his privileges (based on his CAS credentials) and either allows or denies him access. This adds up to a LOT of framework-specific code that's been very easily factored out. The CAS and role modules behind the decorators are, in turn, generic frameworks that we've merely specialised for CherryPy. At some point we'll get around to releasing some code. :-) As a slight aside, allow me to recommend Meld3 as a good templating library. It's basically ElementTree with a lot of practical templating stuff on top, so it's not a mini-language unto itself, and you don't embed your code in the page. -- Martin Sand Christensen IT Services, Dept. of Electronic Systems -- http://mail.python.org/mailman/listinfo/python-list
Re: How decoupled are the Python frameworks?
Lie Ryan lie.1...@gmail.com writes: In the end, it is the developer's responsibility not to write something too tightly coupled with their framework, isn't it? (or at least to minimize the framework-specific code to a certain area) That's a good summary of my point. However, I have very little experience with other frameworks than CherryPy, so I do not want to draw any general conclusions. My programmer's instincts say that it's true, though. -- Martin Sand Christensen IT Services, Dept. of Electronic Systems -- http://mail.python.org/mailman/listinfo/python-list
Re: New CMS in Python
:-) [EMAIL PROTECTED] writes: [...] Please send me how U expect your new CMS would be? I don't mean to be disparaging, but I'd like to point out a few things that I'm not entirely sure you've thought through. Firstly, there is currently no shortage in CMS', and especially not in half-completed, poorly thought out CMS' (not, of course, that I'm saying yours is necessarily poorly thought out). Unless you bring some new ideas to the playing field, or unless you think you can do something better than the existing CMS' already do, the world isn't going to care about your particular CMS. If you're writing your CMS for fun and experience, that's great for you and probably a very good idea, but don't expect many other people to show enthusiasm. Secondly and related to my first point, if you're asking for people's input on how they'd like your CMS to be, you've probably not given purpose and design a lot of thought, and so I must reiterate that unless you do something new or something better, don't expect anyone to care. If you want to build a community around your project, the best thing to do is generally to get something working on your own and then open up for others when there's actually something to work on. And that means that you should have a reasonably clear idea where you're going before setting out. Anyway, the best of luck to you! -- Martin Sand Christensen IT Services, Dept. of Electronic Systems -- http://mail.python.org/mailman/listinfo/python-list
Re: how would you...?
inhahe == inhahe [EMAIL PROTECTED] writes: inhahe Btw, use float() to convert a textual GPA to a number. It would be much better to use Decimal() instead of float(). A GPA of 3.6001 probably doesn't make much sense; this problem doesn't arise when using the Decimal type. Martin -- http://mail.python.org/mailman/listinfo/python-list
Why don't generators execute until first yield?
Hi! First a bit of context. Yesterday I spent a lot of time debugging the following method in a rather slim database abstraction layer we've developed: , | def selectColumn(self, table, column, where={}, order_by=[], group_by=[]): | Performs a SQL select query returning a single column | | The column is returned as a list. An exception is thrown if the | result is not a single column. | query = build_select(table, [column], where, order_by, group_by) | result = DBResult(self.rawQuery(query)) | if result.colcount != 1: | raise QueryError(Query must return exactly one column, query) | for row in result.fetchAllRowsAsList(): | yield row[0] ` I'd just rewritten the method as a generator rather than returning a list of results. The following test then failed: , | def testSelectColumnMultipleColumns(self): | res = self.fdb.selectColumn('db3ut1', ['c1', 'c2'], | {'c1':(1, 2)}, order_by='c1') | self.assertRaises(db3.QueryError, self.fdb.selectColumn, | 'db3ut1', ['c1', 'c2'], {'c1':(1, 2)}, order_by='c1') ` I expected this to raise a QueryError due to the result.colcount != 1 constraint being violated (as was the case before), but that isn't the case. The constraint it not violated until I get the first result from the generator. Now to the main point. When a generator function is run, it immediately returns a generator, and it does not run any code inside the generator. Not until generator.next() is called is any code inside the generator executed, giving it traditional lazy evaluation semantics. Why don't generators follow the usual eager evaluation semantics of Python and immediately execute up until right before the first yield instead? Giving generators special case semantics for no good reason is a really bad idea, so I'm very curious if there is a good reason for it being this way. With the current semantics it means that errors can pop up at unexpected times rather than the code failing fast. Martin -- http://mail.python.org/mailman/listinfo/python-list
Re: Why don't generators execute until first yield?
Ian == Ian Kelly [EMAIL PROTECTED] writes: Ian Isn't lazy evaluation sort of the whole point of replacing a list Ian with an iterator? Besides which, running up to the first yield when Ian instantiated would make the generator's first iteration Ian inconsistent with the remaining iterations. That wasn't my idea, although that may not have come across quite clearly enough. I wanted the generator to immediately run until right before the first yield so that the first call to next() would start with the first yield. My objection is that generators _by default_ have different semantics than the rest of the language. Lazy evaluation as a concept is great for all the benefits it can provide, but, as I've illustrated, strictly lazy evaluation semantics can be somewhat surprising at times and lead to problems that are hard to debug if you don't constantly bear the difference in mind. In this respect, it seems to me that my suggestion would be an improvement. I'm not any kind of expert on languages, though, and I may very well be missing a part of the bigger picture that makes it obvous why things should be as they are. As for code to slightly change the semantics of generators, that doesn't really address the issue as I see it: if you're going to apply such code to your generators, you're probably doing it exactly because you're aware of the difference in semantics, and you're not going to be surprised by it. You may still want to change the semantics, but for reasons that are irrelevant to my point. Martin -- http://mail.python.org/mailman/listinfo/python-list
Re: Why don't generators execute until first yield?
Duncan == Duncan Booth [EMAIL PROTECTED] writes: [...] Duncan Now try: Duncan Duncanfor command in getCommandsFromUser(): Duncanprint the result of that command was, execute(command) Duncan Duncan where getCommandsFromUser is a greedy generator that reads from stdin, Duncan and see why generators don't work that way. I don't see a problem unless the generator isn't defined where it's going to be used. In other similar input bound use cases, such as the generator iterating over a query result set in my original post, I see even less of a problem. Maybe I'm simply daft and you need to spell it out for me. :-) Martin -- http://mail.python.org/mailman/listinfo/python-list
Re: Code folder with Emacs
Grant == Grant Edwards [EMAIL PROTECTED] writes: Grant Has anybody figured out how to do code folding of Python source Grant files in emacs? I use outline-minor-mode with the following home baked configuration: ;; Python stuff for outline mode. (defvar py-outline-regexp ^\\([ \t]*\\)\\(def\\|class\\|if\\|elif\\|else\\|while\\|for\\|try\\|except\\|with\\) This variable defines what constitutes a 'headline' to outline mode.) (defun py-outline-level () Report outline level for Python outlining. (save-excursion (end-of-line) (let ((indentation (progn (re-search-backward py-outline-regexp) (match-string-no-properties 1 (if (and ( (length indentation) 0) (string= \t (substring indentation 0 1))) (length indentation) (/ (length indentation) py-indent-offset) (add-hook 'python-mode-hook '(lambda () (outline-minor-mode 1) (setq outline-regexp py-outline-regexp outline-level 'py-outline-level))) Martin -- http://mail.python.org/mailman/listinfo/python-list
Re: new style class
gert == gert [EMAIL PROTECTED] writes: gert Why doesn't this new style class work in python 2.5.1 ? Whether you declare your class as a new style class or an old style class, your code is completely and utterly broken. Calling non-existing methods has never been a good way of getting things done. :-) Martin -- http://mail.python.org/mailman/listinfo/python-list
Re: Copy database with python..
Abandoned == Abandoned [EMAIL PROTECTED] writes: Abandoned Yes i understand thank you. Now i find that maybe help the Abandoned other users. Abandoned import os Abandoned os.system(su postgres) Abandoned ... I get the distinct impression that you're trying to replace simple shell scripting with Python. While it's possible, you're probably making things much more complicated than they need to be. Unless you're actually doing something with all that data of yours, don't use Python where a simple shell script will be much smaller and cleaner. Martin -- http://mail.python.org/mailman/listinfo/python-list
Re: Copy database with python..
Abandoned == Abandoned [EMAIL PROTECTED] writes: Abandoned I want to copy my database but python give me error when i Abandoned use this command. cursor.execute(pg_dump mydata old.dump) Abandoned What is the problem ? cursor.execute() is for executing SQL commands, and this is not an SQL command, but rather a shell command. Abandoned And how can i copy the database with python ? Note: The Abandoned database's size is 200 GB If you want to do this from Python, run it as a separate process. Martin -- http://mail.python.org/mailman/listinfo/python-list
Re: Proxying downloads
But I can't figure out how I would solve the following: 1 Alice asks Sam for foobar.iso 2 Sam can't find foobar.iso in cachedir 3 Sam requests foobar.iso from uplink 4 Sam saves and forwards to Alice 5 At about 30 % of the download Bob asks Sam for foobar.iso 6 How do I serve Bob now? Let every file in your download cache be represented by a Python object. Instead of streaming the file directly to the clients, you can stream the objects. The object will know if the file it represents has finished downloading or not, where the file is located etc. This way you can also, for the sake of persistence, keep partially downloaded files separate from the completely downloaded files, as per a previous suggestion, so that you won't start serving half files after a crash, and it'll be completely transparent in all code except for your proxy file objects. Martin -- http://mail.python.org/mailman/listinfo/python-list