Re: Best way to (re)load test data in Mongo DB

2019-02-25 Thread Martin Sand Christensen
Nagy László Zsolt  writes:

> We have a system where we have to create an exact copy of the original
> database for testing. The database size is over 800GB. [...]

That all sounds pretty cool, but it's precisely the opposite of what I'm
trying to acheive: keeping things as simple as possible. Snapshotting is
neat for testing, especially for the type of snapshots that writes
deltas on top of some base. ZopeDB offers precisely this sort of
feature directly, I've read; that's what I'd wish from every database.

> For much smaller databases, you can (of course) use pure python code to
> insert test data into a test database. If it only takes seconds, then it
> is not a problem, right? I believe that for small tests (e.g. unit
> tests), using python code to populate a test database is just fine.

Yeah... I'd just hoped to push it further down given that I only have
about a handful entries for each collection.

> Regarding question #2, you can always directly give an _id for documents
> if you want:
>
> https://api.mongodb.com/python/current/api/bson/objectid.html#bson.objectid.ObjectId

Cheers. I'll give it another go.


Martin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Best way to (re)load test data in Mongo DB

2019-02-24 Thread Martin Sand Christensen
breamore...@gmail.com writes:
> I was going to suggest mocking. I'm no expert so I suggest that you
> search for "python test mock database" and go from there. Try to run
> tests with a real db and you're likely to be at it until domesday.

Mocking is definitely the order of the day for most tests, but I'd like
to test the data layer itself, too, and I want a number of comprehensive
function tests as well, and these need to exercise the whole stack.
Mocking is great so long as you also remember to test the things that
you mock.

The point of this exercise is to eventually release it as a sort of
example project of how to build a complex web application. Testing is
particularly important to me since it's too often being overlooked in
tutorials, or it only deals with trivial examples.


Martin
-- 
https://mail.python.org/mailman/listinfo/python-list


Best way to (re)load test data in Mongo DB

2019-02-23 Thread Martin Sand Christensen
Hi!

I'm toying around with Pyramid and am using Mongo via MongoEngine for
storage. I'm new to both Pyramid and MongoEngine. For every test case in
the part of my suite that tests the data access layer I want to reload
the database from scratch, but it feels like there should be a better
and faster way than what I'm doing now.

I have two distinct issues:
1. What's the fastest way of resetting the database to a clean state?
2. How do I load data with Mongo's internal _id being kept persistent?

For issue #1:
First of all I'd very much prefer to avoid having to use external client
programs such as mongoimport to keep the number of dependencies minimal.
Thus if there's a good way to do it through MongoEngine or PyMongo,
that'd be preferable.

My first shot at populating the database was simply to load data from a
JSON file, use this to create my model objects (based on
MongoEngine.Document) and save them to the DB. With a single-digit
number of test cases and very limited data, this approach already takes
close to a second, so I'm thinking there should be a faster way. It's
Mongo, after all, not Oracle.

My second version uses the underlying PyMongo module's insert_many()
function to add all the documents for each collection in one go, but for
this small amount of data it doesn't seem any faster.

Which brings us to issue #2:
For both of these strategies I'm unable to insert the Mongo ObjectId
type _id. I haven't made _id properties part of my models, because they
seem a bit... alien. I'd rather not include them solely to be able to
load my test data properly. How can I populate _id as an ObjectId, not
just as a string? (I'm assuming there's a difference, but it's never
come up until now.)


Am I being too difficult? I haven't been able to find much written about
this topic: discussions about mocking drown out everything else the
moment you mention 'mongo' and 'test' in the same search.


Martin
-- 
https://mail.python.org/mailman/listinfo/python-list


Templating and XML modelling

2012-11-13 Thread Martin Sand Christensen

Hi!

At our IT department we've developed a basic templating system for web 
apps in the spirit of Meld3 (which appears to have been abandoned some 
time ago), based on lxml. Here's what we like about it:


* It's just a library, not a template language
* It uses templates that are valid XHTML
* It's trivial to generate tables and forms from database metadata
* It's trivial to fill named elements á la format strings

While we like it, it's more code to maintain. Between the time when we 
started coding this and now, more new templating systems have appeared 
than I can reasonably evaluate. So now the question is whether we can 
find a good replacement or whether we should publish our code and hope 
that more people will adopt and help maintain it.


So...

1) Can you suggest a good alternative that sounds like a good fit?
2) Does our templating system sound like just what you've been looking for?

--
Martin Sand Christensen
IT Services, Dept. of Electronic Systems
--
http://mail.python.org/mailman/listinfo/python-list


Re: How decoupled are the Python frameworks?

2009-12-08 Thread Martin Sand Christensen
J Kenneth King ja...@agentultra.com writes:
 [...] (though it sounds like cherrypy would be very good at separating
 dispatching from application code).

True. In CherryPy, each page is represented by one method (the 'default'
method is an exception, but that's not for this discussion). This method
is expected to return a string representing the page/resource that the
user requested. At this simple level, CherryPy can be considered more or
less just a HTTP server and dispatcher.

However, we all know that this isn't where it ends. When we want to
handle cookies, we need framework-specific code. When we want to return
something other than HTML, we need framework-specific code. The list
goes on.

However, with a reasonable coding style, it can be quite practical to
separate strict application code from the framework-specific code. Here
the one-method,-one-page principle is a great help. For instance, we
quite often use decorators for such things as authentication and role
checking, which is both very practical and technically elegant. For
instance, if a user must have a CAS single sign-on identity AND, say,
the administrator role, we'd do as follows:

@cas
@role('administrator')
def protectedpage(self, ...):
# stuff

If the user isn't currently signed in to our CAS, he'll be redirected to
the sign-in page and, after signing in, is returned to the page he
originally requested. The role decorator checks his privileges (based on
his CAS credentials) and either allows or denies him access. This adds
up to a LOT of framework-specific code that's been very easily factored
out. The CAS and role modules behind the decorators are, in turn,
generic frameworks that we've merely specialised for CherryPy. At some
point we'll get around to releasing some code. :-)

As a slight aside, allow me to recommend Meld3 as a good templating
library. It's basically ElementTree with a lot of practical templating
stuff on top, so it's not a mini-language unto itself, and you don't
embed your code in the page.

-- 
Martin Sand Christensen
IT Services, Dept. of Electronic Systems
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How decoupled are the Python frameworks?

2009-12-08 Thread Martin Sand Christensen
Lie Ryan lie.1...@gmail.com writes:

 In the end, it is the developer's responsibility not to write
 something too tightly coupled with their framework, isn't it? (or at
 least to minimize the framework-specific code to a certain area)

That's a good summary of my point. However, I have very little
experience with other frameworks than CherryPy, so I do not want to draw
any general conclusions. My programmer's instincts say that it's true,
though.

-- 
Martin Sand Christensen
IT Services, Dept. of Electronic Systems
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: New CMS in Python

2008-07-31 Thread Martin Sand Christensen
 :-)  [EMAIL PROTECTED] writes:

[...]
 Please send me how U expect your new CMS would be?

I don't mean to be disparaging, but I'd like to point out a few things
that I'm not entirely sure you've thought through.

Firstly, there is currently no shortage in CMS', and especially not in
half-completed, poorly thought out CMS' (not, of course, that I'm saying
yours is necessarily poorly thought out). Unless you bring some new
ideas to the playing field, or unless you think you can do something
better than the existing CMS' already do, the world isn't going to care
about your particular CMS. If you're writing your CMS for fun and
experience, that's great for you and probably a very good idea, but
don't expect many other people to show enthusiasm.

Secondly and related to my first point, if you're asking for people's
input on how they'd like your CMS to be, you've probably not given
purpose and design a lot of thought, and so I must reiterate that unless
you do something new or something better, don't expect anyone to care.
If you want to build a community around your project, the best thing to
do is generally to get something working on your own and then open up
for others when there's actually something to work on. And that means
that you should have a reasonably clear idea where you're going before
setting out.

Anyway, the best of luck to you!

-- 
Martin Sand Christensen
IT Services, Dept. of Electronic Systems
--
http://mail.python.org/mailman/listinfo/python-list


Re: how would you...?

2008-05-19 Thread Martin Sand Christensen
 inhahe == inhahe  [EMAIL PROTECTED] writes:
inhahe Btw, use float() to convert a textual GPA to a number.

It would be much better to use Decimal() instead of float(). A GPA of
3.6001 probably doesn't make much sense; this problem
doesn't arise when using the Decimal type.

Martin
--
http://mail.python.org/mailman/listinfo/python-list


Why don't generators execute until first yield?

2008-05-07 Thread Martin Sand Christensen
Hi!

First a bit of context.

Yesterday I spent a lot of time debugging the following method in a
rather slim database abstraction layer we've developed:

,
| def selectColumn(self, table, column, where={}, order_by=[], group_by=[]):
| Performs a SQL select query returning a single column
|
| The column is returned as a list. An exception is thrown if the
| result is not a single column.
| query = build_select(table, [column], where, order_by, group_by)
| result = DBResult(self.rawQuery(query))
| if result.colcount != 1:
| raise QueryError(Query must return exactly one column, query)
| for row in result.fetchAllRowsAsList():
| yield row[0]
`

I'd just rewritten the method as a generator rather than returning a
list of results. The following test then failed:

,
| def testSelectColumnMultipleColumns(self):
| res = self.fdb.selectColumn('db3ut1', ['c1', 'c2'],
| {'c1':(1, 2)}, order_by='c1')
| self.assertRaises(db3.QueryError, self.fdb.selectColumn,
|   'db3ut1', ['c1', 'c2'], {'c1':(1, 2)}, order_by='c1')
`

I expected this to raise a QueryError due to the result.colcount != 1
constraint being violated (as was the case before), but that isn't the
case. The constraint it not violated until I get the first result from
the generator.

Now to the main point. When a generator function is run, it immediately
returns a generator, and it does not run any code inside the generator.
Not until generator.next() is called is any code inside the generator
executed, giving it traditional lazy evaluation semantics. Why don't
generators follow the usual eager evaluation semantics of Python and
immediately execute up until right before the first yield instead?
Giving generators special case semantics for no good reason is a really
bad idea, so I'm very curious if there is a good reason for it being
this way. With the current semantics it means that errors can pop up at
unexpected times rather than the code failing fast.

Martin
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why don't generators execute until first yield?

2008-05-07 Thread Martin Sand Christensen
 Ian == Ian Kelly [EMAIL PROTECTED] writes:
Ian Isn't lazy evaluation sort of the whole point of replacing a list
Ian with an iterator? Besides which, running up to the first yield when
Ian instantiated would make the generator's first iteration
Ian inconsistent with the remaining iterations.

That wasn't my idea, although that may not have come across quite
clearly enough. I wanted the generator to immediately run until right
before the first yield so that the first call to next() would start with
the first yield.

My objection is that generators _by default_ have different semantics
than the rest of the language. Lazy evaluation as a concept is great for
all the benefits it can provide, but, as I've illustrated, strictly lazy
evaluation semantics can be somewhat surprising at times and lead to
problems that are hard to debug if you don't constantly bear the
difference in mind. In this respect, it seems to me that my suggestion
would be an improvement. I'm not any kind of expert on languages,
though, and I may very well be missing a part of the bigger picture that
makes it obvous why things should be as they are.

As for code to slightly change the semantics of generators, that doesn't
really address the issue as I see it: if you're going to apply such code
to your generators, you're probably doing it exactly because you're
aware of the difference in semantics, and you're not going to be
surprised by it. You may still want to change the semantics, but for
reasons that are irrelevant to my point.

Martin
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why don't generators execute until first yield?

2008-05-07 Thread Martin Sand Christensen
 Duncan == Duncan Booth [EMAIL PROTECTED] writes:
[...]
Duncan Now try:
Duncan 
Duncanfor command in getCommandsFromUser():
Duncanprint the result of that command was, execute(command)
Duncan 
Duncan where getCommandsFromUser is a greedy generator that reads from stdin, 
Duncan and see why generators don't work that way.

I don't see a problem unless the generator isn't defined where it's
going to be used. In other similar input bound use cases, such as the
generator iterating over a query result set in my original post, I see
even less of a problem. Maybe I'm simply daft and you need to spell it
out for me. :-)

Martin
--
http://mail.python.org/mailman/listinfo/python-list


Re: Code folder with Emacs

2008-03-25 Thread Martin Sand Christensen
 Grant == Grant Edwards [EMAIL PROTECTED] writes:
Grant Has anybody figured out how to do code folding of Python source
Grant files in emacs?

I use outline-minor-mode with the following home baked configuration:

;; Python stuff for outline mode.
(defvar py-outline-regexp ^\\([ 
\t]*\\)\\(def\\|class\\|if\\|elif\\|else\\|while\\|for\\|try\\|except\\|with\\)
  This variable defines what constitutes a 'headline' to outline mode.)

(defun py-outline-level ()
  Report outline level for Python outlining.
  (save-excursion
(end-of-line)
(let ((indentation (progn
 (re-search-backward py-outline-regexp)
 (match-string-no-properties 1
  (if (and ( (length indentation) 0)
   (string= \t (substring indentation 0 1)))
  (length indentation)
(/ (length indentation) py-indent-offset)
(add-hook 'python-mode-hook
  '(lambda ()
 (outline-minor-mode 1)
 (setq
  outline-regexp py-outline-regexp
  outline-level 'py-outline-level)))


Martin
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: new style class

2007-11-02 Thread Martin Sand Christensen
 gert == gert  [EMAIL PROTECTED] writes:
gert Why doesn't this new style class work in python 2.5.1 ?

Whether you declare your class as a new style class or an old style
class, your code is completely and utterly broken. Calling non-existing
methods has never been a good way of getting things done. :-)

Martin
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Copy database with python..

2007-11-02 Thread Martin Sand Christensen
 Abandoned == Abandoned  [EMAIL PROTECTED] writes:
Abandoned Yes i understand thank you. Now i find that maybe help the
Abandoned other users.

Abandoned import os
Abandoned os.system(su postgres)
Abandoned ...

I get the distinct impression that you're trying to replace simple shell
scripting with Python. While it's possible, you're probably making
things much more complicated than they need to be. Unless you're
actually doing something with all that data of yours, don't use Python
where a simple shell script will be much smaller and cleaner.

Martin
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Copy database with python..

2007-11-02 Thread Martin Sand Christensen
 Abandoned == Abandoned  [EMAIL PROTECTED] writes:
Abandoned I want to copy my database but python give me error when i
Abandoned use this command. cursor.execute(pg_dump mydata  old.dump)
Abandoned What is the problem ?

cursor.execute() is for executing SQL commands, and this is not an SQL
command, but rather a shell command.

Abandoned And how can i copy the database with python ? Note: The
Abandoned database's size is 200 GB

If you want to do this from Python, run it as a separate process.

Martin
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Proxying downloads

2007-10-30 Thread Martin Sand Christensen
 But I can't figure out how I would solve the following:

 1 Alice asks Sam for foobar.iso
 2 Sam can't find foobar.iso in cachedir
 3 Sam requests foobar.iso from uplink
 4 Sam saves and forwards to Alice
 5 At about 30 % of the download Bob asks Sam for foobar.iso
 6 How do I serve Bob now?

Let every file in your download cache be represented by a Python object.
Instead of streaming the file directly to the clients, you can stream
the objects. The object will know if the file it represents has finished
downloading or not, where the file is located etc. This way you can
also, for the sake of persistence, keep partially downloaded files
separate from the completely downloaded files, as per a previous
suggestion, so that you won't start serving half files after a crash,
and it'll be completely transparent in all code except for your proxy
file objects.

Martin
-- 
http://mail.python.org/mailman/listinfo/python-list