Re: [Tutor] numerical simulation + SQLite

2009-12-08 Thread Faisal Moledina
On Tue, Dec 1, 2009 at 11:48 AM, Faisal Moledina
 wrote:
> Eike Welk wrote:
>> Just in case you don't know it, maybe Pytables is the right solution
>> for you. It is a disk storage library specially for scientific
>> applications:
>> http://www.pytables.org/moin
>
> Wow, that looks pretty good. I work with a lot of numpy.array's in this 
> simulation so I'll definitely look into that.

For those of you following along at home, my problem has been solved
with Pytables. Discussion available at
http://www.mail-archive.com/pytables-us...@lists.sourceforge.net/msg01416.html
on the Pytables-users list. Thanks again, Eike.

Faisal
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] numerical simulation + SQLite

2009-12-01 Thread Faisal Moledina
Thanks everyone for your responses!

Alan Gauld wrote:
> You may need to be realistic in your expectations.
> A database is writing to disk which will be slower than working in memory. 
> And a 3GB file takes a while to read/traverse, even with indexes. It depends 
> a lot on exactly what you are doing. If its mainly writing it should not be 
> much slower than writing to a flat file. If you are doing a lot of reading - 
> and you have used indexes - then it should be a lot faster than a file.
> 
> But RAM - if you have enough - will always be fastest, by about 100 times.
> The problem is when you run out, you revert to using files and that's usually 
> slower than a database...
> 
> But without details of your usage pattern and database schema and SQL code 
> etc it is, as you say, impossible to be specific.

I'm running a stochastic simulation of Brownian motion of a number of 
particles, for which I'll present a simplified version here. At each time step, 
I determine if some particles have left the system, determine the next position 
of the remaining particles, and then introduce new particles into the system at 
defined starting points. I have two tables in my SQLite database: one for 
information on each particle and one for all the x y z locations for each 
particle.

sqlite> .schema Particles
CREATE TABLE Particles
(part_id INTEGER PRIMARY KEY,
origin INTEGER,
endpoint INTEGER,
status TEXT,
starttime REAL,
x REAL,
y REAL,
z REAL);

sqlite> .schema Locations
CREATE TABLE Locations
(id INTEGER PRIMARY KEY AUTOINCREMENT,
timepoint REAL,
part_id INTEGER,
x REAL,
y REAL,
z REAL);

For particles that have left the system, I create a list of part_id values 
whose status I'd like to update in the database and issue a command within my 
script (for which db=sqlite3.connect('results.db')):

db.executemany("UPDATE Particles SET status='left' WHERE part_id=?",part_id)
db.commit()

To update the position, something like:

db.executemany("UPDATE Particles SET x=?,y=?,z=? WHERE 
part_id=?",Particle_entries)
db.executemany("INSERT INTO Locations (timepoint,lig,x,y,z) VALUES 
(?,?,?,?,?)",Location_entries)
db.commit()

That's about it. Just for many particles (i.e. 1e4 to 1e5). I'm considering 
whether I need every location entry or if I could get away with every 10 
location entries, for example.

Eike Welk wrote:
> Just in case you don't know it, maybe Pytables is the right solution 
> for you. It is a disk storage library specially for scientific 
> applications:
> http://www.pytables.org/moin

Wow, that looks pretty good. I work with a lot of numpy.array's in this 
simulation so I'll definitely look into that.

bob gailer wrote:
> What do you do with the results after the simulation run?
> 
> How precise do the numbers have to be?

I'm interested in the particles that have left the system (I actually have a 
few ways they can leave) and I'm also interested in the ensemble average of the 
trajectories. As far as precision is concerned, I'm working on the scale of µm 
and each movement is on the order of 0.1 to 10 µm.

Faisal
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


[Tutor] numerical simulation + SQLite

2009-11-30 Thread Faisal Moledina
Hey everyone,

I have a general issue that I'd like to discuss. I'm using Python to
run a numerical simulation where at each time step, I run a number of
operations and store the results before moving to the next timestep.
At first, I used a list to store a bunch of class instances, each of
which contained a bunch of data calculated at each time step. This
resulted in whopping memory usage (2.75 GB RAM, 3.75 GB VM).

So then I decided instead to use SQLite to store that information at
each timestep. This seems to work well, but it gets slow over time as
well. I've never used SQLite or Python before and I come from a
MATLAB-based engineering background rather than a programming one. I
was wondering if anyone had any tips for using SQLite efficiently.
Maybe a list of dos and don'ts.

I understand that specific help is impossible without a reduced sample
of code. Currently I'm looking for general guidelines and will come
back to this list for specific hangups. Thank you.

Faisal
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor