Thanks everyone for your responses!
Alan Gauld wrote:
> You may need to be realistic in your expectations.
> A database is writing to disk which will be slower than working in memory.
> And a 3GB file takes a while to read/traverse, even with indexes. It depends
> a lot on exactly what you are doing. If its mainly writing it should not be
> much slower than writing to a flat file. If you are doing a lot of reading -
> and you have used indexes - then it should be a lot faster than a file.
>
> But RAM - if you have enough - will always be fastest, by about 100 times.
> The problem is when you run out, you revert to using files and that's usually
> slower than a database...
>
> But without details of your usage pattern and database schema and SQL code
> etc it is, as you say, impossible to be specific.
I'm running a stochastic simulation of Brownian motion of a number of
particles, for which I'll present a simplified version here. At each time step,
I determine if some particles have left the system, determine the next position
of the remaining particles, and then introduce new particles into the system at
defined starting points. I have two tables in my SQLite database: one for
information on each particle and one for all the x y z locations for each
particle.
sqlite> .schema Particles
CREATE TABLE Particles
(part_id INTEGER PRIMARY KEY,
origin INTEGER,
endpoint INTEGER,
status TEXT,
starttime REAL,
x REAL,
y REAL,
z REAL);
sqlite> .schema Locations
CREATE TABLE Locations
(id INTEGER PRIMARY KEY AUTOINCREMENT,
timepoint REAL,
part_id INTEGER,
x REAL,
y REAL,
z REAL);
For particles that have left the system, I create a list of part_id values
whose status I'd like to update in the database and issue a command within my
script (for which db=sqlite3.connect('results.db')):
db.executemany("UPDATE Particles SET status='left' WHERE part_id=?",part_id)
db.commit()
To update the position, something like:
db.executemany("UPDATE Particles SET x=?,y=?,z=? WHERE
part_id=?",Particle_entries)
db.executemany("INSERT INTO Locations (timepoint,lig,x,y,z) VALUES
(?,?,?,?,?)",Location_entries)
db.commit()
That's about it. Just for many particles (i.e. 1e4 to 1e5). I'm considering
whether I need every location entry or if I could get away with every 10
location entries, for example.
Eike Welk wrote:
> Just in case you don't know it, maybe Pytables is the right solution
> for you. It is a disk storage library specially for scientific
> applications:
> http://www.pytables.org/moin
Wow, that looks pretty good. I work with a lot of numpy.array's in this
simulation so I'll definitely look into that.
bob gailer wrote:
> What do you do with the results after the simulation run?
>
> How precise do the numbers have to be?
I'm interested in the particles that have left the system (I actually have a
few ways they can leave) and I'm also interested in the ensemble average of the
trajectories. As far as precision is concerned, I'm working on the scale of µm
and each movement is on the order of 0.1 to 10 µm.
Faisal
___
Tutor maillist - Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor