
On Thursday, December 4, 2014 12:02:42 AM UTC+1, Ams Fwd wrote:
> On 12/3/14 2:23 PM, Andrea Gavana wrote: 
> > 
> > 
> > On Wednesday, December 3, 2014 10:42:27 PM UTC+1, Jonathan Vanasco 
> wrote: 
> > 
> > 
> > 
> >     On Wednesday, December 3, 2014 4:23:31 PM UTC-5, Ams Fwd wrote: 
> > 
> >         I would recommend just storing them on disk and let the OS VMM 
> >         deal with 
> >         caching for speed. If you are not constrained for space I would 
> >         recommend not zlib-ing it either. 
> > 
> > 
> >     I'll second storing them to disk.  Large object support in all the 
> >     databases is a pain and not very optimal.  Just pickle/unpickle a 
> >     file and use the db to manage that file. 
> > 
> > 
> > 
> > Thanks to all of you who replied. A couple of issues that I'm sure I 
> > will encounter by letting the files on disk: 
> > 
> > 1. Other users can easily delete/overwrite/rename the files on disk, 
> > which is something we really, really do not want; 
> If this is windows group policies are your friends :). If this is linux, 
> permissions with a secondary service to access the files are a decent 
> choice. 

Yes, this is Windows, but no, I can't go around and tell the users that the 
simulation they just saved is not accessible anymore. The database is part 
of a much larger user interface application, where users can generate 
simulations and then decide whether or not they are relevant enough to be 
stored in the database. At a rate of 300 MB per simulation (or more), it 
gets quickly to the "size issue".

> > 2. The whole point of a database was to have everything centralized in 
> > one place, not leaving the simulation files scattered around like a 
> > mess in the whole network drive; 
> The last time I did it a post processing step in my data pipeline 
> organized the files based on a multi-level folder structure based on the 
> first x-characters of their sha1. 

Again, I am dealing with non-Python people - and in general with people who 
are extremely good at what they do but they don't care about the overall IT 
architecture - as long as it works and it is recognizable from a Windows 
Explorer point of view. A file-based approach is unfortunately not a good 
option in the current setup.


> > 3. As an aside, not zlib-ing the files saves about 5 
> > seconds/simulation (over a 20 seconds save) but increases the database 
> > size by 4 times. I'll have to check if this is OK. 
> > 
> To use compression or not depends on your needs. If the difference in 
> time consumed is so stark, I would highly recommend compression. 

I will probably go that way, 5 seconds more or less do not make that much 
of a difference overall. I just wish the backends didn't complain when I 
pass them cPickled objects (bytes instead of strings...).


You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Reply via email to