Hi, On Thursday, December 4, 2014 12:02:42 AM UTC+1, Ams Fwd wrote: > > On 12/3/14 2:23 PM, Andrea Gavana wrote: > > > > > > On Wednesday, December 3, 2014 10:42:27 PM UTC+1, Jonathan Vanasco > wrote: > > > > > > > > On Wednesday, December 3, 2014 4:23:31 PM UTC-5, Ams Fwd wrote: > > > > I would recommend just storing them on disk and let the OS VMM > > deal with > > caching for speed. If you are not constrained for space I would > > recommend not zlib-ing it either. > > > > > > I'll second storing them to disk. Large object support in all the > > databases is a pain and not very optimal. Just pickle/unpickle a > > file and use the db to manage that file. > > > > > > > > Thanks to all of you who replied. A couple of issues that I'm sure I > > will encounter by letting the files on disk: > > > > 1. Other users can easily delete/overwrite/rename the files on disk, > > which is something we really, really do not want; > > If this is windows group policies are your friends :). If this is linux, > permissions with a secondary service to access the files are a decent > choice. >
Yes, this is Windows, but no, I can't go around and tell the users that the simulation they just saved is not accessible anymore. The database is part of a much larger user interface application, where users can generate simulations and then decide whether or not they are relevant enough to be stored in the database. At a rate of 300 MB per simulation (or more), it gets quickly to the "size issue". > > 2. The whole point of a database was to have everything centralized in > > one place, not leaving the simulation files scattered around like a > > mess in the whole network drive; > > The last time I did it a post processing step in my data pipeline > organized the files based on a multi-level folder structure based on the > first x-characters of their sha1. > Again, I am dealing with non-Python people - and in general with people who are extremely good at what they do but they don't care about the overall IT architecture - as long as it works and it is recognizable from a Windows Explorer point of view. A file-based approach is unfortunately not a good option in the current setup. > > > 3. As an aside, not zlib-ing the files saves about 5 > > seconds/simulation (over a 20 seconds save) but increases the database > > size by 4 times. I'll have to check if this is OK. > > > To use compression or not depends on your needs. If the difference in > time consumed is so stark, I would highly recommend compression. > I will probably go that way, 5 seconds more or less do not make that much of a difference overall. I just wish the backends didn't complain when I pass them cPickled objects (bytes instead of strings...). Andrea. -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.