On Jun 23, 2011, at 10:49 AM, Jean-Denis Muys <jdm...@kleegroup.com> wrote:
> > On 23 juin 2011, at 16:22, Mr. Puneet Kishor wrote: > >> >> >> >> On Jun 23, 2011, at 10:18 AM, Stephan Beal <sgb...@googlemail.com> wrote: >> >>> Hi, all! >>> >>> Today i saw a curious thing: i store 440kb of wiki files in an sqlite3 db >>> and the db file is only 400kb. >>> >>> HTF can that possibly be? >>> >>> After poking around i found that the wiki files actually total 360kb (when i >>> added up their sizes manually, as opposed to using 'df' to get it), and the >>> extra 80kb were from the hard drive's large block size (slack space reported >>> by 'df'). >>> >>> Kinda funny, though, that sqlite3 actually decreases the amount of storage >>> required in this case. >> >> >> Lots of small files will take up more space because of the fixed minimum >> block size. For large corpuses this won't matter. Putting them all in one db >> makes logistical management easier, but you will lose the ability to update >> just a single file individually. I used to store all my wiki files >> (punkish.org) in one SQLite db, but now I have them as separate files which >> allows me to just ssh and edit a single file easily. Six of one, and all >> that. >> > > Let me add two other drawbacks as well: > > - incremental backups: now everytime you change one small file, the whole > database needs to be backed up, increasing needlessly storage size, and > backup time. This applies to system that do versioning as well as backups > (such as Time Machine). > > - system level indexing: it now becomes much more difficult, if not > impossible, to do system level indexing and searching (as eg in Spotlight). > This is the reason why Apple stopped using a monolithic database for its > email application, now storing each mail individually: so that system-wide > user search can hit emails too. Yup. Very good points, both of them. I still use the db for metadata, but my files are stored in a tree directory structure much like CPAN's directories -- /path/<1>/<12>/<123>/filename.txt where 1, 2, and 3 are the first, second and third letters of the filename. I could store the metadata per file within each file, however, I haven't yet found a way to "find the ten most recently edited files" or "find all files edited by <person name>". > > These two drawbacks may or may not apply to your situation. > > Jean-Denis > > _______________________________________________ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users