On Jun 23, 2011, at 10:49 AM, Jean-Denis Muys <jdm...@kleegroup.com> wrote:

> 
> On 23 juin 2011, at 16:22, Mr. Puneet Kishor wrote:
> 
>> 
>> 
>> 
>> On Jun 23, 2011, at 10:18 AM, Stephan Beal <sgb...@googlemail.com> wrote:
>> 
>>> Hi, all!
>>> 
>>> Today i saw a curious thing: i store 440kb of wiki files in an sqlite3 db
>>> and the db file is only 400kb.
>>> 
>>> HTF can that possibly be?
>>> 
>>> After poking around i found that the wiki files actually total 360kb (when i
>>> added up their sizes manually, as opposed to using 'df' to get it), and the
>>> extra 80kb were from the hard drive's large block size (slack space reported
>>> by 'df').
>>> 
>>> Kinda funny, though, that sqlite3 actually decreases the amount of storage
>>> required in this case.
>> 
>> 
>> Lots of small files will take up more space because of the fixed minimum 
>> block size. For large corpuses this won't matter. Putting them all in one db 
>> makes logistical management easier, but you will lose the ability to update 
>> just a single file individually. I used to store all my wiki files 
>> (punkish.org) in one SQLite db, but now I have them as separate files which 
>> allows me to just ssh and edit a single file easily. Six of one, and all 
>> that.
>> 
> 
> Let me add two other drawbacks as well:
> 
> - incremental backups: now everytime you change one small file, the whole 
> database needs to be backed up, increasing needlessly storage size, and 
> backup time. This applies to system that do versioning as well as backups 
> (such as Time Machine).
> 
> - system level indexing: it now becomes much more difficult, if not 
> impossible, to do system level indexing and searching (as eg in Spotlight). 
> This is the reason why Apple stopped using a monolithic database for its 
> email application, now storing each mail individually: so that system-wide 
> user search can hit emails too.


Yup. Very good points, both of them.

I still use the db for metadata, but my files are stored in a tree directory 
structure much like CPAN's directories -- /path/<1>/<12>/<123>/filename.txt 
where 1, 2, and 3 are the first, second and third letters of the filename. I 
could store the metadata per file within each file, however, I haven't yet 
found a way to "find the ten most recently edited files" or "find all files 
edited by <person name>".


> 
> These two drawbacks may or may not apply to your situation.
> 
> Jean-Denis
> 
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to