Due to bugs we've seen users with 10gb history files, which may
contribute to complaints.
  http://code.google.com/p/chromium/issues/detail?id=24947

Even if compression ends up being pretty slow, you could imagine using
it for our archived history (history more than a month old).

On Tue, Nov 24, 2009 at 10:21 AM, Elliot Glaysher (Chromium)
<e...@chromium.org> wrote:
> I'm all for it. I vaguely remember people complaining about the size
> of our history files, and most of my history files are over 50M.
>
> -- Elliot
>
> On Tue, Nov 24, 2009 at 10:13 AM, Scott Hess <sh...@chromium.org> wrote:
>> Long ago when developing fts1, I experimented with using zlib
>> compression as part of the implementation.  It fell by the wayside
>> because it really didn't provide enough performance improvement (I
>> needed an order of magnitude, it didn't provide it), and because of
>> licensing issues (fts1/2/3 are part of core SQLite, which does not
>> include zlib).
>>
>> Chromium already has zlib, and I don't think there's any particular
>> reason not to hack our version of fts to support it.  Looking at my
>> October history file, I get the following (numbers are in megabytes):
>>
>> ls -lh History\ Index\ 2009-10
>> # -rw-r--r--@ 1 shess  eng    66M Nov 24 09:38 History Index 2009-10
>> .../sqlite3 History\ Index\ 2009-10
>> select 
>> round(sum(length(c0url)+length(c1title)+length(c2body))/1024.0/1024.0,2)
>> from pages_content;
>> # 34.9
>> select 
>> round(sum(length(compress(c0url))+length(compress(c1title))+length(compress(c2body)))/1024.0/1024.0,2)
>> from pages_content;
>> # 12.29
>> select round(sum(length(block))/1024.0/1024.0,2) from pages_segments;
>> # 24.6
>> select round(sum(length(compress(block)))/1024.0/1024.0,2) from 
>> pages_segments;
>> # 14.3
>>
>> pages_segments is the fts index.  Since it is consulted very
>> frequently, I'd be slightly nervous about compressing it.
>> pages_content is the document data, which is hit after the index (or
>> when doing a lookup by document id), so compressing it shouldn't have
>> much performance impact.
>>
>> Does this seem like a win worth pursuing?
>>
>> -scott
>>
>> --
>> Chromium Developers mailing list: chromium-dev@googlegroups.com
>> View archives, change email options, or unsubscribe:
>>    http://groups.google.com/group/chromium-dev
>>
>
> --
> Chromium Developers mailing list: chromium-dev@googlegroups.com
> View archives, change email options, or unsubscribe:
>    http://groups.google.com/group/chromium-dev
>

-- 
Chromium Developers mailing list: chromium-dev@googlegroups.com 
View archives, change email options, or unsubscribe: 
    http://groups.google.com/group/chromium-dev

Reply via email to