Gziping on disk requires quite some I/O. I guess that on the fly zipping should be faster.

C.

Walter Underwood wrote:
About a factor of 2 on a small, optimized index. Gzipping took 20 seconds,
so it isn't free.

$ cd index-copy
$ du -sk
134336  .
$ gzip *
$ du -sk
62084   .

wunder

On 10/30/08 8:20 AM, "Otis Gospodnetic" <[EMAIL PROTECTED]> wrote:

Yeah.  I'm just not sure how much benefit in terms of data transfer this will
save.  Has anyone tested this to see if this is even worth it?


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
From: Erik Hatcher <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org
Sent: Thursday, October 30, 2008 9:54:28 AM
Subject: Re: replication handler - compression

+1 - the GzipServletFilter is the way to go.

Regarding request handlers reading HTTP headers, yeah,... this will improve,
for sure.

    Erik

On Oct 30, 2008, at 12:18 AM, Chris Hostetter wrote:

: You are partially right. Instead of the HTTP header , we use a request
: parameter. (RequestHandlers cannot read HTP headers). If the param is

hmmm, i'm with walter: we shouldn't invent new mechanisms for
clients to request compression over HTTP from servers.

replicatoin is both special enough and important enough that if we had to
add special support to make that information available to the handler on
the master we could.

but frankly i don't think that's neccessary: the logic to turn on
compression if the client requests it using "Accept-Encoding: gzip" is
generic enough that there is no reason for it to be in a handler.  we
could easily put it in the SolrDispatchFilter, or even in a new
ServletFilte (i'm guessing iv'e seen about 74 different implementations of
a GzipServletFilter in the wild that could be used as is.

then we'd have double wins: compression for replication, and compression
of all responses generated by Solr if hte client requests it.

-Hoss

Reply via email to