[ 
https://issues.apache.org/jira/browse/SOLR-829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648507#action_12648507
 ] 

Hoss Man commented on SOLR-829:
-------------------------------

Let's keep this issue focused on one thing: making it possible to configure a 
"slave" solr instance so that it indicates it can "Accept-Encoding" compressed 
responses during replication (discussion of what the "master" does with that 
information are a separate matter)

>From my (naive) reading of the current patch, a few things jump out at me...

1) the "FastOutputStream" changes in ReplicationHandler looks like an 
unintentional part of the patch.
2) why does setting the ZIP option to true disable checksums?  i'm not sure 
when/how checksums are currently computed/compared, but if it can be done with 
a raw i/o streams right now, it can be done with a GZIP i/o streams if the 
response is compressed.
3) the behavior of checkCompressed doesn't seem right. A Content-Encoding 
header is used to indicate that the orriginal content has been compressed in 
order to transfer over HTTP, but the Content-Type header is used to identify 
the true type of the payload.  we shouldn't silently uncompress files just 
because they happen to have a mime type of "application/x-gzip-compressed".  we 
might be able to get away with it in dealing with replication, but we shouldn't 
need it (and unless i'm severaly mistaken, this will break in the event that 
gzip content is sent *with* additional gzip Content-Encoding.


> replication Compression
> -----------------------
>
>                 Key: SOLR-829
>                 URL: https://issues.apache.org/jira/browse/SOLR-829
>             Project: Solr
>          Issue Type: Improvement
>          Components: replication (java)
>            Reporter: Simon Collins
>            Assignee: Shalin Shekhar Mangar
>         Attachments: email discussion.txt, solr-829.patch, solr-829.patch
>
>
> From a discussion on the mailing list solr-user, it would be useful to have 
> an option to compress the files sent between servers for replication purposes.
> Files sent across between indexes can be compressed by a large margin 
> allowing for easier replication between sites.
> ...Noted by Noble Paul 
> we will use a gzip on both ends of the pipe . On the slave side you can say 
> <str name="zip">true<str> as an extra option to compress and send data from 
> server 
> Other thoughts on issue: 
> Do keep in mind that compression is a CPU intensive process so it is a trade 
> off between CPU utilization and network bandwidth.  I have see cases where 
> compressing the data before a network transfer ended up being slower than 
> without compression because the cost of compression and un-compression was 
> more than the gain in network transfer.
> Why invent something when compression is standard in HTTP? --wunder

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to