[ 
https://issues.apache.org/jira/browse/SOLR-829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akshay K. Ukey updated SOLR-829:
--------------------------------

    Attachment: solr-829.patch

Patch with following changes:
Zip configuration parameter in replicationhandler (on slave):

{code}
<str name="zip">true</str>
{code}

Have tested it with replication across two data centres with an index size of 
1.1G.
Time taken for replicating with gzipping is 1012 seconds (17 mins) compared to 
1250 seconds (21 mins) with replication without gzipping.

> replication Compression
> -----------------------
>
>                 Key: SOLR-829
>                 URL: https://issues.apache.org/jira/browse/SOLR-829
>             Project: Solr
>          Issue Type: Improvement
>          Components: replication (java)
>            Reporter: Simon Collins
>         Attachments: email discussion.txt, solr-829.patch
>
>
> From a discussion on the mailing list solr-user, it would be useful to have 
> an option to compress the files sent between servers for replication purposes.
> Files sent across between indexes can be compressed by a large margin 
> allowing for easier replication between sites.
> ...Noted by Noble Paul 
> we will use a gzip on both ends of the pipe . On the slave side you can say 
> <str name="zip">true<str> as an extra option to compress and send data from 
> server 
> Other thoughts on issue: 
> Do keep in mind that compression is a CPU intensive process so it is a trade 
> off between CPU utilization and network bandwidth.  I have see cases where 
> compressing the data before a network transfer ended up being slower than 
> without compression because the cost of compression and un-compression was 
> more than the gain in network transfer.
> Why invent something when compression is standard in HTTP? --wunder

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to