On 8/31/2015 7:23 AM, Maulin Rathod wrote: > We are using solrcloud 5.2 with 1 shard (in UK Data Center) and 1 replica > (in Australia Data Center). We observed that data inserted/updated in shard > (UK Data center) is replicated very slowly to Replica in AUSTRALIA Data > Center (Due to high latency between UK and AUSTRALIA). We are looking to > improve the speed of data replication from shard to replica. Can we use > some sort of compression before sending data to replica? Please let me know > if any other alternative is available to improve data replication speed > from shard to replica?
SolrCloud replicates data differently than many people expect, especially if they are familiar with how replication worked prior to SolrCloud's introduction in Solr 4.0. The original document is sent to all replicas and each one indexes it independently. This is HTTP traffic, containing the document data after the initial update processors are finished with it. TCP connections across international lines, and oceans in particular, are slow, because of the high latency. The physical distance covered by the speed of light is one problem, but international links usually involve a number of additional routers, which also slows it down. My employer has been dealing with this problem for years when copying files from one location to another. One of the things available to help with this problem is modern TCP stacks that scale the TCP window effectively, so fewer acknowledgements are required. If you are running Solr on Linux machines that are running any recent kernel version (2.6 definitely qualifies, but I think 2.4 does as well), and you haven't turned on SYN cookies or explicitly disabled the scaling, you should be automatically scaling your TCP window. If you are on Windows Server 2008 or Windows 7 (or versions later than these) and haven't poked around in the TCP tuning options, then you would also be OK. If either end of the communication is Windows XP, Server 2003, or an older version of Windows, you're out of luck and will need to upgrade the operating system. The requests involved in SolrCloud indexing may be too short-lived to benefit much from scaling, though. Window scaling typically only helps when the TCP connection lives for more than a few seconds, like an FTP data transfer. Each individual inter-server indexing request is likely only transmitting 10 documents. Even when TCP window scaling is present, if there is *ANY* packet loss anywhere in a high-latency path, transfer speed will drop dramatically. In the lab, I built a simulated setup emulating our connection to our UK office. Even with 130 milliseconds of round-trip latency added by the Linux router impersonating the Internet, transfer speeds of photo-sized files on a modern TCP stack were good ... until I also introduced packet loss. Transfer speeds were BADLY affected by even one tenth of one percent packet loss, which is the lowest amount I tested. SolrCloud is highly optimized for the way it is usually installed -- on multiple machines connected together with one or more LAN switches. This is why it uses lots of little connections. The new cross-data center replication (CDCR) feature is an attempt to better utilize high-latency WAN links. In Solr 5.x, the web server is more firmly under the control of the Solr development team, so compression and other improvements may be possible, but latency is the major problem here, not a lack of features. I'm not sure whether the number of documents per update (currently defaulting to 10) is configurable, but with a modern TCP stack, increasing that number could make the transfer more efficient, assuming the communication link is clean. Thanks, Shawn