Do keep in mind that compression is a CPU intensive process so it is a trade off between CPU utilization and network bandwidth. I have see cases where compressing the data before a network transfer ended up being slower than without compression because the cost of compression and un-compression was more than the gain in network transfer.
Bill On Wed, Oct 29, 2008 at 7:35 AM, Noble Paul നോബിള് नोब्ळ् < [EMAIL PROTECTED]> wrote: > open a JIRA issue. we will use a gzip on both ends of the pipe . On > the slave side you can say > <str name="zip">true<str> > as an extra option to compress and send data from server > --Noble > > > > > On Wed, Oct 29, 2008 at 3:06 PM, Simon Collins > <[EMAIL PROTECTED]> wrote: > > I have now optimized the index - down to 325mb, it compresses down to > 20mb. > > > > I think the new replication thing in solr is great, but if it could > compress the files it's sending, it would be an awful lot more useful when > replicating, as we are, between sites. > > > > > > > > -------------------------------------------------------- > > > > Simon Collins > > Systems Analyst > > > > Telephone: 01904 606 867 > > Fax Number: 01904 528 791 > > > > shoe-shop.com ltd > > Catherine House > > Northminster Business Park > > Upper Poppleton, YORK > > YO26 6QU > > www.shoe-shop.com > > -------------------------------------------------------- > > > > This message (and any associated files) is intended only for the use of > the individual or entity to which it is addressed and may contain > information that is confidential, subject to copyright or constitutes a > trade secret. If you are not the intended recipient you are hereby notified > that any dissemination, copying or distribution of this message, or files > associated with this message, is strictly prohibited. If you have received > this message in error, please notify us immediately by replying to the > message and deleting it from your computer. Messages sent to and from us may > be monitored. > > > > Internet communications cannot be guaranteed to be secure or error-free > as information could be intercepted, corrupted, lost, destroyed, arrive late > or incomplete, or contain viruses. Therefore, we do not accept > responsibility for any errors or omissions that are present in this message, > or any attachment, that have arisen as a result of e-mail transmission. If > verification is required, please request a hard-copy version. Any views or > opinions presented are solely those of the author and do not necessarily > represent those of the company. (PAVD001) > > Shoe-shop.com Limited is a company registered in England and Wales with > company number 03817232. Vat Registration GB 734 256 241. Registered Office > Catherine House, Northminster Business Park, Upper Poppleton, YORK, YO26 > 6QU. > > > > > > -----Original Message----- > > > > From: Noble Paul നോബിള് नोब्ळ् [mailto:[EMAIL PROTECTED] > > Sent: 29 October 2008 03:29 > > To: solr-user@lucene.apache.org > > Subject: Re: replication handler - compression > > > > The new replication feature does not use any unix commands , it is > > pure java. On the fly compression is hard but possible. > > I wish to repeat the question. Did you optimize the index? Because a > > 10:1 compression is not usually observed in an optimized index. Our > > own experiments showed compression of around 10:6 for optimized > > indexes. > > > > --Noble > > > > On Wed, Oct 29, 2008 at 3:41 AM, Lance Norskog <[EMAIL PROTECTED]> > wrote: > >> Aha! The hint to the actual problem: "When compressed with winzip". You > are running Solr on Windows. > >> > >> Snapshots don't work on Windows: they depend on a Unix file system > feature. You may be copying the entire index. Not just that, it could be > inconsistent. > >> This is a fine topic for a "best practices for Windows" wiki page. > >> > >> The 'scp' program what you want. It has an option to compress on the fly > without saving anything to disk. 'Rcopy' in particular has features to only > copy what is not already at the target. The Putty suite 'pscp' program also > has the compression feature. > >> > >> Lance > >> > >> -----Original Message----- > >> From: Noble Paul നോബിള് नोब्ळ् [mailto:[EMAIL PROTECTED] > >> Sent: Monday, October 27, 2008 9:36 PM > >> To: solr-user@lucene.apache.org > >> Subject: Re: replication handler - compression > >> > >>> It is useful only if your bandwidth is very low. > >>> Otherwise the cost of copying/comprressing/decompressing can take up > >>> more time than we save. > >> > >> I mean compressing and transferring. If the optimized index itself has a > very high compression ratio then it is worth exploring the option of > compresssing and transferring. And do not assume that all the files in the > index directory is transferred during replication. It only transfers the > files which are used by the current commit point and the ones which are > absent in the slave > >> > >> > >>> > >>> > >>> > >>> On Tue, Oct 28, 2008 at 2:49 AM, Simon Collins > >>> <[EMAIL PROTECTED]> wrote: > >>>> Is there an option on the replication handler to compress the files? > >>>> > >>>> > >>>> > >>>> I'm trying to replicate off site, and seem to have accumulated about > >>>> 1.4gb. When compressed with winzip of all things i can get this down > >>>> to about 10% of the size. > >>>> > >>>> > >>>> > >>>> Is compression in the pipeline / can it be if not! > >>>> > >>>> > >>>> > >>>> simon > >>>> > >>>> > >>>> > >>>> This message has been scanned for malware by SurfControl plc. > >>>> www.surfcontrol.com > >>>> > >>> > >>> > >>> > >>> -- > >>> --Noble Paul > >>> > >> > >> > >> > >> -- > >> --Noble Paul > >> > >> > > > > > > > > -- > > --Noble Paul > > > > > > -- > --Noble Paul >