[jira] [Updated] (HADOOP-8065) discp should have an option to compress data while copying.
[ https://issues.apache.org/jira/browse/HADOOP-8065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen Veiss updated HADOOP-8065: -- Attachment: HADOOP-8065-trunk_2015-11-04.patch > discp should have an option to compress data while copying. > --- > > Key: HADOOP-8065 > URL: https://issues.apache.org/jira/browse/HADOOP-8065 > Project: Hadoop Common > Issue Type: Improvement > Components: fs >Affects Versions: 0.20.2 >Reporter: Suresh Antony >Priority: Minor > Labels: distcp > Fix For: 0.20.2 > > Attachments: HADOOP-8065-trunk_2015-11-03.patch, > HADOOP-8065-trunk_2015-11-04.patch, patch.distcp.2012-02-10 > > > We would like compress the data while transferring from our source system to > target system. One way to do this is to write a map/reduce job to compress > that after/before being transferred. This looks inefficient. > Since distcp already reading writing data it would be better if it can > accomplish while doing this. > Flip side of this is that distcp -update option can not check file size > before copying data. It can only check for the existence of file. > So I propose if -compress option is given then file size is not checked. > Also when we copy file appropriate extension needs to be added to file > depending on compression type. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-8065) discp should have an option to compress data while copying.
[ https://issues.apache.org/jira/browse/HADOOP-8065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen Veiss updated HADOOP-8065: -- Status: Patch Available (was: Open) We needed this feature internally, so I updated the patch for CDH 5.4 and from there to current trunk. I've attached the trunk version of the patch. This also includes some tests, and doesn't allow a compression codec to be specified for an update, as there's no easy way to tell if a file is different between source and destination if one side is compressed. > discp should have an option to compress data while copying. > --- > > Key: HADOOP-8065 > URL: https://issues.apache.org/jira/browse/HADOOP-8065 > Project: Hadoop Common > Issue Type: Improvement > Components: fs >Affects Versions: 0.20.2 >Reporter: Suresh Antony >Priority: Minor > Labels: distcp > Fix For: 0.20.2 > > Attachments: HADOOP-8065-trunk_2015-11-03.patch, > patch.distcp.2012-02-10 > > > We would like compress the data while transferring from our source system to > target system. One way to do this is to write a map/reduce job to compress > that after/before being transferred. This looks inefficient. > Since distcp already reading writing data it would be better if it can > accomplish while doing this. > Flip side of this is that distcp -update option can not check file size > before copying data. It can only check for the existence of file. > So I propose if -compress option is given then file size is not checked. > Also when we copy file appropriate extension needs to be added to file > depending on compression type. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-8065) discp should have an option to compress data while copying.
[ https://issues.apache.org/jira/browse/HADOOP-8065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen Veiss updated HADOOP-8065: -- Attachment: HADOOP-8065-trunk_2015-11-03.patch > discp should have an option to compress data while copying. > --- > > Key: HADOOP-8065 > URL: https://issues.apache.org/jira/browse/HADOOP-8065 > Project: Hadoop Common > Issue Type: Improvement > Components: fs >Affects Versions: 0.20.2 >Reporter: Suresh Antony >Priority: Minor > Labels: distcp > Fix For: 0.20.2 > > Attachments: HADOOP-8065-trunk_2015-11-03.patch, > patch.distcp.2012-02-10 > > > We would like compress the data while transferring from our source system to > target system. One way to do this is to write a map/reduce job to compress > that after/before being transferred. This looks inefficient. > Since distcp already reading writing data it would be better if it can > accomplish while doing this. > Flip side of this is that distcp -update option can not check file size > before copying data. It can only check for the existence of file. > So I propose if -compress option is given then file size is not checked. > Also when we copy file appropriate extension needs to be added to file > depending on compression type. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-8065) discp should have an option to compress data while copying.
[ https://issues.apache.org/jira/browse/HADOOP-8065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony updated HADOOP-8065: -- Attachment: patch.distcp.2012-02-10 Attaching patch for distcp. Added an additional argument -compress_codec compress codec class Sample command line: $HADOOP_HOME/bin/hadoop distcp -compress_codec org.apache.hadoop.io.compress.GzipCodec -log hdfs://hadoopmasterflume1/tmp/suresh/log -overwrite hdfs://hadoopmasterflume1/tmp/distcp_src hdfs://hadoopmasterflume1/tmp/distcp_tgt discp should have an option to compress data while copying. --- Key: HADOOP-8065 URL: https://issues.apache.org/jira/browse/HADOOP-8065 Project: Hadoop Common Issue Type: Improvement Components: fs Affects Versions: 0.20.2 Reporter: Suresh Antony Priority: Minor Labels: distcp Fix For: 0.20.2 Attachments: patch.distcp.2012-02-10 We would like compress the data while transferring from our source system to target system. One way to do this is to write a map/reduce job to compress that after/before being transferred. This looks inefficient. Since distcp already reading writing data it would be better if it can accomplish while doing this. Flip side of this is that distcp -update option can not check file size before copying data. It can only check for the existence of file. So I propose if -compress option is given then file size is not checked. Also when we copy file appropriate extension needs to be added to file depending on compression type. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira