[jira] [Updated] (HDFS-11786) Add support to make copyFromLocal multi threaded
[ https://issues.apache.org/jira/browse/HDFS-11786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andras Bokor updated HDFS-11786: Issue Type: Improvement (was: Bug) > Add support to make copyFromLocal multi threaded > > > Key: HDFS-11786 > URL: https://issues.apache.org/jira/browse/HDFS-11786 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh > Fix For: 3.0.0-beta1 > > Attachments: HDFS-11786.001.patch, HDFS-11786.002.patch, > HDFS-11786.003.patch, HDFS-11786.004.patch, HDFS-11786.005.patch > > > CopyFromLocal/Put is not currently multithreaded. > In case, where there are multiple files which need to be uploaded to the > hdfs, a single thread reads the file and then copies the data to the cluster. > This copy to hdfs can be made faster by uploading multiple files in parallel. > I am attaching the initial patch so that I can get some initial feedback. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11786) Add support to make copyFromLocal multi threaded
[ https://issues.apache.org/jira/browse/HDFS-11786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer updated HDFS-11786: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.0.0-beta1 Target Version/s: 3.0.0-beta1 Status: Resolved (was: Patch Available) [~msingh] Thank you for the contribution. I have committed this to the trunk. > Add support to make copyFromLocal multi threaded > > > Key: HDFS-11786 > URL: https://issues.apache.org/jira/browse/HDFS-11786 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh > Fix For: 3.0.0-beta1 > > Attachments: HDFS-11786.001.patch, HDFS-11786.002.patch, > HDFS-11786.003.patch, HDFS-11786.004.patch, HDFS-11786.005.patch > > > CopyFromLocal/Put is not currently multithreaded. > In case, where there are multiple files which need to be uploaded to the > hdfs, a single thread reads the file and then copies the data to the cluster. > This copy to hdfs can be made faster by uploading multiple files in parallel. > I am attaching the initial patch so that I can get some initial feedback. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11786) Add support to make copyFromLocal multi threaded
[ https://issues.apache.org/jira/browse/HDFS-11786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDFS-11786: - Attachment: HDFS-11786.005.patch Thanks for the review [~anu], Last patch should fix the check style warnings as well. Here is how the new help for the command will look like {code} HW13605:multi_thread_upload msingh$ hadoop-dist/target/hadoop-3.0.0-alpha4-SNAPSHOT/bin/hdfs dfs -help copyFromLocal -copyFromLocal [-f] [-p] [-l] [-d] [-t ] ... : Copy files from the local file system into fs. Copying fails if the file already exists, unless the -f flag is given. Flags: -p Preserves access and modification times, ownership and the mode. -f Overwrites the destination if it already exists. -t Number of threads to be used, default is 1. -l Allow DataNode to lazily persist the file to disk. Forces replication factor of 1. This flag will result in reduced durability. Use with care. -d Skip creation of temporary file(._COPYING_). {code} > Add support to make copyFromLocal multi threaded > > > Key: HDFS-11786 > URL: https://issues.apache.org/jira/browse/HDFS-11786 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh > Attachments: HDFS-11786.001.patch, HDFS-11786.002.patch, > HDFS-11786.003.patch, HDFS-11786.004.patch, HDFS-11786.005.patch > > > CopyFromLocal/Put is not currently multithreaded. > In case, where there are multiple files which need to be uploaded to the > hdfs, a single thread reads the file and then copies the data to the cluster. > This copy to hdfs can be made faster by uploading multiple files in parallel. > I am attaching the initial patch so that I can get some initial feedback. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11786) Add support to make copyFromLocal multi threaded
[ https://issues.apache.org/jira/browse/HDFS-11786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDFS-11786: - Attachment: HDFS-11786.004.patch > Add support to make copyFromLocal multi threaded > > > Key: HDFS-11786 > URL: https://issues.apache.org/jira/browse/HDFS-11786 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh > Attachments: HDFS-11786.001.patch, HDFS-11786.002.patch, > HDFS-11786.003.patch, HDFS-11786.004.patch > > > CopyFromLocal/Put is not currently multithreaded. > In case, where there are multiple files which need to be uploaded to the > hdfs, a single thread reads the file and then copies the data to the cluster. > This copy to hdfs can be made faster by uploading multiple files in parallel. > I am attaching the initial patch so that I can get some initial feedback. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11786) Add support to make copyFromLocal multi threaded
[ https://issues.apache.org/jira/browse/HDFS-11786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDFS-11786: - Attachment: HDFS-11786.003.patch [~anu] Thanks for the review. I have changed the option to "-t". Please have a look again. > Add support to make copyFromLocal multi threaded > > > Key: HDFS-11786 > URL: https://issues.apache.org/jira/browse/HDFS-11786 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh > Attachments: HDFS-11786.001.patch, HDFS-11786.002.patch, > HDFS-11786.003.patch > > > CopyFromLocal/Put is not currently multithreaded. > In case, where there are multiple files which need to be uploaded to the > hdfs, a single thread reads the file and then copies the data to the cluster. > This copy to hdfs can be made faster by uploading multiple files in parallel. > I am attaching the initial patch so that I can get some initial feedback. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11786) Add support to make copyFromLocal multi threaded
[ https://issues.apache.org/jira/browse/HDFS-11786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDFS-11786: - Summary: Add support to make copyFromLocal multi threaded (was: Add a new command for multi threaded Put/CopyFromLocal) > Add support to make copyFromLocal multi threaded > > > Key: HDFS-11786 > URL: https://issues.apache.org/jira/browse/HDFS-11786 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh > Attachments: HDFS-11786.001.patch, HDFS-11786.002.patch > > > CopyFromLocal/Put is not currently multithreaded. > In case, where there are multiple files which need to be uploaded to the > hdfs, a single thread reads the file and then copies the data to the cluster. > This copy to hdfs can be made faster by uploading multiple files in parallel. > I am attaching the initial patch so that I can get some initial feedback. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org