[jira] [Updated] (HDFS-11786) Add support to make copyFromLocal multi threaded

2017-07-28 Thread Andras Bokor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Bokor updated HDFS-11786:

Issue Type: Improvement  (was: Bug)

> Add support to make copyFromLocal multi threaded
> 
>
> Key: HDFS-11786
> URL: https://issues.apache.org/jira/browse/HDFS-11786
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
> Fix For: 3.0.0-beta1
>
> Attachments: HDFS-11786.001.patch, HDFS-11786.002.patch, 
> HDFS-11786.003.patch, HDFS-11786.004.patch, HDFS-11786.005.patch
>
>
> CopyFromLocal/Put is not currently multithreaded.
> In case, where there are multiple files which need to be uploaded to the 
> hdfs, a single thread reads the file and then copies the data to the cluster.
> This copy to hdfs can be made faster by uploading multiple files in parallel.
> I am attaching the initial patch so that I can get some initial feedback.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11786) Add support to make copyFromLocal multi threaded

2017-07-16 Thread Anu Engineer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-11786:

  Resolution: Fixed
Hadoop Flags: Reviewed
   Fix Version/s: 3.0.0-beta1
Target Version/s: 3.0.0-beta1
  Status: Resolved  (was: Patch Available)

[~msingh] Thank you for the contribution. I have committed this to the trunk.

> Add support to make copyFromLocal multi threaded
> 
>
> Key: HDFS-11786
> URL: https://issues.apache.org/jira/browse/HDFS-11786
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
> Fix For: 3.0.0-beta1
>
> Attachments: HDFS-11786.001.patch, HDFS-11786.002.patch, 
> HDFS-11786.003.patch, HDFS-11786.004.patch, HDFS-11786.005.patch
>
>
> CopyFromLocal/Put is not currently multithreaded.
> In case, where there are multiple files which need to be uploaded to the 
> hdfs, a single thread reads the file and then copies the data to the cluster.
> This copy to hdfs can be made faster by uploading multiple files in parallel.
> I am attaching the initial patch so that I can get some initial feedback.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11786) Add support to make copyFromLocal multi threaded

2017-07-01 Thread Mukul Kumar Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDFS-11786:
-
Attachment: HDFS-11786.005.patch

Thanks for the review [~anu], Last patch should fix the check style warnings as 
well.

Here is how the new help for the command will look like
{code}
HW13605:multi_thread_upload msingh$ 
hadoop-dist/target/hadoop-3.0.0-alpha4-SNAPSHOT/bin/hdfs dfs -help copyFromLocal
-copyFromLocal [-f] [-p] [-l] [-d] [-t ]  ...  :
  Copy files from the local file system into fs. Copying fails if the file 
already
  exists, unless the -f flag is given.
  Flags:

 
  -p Preserves access and modification times, ownership and the 
 
 mode.  
 
  -f Overwrites the destination if it already exists.   
 
  -t   Number of threads to be used, default is 1.
 
  -l Allow DataNode to lazily persist the file to disk. Forces  
 
 replication factor of 1. This flag will result in reduced  
 
 durability. Use with care. 
 
  -d Skip creation of temporary file(._COPYING_).   
{code}

> Add support to make copyFromLocal multi threaded
> 
>
> Key: HDFS-11786
> URL: https://issues.apache.org/jira/browse/HDFS-11786
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
> Attachments: HDFS-11786.001.patch, HDFS-11786.002.patch, 
> HDFS-11786.003.patch, HDFS-11786.004.patch, HDFS-11786.005.patch
>
>
> CopyFromLocal/Put is not currently multithreaded.
> In case, where there are multiple files which need to be uploaded to the 
> hdfs, a single thread reads the file and then copies the data to the cluster.
> This copy to hdfs can be made faster by uploading multiple files in parallel.
> I am attaching the initial patch so that I can get some initial feedback.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11786) Add support to make copyFromLocal multi threaded

2017-07-01 Thread Mukul Kumar Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDFS-11786:
-
Attachment: HDFS-11786.004.patch

> Add support to make copyFromLocal multi threaded
> 
>
> Key: HDFS-11786
> URL: https://issues.apache.org/jira/browse/HDFS-11786
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
> Attachments: HDFS-11786.001.patch, HDFS-11786.002.patch, 
> HDFS-11786.003.patch, HDFS-11786.004.patch
>
>
> CopyFromLocal/Put is not currently multithreaded.
> In case, where there are multiple files which need to be uploaded to the 
> hdfs, a single thread reads the file and then copies the data to the cluster.
> This copy to hdfs can be made faster by uploading multiple files in parallel.
> I am attaching the initial patch so that I can get some initial feedback.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11786) Add support to make copyFromLocal multi threaded

2017-06-30 Thread Mukul Kumar Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDFS-11786:
-
Attachment: HDFS-11786.003.patch

[~anu] Thanks for the review. I have changed the option to "-t". Please have a 
look again.

> Add support to make copyFromLocal multi threaded
> 
>
> Key: HDFS-11786
> URL: https://issues.apache.org/jira/browse/HDFS-11786
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
> Attachments: HDFS-11786.001.patch, HDFS-11786.002.patch, 
> HDFS-11786.003.patch
>
>
> CopyFromLocal/Put is not currently multithreaded.
> In case, where there are multiple files which need to be uploaded to the 
> hdfs, a single thread reads the file and then copies the data to the cluster.
> This copy to hdfs can be made faster by uploading multiple files in parallel.
> I am attaching the initial patch so that I can get some initial feedback.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11786) Add support to make copyFromLocal multi threaded

2017-06-30 Thread Mukul Kumar Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDFS-11786:
-
Summary: Add support to make copyFromLocal multi threaded  (was: Add a new 
command for multi threaded Put/CopyFromLocal)

> Add support to make copyFromLocal multi threaded
> 
>
> Key: HDFS-11786
> URL: https://issues.apache.org/jira/browse/HDFS-11786
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
> Attachments: HDFS-11786.001.patch, HDFS-11786.002.patch
>
>
> CopyFromLocal/Put is not currently multithreaded.
> In case, where there are multiple files which need to be uploaded to the 
> hdfs, a single thread reads the file and then copies the data to the cluster.
> This copy to hdfs can be made faster by uploading multiple files in parallel.
> I am attaching the initial patch so that I can get some initial feedback.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org