[jira] [Created] (HADOOP-15221) Swift driver should not fail if JSONUtils reports UnknowPropertyException

2018-02-12 Thread Chen He (JIRA)
Chen He created HADOOP-15221:


 Summary: Swift driver should not fail if JSONUtils reports 
UnknowPropertyException
 Key: HADOOP-15221
 URL: https://issues.apache.org/jira/browse/HADOOP-15221
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs/swift
Reporter: Chen He
Assignee: Chen He


org.apache.hadoop.fs.swift.exceptions.SwiftJsonMarshallingException: 
org.codehaus.jackson.map.exc.UnrecognizedPropertyException: Unrecognized field 
We know system is keep involving and new field will be added. However, for 
compatibility point of view, extra field added to json should be logged but may 
not lead to failure from the robustness point of view.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14716) SwiftNativeFileSystem should not eat the exception when rename

2017-08-08 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated HADOOP-14716:
-
Attachment: HADOOP-14716-WIP.patch

WIP patch. 

> SwiftNativeFileSystem should not eat the exception when rename
> --
>
> Key: HADOOP-14716
> URL: https://issues.apache.org/jira/browse/HADOOP-14716
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 2.8.1, 3.0.0-alpha3
>Reporter: Chen He
>Assignee: Chen He
>Priority: Minor
> Attachments: HADOOP-14716-WIP.patch
>
>
> Currently, if "rename" will eat excpetions and return "false" in 
> SwiftNativeFileSystem. It is not easy for user to find root cause about why 
> rename failed. It has to, at least, write out some logs instead of directly 
> eats these exceptions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14716) SwiftNativeFileSystem should not eat the exception when rename

2017-08-02 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16111387#comment-16111387
 ] 

Chen He commented on HADOOP-14716:
--

Thank you for the quick reply, [~steve_l]. IMHO, HADOOP-11452 is very helpful, 
at the same time, I will come up with a patch. 

> SwiftNativeFileSystem should not eat the exception when rename
> --
>
> Key: HADOOP-14716
> URL: https://issues.apache.org/jira/browse/HADOOP-14716
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 2.8.1, 3.0.0-alpha3
>Reporter: Chen He
>Assignee: Chen He
>Priority: Minor
>
> Currently, if "rename" will eat excpetions and return "false" in 
> SwiftNativeFileSystem. It is not easy for user to find root cause about why 
> rename failed. It has to, at least, write out some logs instead of directly 
> eats these exceptions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14716) SwiftNativeFileSystem should not eat the exception when rename

2017-08-01 Thread Chen He (JIRA)
Chen He created HADOOP-14716:


 Summary: SwiftNativeFileSystem should not eat the exception when 
rename
 Key: HADOOP-14716
 URL: https://issues.apache.org/jira/browse/HADOOP-14716
 Project: Hadoop Common
  Issue Type: Bug
  Components: tools
Affects Versions: 3.0.0-alpha3, 2.8.1
Reporter: Chen He
Assignee: Chen He
Priority: Minor


Currently, if "rename" will eat excpetions and return "false" in 
SwiftNativeFileSystem. It is not easy for user to find root cause about why 
rename failed. It has to, at least, write out some logs instead of directly 
eats these exceptions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14641) hadoop-openstack driver reports input stream leaking

2017-07-10 Thread Chen He (JIRA)
Chen He created HADOOP-14641:


 Summary: hadoop-openstack driver reports input stream leaking
 Key: HADOOP-14641
 URL: https://issues.apache.org/jira/browse/HADOOP-14641
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.7.3
Reporter: Chen He


[2017-07-07 14:51:07,052] ERROR Input stream is leaking handles by not being 
closed() properly: HttpInputStreamWithRelease working with https://url/logs 
released=false dataConsumed=false 
(org.apache.hadoop.fs.swift.snative.SwiftNativeInputStream:259)
[2017-07-07 14:51:07,052] DEBUG Releasing connection to https://url/logs:  
finalize() (org.apache.hadoop.fs.swift.http.HttpInputStreamWithRelease:101)
java.lang.Exception: stack
at 
org.apache.hadoop.fs.swift.http.HttpInputStreamWithRelease.(HttpInputStreamWithRelease.java:71)
at 
org.apache.hadoop.fs.swift.http.SwiftRestClient$10.extractResult(SwiftRestClient.java:1523)
at 
org.apache.hadoop.fs.swift.http.SwiftRestClient$10.extractResult(SwiftRestClient.java:1520)
at 
org.apache.hadoop.fs.swift.http.SwiftRestClient.perform(SwiftRestClient.java:1406)
at 
org.apache.hadoop.fs.swift.http.SwiftRestClient.doGet(SwiftRestClient.java:1520)
at 
org.apache.hadoop.fs.swift.http.SwiftRestClient.getData(SwiftRestClient.java:679)
at 
org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystemStore.getObject(SwiftNativeFileSystemStore.java:276)
at 
org.apache.hadoop.fs.swift.snative.SwiftNativeInputStream.(SwiftNativeInputStream.java:104)
at 
org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystem.open(SwiftNativeFileSystem.java:555)
at 
org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystem.open(SwiftNativeFileSystem.java:536)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:769)
at 
com.oracle.kafka.connect.swift.SwiftStorage.exists(SwiftStorage.java:74)
at io.confluent.connect.hdfs.DataWriter.createDir(DataWriter.java:371)
at io.confluent.connect.hdfs.DataWriter.(DataWriter.java:175)
at 
com.oracle.kafka.connect.swift.SwiftSinkTask.start(SwiftSinkTask.java:78)
at 
org.apache.kafka.connect.runtime.WorkerSinkTask.initializeAndStart(WorkerSinkTask.java:231)
at 
org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:145)
at 
org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:139)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:182)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-12554) Swift client to read credentials from a credential provider

2016-11-04 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637831#comment-15637831
 ] 

Chen He commented on HADOOP-12554:
--

I test it against openstack object store. It does not work:
{quote}
httpclient.HttpMethodDirector: Unable to respond to any of these challenges: 
{token=Token}
{quote}
Maybe the ReadMe is not clear?

> Swift client to read credentials from a credential provider
> ---
>
> Key: HADOOP-12554
> URL: https://issues.apache.org/jira/browse/HADOOP-12554
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/swift
>Affects Versions: 2.7.1
>Reporter: Steve Loughran
>Assignee: ramtin
>Priority: Minor
> Attachments: HADOOP-12554.001.patch, HADOOP-12554.002.patch
>
>
> As HADOOP-12548 is going to do for s3, Swift should be reading credentials, 
> particularly passwords, from a credential provider. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-13570) Hadoop Swift driver should use new Apache httpclient

2016-09-01 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He resolved HADOOP-13570.
--
Resolution: Duplicate

Dup to HADOOP-11614, close it.

> Hadoop Swift driver should use new Apache httpclient
> 
>
> Key: HADOOP-13570
> URL: https://issues.apache.org/jira/browse/HADOOP-13570
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/swift
>Affects Versions: 2.7.3, 2.6.4
>Reporter: Chen He
>
> Current Hadoop openstack module is still using apache httpclient v1.x. It is 
> too old. We need to update it to a higher version to catch up in performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13570) Hadoop Swift driver should use new Apache httpclient

2016-09-01 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15456269#comment-15456269
 ] 

Chen He commented on HADOOP-13570:
--

Hi [~steve_l], thank you for pointing out the duplication.  I will comment on 
11614 and close this one.


> Hadoop Swift driver should use new Apache httpclient
> 
>
> Key: HADOOP-13570
> URL: https://issues.apache.org/jira/browse/HADOOP-13570
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/swift
>Affects Versions: 2.7.3, 2.6.4
>Reporter: Chen He
>
> Current Hadoop openstack module is still using apache httpclient v1.x. It is 
> too old. We need to update it to a higher version to catch up in performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13570) Hadoop Swift driver should use new Apache httpclient

2016-08-31 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated HADOOP-13570:
-
Summary: Hadoop Swift driver should use new Apache httpclient  (was: Hadoop 
swift Driver should use new Apache httpclient)

> Hadoop Swift driver should use new Apache httpclient
> 
>
> Key: HADOOP-13570
> URL: https://issues.apache.org/jira/browse/HADOOP-13570
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/swift
>Affects Versions: 2.7.3, 2.6.4
>Reporter: Chen He
>
> Current Hadoop openstack module is still using apache httpclient v1.x. It is 
> too old. We need to update it to a higher version to catch up in performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-13570) Hadoop swift Driver should use new Apache httpclient

2016-08-31 Thread Chen He (JIRA)
Chen He created HADOOP-13570:


 Summary: Hadoop swift Driver should use new Apache httpclient
 Key: HADOOP-13570
 URL: https://issues.apache.org/jira/browse/HADOOP-13570
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: 2.6.4, 2.7.3
Reporter: Chen He


Current Hadoop openstack module is still using apache httpclient v1.x. It is 
too old. We need to update it to a higher version to catch up the performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13570) Hadoop swift Driver should use new Apache httpclient

2016-08-31 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated HADOOP-13570:
-
Component/s: fs/swift

> Hadoop swift Driver should use new Apache httpclient
> 
>
> Key: HADOOP-13570
> URL: https://issues.apache.org/jira/browse/HADOOP-13570
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/swift
>Affects Versions: 2.7.3, 2.6.4
>Reporter: Chen He
>
> Current Hadoop openstack module is still using apache httpclient v1.x. It is 
> too old. We need to update it to a higher version to catch up the performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13570) Hadoop swift Driver should use new Apache httpclient

2016-08-31 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated HADOOP-13570:
-
Description: Current Hadoop openstack module is still using apache 
httpclient v1.x. It is too old. We need to update it to a higher version to 
catch up in performance.  (was: Current Hadoop openstack module is still using 
apache httpclient v1.x. It is too old. We need to update it to a higher version 
to catch up the performance.)

> Hadoop swift Driver should use new Apache httpclient
> 
>
> Key: HADOOP-13570
> URL: https://issues.apache.org/jira/browse/HADOOP-13570
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/swift
>Affects Versions: 2.7.3, 2.6.4
>Reporter: Chen He
>
> Current Hadoop openstack module is still using apache httpclient v1.x. It is 
> too old. We need to update it to a higher version to catch up in performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-9565) Add a Blobstore interface to add to blobstore FileSystems

2016-08-24 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435538#comment-15435538
 ] 

Chen He edited comment on HADOOP-9565 at 8/24/16 7:46 PM:
--

Hi [~steve_l], thank you for spending time on my question. The new version of 
FileOutputCommitter has algorithm 2 which does not have serial rename of all 
tasks in commitJob. Just find the parameter. It should resolve our problem. 


was (Author: airbots):
Hi [~steve_l], thank you for spending time on my question. The new version of 
FileOutputCommitter has algorithm 2 which does not have serial rename of all 
task in commitJob. Just find the parameter. It should resolve our problem. 

> Add a Blobstore interface to add to blobstore FileSystems
> -
>
> Key: HADOOP-9565
> URL: https://issues.apache.org/jira/browse/HADOOP-9565
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs, fs/s3, fs/swift
>Affects Versions: 2.6.0
>Reporter: Steve Loughran
>Assignee: Pieter Reuse
> Attachments: HADOOP-9565-001.patch, HADOOP-9565-002.patch, 
> HADOOP-9565-003.patch, HADOOP-9565-004.patch, HADOOP-9565-005.patch, 
> HADOOP-9565-006.patch, HADOOP-9565-branch-2-007.patch
>
>
> We can make the fact that some {{FileSystem}} implementations are really 
> blobstores, with different atomicity and consistency guarantees, by adding a 
> {{Blobstore}} interface to add to them. 
> This could also be a place to add a {{Copy(Path,Path)}} method, assuming that 
> all blobstores implement at server-side copy operation as a substitute for 
> rename.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-9565) Add a Blobstore interface to add to blobstore FileSystems

2016-08-24 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435538#comment-15435538
 ] 

Chen He commented on HADOOP-9565:
-

Hi [~steve_l], thank you for spending time on my question. The new version of 
FileOutputCommitter has algorithm 2 which does not have serial rename of all 
task in commitJob. Just find the parameter. It should resolve our problem. 

> Add a Blobstore interface to add to blobstore FileSystems
> -
>
> Key: HADOOP-9565
> URL: https://issues.apache.org/jira/browse/HADOOP-9565
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs, fs/s3, fs/swift
>Affects Versions: 2.6.0
>Reporter: Steve Loughran
>Assignee: Pieter Reuse
> Attachments: HADOOP-9565-001.patch, HADOOP-9565-002.patch, 
> HADOOP-9565-003.patch, HADOOP-9565-004.patch, HADOOP-9565-005.patch, 
> HADOOP-9565-006.patch, HADOOP-9565-branch-2-007.patch
>
>
> We can make the fact that some {{FileSystem}} implementations are really 
> blobstores, with different atomicity and consistency guarantees, by adding a 
> {{Blobstore}} interface to add to them. 
> This could also be a place to add a {{Copy(Path,Path)}} method, assuming that 
> all blobstores implement at server-side copy operation as a substitute for 
> rename.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-9565) Add a Blobstore interface to add to blobstore FileSystems

2016-08-19 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15428975#comment-15428975
 ] 

Chen He edited comment on HADOOP-9565 at 8/19/16 10:52 PM:
---

>From our experiences, the main renaming overhead comes from 
>"FileOutputCommitter.commitTask()". Because it moves the files from temp dir 
>to dest dir. Some frameworks may not care whether the final task files are 
>under "dst/_temporary/0/_temporary/" or "dst/". Why don't we add a parameter 
>such as "mapreduce.skip.task.commit" parameter (default is false), so that 
>once a task is done, the output just stay in "dst/_temporary/0/_temporary/". 
>Then, the next job or application just need to take the "dst/" as input dir, 
>they do not care about whether is is deep or not. It avoids the atomicwrite 
>issue, provide compatibility, and avoid rename overhead. If there is no 
>objection, I am happy to create a JIRA to tracking that.


was (Author: airbots):
>From our experiences, the main renaming overhead comes from 
>"FileOutputCommitter.commitTask()". Because it moves the files from temp dir 
>to dest dir. Some frameworks may not care whether the final task files are 
>under "dst/_temporary/0/_temporary/" or "dst/". Why don't we add a parameter 
>such as "mapreduce.skip.task.commit" parameter (default is false), so that 
>once a task is done, the output just stay in "dst/_temporary/0/_temporary/". 
>Then, the next job or application just need to take the "dst/" as input dir, 
>they do not care about whether is is deep or not. It avoids the atomicwrite 
>issue, provide compatibility, and avoid rename overhead. If there is no 
>objection, I will create a JIRA to tracking that.

> Add a Blobstore interface to add to blobstore FileSystems
> -
>
> Key: HADOOP-9565
> URL: https://issues.apache.org/jira/browse/HADOOP-9565
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs, fs/s3, fs/swift
>Affects Versions: 2.6.0
>Reporter: Steve Loughran
>Assignee: Pieter Reuse
> Attachments: HADOOP-9565-001.patch, HADOOP-9565-002.patch, 
> HADOOP-9565-003.patch, HADOOP-9565-004.patch, HADOOP-9565-005.patch, 
> HADOOP-9565-006.patch, HADOOP-9565-branch-2-007.patch
>
>
> We can make the fact that some {{FileSystem}} implementations are really 
> blobstores, with different atomicity and consistency guarantees, by adding a 
> {{Blobstore}} interface to add to them. 
> This could also be a place to add a {{Copy(Path,Path)}} method, assuming that 
> all blobstores implement at server-side copy operation as a substitute for 
> rename.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-9565) Add a Blobstore interface to add to blobstore FileSystems

2016-08-19 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15428975#comment-15428975
 ] 

Chen He commented on HADOOP-9565:
-

>From our experiences, the main renaming overhead comes from 
>"FileOutputCommitter.commitTask()". Because it moves the files from temp dir 
>to dest dir. Some frameworks may not care whether the final task files are 
>under "dst/_temporary/0/_temporary/" or "dst/". Why don't we add a parameter 
>such as "mapreduce.skip.task.commit" parameter (default is false), so that 
>once a task is done, the output just stay in "dst/_temporary/0/_temporary/". 
>Then, the next job or application just need to take the "dst/" as input dir, 
>they do not care about whether is is deep or not. It avoids the atomicwrite 
>issue, provide compatibility, and avoid rename overhead. If there is no 
>objection, I will create a JIRA to tracking that.

> Add a Blobstore interface to add to blobstore FileSystems
> -
>
> Key: HADOOP-9565
> URL: https://issues.apache.org/jira/browse/HADOOP-9565
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs, fs/s3, fs/swift
>Affects Versions: 2.6.0
>Reporter: Steve Loughran
>Assignee: Pieter Reuse
> Attachments: HADOOP-9565-001.patch, HADOOP-9565-002.patch, 
> HADOOP-9565-003.patch, HADOOP-9565-004.patch, HADOOP-9565-005.patch, 
> HADOOP-9565-006.patch, HADOOP-9565-branch-2-007.patch
>
>
> We can make the fact that some {{FileSystem}} implementations are really 
> blobstores, with different atomicity and consistency guarantees, by adding a 
> {{Blobstore}} interface to add to them. 
> This could also be a place to add a {{Copy(Path,Path)}} method, assuming that 
> all blobstores implement at server-side copy operation as a substitute for 
> rename.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-11786) Fix Javadoc typos in org.apache.hadoop.fs.FileSystem

2016-08-17 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15425479#comment-15425479
 ] 

Chen He commented on HADOOP-11786:
--

Thank you for the work, [~anu] and [~boky01]

> Fix Javadoc typos in org.apache.hadoop.fs.FileSystem
> 
>
> Key: HADOOP-11786
> URL: https://issues.apache.org/jira/browse/HADOOP-11786
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.6.0
>Reporter: Chen He
>Assignee: Andras Bokor
>Priority: Trivial
>  Labels: newbie++
> Attachments: HADOOP-11786.patch
>
>
> /**
>  * Resets all statistics to 0.
>  *
>  * In order to reset, we add up all the thread-local statistics data, and
>  * set rootData to the negative of that.
>  *
>  * This may seem like a counterintuitive way to reset the statsitics.  Why
>  * can't we just zero out all the thread-local data?  Well, thread-local
>  * data can only be modified by the thread that owns it.  If we tried to
>  * modify the thread-local data from this thread, our modification might 
> get
>  * interleaved with a read-modify-write operation done by the thread that
>  * owns the data.  That would result in our update getting lost.
>  *
>  * The approach used here avoids this problem because it only ever reads
>  * (not writes) the thread-local data.  Both reads and writes to rootData
>  * are done under the lock, so we're free to modify rootData from any 
> thread
>  * that holds the lock.
>  */
> etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13211) Swift driver should have a configurable retry feature when ecounter 5xx error

2016-07-20 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated HADOOP-13211:
-
Assignee: (was: Chen He)

> Swift driver should have a configurable retry feature when ecounter 5xx error
> -
>
> Key: HADOOP-13211
> URL: https://issues.apache.org/jira/browse/HADOOP-13211
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/swift
>Affects Versions: 2.7.2
>Reporter: Chen He
>
> In current code. if Swift driver meets a HTTP 5xx, it will throw exception 
> and stop. As a driver, it will be more sophisticate if it can retry a 
> configurable times before report failure. There are two reasons that I can 
> image:
> 1. if the server is really busy, it is possible that the server will drop 
> some requests to avoid DDoS attack.
> 2. If server accidentally unavailable for a short period of time and come 
> back again, we may not need to fail the whole driver. Just record the 
> exception and retry may be more flexible. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13211) Swift driver should have a configurable retry feature when ecounter 5xx error

2016-07-20 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15387082#comment-15387082
 ] 

Chen He commented on HADOOP-13211:
--

Thank you for the update, [~ste...@apache.org]. If we want to avoid data loss 
on retry. How about recursive retry ? It looks little bit ugly but won't fail. 
For object store server dead case, recursive is nightmare.

> Swift driver should have a configurable retry feature when ecounter 5xx error
> -
>
> Key: HADOOP-13211
> URL: https://issues.apache.org/jira/browse/HADOOP-13211
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/swift
>Affects Versions: 2.7.2
>Reporter: Chen He
>Assignee: Chen He
>
> In current code. if Swift driver meets a HTTP 5xx, it will throw exception 
> and stop. As a driver, it will be more sophisticate if it can retry a 
> configurable times before report failure. There are two reasons that I can 
> image:
> 1. if the server is really busy, it is possible that the server will drop 
> some requests to avoid DDoS attack.
> 2. If server accidentally unavailable for a short period of time and come 
> back again, we may not need to fail the whole driver. Just record the 
> exception and retry may be more flexible. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13211) Swift driver should have a configurable retry feature when ecounter 5xx error

2016-05-31 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15309263#comment-15309263
 ] 

Chen He commented on HADOOP-13211:
--

Thank you for the reply, [~ste...@apache.org]. 

IMHO, the hadoop openstack driver is a bridge between HDFS and Openstack object 
store. MR or other native Hadoop frameworks should be able to utilize the 
Hadoop IPC retry. With the increasing popularity of HDFS, other computing 
frameworks like Spark, in memory storage system like Tachyon, they are using 
hadoop openstack driver. I am not sure if Spark or other frameworks use 
hadoop-openstack driver, the Hadoop IPC retry will trigger or not. 

Those frameworks have retry on task level, however, it could be costly to retry 
a task than just retry in the driver level. 

For the data lose, it is a really good catch. If the server keeps failing and 
providing 5xx, the upload will finally fail. The object store is not file 
system and may not guarantee file system level integrity. I can't figure out a 
scenario that data loss caused by retry. Could you provide an suggestion? 

> Swift driver should have a configurable retry feature when ecounter 5xx error
> -
>
> Key: HADOOP-13211
> URL: https://issues.apache.org/jira/browse/HADOOP-13211
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/swift
>Affects Versions: 2.7.2
>Reporter: Chen He
>Assignee: Chen He
>
> In current code. if Swift driver meets a HTTP 5xx, it will throw exception 
> and stop. As a driver, it will be more sophisticate if it can retry a 
> configurable times before report failure. There are two reasons that I can 
> image:
> 1. if the server is really busy, it is possible that the server will drop 
> some requests to avoid DDoS attack.
> 2. If server accidentally unavailable for a short period of time and come 
> back again, we may not need to fail the whole driver. Just record the 
> exception and retry may be more flexible. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-13211) Swift driver should have a configurable retry feature when ecounter 5xx error

2016-05-26 Thread Chen He (JIRA)
Chen He created HADOOP-13211:


 Summary: Swift driver should have a configurable retry feature 
when ecounter 5xx error
 Key: HADOOP-13211
 URL: https://issues.apache.org/jira/browse/HADOOP-13211
 Project: Hadoop Common
  Issue Type: New Feature
  Components: fs/swift
Affects Versions: 2.7.2
Reporter: Chen He
Assignee: Chen He


In current code. if Swift driver meets a HTTP 5xx, it will throw exception and 
stop. As a driver, it will be more sophisticate if it can retry a configurable 
times before report failure. There are two reasons that I can image:

1. if the server is really busy, it is possible that the server will drop some 
requests to avoid DDoS attack.

2. If server accidentally unavailable for a short period of time and come back 
again, we may not need to fail the whole driver. Just record the exception and 
retry may be more flexible. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-12057) swiftfs rename on partitioned file attempts to consolidate partitions

2016-04-28 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15263341#comment-15263341
 ] 

Chen He commented on HADOOP-12057:
--

It automatically skips "unit tests" if the auth-keys.xml is not configured.

> swiftfs rename on partitioned file attempts to consolidate partitions
> -
>
> Key: HADOOP-12057
> URL: https://issues.apache.org/jira/browse/HADOOP-12057
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/swift
>Reporter: David Dobbins
>Assignee: David Dobbins
> Attachments: HADOOP-12057-006.patch, HADOOP-12057-008.patch, 
> HADOOP-12057.007.patch, HADOOP-12057.patch, HADOOP-12057.patch, 
> HADOOP-12057.patch, HADOOP-12057.patch, HADOOP-12057.patch
>
>
> In the swift filesystem for openstack, a rename operation on a partitioned 
> file uses the swift COPY operation, which attempts to consolidate all of the 
> partitions into a single object.  This causes the rename to fail when the 
> total size of all the partitions exceeds the maximum object size for swift.  
> Since partitioned files are primarily created to allow a file to exceed the 
> maximum object size, this bug makes writing to swift extremely unreliable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-12057) swiftfs rename on partitioned file attempts to consolidate partitions

2016-04-28 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15263308#comment-15263308
 ] 

Chen He commented on HADOOP-12057:
--

I think the reason why the patch get a -1 is because it fails a unit test.
I agree it works in some cases.

Chen He@Oracle from Samsung Mega



> swiftfs rename on partitioned file attempts to consolidate partitions
> -
>
> Key: HADOOP-12057
> URL: https://issues.apache.org/jira/browse/HADOOP-12057
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/swift
>Reporter: David Dobbins
>Assignee: David Dobbins
> Attachments: HADOOP-12057-006.patch, HADOOP-12057-008.patch, 
> HADOOP-12057.007.patch, HADOOP-12057.patch, HADOOP-12057.patch, 
> HADOOP-12057.patch, HADOOP-12057.patch, HADOOP-12057.patch
>
>
> In the swift filesystem for openstack, a rename operation on a partitioned 
> file uses the swift COPY operation, which attempts to consolidate all of the 
> partitions into a single object.  This causes the rename to fail when the 
> total size of all the partitions exceeds the maximum object size for swift.  
> Since partitioned files are primarily created to allow a file to exceed the 
> maximum object size, this bug makes writing to swift extremely unreliable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-12057) swiftfs rename on partitioned file attempts to consolidate partitions

2016-04-28 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15263265#comment-15263265
 ] 

Chen He commented on HADOOP-12057:
--

What object store you are talking about. It is also a server side 
configuration. If your server's config for the maximum size of a object is 
10GB, you will not meet this problem. Try to get this info first. Or try to 
copy a 1TB file to object store, see what happend. :). I just suspect that no 
server will support this large single object . Please feel free to provide 
feedback. Thanks!

> swiftfs rename on partitioned file attempts to consolidate partitions
> -
>
> Key: HADOOP-12057
> URL: https://issues.apache.org/jira/browse/HADOOP-12057
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/swift
>Reporter: David Dobbins
>Assignee: David Dobbins
> Attachments: HADOOP-12057-006.patch, HADOOP-12057-008.patch, 
> HADOOP-12057.007.patch, HADOOP-12057.patch, HADOOP-12057.patch, 
> HADOOP-12057.patch, HADOOP-12057.patch, HADOOP-12057.patch
>
>
> In the swift filesystem for openstack, a rename operation on a partitioned 
> file uses the swift COPY operation, which attempts to consolidate all of the 
> partitions into a single object.  This causes the rename to fail when the 
> total size of all the partitions exceeds the maximum object size for swift.  
> Since partitioned files are primarily created to allow a file to exceed the 
> maximum object size, this bug makes writing to swift extremely unreliable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13021) Hadoop swift driver unit test should use unique directory for each run

2016-04-25 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15256638#comment-15256638
 ] 

Chen He commented on HADOOP-13021:
--

Thank you for the quick reply, [~steve_l]. I agree, we need to clean before and 
after tests. I will update the patch.

> Hadoop swift driver unit test should use unique directory for each run
> --
>
> Key: HADOOP-13021
> URL: https://issues.apache.org/jira/browse/HADOOP-13021
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/swift
>Affects Versions: 2.7.2
>Reporter: Chen He
>Assignee: Chen He
>  Labels: unit-test
> Attachments: HADOOP-13021.001.patch
>
>
> Since all "unit test" in swift package are actually functionality test, it 
> requires server's information in the core-site.xml file. However, multiple 
> unit test runs on difference machines using the same core-site.xml file will 
> result in some unit tests failure. For example:
> In TestSwiftFileSystemBasicOps.java
> public void testMkDir() throws Throwable {
> Path path = new Path("/test/MkDir");
> fs.mkdirs(path);
> //success then -so try a recursive operation
> fs.delete(path, true);
>   }
> It is possible that machine A and B are running "mvn clean install" using 
> same core-site.xml file. However, machine A run testMkDir() first and delete 
> the dir, but machine B just tried to run fs.delete(path,true). It will report 
> failure. This is just an example. There are many similar cases in the unit 
> test sets. I would propose we use a unique dir for each unit test run instead 
> of using "Path path = new Path("/test/MkDir")" for all concurrent runs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-13021) Hadoop swift driver unit test should use unique directory for each run

2016-04-24 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated HADOOP-13021:
-
Attachment: HADOOP-13021.001.patch

> Hadoop swift driver unit test should use unique directory for each run
> --
>
> Key: HADOOP-13021
> URL: https://issues.apache.org/jira/browse/HADOOP-13021
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/swift
>Affects Versions: 2.7.2
>Reporter: Chen He
>Assignee: Chen He
>  Labels: unit-test
> Attachments: HADOOP-13021.001.patch
>
>
> Since all "unit test" in swift package are actually functionality test, it 
> requires server's information in the core-site.xml file. However, multiple 
> unit test runs on difference machines using the same core-site.xml file will 
> result in some unit tests failure. For example:
> In TestSwiftFileSystemBasicOps.java
> public void testMkDir() throws Throwable {
> Path path = new Path("/test/MkDir");
> fs.mkdirs(path);
> //success then -so try a recursive operation
> fs.delete(path, true);
>   }
> It is possible that machine A and B are running "mvn clean install" using 
> same core-site.xml file. However, machine A run testMkDir() first and delete 
> the dir, but machine B just tried to run fs.delete(path,true). It will report 
> failure. This is just an example. There are many similar cases in the unit 
> test sets. I would propose we use a unique dir for each unit test run instead 
> of using "Path path = new Path("/test/MkDir")" for all concurrent runs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-13021) Hadoop swift driver unit test should use unique directory for each run

2016-04-22 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254431#comment-15254431
 ] 

Chen He commented on HADOOP-13021:
--

Thank you for the reply, [~ste...@apache.org]. I agree with you. However, there 
could be corner case such as JVM crashes or unit tests terminated in outage. 
Even we set different value for each machine, for example, machine A has its 
own bucket. Because of previous outage, there is some leftover directories or 
files. The next unit test run incline to report error. 

I propose we use some timestamp for those hard values. Combining your suggest, 
we can guarantee that in every time unit test runs on every machine, they are 
using different hard values. Then, we may be little bit safer than current 
solution.

> Hadoop swift driver unit test should use unique directory for each run
> --
>
> Key: HADOOP-13021
> URL: https://issues.apache.org/jira/browse/HADOOP-13021
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/swift
>Affects Versions: 2.7.2
>Reporter: Chen He
>Assignee: Chen He
>  Labels: unit-test
>
> Since all "unit test" in swift package are actually functionality test, it 
> requires server's information in the core-site.xml file. However, multiple 
> unit test runs on difference machines using the same core-site.xml file will 
> result in some unit tests failure. For example:
> In TestSwiftFileSystemBasicOps.java
> public void testMkDir() throws Throwable {
> Path path = new Path("/test/MkDir");
> fs.mkdirs(path);
> //success then -so try a recursive operation
> fs.delete(path, true);
>   }
> It is possible that machine A and B are running "mvn clean install" using 
> same core-site.xml file. However, machine A run testMkDir() first and delete 
> the dir, but machine B just tried to run fs.delete(path,true). It will report 
> failure. This is just an example. There are many similar cases in the unit 
> test sets. I would propose we use a unique dir for each unit test run instead 
> of using "Path path = new Path("/test/MkDir")" for all concurrent runs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-12291) Add support for nested groups in LdapGroupsMapping

2016-04-21 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated HADOOP-12291:
-
Fix Version/s: (was: 2.8.0)

> Add support for nested groups in LdapGroupsMapping
> --
>
> Key: HADOOP-12291
> URL: https://issues.apache.org/jira/browse/HADOOP-12291
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.8.0
>Reporter: Gautam Gopalakrishnan
>Assignee: Esther Kundin
>  Labels: features, patch
> Attachments: HADOOP-12291.001.patch
>
>
> When using {{LdapGroupsMapping}} with Hadoop, nested groups are not 
> supported. So for example if user {{jdoe}} is part of group A which is a 
> member of group B, the group mapping currently returns only group A.
> Currently this facility is available with {{ShellBasedUnixGroupsMapping}} and 
> SSSD (or similar tools) but would be good to have this feature as part of 
> {{LdapGroupsMapping}} directly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11786) Fix Javadoc typos in org.apache.hadoop.fs.FileSystem

2016-04-19 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249119#comment-15249119
 ] 

Chen He commented on HADOOP-11786:
--

Thank you for the patch, [~boky01]. I will review it this week.

> Fix Javadoc typos in org.apache.hadoop.fs.FileSystem
> 
>
> Key: HADOOP-11786
> URL: https://issues.apache.org/jira/browse/HADOOP-11786
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.6.0
>Reporter: Chen He
>Assignee: Yanjun Wang
>Priority: Trivial
>  Labels: newbie++
> Attachments: HADOOP-11786.patch
>
>
> /**
>  * Resets all statistics to 0.
>  *
>  * In order to reset, we add up all the thread-local statistics data, and
>  * set rootData to the negative of that.
>  *
>  * This may seem like a counterintuitive way to reset the statsitics.  Why
>  * can't we just zero out all the thread-local data?  Well, thread-local
>  * data can only be modified by the thread that owns it.  If we tried to
>  * modify the thread-local data from this thread, our modification might 
> get
>  * interleaved with a read-modify-write operation done by the thread that
>  * owns the data.  That would result in our update getting lost.
>  *
>  * The approach used here avoids this problem because it only ever reads
>  * (not writes) the thread-local data.  Both reads and writes to rootData
>  * are done under the lock, so we're free to modify rootData from any 
> thread
>  * that holds the lock.
>  */
> etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-13021) Hadoop swift driver unit test should use unique directory each run

2016-04-13 Thread Chen He (JIRA)
Chen He created HADOOP-13021:


 Summary: Hadoop swift driver unit test should use unique directory 
each run
 Key: HADOOP-13021
 URL: https://issues.apache.org/jira/browse/HADOOP-13021
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs/swift
Affects Versions: 2.7.2
Reporter: Chen He
Assignee: Chen He


Since all "unit test" in swift package are actually functionality test, it 
requires server's information in the core-site.xml file. However, multiple unit 
test runs on difference machines using the same core-site.xml file will result 
in some unit tests failure. For example:
In TestSwiftFileSystemBasicOps.java
public void testMkDir() throws Throwable {
Path path = new Path("/test/MkDir");
fs.mkdirs(path);
//success then -so try a recursive operation
fs.delete(path, true);
  }

It is possible that machine A and B are running "mvn clean install" using same 
core-site.xml file. However, machine A run testMkDir() first and delete the 
dir, but machine B just tried to run fs.delete(path,true). It will report 
failure. This is just an example. There are many similar cases in the unit test 
sets. I would propose we use a unique dir for each unit test run instead of 
using "Path path = new Path("/test/MkDir")" for all concurrent runs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-13021) Hadoop swift driver unit test should use unique directory for each run

2016-04-13 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated HADOOP-13021:
-
Labels: unit-test  (was: )

> Hadoop swift driver unit test should use unique directory for each run
> --
>
> Key: HADOOP-13021
> URL: https://issues.apache.org/jira/browse/HADOOP-13021
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/swift
>Affects Versions: 2.7.2
>Reporter: Chen He
>Assignee: Chen He
>  Labels: unit-test
>
> Since all "unit test" in swift package are actually functionality test, it 
> requires server's information in the core-site.xml file. However, multiple 
> unit test runs on difference machines using the same core-site.xml file will 
> result in some unit tests failure. For example:
> In TestSwiftFileSystemBasicOps.java
> public void testMkDir() throws Throwable {
> Path path = new Path("/test/MkDir");
> fs.mkdirs(path);
> //success then -so try a recursive operation
> fs.delete(path, true);
>   }
> It is possible that machine A and B are running "mvn clean install" using 
> same core-site.xml file. However, machine A run testMkDir() first and delete 
> the dir, but machine B just tried to run fs.delete(path,true). It will report 
> failure. This is just an example. There are many similar cases in the unit 
> test sets. I would propose we use a unique dir for each unit test run instead 
> of using "Path path = new Path("/test/MkDir")" for all concurrent runs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-13021) Hadoop swift driver unit test should use unique directory for each run

2016-04-13 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated HADOOP-13021:
-
Summary: Hadoop swift driver unit test should use unique directory for each 
run  (was: Hadoop swift driver unit test should use unique directory each run)

> Hadoop swift driver unit test should use unique directory for each run
> --
>
> Key: HADOOP-13021
> URL: https://issues.apache.org/jira/browse/HADOOP-13021
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/swift
>Affects Versions: 2.7.2
>Reporter: Chen He
>Assignee: Chen He
>  Labels: unit-test
>
> Since all "unit test" in swift package are actually functionality test, it 
> requires server's information in the core-site.xml file. However, multiple 
> unit test runs on difference machines using the same core-site.xml file will 
> result in some unit tests failure. For example:
> In TestSwiftFileSystemBasicOps.java
> public void testMkDir() throws Throwable {
> Path path = new Path("/test/MkDir");
> fs.mkdirs(path);
> //success then -so try a recursive operation
> fs.delete(path, true);
>   }
> It is possible that machine A and B are running "mvn clean install" using 
> same core-site.xml file. However, machine A run testMkDir() first and delete 
> the dir, but machine B just tried to run fs.delete(path,true). It will report 
> failure. This is just an example. There are many similar cases in the unit 
> test sets. I would propose we use a unique dir for each unit test run instead 
> of using "Path path = new Path("/test/MkDir")" for all concurrent runs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-12501) Enable SwiftNativeFileSystem to ACLs

2016-03-10 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated HADOOP-12501:
-
Summary: Enable SwiftNativeFileSystem to ACLs  (was: Enable 
SwiftNativeFileSystem to preserve user, group, permission)

> Enable SwiftNativeFileSystem to ACLs
> 
>
> Key: HADOOP-12501
> URL: https://issues.apache.org/jira/browse/HADOOP-12501
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/swift
>Affects Versions: 2.7.1
>Reporter: Chen He
>Assignee: Chen He
>
> Currently, if user copy file/dir from localFS or HDFS to swift object store, 
> u/g/p will be gone. There should be a way to preserve u/g/p. It will provide 
> benefit for  a large number of files/dirs transferring between HDFS/localFS 
> and Swift object store. We also need to be careful since Hadoop prevent 
> general user from changing u/g/p especially if Kerberos is enabled.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-12735) Fix typo in core-default.xml property

2016-01-23 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated HADOOP-12735:
-
Affects Version/s: (was: 2.8.0)
   2.7.1

> Fix typo in core-default.xml property
> -
>
> Key: HADOOP-12735
> URL: https://issues.apache.org/jira/browse/HADOOP-12735
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>Priority: Minor
>  Labels: supportability
>
> The property as defined in core-default.xml is
> bq.  hadoop.work.around.non.threadsafe.getpwuid
> But in NativeIO.java (the only place I can see a similar reference), the 
> property is defined as:
> bq. static final String WORKAROUND_NON_THREADSAFE_CALLS_KEY = 
> "hadoop.workaround.non.threadsafe.getpwuid";
> Note the extra period (.) in the word "workaround".
> Should the code be made to match the property or vice versa?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12735) Fix typo in core-default.xml property

2016-01-23 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15113864#comment-15113864
 ] 

Chen He commented on HADOOP-12735:
--

2.8.0 is not released yet, change to 2.7.1


> Fix typo in core-default.xml property
> -
>
> Key: HADOOP-12735
> URL: https://issues.apache.org/jira/browse/HADOOP-12735
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>Priority: Minor
>  Labels: supportability
>
> The property as defined in core-default.xml is
> bq.  hadoop.work.around.non.threadsafe.getpwuid
> But in NativeIO.java (the only place I can see a similar reference), the 
> property is defined as:
> bq. static final String WORKAROUND_NON_THREADSAFE_CALLS_KEY = 
> "hadoop.workaround.non.threadsafe.getpwuid";
> Note the extra period (.) in the word "workaround".
> Should the code be made to match the property or vice versa?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-12623) Hadoop Swift driver should support more flexible container name than RFC952

2015-12-08 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated HADOOP-12623:
-
Summary: Hadoop Swift driver should support more flexible container name 
than RFC952  (was: Swift should support more flexible container name than 
RFC952)

> Hadoop Swift driver should support more flexible container name than RFC952
> ---
>
> Key: HADOOP-12623
> URL: https://issues.apache.org/jira/browse/HADOOP-12623
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/swift
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Chen He
>
> Just a thought. 
> It will be great if Hadoop swift driver can support more flexible container 
> name. Current Hadoop swift driver requires container name to follow RFC952. 
> It will report error if container name does not obey RFC952:
> "Invalid swift hostname 'test.1.serviceName': hostname must in form 
> container.service"
>  However, user can use any other Swift object store drivers (cURL, cyberduck, 
> JOSS, swift python driver, etc) to upload data to Object store but current 
> hadoop swift driver can not recognize those containers whose names do not 
> follow RFC952. 
> I dig into the source code and figure out it is because of in
> RestClientBindings.java{
>  public static String extractContainerName(URI uri) throws
>   SwiftConfigurationException {
> return extractContainerName(uri.getHost());
>   }
> }
> And URI.java line 3143 gives "host = null" . 
> We may need to find a better way to do the container name parsing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-12623) Swift should support more flexible container name than RFC952

2015-12-07 Thread Chen He (JIRA)
Chen He created HADOOP-12623:


 Summary: Swift should support more flexible container name than 
RFC952
 Key: HADOOP-12623
 URL: https://issues.apache.org/jira/browse/HADOOP-12623
 Project: Hadoop Common
  Issue Type: New Feature
  Components: fs/swift
Affects Versions: 2.6.2, 2.7.1
Reporter: Chen He


Just a thought. 

It will be great if Hadoop swift driver can support more flexible container 
name. Current Hadoop swift driver requires container name to follow RFC952. It 
will report error if container name does not obey RFC952:

"Invalid swift hostname 'test.1.serviceName': hostname must in form 
container.service"

 However, user can use any other Swift object store drivers (cURL, cyberduck, 
JOSS, swift python driver, etc) to upload data to Object store but current 
hadoop swift driver can not recognize those containers whose names do not 
follow RFC952. 

I dig into the source code and figure out it is because of in
RestClientBindings.java{

 public static String extractContainerName(URI uri) throws
  SwiftConfigurationException {
return extractContainerName(uri.getHost());
  }
}

And URI.java line 3143 gives "host = null" . 

We may need to find a better way to do the container name parsing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-12551) Introduce FileNotFoundException for open and getFileStatus API's in WASB

2015-11-05 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated HADOOP-12551:
-
Fix Version/s: (was: 2.8.0)

> Introduce FileNotFoundException for open and getFileStatus API's in WASB
> 
>
> Key: HADOOP-12551
> URL: https://issues.apache.org/jira/browse/HADOOP-12551
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 2.7.1
>Reporter: Dushyanth
>Assignee: Dushyanth
>
> HADOOP-12533 introduced FileNotFoundException to the read and seek API for 
> WASB. The open and getFileStatus api currently throws FileNotFoundException 
> correctly when the file does not exists when the API is called but does not 
> throw the same exception if there is another thread/process deletes the file 
> during its execution. This Jira fixes that behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-12551) Introduce FileNotFoundException for open and getFileStatus API's in WASB

2015-11-05 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated HADOOP-12551:
-
Affects Version/s: (was: 2.8.0)
   2.7.1

> Introduce FileNotFoundException for open and getFileStatus API's in WASB
> 
>
> Key: HADOOP-12551
> URL: https://issues.apache.org/jira/browse/HADOOP-12551
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 2.7.1
>Reporter: Dushyanth
>Assignee: Dushyanth
>
> HADOOP-12533 introduced FileNotFoundException to the read and seek API for 
> WASB. The open and getFileStatus api currently throws FileNotFoundException 
> correctly when the file does not exists when the API is called but does not 
> throw the same exception if there is another thread/process deletes the file 
> during its execution. This Jira fixes that behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12501) Enable SwiftNativeFileSystem to preserve user, group, permission

2015-10-26 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975120#comment-14975120
 ] 

Chen He commented on HADOOP-12501:
--

Thank you for the suggestion, [~steve_l]

"though I think it's dangerous as people may thing those permissions may 
actually apply. "

Actually, I have another idea to enable swift driver to do permission check, 
then the blobstore looks more like a real filesystem. The idea about changing 
'distcp' is a great solution. IMHO, it could be more helpful if we find a way 
to let the '-p' option works for all filesystem implementations.  

> Enable SwiftNativeFileSystem to preserve user, group, permission
> 
>
> Key: HADOOP-12501
> URL: https://issues.apache.org/jira/browse/HADOOP-12501
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/swift
>Affects Versions: 2.7.1
>Reporter: Chen He
>Assignee: Chen He
>
> Currently, if user copy file/dir from localFS or HDFS to swift object store, 
> u/g/p will be gone. There should be a way to preserve u/g/p. It will provide 
> benefit for  a large number of files/dirs transferring between HDFS/localFS 
> and Swift object store. We also need to be careful since Hadoop prevent 
> general user from changing u/g/p especially if Kerberos is enabled.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12501) Enable SwiftNativeFileSystem to preserve user, group, permission

2015-10-22 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969436#comment-14969436
 ] 

Chen He commented on HADOOP-12501:
--

Hi [~steve_l], thank you for the reply. Swift server has its own access control 
mechanism in the backend. However, it may not satisfy the needs in some cases. 
For example:

Current storage provider sells a service to a company A, the A has several 
types of users: admin, general user, etc. If the admin wants to backup or 
restore all files on HDFS to swift object store, the u/g/p data may lost if the 
admin uses 'distcp' to copy files (it is also possible that admin copies data 
blocks and metadata instead of using 'distcp', then there is no need to 
preserve u/g/p). All u/g/p will disappear. My thought was to preserve the u/g/p 
into somewhere in the metadata of each object, then, it will avoid a lot of 
work for the admin to recover u/g/p in this case.

> Enable SwiftNativeFileSystem to preserve user, group, permission
> 
>
> Key: HADOOP-12501
> URL: https://issues.apache.org/jira/browse/HADOOP-12501
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/swift
>Affects Versions: 2.7.1
>Reporter: Chen He
>Assignee: Chen He
>
> Currently, if user copy file/dir from localFS or HDFS to swift object store, 
> u/g/p will be gone. There should be a way to preserve u/g/p. It will provide 
> benefit for  a large number of files/dirs transferring between HDFS/localFS 
> and Swift object store. We also need to be careful since Hadoop prevent 
> general user from changing u/g/p especially if Kerberos is enabled.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-12501) Enable SwiftNativeFileSystem to preserve user, group, permission

2015-10-21 Thread Chen He (JIRA)
Chen He created HADOOP-12501:


 Summary: Enable SwiftNativeFileSystem to preserve user, group, 
permission
 Key: HADOOP-12501
 URL: https://issues.apache.org/jira/browse/HADOOP-12501
 Project: Hadoop Common
  Issue Type: New Feature
  Components: fs/swift
Affects Versions: 2.7.1
Reporter: Chen He
Assignee: Chen He


Currently, if user copy file/dir from localFS or HDFS to swift object store, 
u/g/p will be gone. There should be a way to preserve u/g/p. It will provide 
benefit for  a large number of files/dirs transferring between HDFS/localFS and 
Swift object store. We also need to be careful since Hadoop prevent general 
user from changing u/g/p especially if Kerberos is enabled.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-12461) Swift driver should have the ability to renew token if it expired

2015-10-14 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated HADOOP-12461:
-
Summary: Swift driver should have the ability to renew token if it expired  
(was: Swift driver should have the ability to renew token if server has timeout)

> Swift driver should have the ability to renew token if it expired
> -
>
> Key: HADOOP-12461
> URL: https://issues.apache.org/jira/browse/HADOOP-12461
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/swift
>Affects Versions: 2.7.1
>Reporter: Chen He
>
> Current swift driver will encounter authentication issue if swift server has 
> token timeout. It will be good if driver can automatically renew once it 
> expired. We met HTTP 401 error when transferring a 100gb file to swift object 
> store. Since the large file is chunked into 27 files, the server will ask 
> each chunk for token inspection. If server has timeout and 100GB file 
> transferring time is longer than this timeout, token will expire and the file 
> transferring will fail. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-12461) Swift driver should have the ability to renew token if server has timeout

2015-10-14 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated HADOOP-12461:
-
Description: Current swift driver will encounter authentication issue if 
swift server has token timeout. It will be good if driver can automatically 
renew once it expired. We met HTTP 401 error when transferring a 100gb file to 
swift object store. Since the large file is chunked into 27 files, the server 
will ask each chunk for token inspection. If server has timeout and 100GB file 
transferring time is longer than this timeout, token will expire and the file 
transferring will fail.   (was: Current swift driver will encounter 
authentication issue if swift server has token timeout. It will be good if 
driver can automatically renew once it expired.)

> Swift driver should have the ability to renew token if server has timeout
> -
>
> Key: HADOOP-12461
> URL: https://issues.apache.org/jira/browse/HADOOP-12461
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/swift
>Affects Versions: 2.7.1
>Reporter: Chen He
>
> Current swift driver will encounter authentication issue if swift server has 
> token timeout. It will be good if driver can automatically renew once it 
> expired. We met HTTP 401 error when transferring a 100gb file to swift object 
> store. Since the large file is chunked into 27 files, the server will ask 
> each chunk for token inspection. If server has timeout and 100GB file 
> transferring time is longer than this timeout, token will expire and the file 
> transferring will fail. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-12471) Support Swift file (> 5GB) continuious uploading where there is a failure

2015-10-09 Thread Chen He (JIRA)
Chen He created HADOOP-12471:


 Summary: Support Swift file (> 5GB) continuious uploading where 
there is a failure
 Key: HADOOP-12471
 URL: https://issues.apache.org/jira/browse/HADOOP-12471
 Project: Hadoop Common
  Issue Type: New Feature
  Components: fs/swift
Affects Versions: 2.7.1
Reporter: Chen He


Current Swift FileSystem supports file larger than 5GB. 
File will be chunked as large as 4.6GB (configurable). For example, if there is 
a 46GB file "foo" in swift, 
Then the structure will look like:

foo/01
foo/02
foo/03
...
foo/10

User will not see those 0x files if they don't specify. That means, if use 
do:
\> hadoop fs -ls swift://container.serviceProvidor/foo

It only shows:
dwr-r--r--4.6GBfoo

However, in my test, if there is a failure, during uploading the foo file, the 
previous uploaded chunks will be left in the object store. It will be good to 
support continuous uploading based on previous leftover



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12471) Support Swift file (> 5GB) continuious uploading where there is a failure

2015-10-09 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14949990#comment-14949990
 ] 

Chen He commented on HADOOP-12471:
--

First of all, I think we need a way to differentiate those failed leftover 
files from other file.

> Support Swift file (> 5GB) continuious uploading where there is a failure
> -
>
> Key: HADOOP-12471
> URL: https://issues.apache.org/jira/browse/HADOOP-12471
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/swift
>Affects Versions: 2.7.1
>Reporter: Chen He
>
> Current Swift FileSystem supports file larger than 5GB. 
> File will be chunked as large as 4.6GB (configurable). For example, if there 
> is a 46GB file "foo" in swift, 
> Then the structure will look like:
> foo/01
> foo/02
> foo/03
> ...
> foo/10
> User will not see those 0x files if they don't specify. That means, if 
> user does:
> \> hadoop fs -ls swift://container.serviceProvidor/foo
> It only shows:
> dwr-r--r--46GBfoo
> However, in my test, if there is a failure, during uploading the foo file, 
> the previous uploaded chunks will be left in the object store. It will be 
> good to support continuous uploading based on previous leftover



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-12471) Support Swift file (> 5GB) continuious uploading where there is a failure

2015-10-09 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated HADOOP-12471:
-
Description: 
Current Swift FileSystem supports file larger than 5GB. 
File will be chunked as large as 4.6GB (configurable). For example, if there is 
a 46GB file "foo" in swift, 
Then the structure will look like:

foo/01
foo/02
foo/03
...
foo/10

User will not see those 0x files if they don't specify. That means, if user 
does:
\> hadoop fs -ls swift://container.serviceProvidor/foo

It only shows:
dwr-r--r--4.6GBfoo

However, in my test, if there is a failure, during uploading the foo file, the 
previous uploaded chunks will be left in the object store. It will be good to 
support continuous uploading based on previous leftover

  was:
Current Swift FileSystem supports file larger than 5GB. 
File will be chunked as large as 4.6GB (configurable). For example, if there is 
a 46GB file "foo" in swift, 
Then the structure will look like:

foo/01
foo/02
foo/03
...
foo/10

User will not see those 0x files if they don't specify. That means, if use 
do:
\> hadoop fs -ls swift://container.serviceProvidor/foo

It only shows:
dwr-r--r--4.6GBfoo

However, in my test, if there is a failure, during uploading the foo file, the 
previous uploaded chunks will be left in the object store. It will be good to 
support continuous uploading based on previous leftover


> Support Swift file (> 5GB) continuious uploading where there is a failure
> -
>
> Key: HADOOP-12471
> URL: https://issues.apache.org/jira/browse/HADOOP-12471
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/swift
>Affects Versions: 2.7.1
>Reporter: Chen He
>
> Current Swift FileSystem supports file larger than 5GB. 
> File will be chunked as large as 4.6GB (configurable). For example, if there 
> is a 46GB file "foo" in swift, 
> Then the structure will look like:
> foo/01
> foo/02
> foo/03
> ...
> foo/10
> User will not see those 0x files if they don't specify. That means, if 
> user does:
> \> hadoop fs -ls swift://container.serviceProvidor/foo
> It only shows:
> dwr-r--r--4.6GBfoo
> However, in my test, if there is a failure, during uploading the foo file, 
> the previous uploaded chunks will be left in the object store. It will be 
> good to support continuous uploading based on previous leftover



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-12471) Support Swift file (> 5GB) continuious uploading where there is a failure

2015-10-09 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated HADOOP-12471:
-
Description: 
Current Swift FileSystem supports file larger than 5GB. 
File will be chunked as large as 4.6GB (configurable). For example, if there is 
a 46GB file "foo" in swift, 
Then the structure will look like:

foo/01
foo/02
foo/03
...
foo/10

User will not see those 0x files if they don't specify. That means, if user 
does:
\> hadoop fs -ls swift://container.serviceProvidor/foo

It only shows:
dwr-r--r--46GBfoo

However, in my test, if there is a failure, during uploading the foo file, the 
previous uploaded chunks will be left in the object store. It will be good to 
support continuous uploading based on previous leftover

  was:
Current Swift FileSystem supports file larger than 5GB. 
File will be chunked as large as 4.6GB (configurable). For example, if there is 
a 46GB file "foo" in swift, 
Then the structure will look like:

foo/01
foo/02
foo/03
...
foo/10

User will not see those 0x files if they don't specify. That means, if user 
does:
\> hadoop fs -ls swift://container.serviceProvidor/foo

It only shows:
dwr-r--r--4.6GBfoo

However, in my test, if there is a failure, during uploading the foo file, the 
previous uploaded chunks will be left in the object store. It will be good to 
support continuous uploading based on previous leftover


> Support Swift file (> 5GB) continuious uploading where there is a failure
> -
>
> Key: HADOOP-12471
> URL: https://issues.apache.org/jira/browse/HADOOP-12471
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/swift
>Affects Versions: 2.7.1
>Reporter: Chen He
>
> Current Swift FileSystem supports file larger than 5GB. 
> File will be chunked as large as 4.6GB (configurable). For example, if there 
> is a 46GB file "foo" in swift, 
> Then the structure will look like:
> foo/01
> foo/02
> foo/03
> ...
> foo/10
> User will not see those 0x files if they don't specify. That means, if 
> user does:
> \> hadoop fs -ls swift://container.serviceProvidor/foo
> It only shows:
> dwr-r--r--46GBfoo
> However, in my test, if there is a failure, during uploading the foo file, 
> the previous uploaded chunks will be left in the object store. It will be 
> good to support continuous uploading based on previous leftover



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12461) Swift driver should have the ability to renew token if server has timeout

2015-10-09 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14949993#comment-14949993
 ] 

Chen He commented on HADOOP-12461:
--

I propose to add a new configuration parameter 
like:"fs.swift.token.renew.interval" to enable user config the renew interval. 
Then, swift driver can do the renew accordingly.

> Swift driver should have the ability to renew token if server has timeout
> -
>
> Key: HADOOP-12461
> URL: https://issues.apache.org/jira/browse/HADOOP-12461
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/swift
>Affects Versions: 2.7.1
>Reporter: Chen He
>
> Current swift driver will encounter authentication issue if swift server has 
> token timeout. It will be good if driver can automatically renew once it 
> expired.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12471) Support Swift file (> 5GB) continuious uploading where there is a failure

2015-10-09 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14950604#comment-14950604
 ] 

Chen He commented on HADOOP-12471:
--

Thank you for the comment, [~ste...@apache.org]. I agree with you. There should 
be a persistent lock exists in metadata. 

Another observation I got is that if a user is uploading a foo file but another 
user tries to delete, the delete operation will succeed.

> Support Swift file (> 5GB) continuious uploading where there is a failure
> -
>
> Key: HADOOP-12471
> URL: https://issues.apache.org/jira/browse/HADOOP-12471
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/swift
>Affects Versions: 2.7.1
>Reporter: Chen He
>
> Current Swift FileSystem supports file larger than 5GB. 
> File will be chunked as large as 4.6GB (configurable). For example, if there 
> is a 46GB file "foo" in swift, 
> Then the structure will look like:
> foo/01
> foo/02
> foo/03
> ...
> foo/10
> User will not see those 0x files if they don't specify. That means, if 
> user does:
> \> hadoop fs -ls swift://container.serviceProvidor/foo
> It only shows:
> dwr-r--r--46GBfoo
> However, in my test, if there is a failure, during uploading the foo file, 
> the previous uploaded chunks will be left in the object store. It will be 
> good to support continuous uploading based on previous leftover



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12471) Support Swift file (> 5GB) continuious uploading where there is a failure

2015-10-09 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14950885#comment-14950885
 ] 

Chen He commented on HADOOP-12471:
--

Hi [~arpitagarwal], I met this problem before, it is caused by the renaming 
process. We should remove renaming process, if not, file that larger than 5GB 
will not be successfully renamed. 

> Support Swift file (> 5GB) continuious uploading where there is a failure
> -
>
> Key: HADOOP-12471
> URL: https://issues.apache.org/jira/browse/HADOOP-12471
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/swift
>Affects Versions: 2.7.1
>Reporter: Chen He
>
> Current Swift FileSystem supports file larger than 5GB. 
> File will be chunked as large as 4.6GB (configurable). For example, if there 
> is a 46GB file "foo" in swift, 
> Then the structure will look like:
> foo/01
> foo/02
> foo/03
> ...
> foo/10
> User will not see those 0x files if they don't specify. That means, if 
> user does:
> \> hadoop fs -ls swift://container.serviceProvidor/foo
> It only shows:
> dwr-r--r--46GBfoo
> However, in my test, if there is a failure, during uploading the foo file, 
> the previous uploaded chunks will be left in the object store. It will be 
> good to support continuous uploading based on previous leftover



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-12109) Distcp of file > 5GB to swift fails with HTTP 413 error

2015-10-09 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated HADOOP-12109:
-
Affects Version/s: 2.7.1

> Distcp of file > 5GB to swift fails with HTTP 413 error
> ---
>
> Key: HADOOP-12109
> URL: https://issues.apache.org/jira/browse/HADOOP-12109
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/swift
>Affects Versions: 2.6.0, 2.7.1
>Reporter: Phil D'Amore
>
> Trying to use distcp to copy a file more than 5GB to swift fs results in a 
> stack like the following:
> 15/06/01 20:58:57 ERROR util.RetriableCommand: Failure in Retriable command: 
> Copying hdfs://xxx:8020/path/to/random-5Gplus.dat to swift://xxx/5Gplus.dat
> Invalid Response: Method COPY on 
> http://xxx:8080/v1/AUTH_fb7a8901dd8d4c8dba27f5e5d55a46a9/test/.distcp.tmp.attempt_local1097967418_0001_m_00_0
>  failed, status code: 413, status line: HTTP/1.1 413 Request Entity Too Large 
>  COPY 
> http://xxx:8080/v1/AUTH_fb7a8901dd8d4c8dba27f5e5d55a46a9/test/.distcp.tmp.attempt_local1097967418_0001_m_00_0
>  => 413 : Request Entity Too LargeThe body of your request 
> was too large for this server.
> at 
> org.apache.hadoop.fs.swift.http.SwiftRestClient.buildException(SwiftRestClient.java:1502)
> at 
> org.apache.hadoop.fs.swift.http.SwiftRestClient.perform(SwiftRestClient.java:1403)
> at 
> org.apache.hadoop.fs.swift.http.SwiftRestClient.copyObject(SwiftRestClient.java:923)
> at 
> org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystemStore.copyObject(SwiftNativeFileSystemStore.java:765)
> at 
> org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystemStore.rename(SwiftNativeFileSystemStore.java:617)
> at 
> org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystem.rename(SwiftNativeFileSystem.java:577)
> at 
> org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.promoteTmpToTarget(RetriableFileCopyCommand.java:220)
> at 
> org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doCopy(RetriableFileCopyCommand.java:137)
> at 
> org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doExecute(RetriableFileCopyCommand.java:100)
> at 
> org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:87)
> at 
> org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:280)
> at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:252)
> at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:50)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> It looks like the problem actually occurs in the rename operation which 
> happens after the copy.  The rename is implemented as a copy/delete, and this 
> secondary copy looks like it's not done in a way that breaks up the file into 
> smaller chunks.  
> It looks like the following bug:
> https://bugs.launchpad.net/sahara/+bug/1428941
> It does not look like the fix for this is incorporated into hadoop's swift 
> client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12109) Distcp of file > 5GB to swift fails with HTTP 413 error

2015-10-09 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14950891#comment-14950891
 ] 

Chen He commented on HADOOP-12109:
--

This is because current swift code do a copy and a rename process when upload a 
file (>5GB) to object store. We should avoid the rename process if not it will 
always complain HTTP 413. 

> Distcp of file > 5GB to swift fails with HTTP 413 error
> ---
>
> Key: HADOOP-12109
> URL: https://issues.apache.org/jira/browse/HADOOP-12109
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/swift
>Affects Versions: 2.6.0, 2.7.1
>Reporter: Phil D'Amore
>
> Trying to use distcp to copy a file more than 5GB to swift fs results in a 
> stack like the following:
> 15/06/01 20:58:57 ERROR util.RetriableCommand: Failure in Retriable command: 
> Copying hdfs://xxx:8020/path/to/random-5Gplus.dat to swift://xxx/5Gplus.dat
> Invalid Response: Method COPY on 
> http://xxx:8080/v1/AUTH_fb7a8901dd8d4c8dba27f5e5d55a46a9/test/.distcp.tmp.attempt_local1097967418_0001_m_00_0
>  failed, status code: 413, status line: HTTP/1.1 413 Request Entity Too Large 
>  COPY 
> http://xxx:8080/v1/AUTH_fb7a8901dd8d4c8dba27f5e5d55a46a9/test/.distcp.tmp.attempt_local1097967418_0001_m_00_0
>  => 413 : Request Entity Too LargeThe body of your request 
> was too large for this server.
> at 
> org.apache.hadoop.fs.swift.http.SwiftRestClient.buildException(SwiftRestClient.java:1502)
> at 
> org.apache.hadoop.fs.swift.http.SwiftRestClient.perform(SwiftRestClient.java:1403)
> at 
> org.apache.hadoop.fs.swift.http.SwiftRestClient.copyObject(SwiftRestClient.java:923)
> at 
> org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystemStore.copyObject(SwiftNativeFileSystemStore.java:765)
> at 
> org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystemStore.rename(SwiftNativeFileSystemStore.java:617)
> at 
> org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystem.rename(SwiftNativeFileSystem.java:577)
> at 
> org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.promoteTmpToTarget(RetriableFileCopyCommand.java:220)
> at 
> org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doCopy(RetriableFileCopyCommand.java:137)
> at 
> org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doExecute(RetriableFileCopyCommand.java:100)
> at 
> org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:87)
> at 
> org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:280)
> at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:252)
> at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:50)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> It looks like the problem actually occurs in the rename operation which 
> happens after the copy.  The rename is implemented as a copy/delete, and this 
> secondary copy looks like it's not done in a way that breaks up the file into 
> smaller chunks.  
> It looks like the following bug:
> https://bugs.launchpad.net/sahara/+bug/1428941
> It does not look like the fix for this is incorporated into hadoop's swift 
> client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12471) Support Swift file (> 5GB) continuious uploading where there is a failure

2015-10-09 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14950877#comment-14950877
 ] 

Chen He commented on HADOOP-12471:
--

I am not sure how swift do the chunk operation in the beginning. However, the 
DLO flag will be added once all chunks are successfully uploaded. If there is 
failure, the DLO flag is not created, then, there are leftovers. 

I assume swift does not know how many chunks will be if user upload a large 
file. If that is a case, can we add another header flag that identifies whether 
this large file is succeed or not in the begging? For example:
X-Object-Succeed-Flag
In the beginning, this flag will be false (or any value that can be changed 
later), once all chunks get successfully uploaded, we change it to true. If 
there is any failure in the middle, this flag will remain false. Any request to 
a file with this header which is false will be prevented. 

> Support Swift file (> 5GB) continuious uploading where there is a failure
> -
>
> Key: HADOOP-12471
> URL: https://issues.apache.org/jira/browse/HADOOP-12471
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/swift
>Affects Versions: 2.7.1
>Reporter: Chen He
>
> Current Swift FileSystem supports file larger than 5GB. 
> File will be chunked as large as 4.6GB (configurable). For example, if there 
> is a 46GB file "foo" in swift, 
> Then the structure will look like:
> foo/01
> foo/02
> foo/03
> ...
> foo/10
> User will not see those 0x files if they don't specify. That means, if 
> user does:
> \> hadoop fs -ls swift://container.serviceProvidor/foo
> It only shows:
> dwr-r--r--46GBfoo
> However, in my test, if there is a failure, during uploading the foo file, 
> the previous uploaded chunks will be left in the object store. It will be 
> good to support continuous uploading based on previous leftover



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12471) Support Swift file (> 5GB) continuious uploading where there is a failure

2015-10-09 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14950906#comment-14950906
 ] 

Chen He commented on HADOOP-12471:
--

If the server is offline, User will get HTTP 50x errors which identify that is 
not driver's problem and out of control of driver's scope.

> Support Swift file (> 5GB) continuious uploading where there is a failure
> -
>
> Key: HADOOP-12471
> URL: https://issues.apache.org/jira/browse/HADOOP-12471
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/swift
>Affects Versions: 2.7.1
>Reporter: Chen He
>
> Current Swift FileSystem supports file larger than 5GB. 
> File will be chunked as large as 4.6GB (configurable). For example, if there 
> is a 46GB file "foo" in swift, 
> Then the structure will look like:
> foo/01
> foo/02
> foo/03
> ...
> foo/10
> User will not see those 0x files if they don't specify. That means, if 
> user does:
> \> hadoop fs -ls swift://container.serviceProvidor/foo
> It only shows:
> dwr-r--r--46GBfoo
> However, in my test, if there is a failure, during uploading the foo file, 
> the previous uploaded chunks will be left in the object store. It will be 
> good to support continuous uploading based on previous leftover



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12471) Support Swift file (> 5GB) continuious uploading where there is a failure

2015-10-09 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14950904#comment-14950904
 ] 

Chen He commented on HADOOP-12471:
--

I mean, request to this failed file will report warning and suggest user to 
delete.

> Support Swift file (> 5GB) continuious uploading where there is a failure
> -
>
> Key: HADOOP-12471
> URL: https://issues.apache.org/jira/browse/HADOOP-12471
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/swift
>Affects Versions: 2.7.1
>Reporter: Chen He
>
> Current Swift FileSystem supports file larger than 5GB. 
> File will be chunked as large as 4.6GB (configurable). For example, if there 
> is a 46GB file "foo" in swift, 
> Then the structure will look like:
> foo/01
> foo/02
> foo/03
> ...
> foo/10
> User will not see those 0x files if they don't specify. That means, if 
> user does:
> \> hadoop fs -ls swift://container.serviceProvidor/foo
> It only shows:
> dwr-r--r--46GBfoo
> However, in my test, if there is a failure, during uploading the foo file, 
> the previous uploaded chunks will be left in the object store. It will be 
> good to support continuous uploading based on previous leftover



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-12461) Swift driver should have the ability to renew token if server has timeout

2015-10-06 Thread Chen He (JIRA)
Chen He created HADOOP-12461:


 Summary: Swift driver should have the ability to renew token if 
server has timeout
 Key: HADOOP-12461
 URL: https://issues.apache.org/jira/browse/HADOOP-12461
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs/swift
Affects Versions: 2.7.1
Reporter: Chen He


Current swift driver will encounter authentication issue if swift server has 
token timeout. It will be good if driver can automatically renew once it 
expired.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12057) swiftfs rename on partitioned file attempts to consolidate partitions

2015-08-20 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706259#comment-14706259
 ] 

Chen He commented on HADOOP-12057:
--

Hi [~highlycaffeinated], add a auth-keys.xml file under the resource directory 
of the openstack unit test, it will automatically trigger the openstack unit 
test and provide you more hints.

 swiftfs rename on partitioned file attempts to consolidate partitions
 -

 Key: HADOOP-12057
 URL: https://issues.apache.org/jira/browse/HADOOP-12057
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs/swift
Reporter: David Dobbins
Assignee: David Dobbins
 Attachments: HADOOP-12057-006.patch, HADOOP-12057-008.patch, 
 HADOOP-12057.007.patch, HADOOP-12057.patch, HADOOP-12057.patch, 
 HADOOP-12057.patch, HADOOP-12057.patch, HADOOP-12057.patch


 In the swift filesystem for openstack, a rename operation on a partitioned 
 file uses the swift COPY operation, which attempts to consolidate all of the 
 partitions into a single object.  This causes the rename to fail when the 
 total size of all the partitions exceeds the maximum object size for swift.  
 Since partitioned files are primarily created to allow a file to exceed the 
 maximum object size, this bug makes writing to swift extremely unreliable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-12343) Swift Driver should verify whether container name follows RFC952

2015-08-20 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated HADOOP-12343:
-
Summary: Swift Driver should verify whether container name follows RFC952  
(was: Swift Driver should verify whether container and service name follows 
RFC952)

 Swift Driver should verify whether container name follows RFC952
 

 Key: HADOOP-12343
 URL: https://issues.apache.org/jira/browse/HADOOP-12343
 Project: Hadoop Common
  Issue Type: New Feature
  Components: fs/swift
Affects Versions: 2.7.1
Reporter: Chen He
Assignee: Chen He

 Swift driver reports:Invalid swift hostname 'null', hostname must in form 
 container.service if the container name does not follow RFC952. However, the 
 container or service name is not 'null'. The error message should be more 
 clear.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-12343) Swift Driver should verify whether container and service name follows RFC952

2015-08-20 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated HADOOP-12343:
-
Summary: Swift Driver should verify whether container and service name 
follows RFC952  (was: Error message of Swift driver should be more clear when 
there is mal-format of hostname or service)

 Swift Driver should verify whether container and service name follows RFC952
 

 Key: HADOOP-12343
 URL: https://issues.apache.org/jira/browse/HADOOP-12343
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs/swift
Affects Versions: 2.7.1
Reporter: Chen He
Assignee: Chen He
Priority: Minor

 Swift driver reports:Invalid swift hostname 'null', hostname must in form 
 container.service if the container name does not follow RFC952. However, the 
 container or service name is not 'null'. The error message should be more 
 clear.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-12343) Swift Driver should verify whether container and service name follows RFC952

2015-08-20 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated HADOOP-12343:
-
Priority: Major  (was: Minor)

 Swift Driver should verify whether container and service name follows RFC952
 

 Key: HADOOP-12343
 URL: https://issues.apache.org/jira/browse/HADOOP-12343
 Project: Hadoop Common
  Issue Type: New Feature
  Components: fs/swift
Affects Versions: 2.7.1
Reporter: Chen He
Assignee: Chen He

 Swift driver reports:Invalid swift hostname 'null', hostname must in form 
 container.service if the container name does not follow RFC952. However, the 
 container or service name is not 'null'. The error message should be more 
 clear.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-12343) Swift Driver should verify whether container and service name follows RFC952

2015-08-20 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated HADOOP-12343:
-
Issue Type: New Feature  (was: Bug)

 Swift Driver should verify whether container and service name follows RFC952
 

 Key: HADOOP-12343
 URL: https://issues.apache.org/jira/browse/HADOOP-12343
 Project: Hadoop Common
  Issue Type: New Feature
  Components: fs/swift
Affects Versions: 2.7.1
Reporter: Chen He
Assignee: Chen He
Priority: Minor

 Swift driver reports:Invalid swift hostname 'null', hostname must in form 
 container.service if the container name does not follow RFC952. However, the 
 container or service name is not 'null'. The error message should be more 
 clear.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-12343) Error message of Swift driver should be more clear when there is mal-format of hostname or service

2015-08-19 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated HADOOP-12343:
-
Summary: Error message of Swift driver should be more clear when there is 
mal-format of hostname or service  (was: Error message of Swift driver should 
be more clear when there is mal-format of hostname and service)

 Error message of Swift driver should be more clear when there is mal-format 
 of hostname or service
 --

 Key: HADOOP-12343
 URL: https://issues.apache.org/jira/browse/HADOOP-12343
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs/swift
Affects Versions: 2.7.1
Reporter: Chen He
Assignee: Chen He
Priority: Minor

 Swift driver reports:Invalid swift hostname 'null', hostname must in form 
 container.service if the container name does not follow RFC952. However, the 
 container or service name is not 'null'. The error message should be more 
 clear.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-12343) Error message of Swift driver should be more clear when there is mal-format of hostname and service

2015-08-19 Thread Chen He (JIRA)
Chen He created HADOOP-12343:


 Summary: Error message of Swift driver should be more clear when 
there is mal-format of hostname and service
 Key: HADOOP-12343
 URL: https://issues.apache.org/jira/browse/HADOOP-12343
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs/swift
Affects Versions: 2.7.1
Reporter: Chen He
Assignee: Chen He


Swift driver reports:Invalid swift hostname 'null', hostname must in form 
container.service if the container name does not follow RFC952. However, the 
container or service name is not 'null'. The error message should be more clear.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-12343) Error message of Swift driver should be more clear when there is mal-format of hostname and service

2015-08-19 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated HADOOP-12343:
-
Priority: Minor  (was: Major)

 Error message of Swift driver should be more clear when there is mal-format 
 of hostname and service
 ---

 Key: HADOOP-12343
 URL: https://issues.apache.org/jira/browse/HADOOP-12343
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs/swift
Affects Versions: 2.7.1
Reporter: Chen He
Assignee: Chen He
Priority: Minor

 Swift driver reports:Invalid swift hostname 'null', hostname must in form 
 container.service if the container name does not follow RFC952. However, the 
 container or service name is not 'null'. The error message should be more 
 clear.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12291) Add support for nested groups in LdapGroupsMapping

2015-07-30 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648746#comment-14648746
 ] 

Chen He commented on HADOOP-12291:
--

+1 for the idea.

 Add support for nested groups in LdapGroupsMapping
 --

 Key: HADOOP-12291
 URL: https://issues.apache.org/jira/browse/HADOOP-12291
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Reporter: Gautam Gopalakrishnan

 When using {{LdapGroupsMapping}} with Hadoop, nested groups are not 
 supported. So for example if user {{jdoe}} is part of group A which is a 
 member of group B, the group mapping currently returns only group A.
 Currently this facility is available with {{ShellBasedUnixGroupsMapping}} and 
 SSSD (or similar tools) but would be good to have this feature as part of 
 {{LdapGroupsMapping}} directly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11762) Enable swift distcp to secure HDFS

2015-07-21 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14635583#comment-14635583
 ] 

Chen He commented on HADOOP-11762:
--

Thanks, [~aw]

 Enable swift distcp to secure HDFS
 --

 Key: HADOOP-11762
 URL: https://issues.apache.org/jira/browse/HADOOP-11762
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs/swift
Affects Versions: 2.3.0, 2.4.0, 2.5.0, 2.4.1, 2.5.1, 2.6.0
Reporter: Chen He
Assignee: Chen He
 Fix For: 3.0.0

 Attachments: HADOOP-11762.000.patch


 Even we can use dfs -put or dfs -cp to move data between swift and 
 secured HDFS, it will be impractical for moving huge amount of data like 10TB 
 or larger.
 Current Hadoop code will result in :java.lang.IllegalArgumentException: 
 java.net.UnknownHostException: container.swiftdomain 
 Since it does not support token feature in SwiftNativeFileSystem right now, 
 it will be reasonable that we override the getCanonicalServiceName method 
 like other filesystem extensions (S3FileSystem, S3AFileSystem)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12038) SwiftNativeOutputStream should check whether a file exists or not before deleting

2015-07-21 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14635613#comment-14635613
 ] 

Chen He commented on HADOOP-12038:
--

Assign back to [~ste...@apache.org] who is professional and has more 
experiences in writing unit tests for Swift Driver. 

 SwiftNativeOutputStream should check whether a file exists or not before 
 deleting
 -

 Key: HADOOP-12038
 URL: https://issues.apache.org/jira/browse/HADOOP-12038
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Chen He
Assignee: Steve Loughran
Priority: Minor
 Attachments: HADOOP-12038.000.patch


 15/05/27 15:27:03 WARN snative.SwiftNativeOutputStream: Could not delete 
 /tmp/hadoop-root/output-3695386887711395289.tmp
 It should check whether the file exists or not before deleting. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-12038) SwiftNativeOutputStream should check whether a file exists or not before deleting

2015-07-21 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated HADOOP-12038:
-
Assignee: Steve Loughran  (was: Chen He)

 SwiftNativeOutputStream should check whether a file exists or not before 
 deleting
 -

 Key: HADOOP-12038
 URL: https://issues.apache.org/jira/browse/HADOOP-12038
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Chen He
Assignee: Steve Loughran
Priority: Minor
 Attachments: HADOOP-12038.000.patch


 15/05/27 15:27:03 WARN snative.SwiftNativeOutputStream: Could not delete 
 /tmp/hadoop-root/output-3695386887711395289.tmp
 It should check whether the file exists or not before deleting. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12038) SwiftNativeOutputStream should check whether a file exists or not before deleting

2015-07-20 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633747#comment-14633747
 ] 

Chen He commented on HADOOP-12038:
--

Appologize, I will work on it tonight.

 SwiftNativeOutputStream should check whether a file exists or not before 
 deleting
 -

 Key: HADOOP-12038
 URL: https://issues.apache.org/jira/browse/HADOOP-12038
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Chen He
Assignee: Chen He
Priority: Minor
 Attachments: HADOOP-12038.000.patch


 15/05/27 15:27:03 WARN snative.SwiftNativeOutputStream: Could not delete 
 /tmp/hadoop-root/output-3695386887711395289.tmp
 It should check whether the file exists or not before deleting. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-10615) FileInputStream in JenkinsHash#main() is never closed

2015-07-15 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14628351#comment-14628351
 ] 

Chen He commented on HADOOP-10615:
--

Sure, thank you for the suggestion, [~ozawa].

 FileInputStream in JenkinsHash#main() is never closed
 -

 Key: HADOOP-10615
 URL: https://issues.apache.org/jira/browse/HADOOP-10615
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Chen He
Priority: Minor
  Labels: BB2015-05-TBR
 Attachments: HADOOP-10615-2.patch, HADOOP-10615.patch


 {code}
 FileInputStream in = new FileInputStream(args[0]);
 {code}
 The above FileInputStream is not closed upon exit of main.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-10615) FileInputStream in JenkinsHash#main() is never closed

2015-07-15 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated HADOOP-10615:
-
Attachment: HADOOP-10615.003.patch

patch updated.

 FileInputStream in JenkinsHash#main() is never closed
 -

 Key: HADOOP-10615
 URL: https://issues.apache.org/jira/browse/HADOOP-10615
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Chen He
Priority: Minor
  Labels: BB2015-05-TBR
 Attachments: HADOOP-10615-2.patch, HADOOP-10615.003.patch, 
 HADOOP-10615.patch


 {code}
 FileInputStream in = new FileInputStream(args[0]);
 {code}
 The above FileInputStream is not closed upon exit of main.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-9565) Add a Blobstore interface to add to blobstore FileSystems

2015-06-25 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602430#comment-14602430
 ] 

Chen He commented on HADOOP-9565:
-

The ._COPYING_ mechanism actually has problem. I create the bug HDFS-8673.

 Add a Blobstore interface to add to blobstore FileSystems
 -

 Key: HADOOP-9565
 URL: https://issues.apache.org/jira/browse/HADOOP-9565
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs, fs/s3, fs/swift
Affects Versions: 2.6.0
Reporter: Steve Loughran
Assignee: Steve Loughran
  Labels: BB2015-05-TBR
 Attachments: HADOOP-9565-001.patch, HADOOP-9565-002.patch, 
 HADOOP-9565-003.patch


 We can make the fact that some {{FileSystem}} implementations are really 
 blobstores, with different atomicity and consistency guarantees, by adding a 
 {{Blobstore}} interface to add to them. 
 This could also be a place to add a {{Copy(Path,Path)}} method, assuming that 
 all blobstores implement at server-side copy operation as a substitute for 
 rename.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-12046) Avoid creating ._COPYING_ temporary file when copying file to Swift file system

2015-06-21 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated HADOOP-12046:
-
Attachment: Copy Large file to Swift using Hadoop Client.png

 Avoid creating ._COPYING_ temporary file when copying file to Swift file 
 system
 -

 Key: HADOOP-12046
 URL: https://issues.apache.org/jira/browse/HADOOP-12046
 Project: Hadoop Common
  Issue Type: New Feature
  Components: fs/swift
Affects Versions: 2.7.0
Reporter: Chen He
Assignee: Chen He
 Attachments: Copy Large file to Swift using Hadoop Client.png


 When copy file from HDFS or local to another file system implementation, in 
 CommandWithDestination.java, it creates a temp file by adding suffix 
 ._COPYING_. Once file is successfully copied, it will remove the suffix by 
 rename(). 
 try {
   PathData tempTarget = target.suffix(._COPYING_);
   targetFs.setWriteChecksum(writeChecksum);
   targetFs.writeStreamToFile(in, tempTarget, lazyPersist);
   targetFs.rename(tempTarget, target);
 } finally {
   targetFs.close(); // last ditch effort to ensure temp file is removed
 }
 It is not costly in HDFS. However, if copy to Swift file system, the rename 
 process is to create a new file. It is not efficient if users copy a lot of 
 files to swift file system. I did some tests, for a 1G file copying to swift, 
 it will take 10% more time. We should only do the copy one time for Swift 
 file system. Changes should be limited to the Swift driver level.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-12046) Avoid creating ._COPYING_ temporary file when copying file to Swift file system

2015-06-21 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated HADOOP-12046:
-
Attachment: (was: 屏幕快照 2015-06-21 下午10.38.34.png)

 Avoid creating ._COPYING_ temporary file when copying file to Swift file 
 system
 -

 Key: HADOOP-12046
 URL: https://issues.apache.org/jira/browse/HADOOP-12046
 Project: Hadoop Common
  Issue Type: New Feature
  Components: fs/swift
Affects Versions: 2.7.0
Reporter: Chen He
Assignee: Chen He

 When copy file from HDFS or local to another file system implementation, in 
 CommandWithDestination.java, it creates a temp file by adding suffix 
 ._COPYING_. Once file is successfully copied, it will remove the suffix by 
 rename(). 
 try {
   PathData tempTarget = target.suffix(._COPYING_);
   targetFs.setWriteChecksum(writeChecksum);
   targetFs.writeStreamToFile(in, tempTarget, lazyPersist);
   targetFs.rename(tempTarget, target);
 } finally {
   targetFs.close(); // last ditch effort to ensure temp file is removed
 }
 It is not costly in HDFS. However, if copy to Swift file system, the rename 
 process is to create a new file. It is not efficient if users copy a lot of 
 files to swift file system. I did some tests, for a 1G file copying to swift, 
 it will take 10% more time. We should only do the copy one time for Swift 
 file system. Changes should be limited to the Swift driver level.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-12046) Avoid creating ._COPYING_ temporary file when copying file to Swift file system

2015-06-21 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated HADOOP-12046:
-
Attachment: 屏幕快照 2015-06-21 下午10.38.34.png

Attach the file copy process if user tries to copy a file(larger than 5GB) from 
HDFS to Swift using current Swift driver. 

 Avoid creating ._COPYING_ temporary file when copying file to Swift file 
 system
 -

 Key: HADOOP-12046
 URL: https://issues.apache.org/jira/browse/HADOOP-12046
 Project: Hadoop Common
  Issue Type: New Feature
  Components: fs/swift
Affects Versions: 2.7.0
Reporter: Chen He
Assignee: Chen He

 When copy file from HDFS or local to another file system implementation, in 
 CommandWithDestination.java, it creates a temp file by adding suffix 
 ._COPYING_. Once file is successfully copied, it will remove the suffix by 
 rename(). 
 try {
   PathData tempTarget = target.suffix(._COPYING_);
   targetFs.setWriteChecksum(writeChecksum);
   targetFs.writeStreamToFile(in, tempTarget, lazyPersist);
   targetFs.rename(tempTarget, target);
 } finally {
   targetFs.close(); // last ditch effort to ensure temp file is removed
 }
 It is not costly in HDFS. However, if copy to Swift file system, the rename 
 process is to create a new file. It is not efficient if users copy a lot of 
 files to swift file system. I did some tests, for a 1G file copying to swift, 
 it will take 10% more time. We should only do the copy one time for Swift 
 file system. Changes should be limited to the Swift driver level.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-9565) Add a Blobstore interface to add to blobstore FileSystems

2015-06-13 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584724#comment-14584724
 ] 

Chen He commented on HADOOP-9565:
-

Thank you for the explanation, [~ste...@apache.org]. You are right, the 
._COPYING_ is added by CLI (distcp refers to this also) and hardcoded there. 
IMHO, it may be more flexible if we can choose to not add ._COPYING_ by setting 
a parameter like OBJECTSTORE_NO_RENAME_IN_COPY. 

 Add a Blobstore interface to add to blobstore FileSystems
 -

 Key: HADOOP-9565
 URL: https://issues.apache.org/jira/browse/HADOOP-9565
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs, fs/s3, fs/swift
Affects Versions: 2.6.0
Reporter: Steve Loughran
Assignee: Steve Loughran
  Labels: BB2015-05-TBR
 Attachments: HADOOP-9565-001.patch, HADOOP-9565-002.patch, 
 HADOOP-9565-003.patch


 We can make the fact that some {{FileSystem}} implementations are really 
 blobstores, with different atomicity and consistency guarantees, by adding a 
 {{Blobstore}} interface to add to them. 
 This could also be a place to add a {{Copy(Path,Path)}} method, assuming that 
 all blobstores implement at server-side copy operation as a substitute for 
 rename.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-9565) Add a Blobstore interface to add to blobstore FileSystems

2015-06-12 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584226#comment-14584226
 ] 

Chen He commented on HADOOP-9565:
-

Thank you for the contribution [~ste...@apache.org]. I have a question about 
the copying process. Why we have to add ._COPYING_ for swift storage which is 
using filename to decide the location of file blocks?

Another potential problem is the rename process. It may cause YARN timeout (10 
mins) if we use distcp to copy a large file.

 Add a Blobstore interface to add to blobstore FileSystems
 -

 Key: HADOOP-9565
 URL: https://issues.apache.org/jira/browse/HADOOP-9565
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs, fs/s3, fs/swift
Affects Versions: 2.6.0
Reporter: Steve Loughran
Assignee: Steve Loughran
  Labels: BB2015-05-TBR
 Attachments: HADOOP-9565-001.patch, HADOOP-9565-002.patch, 
 HADOOP-9565-003.patch


 We can make the fact that some {{FileSystem}} implementations are really 
 blobstores, with different atomicity and consistency guarantees, by adding a 
 {{Blobstore}} interface to add to them. 
 This could also be a place to add a {{Copy(Path,Path)}} method, assuming that 
 all blobstores implement at server-side copy operation as a substitute for 
 rename.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-12086) Swift driver reports NPE if user try to create a dir without name

2015-06-11 Thread Chen He (JIRA)
Chen He created HADOOP-12086:


 Summary: Swift driver reports NPE if user try to create a dir 
without name
 Key: HADOOP-12086
 URL: https://issues.apache.org/jira/browse/HADOOP-12086
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs/swift
Affects Versions: 2.3.0
Reporter: Chen He
Assignee: Chen He


hadoop fs -mkdir swift://container.Provider/
-mkdir: Fatal internal error
java.lang.NullPointerException
at 
org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystem.makeAbsolute(SwiftNativeFileSystem.java:691)
at 
org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystem.getFileStatus(SwiftNativeFileSystem.java:197)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1400)
at org.apache.hadoop.fs.shell.Mkdir.processNonexistentPath(Mkdir.java:73)
at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:262)
at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:190)
at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:287)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:340)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12038) SwiftNativeOutputStream should check whether a file exists or not before deleting

2015-06-04 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573375#comment-14573375
 ] 

Chen He commented on HADOOP-12038:
--

Thank you very much, Steve. I will come up with a patch.

 SwiftNativeOutputStream should check whether a file exists or not before 
 deleting
 -

 Key: HADOOP-12038
 URL: https://issues.apache.org/jira/browse/HADOOP-12038
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Chen He
Assignee: Chen He
Priority: Minor
 Attachments: HADOOP-12038.000.patch


 15/05/27 15:27:03 WARN snative.SwiftNativeOutputStream: Could not delete 
 /tmp/hadoop-root/output-3695386887711395289.tmp
 It should check whether the file exists or not before deleting. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12046) Avoid creating ._COPYING_ temporary file when copying file to Swift file system

2015-06-01 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14568155#comment-14568155
 ] 

Chen He commented on HADOOP-12046:
--

Thank you for the quick reply, [~steve_l]. I will read HADOOP-9565, it sounds 
interesting.

 Avoid creating ._COPYING_ temporary file when copying file to Swift file 
 system
 -

 Key: HADOOP-12046
 URL: https://issues.apache.org/jira/browse/HADOOP-12046
 Project: Hadoop Common
  Issue Type: New Feature
  Components: fs/swift
Affects Versions: 2.7.0
Reporter: Chen He
Assignee: Chen He

 When copy file from HDFS or local to another file system implementation, in 
 CommandWithDestination.java, it creates a temp file by adding suffix 
 ._COPYING_. Once file is successfully copied, it will remove the suffix by 
 rename(). 
 try {
   PathData tempTarget = target.suffix(._COPYING_);
   targetFs.setWriteChecksum(writeChecksum);
   targetFs.writeStreamToFile(in, tempTarget, lazyPersist);
   targetFs.rename(tempTarget, target);
 } finally {
   targetFs.close(); // last ditch effort to ensure temp file is removed
 }
 It is not costly in HDFS. However, if copy to Swift file system, the rename 
 process is to create a new file. It is not efficient if users copy a lot of 
 files to swift file system. I did some tests, for a 1G file copying to swift, 
 it will take 10% more time. We should only do the copy one time for Swift 
 file system. Changes should be limited to the Swift driver level.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12038) SwiftNativeOutputStream should check whether a file exists or not before deleting

2015-05-31 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14566906#comment-14566906
 ] 

Chen He commented on HADOOP-12038:
--

I didn't find data losing when it reports this warning. 

 SwiftNativeOutputStream should check whether a file exists or not before 
 deleting
 -

 Key: HADOOP-12038
 URL: https://issues.apache.org/jira/browse/HADOOP-12038
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Chen He
Assignee: Chen He
Priority: Minor
 Attachments: HADOOP-12038.000.patch


 15/05/27 15:27:03 WARN snative.SwiftNativeOutputStream: Could not delete 
 /tmp/hadoop-root/output-3695386887711395289.tmp
 It should check whether the file exists or not before deleting. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12038) SwiftNativeOutputStream should check whether a file exists or not before deleting

2015-05-31 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14566900#comment-14566900
 ] 

Chen He commented on HADOOP-12038:
--

Thanks, [~steve_l]. 

Actually, Openstack community has another version of swift driver for Hadoop. 
It supports files that are larger than 5GB, what I did is to add those 
functions to hadoop-openstack module. I don't know why Hadoo community does not 
have similar solution. The error was reported during my test process. 

Openstack driver is call Sahara. It breaks file (larger than 5GB) into a 
configurable chunks (default 4.6GB) and create a manifest fold in swift file 
system and point to those chunks. However, since swift rename process is to 
create a new file instead of changing original file's name (Because of Swift 
DHT using name to do the hash). It is inefficient for large file copying. I 
resolved this issue and will create issue and post patch later. 


 SwiftNativeOutputStream should check whether a file exists or not before 
 deleting
 -

 Key: HADOOP-12038
 URL: https://issues.apache.org/jira/browse/HADOOP-12038
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Chen He
Assignee: Chen He
Priority: Minor
 Attachments: HADOOP-12038.000.patch


 15/05/27 15:27:03 WARN snative.SwiftNativeOutputStream: Could not delete 
 /tmp/hadoop-root/output-3695386887711395289.tmp
 It should check whether the file exists or not before deleting. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-12046) Avoid creating ._COPYING_ temporary file when copying file to Swift file system

2015-05-31 Thread Chen He (JIRA)
Chen He created HADOOP-12046:


 Summary: Avoid creating ._COPYING_ temporary file when copying 
file to Swift file system
 Key: HADOOP-12046
 URL: https://issues.apache.org/jira/browse/HADOOP-12046
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: 2.7.0
Reporter: Chen He
Assignee: Chen He


When copy file from HDFS or local to another file system implementation, in 
CommandWithDestination.java, it creates a temp file by adding suffix 
._COPYING_. Once file is successfully copied, it will remove the suffix by 
rename(). 

try {
  PathData tempTarget = target.suffix(._COPYING_);
  targetFs.setWriteChecksum(writeChecksum);
  targetFs.writeStreamToFile(in, tempTarget, lazyPersist);
  targetFs.rename(tempTarget, target);
} finally {
  targetFs.close(); // last ditch effort to ensure temp file is removed
}

It is not costly in HDFS. However, if copy to Swift file system, the rename 
process is to create a new file. It is not efficient if users copy a lot of 
files to swift file system. I did some tests, for a 1G file copying to swift, 
it will take 10% more time. We should only do the copy one time for Swift file 
system. Changes should be limited to the Swift driver level.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12038) SwiftNativeOutputStream should check whether a file exists or not before deleting

2015-05-29 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14565178#comment-14565178
 ] 

Chen He commented on HADOOP-12038:
--

Hi [~steve_l], thank you for the comment. I should describe it more clear.

In hadoop-openstack module, it says there is not unit test but only functional 
test since it has dependency on swift server. If there is no swift server, all 
those unit tests in hadoop-openstack module will not be executed. 

I met this issue when I copying a large file to swift server. It returns to me 
this warning because the tmp file has already been deleted. 

I will try to add a unit test following the same pattern that previous unit 
tests have. 

 SwiftNativeOutputStream should check whether a file exists or not before 
 deleting
 -

 Key: HADOOP-12038
 URL: https://issues.apache.org/jira/browse/HADOOP-12038
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Chen He
Assignee: Chen He
Priority: Minor
 Attachments: HADOOP-12038.000.patch


 15/05/27 15:27:03 WARN snative.SwiftNativeOutputStream: Could not delete 
 /tmp/hadoop-root/output-3695386887711395289.tmp
 It should check whether the file exists or not before deleting. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-10661) Ineffective user/passsword check in FTPFileSystem#initialize()

2015-05-28 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14562378#comment-14562378
 ] 

Chen He commented on HADOOP-10661:
--

Actually, if user specify user is  and password is , current code:
Preconditions.checkState(userPasswdInfo.length  1,
 Invalid username / password);

will report Invalid username / password because userPasswdInfo.length is 0;

 Ineffective user/passsword check in FTPFileSystem#initialize()
 --

 Key: HADOOP-10661
 URL: https://issues.apache.org/jira/browse/HADOOP-10661
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Chen He
Priority: Minor
  Labels: BB2015-05-TBR
 Attachments: HADOOP-10661.patch, HADOOP-10661.patch


 Here is related code:
 {code}
   userAndPassword = (conf.get(fs.ftp.user. + host, null) + : + conf
   .get(fs.ftp.password. + host, null));
   if (userAndPassword == null) {
 throw new IOException(Invalid user/passsword specified);
   }
 {code}
 The intention seems to be checking that username / password should not be 
 null.
 But due to the presence of colon, the above check is not effective.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-10661) Ineffective user/passsword check in FTPFileSystem#initialize()

2015-05-28 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14562379#comment-14562379
 ] 

Chen He commented on HADOOP-10661:
--

If we only provide user but passwd is  it will also report invalid since 
userPasswdInfo.length is 1

 Ineffective user/passsword check in FTPFileSystem#initialize()
 --

 Key: HADOOP-10661
 URL: https://issues.apache.org/jira/browse/HADOOP-10661
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Chen He
Priority: Minor
  Labels: BB2015-05-TBR
 Attachments: HADOOP-10661.patch, HADOOP-10661.patch


 Here is related code:
 {code}
   userAndPassword = (conf.get(fs.ftp.user. + host, null) + : + conf
   .get(fs.ftp.password. + host, null));
   if (userAndPassword == null) {
 throw new IOException(Invalid user/passsword specified);
   }
 {code}
 The intention seems to be checking that username / password should not be 
 null.
 But due to the presence of colon, the above check is not effective.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-10661) Ineffective user/passsword check in FTPFileSystem#initialize()

2015-05-28 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14562401#comment-14562401
 ] 

Chen He commented on HADOOP-10661:
--

Hi [~te...@apache.org], looks like current Preconditions check can guarantee 
neighter user nor passwd to be . But null is allowed. Do we still need to fix 
this?

 Ineffective user/passsword check in FTPFileSystem#initialize()
 --

 Key: HADOOP-10661
 URL: https://issues.apache.org/jira/browse/HADOOP-10661
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Chen He
Priority: Minor
  Labels: BB2015-05-TBR
 Attachments: HADOOP-10661.patch, HADOOP-10661.patch


 Here is related code:
 {code}
   userAndPassword = (conf.get(fs.ftp.user. + host, null) + : + conf
   .get(fs.ftp.password. + host, null));
   if (userAndPassword == null) {
 throw new IOException(Invalid user/passsword specified);
   }
 {code}
 The intention seems to be checking that username / password should not be 
 null.
 But due to the presence of colon, the above check is not effective.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-12038) SwiftNativeOutputStream should check whether a file exists or not before deleting

2015-05-27 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated HADOOP-12038:
-
Status: Patch Available  (was: Open)

 SwiftNativeOutputStream should check whether a file exists or not before 
 deleting
 -

 Key: HADOOP-12038
 URL: https://issues.apache.org/jira/browse/HADOOP-12038
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Chen He
Assignee: Chen He
Priority: Minor
 Attachments: HADOOP-12038.000.patch


 15/05/27 15:27:03 WARN snative.SwiftNativeOutputStream: Could not delete 
 /tmp/hadoop-root/output-3695386887711395289.tmp
 It should check whether the file exists or not before deleting. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-12038) SwiftNativeOutputStream should check whether a file exists or not before deleting

2015-05-27 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated HADOOP-12038:
-
Attachment: HADOOP-12038.000.patch

 SwiftNativeOutputStream should check whether a file exists or not before 
 deleting
 -

 Key: HADOOP-12038
 URL: https://issues.apache.org/jira/browse/HADOOP-12038
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Chen He
Assignee: Chen He
Priority: Minor
 Attachments: HADOOP-12038.000.patch


 15/05/27 15:27:03 WARN snative.SwiftNativeOutputStream: Could not delete 
 /tmp/hadoop-root/output-3695386887711395289.tmp
 It should check whether the file exists or not before deleting. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-12038) SwiftNativeOutputStream should check whether a file exists or not before deleting

2015-05-27 Thread Chen He (JIRA)
Chen He created HADOOP-12038:


 Summary: SwiftNativeOutputStream should check whether a file 
exists or not before deleting
 Key: HADOOP-12038
 URL: https://issues.apache.org/jira/browse/HADOOP-12038
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Chen He
Assignee: Chen He
Priority: Minor


15/05/27 15:27:03 WARN snative.SwiftNativeOutputStream: Could not delete 
/tmp/hadoop-root/output-3695386887711395289.tmp

It should check whether the file exists or not before deleting. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-12038) SwiftNativeOutputStream should check whether a file exists or not before deleting

2015-05-27 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated HADOOP-12038:
-
Affects Version/s: 2.7.0

 SwiftNativeOutputStream should check whether a file exists or not before 
 deleting
 -

 Key: HADOOP-12038
 URL: https://issues.apache.org/jira/browse/HADOOP-12038
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Chen He
Assignee: Chen He
Priority: Minor

 15/05/27 15:27:03 WARN snative.SwiftNativeOutputStream: Could not delete 
 /tmp/hadoop-root/output-3695386887711395289.tmp
 It should check whether the file exists or not before deleting. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12038) SwiftNativeOutputStream should check whether a file exists or not before deleting

2015-05-27 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14562246#comment-14562246
 ] 

Chen He commented on HADOOP-12038:
--

The change is too simple and may not need a unit test.

 SwiftNativeOutputStream should check whether a file exists or not before 
 deleting
 -

 Key: HADOOP-12038
 URL: https://issues.apache.org/jira/browse/HADOOP-12038
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Chen He
Assignee: Chen He
Priority: Minor
 Attachments: HADOOP-12038.000.patch


 15/05/27 15:27:03 WARN snative.SwiftNativeOutputStream: Could not delete 
 /tmp/hadoop-root/output-3695386887711395289.tmp
 It should check whether the file exists or not before deleting. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-10661) Ineffective user/passsword check in FTPFileSystem#initialize()

2015-05-27 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14561976#comment-14561976
 ] 

Chen He commented on HADOOP-10661:
--

Sure, I will do it tonight. Thank you for remaindering me [~ted_yu].

 Ineffective user/passsword check in FTPFileSystem#initialize()
 --

 Key: HADOOP-10661
 URL: https://issues.apache.org/jira/browse/HADOOP-10661
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Chen He
Priority: Minor
  Labels: BB2015-05-TBR
 Attachments: HADOOP-10661.patch, HADOOP-10661.patch


 Here is related code:
 {code}
   userAndPassword = (conf.get(fs.ftp.user. + host, null) + : + conf
   .get(fs.ftp.password. + host, null));
   if (userAndPassword == null) {
 throw new IOException(Invalid user/passsword specified);
   }
 {code}
 The intention seems to be checking that username / password should not be 
 null.
 But due to the presence of colon, the above check is not effective.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11775) Fix Javadoc typos in hadoop-openstack module

2015-05-08 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14535230#comment-14535230
 ] 

Chen He commented on HADOOP-11775:
--

Anyone can review this ticket? Thanks!

 Fix Javadoc typos in hadoop-openstack module
 

 Key: HADOOP-11775
 URL: https://issues.apache.org/jira/browse/HADOOP-11775
 Project: Hadoop Common
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.6.0
Reporter: Chen He
Assignee: Yanjun Wang
Priority: Trivial
 Attachments: HADOOP-11775.000.patch, HADOOP-11775.001.patch


 Some typos are listed below but not limited to:
 SwiftNativeFileSystemObject.java
   /**
* Initalize the filesystem store -this creates the REST client binding.
*
* @param fsURI URI of the filesystem, which is used to map to the 
 filesystem-specific
*  options in the configuration file
* @param configuration configuration
* @throws IOException on any failure.
*/
 SwiftNativeFileSystem.java
   /**
* Low level method to do a deep listing of all entries, not stopping
* at the next directory entry. This is to let tests be confident that
* recursive deletes c really are working.
* @param path path to recurse down
* @param newest ask for the newest data, potentially slower than not.
* @return a potentially empty array of file status
* @throws IOException any problem
*/
   /**
* Low-level operation to also set the block size for this operation
* @param path   the file name to open
* @param bufferSize the size of the buffer to be used.
* @param readBlockSize how big should the read blockk/buffer size be?
* @return the input stream
* @throws FileNotFoundException if the file is not found
* @throws IOException any IO problem
*/
 SwiftRestClient.java
   /**
* Converts Swift path to URI to make request.
* This is public for unit testing
*
* @param path path to object
* @param endpointURI damain url e.g. http://domain.com
* @return valid URI for object
*/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11775) Fix Javadoc typos in hadoop-openstack module

2015-05-08 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14535908#comment-14535908
 ] 

Chen He commented on HADOOP-11775:
--

[~ajisakaa], would you mind take a look of this ticket? Thank you. 

 Fix Javadoc typos in hadoop-openstack module
 

 Key: HADOOP-11775
 URL: https://issues.apache.org/jira/browse/HADOOP-11775
 Project: Hadoop Common
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.6.0
Reporter: Chen He
Assignee: Yanjun Wang
Priority: Trivial
  Labels: BB2015-05-RFC
 Attachments: HADOOP-11775.000.patch, HADOOP-11775.001.patch


 Some typos are listed below but not limited to:
 SwiftNativeFileSystemObject.java
   /**
* Initalize the filesystem store -this creates the REST client binding.
*
* @param fsURI URI of the filesystem, which is used to map to the 
 filesystem-specific
*  options in the configuration file
* @param configuration configuration
* @throws IOException on any failure.
*/
 SwiftNativeFileSystem.java
   /**
* Low level method to do a deep listing of all entries, not stopping
* at the next directory entry. This is to let tests be confident that
* recursive deletes c really are working.
* @param path path to recurse down
* @param newest ask for the newest data, potentially slower than not.
* @return a potentially empty array of file status
* @throws IOException any problem
*/
   /**
* Low-level operation to also set the block size for this operation
* @param path   the file name to open
* @param bufferSize the size of the buffer to be used.
* @param readBlockSize how big should the read blockk/buffer size be?
* @return the input stream
* @throws FileNotFoundException if the file is not found
* @throws IOException any IO problem
*/
 SwiftRestClient.java
   /**
* Converts Swift path to URI to make request.
* This is public for unit testing
*
* @param path path to object
* @param endpointURI damain url e.g. http://domain.com
* @return valid URI for object
*/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-11811) Fix typos in hadoop-project/pom.xml

2015-04-07 Thread Chen He (JIRA)
Chen He created HADOOP-11811:


 Summary: Fix typos in hadoop-project/pom.xml
 Key: HADOOP-11811
 URL: https://issues.apache.org/jira/browse/HADOOP-11811
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Chen He
Priority: Trivial


!-- These 2 versions are defined here becuase they are used --
!-- JDIFF generation from embedded ant in the antrun plugin --

etc. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-11786) Fix Javadoc typos in org.apache.hadoop.fs.FileSystem

2015-04-01 Thread Chen He (JIRA)
Chen He created HADOOP-11786:


 Summary: Fix Javadoc typos in org.apache.hadoop.fs.FileSystem
 Key: HADOOP-11786
 URL: https://issues.apache.org/jira/browse/HADOOP-11786
 Project: Hadoop Common
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.6.0
Reporter: Chen He
Assignee: Yanjun Wang
Priority: Trivial


/**
 * Resets all statistics to 0.
 *
 * In order to reset, we add up all the thread-local statistics data, and
 * set rootData to the negative of that.
 *
 * This may seem like a counterintuitive way to reset the statsitics.  Why
 * can't we just zero out all the thread-local data?  Well, thread-local
 * data can only be modified by the thread that owns it.  If we tried to
 * modify the thread-local data from this thread, our modification might get
 * interleaved with a read-modify-write operation done by the thread that
 * owns the data.  That would result in our update getting lost.
 *
 * The approach used here avoids this problem because it only ever reads
 * (not writes) the thread-local data.  Both reads and writes to rootData
 * are done under the lock, so we're free to modify rootData from any thread
 * that holds the lock.
 */

etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-11775) Fix Javadoc typos in hadoop-openstack module

2015-03-31 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated HADOOP-11775:
-
Summary: Fix Javadoc typos in hadoop-openstack module  (was: Fix typos in 
Swift related Javadoc)

 Fix Javadoc typos in hadoop-openstack module
 

 Key: HADOOP-11775
 URL: https://issues.apache.org/jira/browse/HADOOP-11775
 Project: Hadoop Common
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.6.0
Reporter: Chen He
Assignee: Yanjun Wang
Priority: Trivial
 Attachments: HADOOP-11775.000.patch


 Some typos are listed below but not limited to:
 SwiftNativeFileSystemObject.java
   /**
* Initalize the filesystem store -this creates the REST client binding.
*
* @param fsURI URI of the filesystem, which is used to map to the 
 filesystem-specific
*  options in the configuration file
* @param configuration configuration
* @throws IOException on any failure.
*/
 SwiftNativeFileSystem.java
   /**
* Low level method to do a deep listing of all entries, not stopping
* at the next directory entry. This is to let tests be confident that
* recursive deletes c really are working.
* @param path path to recurse down
* @param newest ask for the newest data, potentially slower than not.
* @return a potentially empty array of file status
* @throws IOException any problem
*/
   /**
* Low-level operation to also set the block size for this operation
* @param path   the file name to open
* @param bufferSize the size of the buffer to be used.
* @param readBlockSize how big should the read blockk/buffer size be?
* @return the input stream
* @throws FileNotFoundException if the file is not found
* @throws IOException any IO problem
*/
 SwiftRestClient.java
   /**
* Converts Swift path to URI to make request.
* This is public for unit testing
*
* @param path path to object
* @param endpointURI damain url e.g. http://domain.com
* @return valid URI for object
*/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11775) Fix typos in Swift related Javadoc

2015-03-31 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388928#comment-14388928
 ] 

Chen He commented on HADOOP-11775:
--

Thank you for the patch, [~kelwyj], I think this JIRA also include fixing typos 
in all javadocs under hadoop-openstack project. Such as:
org.apache.hadoop.fs.swift.http.ExceptionDiags.java
/**
 * Variant of Hadoop Netutils exception wrapping with URI awareness and
 * available in branch-1 too.
 */
org.apache.hadoop.fs.swift.http.HttpBodyContent.java
  /**
   * build a body response
   * @param inputStream input stream from the operatin
   * @param contentLength length of content; may be -1 for don't know
   */

etc. 

Looking forward to see your updated patch. 

 Fix typos in Swift related Javadoc
 --

 Key: HADOOP-11775
 URL: https://issues.apache.org/jira/browse/HADOOP-11775
 Project: Hadoop Common
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.6.0
Reporter: Chen He
Assignee: Yanjun Wang
Priority: Trivial
 Attachments: HADOOP-11775.000.patch


 Some typos are listed below but not limited to:
 SwiftNativeFileSystemObject.java
   /**
* Initalize the filesystem store -this creates the REST client binding.
*
* @param fsURI URI of the filesystem, which is used to map to the 
 filesystem-specific
*  options in the configuration file
* @param configuration configuration
* @throws IOException on any failure.
*/
 SwiftNativeFileSystem.java
   /**
* Low level method to do a deep listing of all entries, not stopping
* at the next directory entry. This is to let tests be confident that
* recursive deletes c really are working.
* @param path path to recurse down
* @param newest ask for the newest data, potentially slower than not.
* @return a potentially empty array of file status
* @throws IOException any problem
*/
   /**
* Low-level operation to also set the block size for this operation
* @param path   the file name to open
* @param bufferSize the size of the buffer to be used.
* @param readBlockSize how big should the read blockk/buffer size be?
* @return the input stream
* @throws FileNotFoundException if the file is not found
* @throws IOException any IO problem
*/
 SwiftRestClient.java
   /**
* Converts Swift path to URI to make request.
* This is public for unit testing
*
* @param path path to object
* @param endpointURI damain url e.g. http://domain.com
* @return valid URI for object
*/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   >