[jira] [Assigned] (HDFS-11947) When constructing a thread name, BPOfferService may print a bogus warning message
[ https://issues.apache.org/jira/browse/HDFS-11947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang reassigned HDFS-11947: -- Assignee: Weiwei Yang > When constructing a thread name, BPOfferService may print a bogus warning > message > -- > > Key: HDFS-11947 > URL: https://issues.apache.org/jira/browse/HDFS-11947 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Tsz Wo Nicholas Sze >Assignee: Weiwei Yang >Priority: Minor > > HDFS-11558 tries to get Block pool ID for constructing thread names. When > the service is not yet registered with NN, it prints the bogus warning "Block > pool ID needed, but service not yet registered with NN" with stack trace. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11729) Improve NNStorageRetentionManager failure handling.
[ https://issues.apache.org/jira/browse/HDFS-11729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated HDFS-11729: --- Status: Patch Available (was: Open) > Improve NNStorageRetentionManager failure handling. > --- > > Key: HDFS-11729 > URL: https://issues.apache.org/jira/browse/HDFS-11729 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Weiwei Yang > Attachments: HDFS-11729.001.patch, HDFS-11729.002.patch > > > Currently {{NNStorageRetentionManager}} will simply skip a storage directory > if a problem is detected. Since checkpoint saving does not go through the > same set of checks, this can lead to a space exhaustion seen in HDFS-11714. > Instead of ignoring errors, it should handle it properly. One potential > improvement is to catch the exception and report the storage directory > failure using {{NNStorage.reportErrorsOnDirectories()}}. > {{attemptRestoreRemovedStorage()}} will need extra checks. E.g. existence of > a VERSION file. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11729) Improve NNStorageRetentionManager failure handling.
[ https://issues.apache.org/jira/browse/HDFS-11729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042290#comment-16042290 ] Weiwei Yang commented on HDFS-11729: Patch submitted based on the first comment, test case added. Please kindly review. Thanks. > Improve NNStorageRetentionManager failure handling. > --- > > Key: HDFS-11729 > URL: https://issues.apache.org/jira/browse/HDFS-11729 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Weiwei Yang > Attachments: HDFS-11729.001.patch, HDFS-11729.002.patch > > > Currently {{NNStorageRetentionManager}} will simply skip a storage directory > if a problem is detected. Since checkpoint saving does not go through the > same set of checks, this can lead to a space exhaustion seen in HDFS-11714. > Instead of ignoring errors, it should handle it properly. One potential > improvement is to catch the exception and report the storage directory > failure using {{NNStorage.reportErrorsOnDirectories()}}. > {{attemptRestoreRemovedStorage()}} will need extra checks. E.g. existence of > a VERSION file. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11729) Improve NNStorageRetentionManager failure handling.
[ https://issues.apache.org/jira/browse/HDFS-11729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated HDFS-11729: --- Attachment: HDFS-11729.002.patch > Improve NNStorageRetentionManager failure handling. > --- > > Key: HDFS-11729 > URL: https://issues.apache.org/jira/browse/HDFS-11729 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Weiwei Yang > Attachments: HDFS-11729.001.patch, HDFS-11729.002.patch > > > Currently {{NNStorageRetentionManager}} will simply skip a storage directory > if a problem is detected. Since checkpoint saving does not go through the > same set of checks, this can lead to a space exhaustion seen in HDFS-11714. > Instead of ignoring errors, it should handle it properly. One potential > improvement is to catch the exception and report the storage directory > failure using {{NNStorage.reportErrorsOnDirectories()}}. > {{attemptRestoreRemovedStorage()}} will need extra checks. E.g. existence of > a VERSION file. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11779) Ozone: KSM: add listBuckets
[ https://issues.apache.org/jira/browse/HDFS-11779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042280#comment-16042280 ] Anu Engineer commented on HDFS-11779: - Thanks, Looking at it right now > Ozone: KSM: add listBuckets > --- > > Key: HDFS-11779 > URL: https://issues.apache.org/jira/browse/HDFS-11779 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Affects Versions: HDFS-7240 >Reporter: Anu Engineer >Assignee: Weiwei Yang > Attachments: HDFS-11779-HDFS-7240.001.patch, > HDFS-11779-HDFS-7240.002.patch, HDFS-11779-HDFS-7240.003.patch, > HDFS-11779-HDFS-7240.004.patch, HDFS-11779-HDFS-7240.005.patch, > HDFS-11779-HDFS-7240.006.patch, HDFS-11779-HDFS-7240.007.patch, > HDFS-11779-HDFS-7240.008.patch, HDFS-11779-HDFS-7240.009.patch, > HDFS-11779-HDFS-7240.010.patch, HDFS-11779-HDFS-7240.011.patch, > HDFS-11779-HDFS-7240.012.patch, HDFS-11779-HDFS-7240.013.patch, > HDFS-11779-HDFS-7240.014.patch > > > Lists buckets of a given volume. Similar to listVolumes, paging supported via > prevKey, prefix and maxKeys. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11779) Ozone: KSM: add listBuckets
[ https://issues.apache.org/jira/browse/HDFS-11779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042278#comment-16042278 ] Weiwei Yang commented on HDFS-11779: Rebased done, v14 patch uploaded. Thanks. > Ozone: KSM: add listBuckets > --- > > Key: HDFS-11779 > URL: https://issues.apache.org/jira/browse/HDFS-11779 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Affects Versions: HDFS-7240 >Reporter: Anu Engineer >Assignee: Weiwei Yang > Attachments: HDFS-11779-HDFS-7240.001.patch, > HDFS-11779-HDFS-7240.002.patch, HDFS-11779-HDFS-7240.003.patch, > HDFS-11779-HDFS-7240.004.patch, HDFS-11779-HDFS-7240.005.patch, > HDFS-11779-HDFS-7240.006.patch, HDFS-11779-HDFS-7240.007.patch, > HDFS-11779-HDFS-7240.008.patch, HDFS-11779-HDFS-7240.009.patch, > HDFS-11779-HDFS-7240.010.patch, HDFS-11779-HDFS-7240.011.patch, > HDFS-11779-HDFS-7240.012.patch, HDFS-11779-HDFS-7240.013.patch, > HDFS-11779-HDFS-7240.014.patch > > > Lists buckets of a given volume. Similar to listVolumes, paging supported via > prevKey, prefix and maxKeys. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11779) Ozone: KSM: add listBuckets
[ https://issues.apache.org/jira/browse/HDFS-11779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated HDFS-11779: --- Attachment: HDFS-11779-HDFS-7240.014.patch > Ozone: KSM: add listBuckets > --- > > Key: HDFS-11779 > URL: https://issues.apache.org/jira/browse/HDFS-11779 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Affects Versions: HDFS-7240 >Reporter: Anu Engineer >Assignee: Weiwei Yang > Attachments: HDFS-11779-HDFS-7240.001.patch, > HDFS-11779-HDFS-7240.002.patch, HDFS-11779-HDFS-7240.003.patch, > HDFS-11779-HDFS-7240.004.patch, HDFS-11779-HDFS-7240.005.patch, > HDFS-11779-HDFS-7240.006.patch, HDFS-11779-HDFS-7240.007.patch, > HDFS-11779-HDFS-7240.008.patch, HDFS-11779-HDFS-7240.009.patch, > HDFS-11779-HDFS-7240.010.patch, HDFS-11779-HDFS-7240.011.patch, > HDFS-11779-HDFS-7240.012.patch, HDFS-11779-HDFS-7240.013.patch, > HDFS-11779-HDFS-7240.014.patch > > > Lists buckets of a given volume. Similar to listVolumes, paging supported via > prevKey, prefix and maxKeys. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11777) Ozone: KSM: add deleteBucket
[ https://issues.apache.org/jira/browse/HDFS-11777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042277#comment-16042277 ] Nandakumar commented on HDFS-11777: --- Thanks [~xyao] for the review and commit. > Ozone: KSM: add deleteBucket > > > Key: HDFS-11777 > URL: https://issues.apache.org/jira/browse/HDFS-11777 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Affects Versions: HDFS-7240 >Reporter: Anu Engineer >Assignee: Nandakumar > Fix For: HDFS-7240 > > Attachments: HDFS-11777-HDFS-7240.000.patch > > > Allows a bucket to to be deleted if there are no keys in the bucket. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11880) Ozone: KSM: Remove protobuf formats from KSM wrappers
[ https://issues.apache.org/jira/browse/HDFS-11880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042274#comment-16042274 ] Nandakumar commented on HDFS-11880: --- Thanks [~xyao] for the review and commit. > Ozone: KSM: Remove protobuf formats from KSM wrappers > - > > Key: HDFS-11880 > URL: https://issues.apache.org/jira/browse/HDFS-11880 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Affects Versions: HDFS-7240 >Reporter: Nandakumar >Assignee: Nandakumar > Fix For: HDFS-7240 > > Attachments: HDFS-11880-HDFS-7240.000.patch, > HDFS-11880-HDFS-7240.001.patch > > > KSM wrappers like KsmBucketInfo and KsmBucketArgs are using protobuf formats > such as StorageTypeProto and OzoneAclInfo, this jira is to remove the > dependency and use {{StorageType}} and {{OzoneAcl}} instead. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-11946) Ozone: Containers in different datanodes are mapped to the same location
[ https://issues.apache.org/jira/browse/HDFS-11946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandakumar reassigned HDFS-11946: - Assignee: Nandakumar (was: Anu Engineer) > Ozone: Containers in different datanodes are mapped to the same location > > > Key: HDFS-11946 > URL: https://issues.apache.org/jira/browse/HDFS-11946 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Reporter: Tsz Wo Nicholas Sze >Assignee: Nandakumar > > This is a problem in unit tests. Containers with the same container name in > different datanodes are mapped to the same local path location. As a result, > the first datanode will be able to succeed creating the container file but > the remaining datanodes will fail to create the container file with > FileAlreadyExistsException. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11779) Ozone: KSM: add listBuckets
[ https://issues.apache.org/jira/browse/HDFS-11779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042258#comment-16042258 ] Weiwei Yang commented on HDFS-11779: Thanks [~anu], conflicts are caused by HDFS-11880, let me rebase it to latest code base and upload a new patch. > Ozone: KSM: add listBuckets > --- > > Key: HDFS-11779 > URL: https://issues.apache.org/jira/browse/HDFS-11779 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Affects Versions: HDFS-7240 >Reporter: Anu Engineer >Assignee: Weiwei Yang > Attachments: HDFS-11779-HDFS-7240.001.patch, > HDFS-11779-HDFS-7240.002.patch, HDFS-11779-HDFS-7240.003.patch, > HDFS-11779-HDFS-7240.004.patch, HDFS-11779-HDFS-7240.005.patch, > HDFS-11779-HDFS-7240.006.patch, HDFS-11779-HDFS-7240.007.patch, > HDFS-11779-HDFS-7240.008.patch, HDFS-11779-HDFS-7240.009.patch, > HDFS-11779-HDFS-7240.010.patch, HDFS-11779-HDFS-7240.011.patch, > HDFS-11779-HDFS-7240.012.patch, HDFS-11779-HDFS-7240.013.patch > > > Lists buckets of a given volume. Similar to listVolumes, paging supported via > prevKey, prefix and maxKeys. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11943) Warn log frequently print to screen in doEncode function on AbstractNativeRawEncoder class
[ https://issues.apache.org/jira/browse/HDFS-11943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liaoyuxiangqin updated HDFS-11943: -- Status: Patch Available (was: Open) > Warn log frequently print to screen in doEncode function on > AbstractNativeRawEncoder class > --- > > Key: HDFS-11943 > URL: https://issues.apache.org/jira/browse/HDFS-11943 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding, native >Affects Versions: 3.0.0-alpha4 > Environment: cluster: 3 nodes > os:(Red Hat 2.6.33.20, Red Hat 3.10.0-514.6.1.el7.x86_64, > Ubuntu4.4.0-31-generic) > hadoop version: hadoop-3.0.0-alpha4 > erasure coding: XOR-2-1-64k and enabled Intel ISA-L > hadoop fs -put file / >Reporter: liaoyuxiangqin >Assignee: liaoyuxiangqin >Priority: Minor > Attachments: HDFS-11943.002.patch, HDFS-11943.patch > > Original Estimate: 0.05h > Remaining Estimate: 0.05h > > when i write file to hdfs on above environment, the hdfs client frequently > print warn log of use direct ByteBuffer inputs/outputs in doEncode function > to screen, detail information as follows: > 2017-06-07 15:20:42,856 WARN rawcoder.AbstractNativeRawEncoder: > convertToByteBufferState is invoked, not efficiently. Please use direct > ByteBuffer inputs/outputs -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11943) Warn log frequently print to screen in doEncode function on AbstractNativeRawEncoder class
[ https://issues.apache.org/jira/browse/HDFS-11943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liaoyuxiangqin updated HDFS-11943: -- Status: Open (was: Patch Available) > Warn log frequently print to screen in doEncode function on > AbstractNativeRawEncoder class > --- > > Key: HDFS-11943 > URL: https://issues.apache.org/jira/browse/HDFS-11943 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding, native >Affects Versions: 3.0.0-alpha4 > Environment: cluster: 3 nodes > os:(Red Hat 2.6.33.20, Red Hat 3.10.0-514.6.1.el7.x86_64, > Ubuntu4.4.0-31-generic) > hadoop version: hadoop-3.0.0-alpha4 > erasure coding: XOR-2-1-64k and enabled Intel ISA-L > hadoop fs -put file / >Reporter: liaoyuxiangqin >Assignee: liaoyuxiangqin >Priority: Minor > Attachments: HDFS-11943.002.patch, HDFS-11943.patch > > Original Estimate: 0.05h > Remaining Estimate: 0.05h > > when i write file to hdfs on above environment, the hdfs client frequently > print warn log of use direct ByteBuffer inputs/outputs in doEncode function > to screen, detail information as follows: > 2017-06-07 15:20:42,856 WARN rawcoder.AbstractNativeRawEncoder: > convertToByteBufferState is invoked, not efficiently. Please use direct > ByteBuffer inputs/outputs -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11943) Warn log frequently print to screen in doEncode function on AbstractNativeRawEncoder class
[ https://issues.apache.org/jira/browse/HDFS-11943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liaoyuxiangqin updated HDFS-11943: -- Attachment: HDFS-11943.002.patch > Warn log frequently print to screen in doEncode function on > AbstractNativeRawEncoder class > --- > > Key: HDFS-11943 > URL: https://issues.apache.org/jira/browse/HDFS-11943 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding, native >Affects Versions: 3.0.0-alpha4 > Environment: cluster: 3 nodes > os:(Red Hat 2.6.33.20, Red Hat 3.10.0-514.6.1.el7.x86_64, > Ubuntu4.4.0-31-generic) > hadoop version: hadoop-3.0.0-alpha4 > erasure coding: XOR-2-1-64k and enabled Intel ISA-L > hadoop fs -put file / >Reporter: liaoyuxiangqin >Assignee: liaoyuxiangqin >Priority: Minor > Attachments: HDFS-11943.002.patch, HDFS-11943.patch > > Original Estimate: 0.05h > Remaining Estimate: 0.05h > > when i write file to hdfs on above environment, the hdfs client frequently > print warn log of use direct ByteBuffer inputs/outputs in doEncode function > to screen, detail information as follows: > 2017-06-07 15:20:42,856 WARN rawcoder.AbstractNativeRawEncoder: > convertToByteBufferState is invoked, not efficiently. Please use direct > ByteBuffer inputs/outputs -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11779) Ozone: KSM: add listBuckets
[ https://issues.apache.org/jira/browse/HDFS-11779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042250#comment-16042250 ] Anu Engineer commented on HDFS-11779: - [~cheersyang] I am not able to apply this patch. Would you please verify this patch on your end. I will be online for some time more in case you are uploading a new patch. > Ozone: KSM: add listBuckets > --- > > Key: HDFS-11779 > URL: https://issues.apache.org/jira/browse/HDFS-11779 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Affects Versions: HDFS-7240 >Reporter: Anu Engineer >Assignee: Weiwei Yang > Attachments: HDFS-11779-HDFS-7240.001.patch, > HDFS-11779-HDFS-7240.002.patch, HDFS-11779-HDFS-7240.003.patch, > HDFS-11779-HDFS-7240.004.patch, HDFS-11779-HDFS-7240.005.patch, > HDFS-11779-HDFS-7240.006.patch, HDFS-11779-HDFS-7240.007.patch, > HDFS-11779-HDFS-7240.008.patch, HDFS-11779-HDFS-7240.009.patch, > HDFS-11779-HDFS-7240.010.patch, HDFS-11779-HDFS-7240.011.patch, > HDFS-11779-HDFS-7240.012.patch, HDFS-11779-HDFS-7240.013.patch > > > Lists buckets of a given volume. Similar to listVolumes, paging supported via > prevKey, prefix and maxKeys. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-11943) Warn log frequently print to screen in doEncode function on AbstractNativeRawEncoder class
[ https://issues.apache.org/jira/browse/HDFS-11943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042247#comment-16042247 ] liaoyuxiangqin edited comment on HDFS-11943 at 6/8/17 6:07 AM: --- Hi, thanks for your review the report [~andrew.wang]. I'd glad to LOG at debug in a more reasonable way, and don't affect performance. was (Author: liaoyuxiangqin): Hi, thanks for your review the report [~andrew.wang]]. I'd glad to LOG at debug in a more reasonable way, and don't affect performance. > Warn log frequently print to screen in doEncode function on > AbstractNativeRawEncoder class > --- > > Key: HDFS-11943 > URL: https://issues.apache.org/jira/browse/HDFS-11943 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding, native >Affects Versions: 3.0.0-alpha4 > Environment: cluster: 3 nodes > os:(Red Hat 2.6.33.20, Red Hat 3.10.0-514.6.1.el7.x86_64, > Ubuntu4.4.0-31-generic) > hadoop version: hadoop-3.0.0-alpha4 > erasure coding: XOR-2-1-64k and enabled Intel ISA-L > hadoop fs -put file / >Reporter: liaoyuxiangqin >Assignee: liaoyuxiangqin >Priority: Minor > Attachments: HDFS-11943.patch > > Original Estimate: 0.05h > Remaining Estimate: 0.05h > > when i write file to hdfs on above environment, the hdfs client frequently > print warn log of use direct ByteBuffer inputs/outputs in doEncode function > to screen, detail information as follows: > 2017-06-07 15:20:42,856 WARN rawcoder.AbstractNativeRawEncoder: > convertToByteBufferState is invoked, not efficiently. Please use direct > ByteBuffer inputs/outputs -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-11943) Warn log frequently print to screen in doEncode function on AbstractNativeRawEncoder class
[ https://issues.apache.org/jira/browse/HDFS-11943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042247#comment-16042247 ] liaoyuxiangqin edited comment on HDFS-11943 at 6/8/17 6:06 AM: --- Hi, thanks for your review the report [~andrew.wang]]. I'd glad to LOG at debug in a more reasonable way, and don't affect performance. was (Author: liaoyuxiangqin): Hi, thanks for your review the report [~andrew wang]. I'd glad to LOG at debug in a more reasonable way, and don't affect performance. > Warn log frequently print to screen in doEncode function on > AbstractNativeRawEncoder class > --- > > Key: HDFS-11943 > URL: https://issues.apache.org/jira/browse/HDFS-11943 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding, native >Affects Versions: 3.0.0-alpha4 > Environment: cluster: 3 nodes > os:(Red Hat 2.6.33.20, Red Hat 3.10.0-514.6.1.el7.x86_64, > Ubuntu4.4.0-31-generic) > hadoop version: hadoop-3.0.0-alpha4 > erasure coding: XOR-2-1-64k and enabled Intel ISA-L > hadoop fs -put file / >Reporter: liaoyuxiangqin >Assignee: liaoyuxiangqin >Priority: Minor > Attachments: HDFS-11943.patch > > Original Estimate: 0.05h > Remaining Estimate: 0.05h > > when i write file to hdfs on above environment, the hdfs client frequently > print warn log of use direct ByteBuffer inputs/outputs in doEncode function > to screen, detail information as follows: > 2017-06-07 15:20:42,856 WARN rawcoder.AbstractNativeRawEncoder: > convertToByteBufferState is invoked, not efficiently. Please use direct > ByteBuffer inputs/outputs -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-11943) Warn log frequently print to screen in doEncode function on AbstractNativeRawEncoder class
[ https://issues.apache.org/jira/browse/HDFS-11943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042247#comment-16042247 ] liaoyuxiangqin edited comment on HDFS-11943 at 6/8/17 6:05 AM: --- Hi, thanks for your review the report [~andrew wang]. I'd glad to LOG at debug in a more reasonable way, and don't affect performance. was (Author: liaoyuxiangqin): Hi, thanks for your review the report andrew wang. I'd glad to LOG at debug in a more reasonable way, and don't affect performance. > Warn log frequently print to screen in doEncode function on > AbstractNativeRawEncoder class > --- > > Key: HDFS-11943 > URL: https://issues.apache.org/jira/browse/HDFS-11943 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding, native >Affects Versions: 3.0.0-alpha4 > Environment: cluster: 3 nodes > os:(Red Hat 2.6.33.20, Red Hat 3.10.0-514.6.1.el7.x86_64, > Ubuntu4.4.0-31-generic) > hadoop version: hadoop-3.0.0-alpha4 > erasure coding: XOR-2-1-64k and enabled Intel ISA-L > hadoop fs -put file / >Reporter: liaoyuxiangqin >Assignee: liaoyuxiangqin >Priority: Minor > Attachments: HDFS-11943.patch > > Original Estimate: 0.05h > Remaining Estimate: 0.05h > > when i write file to hdfs on above environment, the hdfs client frequently > print warn log of use direct ByteBuffer inputs/outputs in doEncode function > to screen, detail information as follows: > 2017-06-07 15:20:42,856 WARN rawcoder.AbstractNativeRawEncoder: > convertToByteBufferState is invoked, not efficiently. Please use direct > ByteBuffer inputs/outputs -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-11943) Warn log frequently print to screen in doEncode function on AbstractNativeRawEncoder class
[ https://issues.apache.org/jira/browse/HDFS-11943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042247#comment-16042247 ] liaoyuxiangqin edited comment on HDFS-11943 at 6/8/17 6:03 AM: --- Hi, thanks for your review the report andrew wang. I'd glad to LOG at debug in a more reasonable way, and don't affect performance. was (Author: liaoyuxiangqin): Hi, thanks for your review the report https://issues.apache.org/jira/secure/ViewProfile.jspa?name=andrew.wang. I'd glad to LOG at debug in a more reasonable way, and don't affect performance. > Warn log frequently print to screen in doEncode function on > AbstractNativeRawEncoder class > --- > > Key: HDFS-11943 > URL: https://issues.apache.org/jira/browse/HDFS-11943 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding, native >Affects Versions: 3.0.0-alpha4 > Environment: cluster: 3 nodes > os:(Red Hat 2.6.33.20, Red Hat 3.10.0-514.6.1.el7.x86_64, > Ubuntu4.4.0-31-generic) > hadoop version: hadoop-3.0.0-alpha4 > erasure coding: XOR-2-1-64k and enabled Intel ISA-L > hadoop fs -put file / >Reporter: liaoyuxiangqin >Assignee: liaoyuxiangqin >Priority: Minor > Attachments: HDFS-11943.patch > > Original Estimate: 0.05h > Remaining Estimate: 0.05h > > when i write file to hdfs on above environment, the hdfs client frequently > print warn log of use direct ByteBuffer inputs/outputs in doEncode function > to screen, detail information as follows: > 2017-06-07 15:20:42,856 WARN rawcoder.AbstractNativeRawEncoder: > convertToByteBufferState is invoked, not efficiently. Please use direct > ByteBuffer inputs/outputs -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11943) Warn log frequently print to screen in doEncode function on AbstractNativeRawEncoder class
[ https://issues.apache.org/jira/browse/HDFS-11943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042247#comment-16042247 ] liaoyuxiangqin commented on HDFS-11943: --- Hi, thanks for your review the report https://issues.apache.org/jira/secure/ViewProfile.jspa?name=andrew.wang. I'd glad to LOG at debug in a more reasonable way, and don't affect performance. > Warn log frequently print to screen in doEncode function on > AbstractNativeRawEncoder class > --- > > Key: HDFS-11943 > URL: https://issues.apache.org/jira/browse/HDFS-11943 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding, native >Affects Versions: 3.0.0-alpha4 > Environment: cluster: 3 nodes > os:(Red Hat 2.6.33.20, Red Hat 3.10.0-514.6.1.el7.x86_64, > Ubuntu4.4.0-31-generic) > hadoop version: hadoop-3.0.0-alpha4 > erasure coding: XOR-2-1-64k and enabled Intel ISA-L > hadoop fs -put file / >Reporter: liaoyuxiangqin >Assignee: liaoyuxiangqin >Priority: Minor > Attachments: HDFS-11943.patch > > Original Estimate: 0.05h > Remaining Estimate: 0.05h > > when i write file to hdfs on above environment, the hdfs client frequently > print warn log of use direct ByteBuffer inputs/outputs in doEncode function > to screen, detail information as follows: > 2017-06-07 15:20:42,856 WARN rawcoder.AbstractNativeRawEncoder: > convertToByteBufferState is invoked, not efficiently. Please use direct > ByteBuffer inputs/outputs -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11949) Add testcase for ensuring that FsShell cann't move file to the target directory that file exists
[ https://issues.apache.org/jira/browse/HDFS-11949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042202#comment-16042202 ] Hadoop QA commented on HDFS-11949: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 5s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 24s{color} | {color:red} hadoop-common-project/hadoop-common in trunk has 19 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 22s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 10m 22s{color} | {color:red} root generated 1 new + 787 unchanged - 0 fixed = 788 total (was 787) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 34s{color} | {color:orange} hadoop-common-project/hadoop-common: The patch generated 2 new + 38 unchanged - 0 fixed = 40 total (was 38) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 53s{color} | {color:red} hadoop-common in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 34s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 56m 16s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.net.TestDNS | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | HDFS-11949 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12871984/HDFS-11949.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 14892935880e 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 5672ae7 | | Default Java | 1.8.0_131 | | findbugs | v3.1.0-RC1 | | findbugs | https://builds.apache.org/job/PreCommit-HDFS-Build/19830/artifact/patchprocess/branch-findbugs-hadoop-common-project_hadoop-common-warnings.html | | javac | https://builds.apache.org/job/PreCommit-HDFS-Build/19830/artifact/patchprocess/diff-compile-javac-root.txt | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/19830/artifact/patchprocess/diff-checkstyle-hadoop-common-project_hadoop-common.txt | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/19830/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/19830/testReport/ | | modules | C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/19830/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Add testcase for ensuring that FsShell cann't
[jira] [Updated] (HDFS-11949) Add testcase for ensuring that FsShell cann't move file to the target directory that file exists
[ https://issues.apache.org/jira/browse/HDFS-11949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] legend updated HDFS-11949: -- Status: Patch Available (was: Open) > Add testcase for ensuring that FsShell cann't move file to the target > directory that file exists > > > Key: HDFS-11949 > URL: https://issues.apache.org/jira/browse/HDFS-11949 > Project: Hadoop HDFS > Issue Type: Test > Components: test >Affects Versions: 3.0.0-alpha4 >Reporter: legend >Priority: Minor > Attachments: HDFS-11949.patch > > > moveFromLocal returns error when move file to the target directory that the > file exists. So we need add test case to check it. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11949) Add testcase for ensuring that FsShell cann't move file to the target directory that file exists
[ https://issues.apache.org/jira/browse/HDFS-11949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] legend updated HDFS-11949: -- Attachment: HDFS-11949.patch > Add testcase for ensuring that FsShell cann't move file to the target > directory that file exists > > > Key: HDFS-11949 > URL: https://issues.apache.org/jira/browse/HDFS-11949 > Project: Hadoop HDFS > Issue Type: Test > Components: test >Affects Versions: 3.0.0-alpha4 >Reporter: legend >Priority: Minor > Attachments: HDFS-11949.patch > > > moveFromLocal returns error when move file to the target directory that the > file exists. So we need add test case to check it. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-11949) Add testcase for ensuring that FsShell cann't move file to the target directory that file exists
legend created HDFS-11949: - Summary: Add testcase for ensuring that FsShell cann't move file to the target directory that file exists Key: HDFS-11949 URL: https://issues.apache.org/jira/browse/HDFS-11949 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 3.0.0-alpha4 Reporter: legend Priority: Minor moveFromLocal returns error when move file to the target directory that the file exists. So we need add test case to check it. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11918) Ozone: Encapsulate KSM metadata key for better (de)serialization
[ https://issues.apache.org/jira/browse/HDFS-11918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated HDFS-11918: --- Status: Open (was: Patch Available) > Ozone: Encapsulate KSM metadata key for better (de)serialization > > > Key: HDFS-11918 > URL: https://issues.apache.org/jira/browse/HDFS-11918 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Critical > Attachments: HDFS-11918-HDFS-7240.001.patch > > > There are multiple type of keys stored in KSM database > # Volume Key > # Bucket Key > # Object Key > # User Key > Currently they are represented as plain string with some conventions, such as > # /volume > # /volume/bucket > # /volume/bucket/key > # $user > this approach makes it so difficult to parse volume/bucket/keys from KSM > database. Propose to encapsulate these types of keys into protobuf messages, > and take advantage of protobuf to serialize(deserialize) classes to byte > arrays (and vice versa). -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11918) Ozone: Encapsulate KSM metadata key into protobuf messages for better (de)serialization
[ https://issues.apache.org/jira/browse/HDFS-11918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042115#comment-16042115 ] Weiwei Yang commented on HDFS-11918: I agree. Second thought, we could still use strings as KSM DB keys (instead of protobuf messages), but adding some encapsulation classes for the read/write. This way we could have higher level API in {{MetadataManagerImpl}} instead of directly manipulating the database. And have a central place to parse a string to volume/bucket/objectKey instances, where we can have a better UT to cover different combinations. For now, let me cancel the patch and revisit this if necessary. Thanks [~anu], [~xyao]. > Ozone: Encapsulate KSM metadata key into protobuf messages for better > (de)serialization > --- > > Key: HDFS-11918 > URL: https://issues.apache.org/jira/browse/HDFS-11918 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Critical > Attachments: HDFS-11918-HDFS-7240.001.patch > > > There are multiple type of keys stored in KSM database > # Volume Key > # Bucket Key > # Object Key > # User Key > Currently they are represented as plain string with some conventions, such as > # /volume > # /volume/bucket > # /volume/bucket/key > # $user > this approach makes it so difficult to parse volume/bucket/keys from KSM > database. Propose to encapsulate these types of keys into protobuf messages, > and take advantage of protobuf to serialize(deserialize) classes to byte > arrays (and vice versa). -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11918) Ozone: Encapsulate KSM metadata key for better (de)serialization
[ https://issues.apache.org/jira/browse/HDFS-11918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated HDFS-11918: --- Summary: Ozone: Encapsulate KSM metadata key for better (de)serialization (was: Ozone: Encapsulate KSM metadata key into protobuf messages for better (de)serialization) > Ozone: Encapsulate KSM metadata key for better (de)serialization > > > Key: HDFS-11918 > URL: https://issues.apache.org/jira/browse/HDFS-11918 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Critical > Attachments: HDFS-11918-HDFS-7240.001.patch > > > There are multiple type of keys stored in KSM database > # Volume Key > # Bucket Key > # Object Key > # User Key > Currently they are represented as plain string with some conventions, such as > # /volume > # /volume/bucket > # /volume/bucket/key > # $user > this approach makes it so difficult to parse volume/bucket/keys from KSM > database. Propose to encapsulate these types of keys into protobuf messages, > and take advantage of protobuf to serialize(deserialize) classes to byte > arrays (and vice versa). -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-11948) Ozone: change TestRatisManager to check cluster with data
Tsz Wo Nicholas Sze created HDFS-11948: -- Summary: Ozone: change TestRatisManager to check cluster with data Key: HDFS-11948 URL: https://issues.apache.org/jira/browse/HDFS-11948 Project: Hadoop HDFS Issue Type: Sub-task Components: ozone Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze TestRatisManager first creates multiple Ratis clusters. Then it changes the membership and closes some clusters. However, it does not test the clusters with data. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11947) When constructing a thread name, BPOfferService may print a bogus warning message
[ https://issues.apache.org/jira/browse/HDFS-11947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-11947: --- Summary: When constructing a thread name, BPOfferService may print a bogus warning message (was: BPOfferService prints a invalid warning message "Block pool ID needed, but service not yet registered with NN") Description: HDFS-11558 tries to get Block pool ID for constructing thread names. When the service is not yet registered with NN, it prints the bogus warning "Block pool ID needed, but service not yet registered with NN" with stack trace. > When constructing a thread name, BPOfferService may print a bogus warning > message > -- > > Key: HDFS-11947 > URL: https://issues.apache.org/jira/browse/HDFS-11947 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Tsz Wo Nicholas Sze >Priority: Minor > > HDFS-11558 tries to get Block pool ID for constructing thread names. When > the service is not yet registered with NN, it prints the bogus warning "Block > pool ID needed, but service not yet registered with NN" with stack trace. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-11947) BPOfferService prints a invalid warning message "Block pool ID needed, but service not yet registered with NN"
Tsz Wo Nicholas Sze created HDFS-11947: -- Summary: BPOfferService prints a invalid warning message "Block pool ID needed, but service not yet registered with NN" Key: HDFS-11947 URL: https://issues.apache.org/jira/browse/HDFS-11947 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: Tsz Wo Nicholas Sze Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-11946) Ozone: Containers in different datanodes are mapped to the same location
[ https://issues.apache.org/jira/browse/HDFS-11946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042105#comment-16042105 ] Tsz Wo Nicholas Sze edited comment on HDFS-11946 at 6/8/17 2:39 AM: For reproducing the problem, run TestOzoneContainerRatis.testBothGetandPutSmallFileRatisNetty. It is easier to understand the log if we add a message for "Created a new container" and comment out the single node test as shown below {code} diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/ozone/container/common/helpers/ContainerUtils.java b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/ozone/container/common/helpers/ContainerUtils.java index 7d0e75667c..f7b191c887 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/ozone/container/common/helpers/ContainerUtils.java +++ b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/ozone/container/common/helpers/ContainerUtils.java @@ -218,6 +218,9 @@ public static void verifyIsNewContainer(File containerFile, File metadataFile) log.error("creation of a new container file failed. File: {}", containerFile.toPath()); throw new IOException("creation of a new container file failed."); +} else { + log.info("Created a new container. File: {}", + containerFile.toPath()); } if (!metadataFile.createNewFile()) { diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/ozone/container/ozoneimpl/TestOzoneContainerRatis.java b/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/ozone/container/ozoneimpl/TestOzoneContainerRatis.java index f77e731d45..8cbaf569fd 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/ozone/container/ozoneimpl/TestOzoneContainerRatis.java +++ b/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/ozone/container/ozoneimpl/TestOzoneContainerRatis.java @@ -130,7 +130,7 @@ private static void runTestBothGetandPutSmallFileRatis( @Test public void testBothGetandPutSmallFileRatisNetty() throws Exception { -runTestBothGetandPutSmallFileRatis(SupportedRpcType.NETTY, 1); +//runTestBothGetandPutSmallFileRatis(SupportedRpcType.NETTY, 1); runTestBothGetandPutSmallFileRatis(SupportedRpcType.NETTY, 3); } {code} was (Author: szetszwo): For reproducing the problem, run TestOzoneContainerRatis.testBothGetandPutSmallFileRatisNetty. It is easier to understand the log if we add a message for "Created of a new container" and comment out the single node test as shown below {code} diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/ozone/container/common/helpers/ContainerUtils.java b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/ozone/container/common/helpers/ContainerUtils.java index 7d0e75667c..f7b191c887 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/ozone/container/common/helpers/ContainerUtils.java +++ b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/ozone/container/common/helpers/ContainerUtils.java @@ -218,6 +218,9 @@ public static void verifyIsNewContainer(File containerFile, File metadataFile) log.error("creation of a new container file failed. File: {}", containerFile.toPath()); throw new IOException("creation of a new container file failed."); +} else { + log.info("Created of a new container. File: {}", + containerFile.toPath()); } if (!metadataFile.createNewFile()) { diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/ozone/container/ozoneimpl/TestOzoneContainerRatis.java b/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/ozone/container/ozoneimpl/TestOzoneContainerRatis.java index f77e731d45..8cbaf569fd 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/ozone/container/ozoneimpl/TestOzoneContainerRatis.java +++ b/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/ozone/container/ozoneimpl/TestOzoneContainerRatis.java @@ -130,7 +130,7 @@ private static void runTestBothGetandPutSmallFileRatis( @Test public void testBothGetandPutSmallFileRatisNetty() throws Exception { -runTestBothGetandPutSmallFileRatis(SupportedRpcType.NETTY, 1); +//runTestBothGetandPutSmallFileRatis(SupportedRpcType.NETTY, 1); runTestBothGetandPutSmallFileRatis(SupportedRpcType.NETTY, 3); } {code} > Ozone: Containers in different datanodes are mapped to the same location > > > Key: HDFS-11946 > URL: https://issues.apache.org/jira/browse/HDFS-11946 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Reporter: Tsz Wo Nicholas Sz
[jira] [Comment Edited] (HDFS-11946) Ozone: Containers in different datanodes are mapped to the same location
[ https://issues.apache.org/jira/browse/HDFS-11946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042098#comment-16042098 ] Tsz Wo Nicholas Sze edited comment on HDFS-11946 at 6/8/17 2:38 AM: Here is an example. The first datanode 127.0.0.1:58976 was able to create container f3972a31-3587-4baf-b1dd-eb3d41d5aad2 but the other datanodes 127.0.0.1:58966 and 127.0.0.1:58971 failed with "container already exists on disk". It seems that the container paths are independent of datanode information. - container path: /Users/szetszwo/hadoop/t2/hadoop-hdfs-project/hadoop-hdfs/target/test/data/MiniOzoneClusteraf64005d-677e-4bb8-a54b-c03c94896214/5d170ac6-dbc3-41e9-aa86-dc9d1416453b/scm/repository/f3972a31-3587-4baf-b1dd-eb3d41d5aad2.container {code} 2017-06-08 10:18:42,712 [StateMachineUpdater-127.0.0.1:58976] INFO - Created a new container. File: /Users/szetszwo/hadoop/t2/hadoop-hdfs-project/hadoop-hdfs/target/test/data/MiniOzoneClusteraf64005d-677e-4bb8-a54b-c03c94896214/5d170ac6-dbc3-41e9-aa86-dc9d1416453b/scm/repository/f3972a31-3587-4baf-b1dd-eb3d41d5aad2.container 2017-06-08 10:18:42,736 [StateMachineUpdater-127.0.0.1:58966] ERROR - container already exists on disk. File: /Users/szetszwo/hadoop/t2/hadoop-hdfs-project/hadoop-hdfs/target/test/data/MiniOzoneClusteraf64005d-677e-4bb8-a54b-c03c94896214/5d170ac6-dbc3-41e9-aa86-dc9d1416453b/scm/repository/f3972a31-3587-4baf-b1dd-eb3d41d5aad2.container 2017-06-08 10:18:42,736 [StateMachineUpdater-127.0.0.1:58971] ERROR - container already exists on disk. File: /Users/szetszwo/hadoop/t2/hadoop-hdfs-project/hadoop-hdfs/target/test/data/MiniOzoneClusteraf64005d-677e-4bb8-a54b-c03c94896214/5d170ac6-dbc3-41e9-aa86-dc9d1416453b/scm/repository/f3972a31-3587-4baf-b1dd-eb3d41d5aad2.container {code} {code} 2017-06-08 10:18:42,738 [StateMachineUpdater-127.0.0.1:58971] ERROR - Creation of container failed. Name: f3972a31-3587-4baf-b1dd-eb3d41d5aad2, we might need to cleanup partially created artifacts. org.apache.hadoop.fs.FileAlreadyExistsException: container already exists on disk. at org.apache.hadoop.ozone.container.common.helpers.ContainerUtils.verifyIsNewContainer(ContainerUtils.java:198) at org.apache.hadoop.ozone.container.common.impl.ContainerManagerImpl.writeContainerInfo(ContainerManagerImpl.java:325) at org.apache.hadoop.ozone.container.common.impl.ContainerManagerImpl.createContainer(ContainerManagerImpl.java:263) at org.apache.hadoop.ozone.container.common.impl.Dispatcher.handleCreateContainer(Dispatcher.java:395) at org.apache.hadoop.ozone.container.common.impl.Dispatcher.containerProcessHandler(Dispatcher.java:156) at org.apache.hadoop.ozone.container.common.impl.Dispatcher.dispatch(Dispatcher.java:103) at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.dispatch(ContainerStateMachine.java:94) at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.applyTransaction(ContainerStateMachine.java:81) at org.apache.ratis.server.impl.RaftServerImpl.applyLogToStateMachine(RaftServerImpl.java:913) at org.apache.ratis.server.impl.StateMachineUpdater.run(StateMachineUpdater.java:142) at java.lang.Thread.run(Thread.java:748) {code} was (Author: szetszwo): Here is an example. The first datanode 127.0.0.1:58976 was able to create container f3972a31-3587-4baf-b1dd-eb3d41d5aad2 but the other datanodes 127.0.0.1:58966 and 127.0.0.1:58971 failed with "container already exists on disk". It seems that the container paths are independent of datanode information. - container path: /Users/szetszwo/hadoop/t2/hadoop-hdfs-project/hadoop-hdfs/target/test/data/MiniOzoneClusteraf64005d-677e-4bb8-a54b-c03c94896214/5d170ac6-dbc3-41e9-aa86-dc9d1416453b/scm/repository/f3972a31-3587-4baf-b1dd-eb3d41d5aad2.container {code} 2017-06-08 10:18:42,712 [StateMachineUpdater-127.0.0.1:58976] INFO - Created of a new container. File: /Users/szetszwo/hadoop/t2/hadoop-hdfs-project/hadoop-hdfs/target/test/data/MiniOzoneClusteraf64005d-677e-4bb8-a54b-c03c94896214/5d170ac6-dbc3-41e9-aa86-dc9d1416453b/scm/repository/f3972a31-3587-4baf-b1dd-eb3d41d5aad2.container 2017-06-08 10:18:42,736 [StateMachineUpdater-127.0.0.1:58966] ERROR - container already exists on disk. File: /Users/szetszwo/hadoop/t2/hadoop-hdfs-project/hadoop-hdfs/target/test/data/MiniOzoneClusteraf64005d-677e-4bb8-a54b-c03c94896214/5d170ac6-dbc3-41e9-aa86-dc9d1416453b/scm/repository/f3972a31-3587-4baf-b1dd-eb3d41d5aad2.container 2017-06-08 10:18:42,736 [StateMachineUpdater-127.0.0.1:58971] ERROR - container already exists on disk. File: /Users/szetszwo/hadoop/t2/hadoop-hdfs-project/hadoop-hdfs/target/test/data/MiniOzoneClusteraf64005d-677e-4bb8-a54b-c03c94896214/5d170ac6-dbc3-41e9-aa86-dc9d1416453b/
[jira] [Commented] (HDFS-11946) Ozone: Containers in different datanodes are mapped to the same location
[ https://issues.apache.org/jira/browse/HDFS-11946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042105#comment-16042105 ] Tsz Wo Nicholas Sze commented on HDFS-11946: For reproducing the problem, run TestOzoneContainerRatis.testBothGetandPutSmallFileRatisNetty. It is easier to understand the log if we add a message for "Created of a new container" and comment out the single node test as shown below {code} diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/ozone/container/common/helpers/ContainerUtils.java b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/ozone/container/common/helpers/ContainerUtils.java index 7d0e75667c..f7b191c887 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/ozone/container/common/helpers/ContainerUtils.java +++ b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/ozone/container/common/helpers/ContainerUtils.java @@ -218,6 +218,9 @@ public static void verifyIsNewContainer(File containerFile, File metadataFile) log.error("creation of a new container file failed. File: {}", containerFile.toPath()); throw new IOException("creation of a new container file failed."); +} else { + log.info("Created of a new container. File: {}", + containerFile.toPath()); } if (!metadataFile.createNewFile()) { diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/ozone/container/ozoneimpl/TestOzoneContainerRatis.java b/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/ozone/container/ozoneimpl/TestOzoneContainerRatis.java index f77e731d45..8cbaf569fd 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/ozone/container/ozoneimpl/TestOzoneContainerRatis.java +++ b/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/ozone/container/ozoneimpl/TestOzoneContainerRatis.java @@ -130,7 +130,7 @@ private static void runTestBothGetandPutSmallFileRatis( @Test public void testBothGetandPutSmallFileRatisNetty() throws Exception { -runTestBothGetandPutSmallFileRatis(SupportedRpcType.NETTY, 1); +//runTestBothGetandPutSmallFileRatis(SupportedRpcType.NETTY, 1); runTestBothGetandPutSmallFileRatis(SupportedRpcType.NETTY, 3); } {code} > Ozone: Containers in different datanodes are mapped to the same location > > > Key: HDFS-11946 > URL: https://issues.apache.org/jira/browse/HDFS-11946 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Reporter: Tsz Wo Nicholas Sze >Assignee: Anu Engineer > > This is a problem in unit tests. Containers with the same container name in > different datanodes are mapped to the same local path location. As a result, > the first datanode will be able to succeed creating the container file but > the remaining datanodes will fail to create the container file with > FileAlreadyExistsException. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11946) Ozone: Containers in different datanodes are mapped to the same location
[ https://issues.apache.org/jira/browse/HDFS-11946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042098#comment-16042098 ] Tsz Wo Nicholas Sze commented on HDFS-11946: Here is an example. The first datanode 127.0.0.1:58976 was able to create container f3972a31-3587-4baf-b1dd-eb3d41d5aad2 but the other datanodes 127.0.0.1:58966 and 127.0.0.1:58971 failed with "container already exists on disk". It seems that the container paths are independent of datanode information. - container path: /Users/szetszwo/hadoop/t2/hadoop-hdfs-project/hadoop-hdfs/target/test/data/MiniOzoneClusteraf64005d-677e-4bb8-a54b-c03c94896214/5d170ac6-dbc3-41e9-aa86-dc9d1416453b/scm/repository/f3972a31-3587-4baf-b1dd-eb3d41d5aad2.container {code} 2017-06-08 10:18:42,712 [StateMachineUpdater-127.0.0.1:58976] INFO - Created of a new container. File: /Users/szetszwo/hadoop/t2/hadoop-hdfs-project/hadoop-hdfs/target/test/data/MiniOzoneClusteraf64005d-677e-4bb8-a54b-c03c94896214/5d170ac6-dbc3-41e9-aa86-dc9d1416453b/scm/repository/f3972a31-3587-4baf-b1dd-eb3d41d5aad2.container 2017-06-08 10:18:42,736 [StateMachineUpdater-127.0.0.1:58966] ERROR - container already exists on disk. File: /Users/szetszwo/hadoop/t2/hadoop-hdfs-project/hadoop-hdfs/target/test/data/MiniOzoneClusteraf64005d-677e-4bb8-a54b-c03c94896214/5d170ac6-dbc3-41e9-aa86-dc9d1416453b/scm/repository/f3972a31-3587-4baf-b1dd-eb3d41d5aad2.container 2017-06-08 10:18:42,736 [StateMachineUpdater-127.0.0.1:58971] ERROR - container already exists on disk. File: /Users/szetszwo/hadoop/t2/hadoop-hdfs-project/hadoop-hdfs/target/test/data/MiniOzoneClusteraf64005d-677e-4bb8-a54b-c03c94896214/5d170ac6-dbc3-41e9-aa86-dc9d1416453b/scm/repository/f3972a31-3587-4baf-b1dd-eb3d41d5aad2.container {code} {code} 2017-06-08 10:18:42,738 [StateMachineUpdater-127.0.0.1:58971] ERROR - Creation of container failed. Name: f3972a31-3587-4baf-b1dd-eb3d41d5aad2, we might need to cleanup partially created artifacts. org.apache.hadoop.fs.FileAlreadyExistsException: container already exists on disk. at org.apache.hadoop.ozone.container.common.helpers.ContainerUtils.verifyIsNewContainer(ContainerUtils.java:198) at org.apache.hadoop.ozone.container.common.impl.ContainerManagerImpl.writeContainerInfo(ContainerManagerImpl.java:325) at org.apache.hadoop.ozone.container.common.impl.ContainerManagerImpl.createContainer(ContainerManagerImpl.java:263) at org.apache.hadoop.ozone.container.common.impl.Dispatcher.handleCreateContainer(Dispatcher.java:395) at org.apache.hadoop.ozone.container.common.impl.Dispatcher.containerProcessHandler(Dispatcher.java:156) at org.apache.hadoop.ozone.container.common.impl.Dispatcher.dispatch(Dispatcher.java:103) at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.dispatch(ContainerStateMachine.java:94) at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.applyTransaction(ContainerStateMachine.java:81) at org.apache.ratis.server.impl.RaftServerImpl.applyLogToStateMachine(RaftServerImpl.java:913) at org.apache.ratis.server.impl.StateMachineUpdater.run(StateMachineUpdater.java:142) at java.lang.Thread.run(Thread.java:748) {code} > Ozone: Containers in different datanodes are mapped to the same location > > > Key: HDFS-11946 > URL: https://issues.apache.org/jira/browse/HDFS-11946 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Reporter: Tsz Wo Nicholas Sze >Assignee: Anu Engineer > > This is a problem in unit tests. Containers with the same container name in > different datanodes are mapped to the same local path location. As a result, > the first datanode will be able to succeed creating the container file but > the remaining datanodes will fail to create the container file with > FileAlreadyExistsException. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-11946) Ozone: Containers in different datanodes are mapped to the same location
Tsz Wo Nicholas Sze created HDFS-11946: -- Summary: Ozone: Containers in different datanodes are mapped to the same location Key: HDFS-11946 URL: https://issues.apache.org/jira/browse/HDFS-11946 Project: Hadoop HDFS Issue Type: Sub-task Components: ozone Reporter: Tsz Wo Nicholas Sze Assignee: Anu Engineer This is a problem in unit tests. Containers with the same container name in different datanodes are mapped to the same local path location. For example, As a result, the first datanode will be able to succeed creating the container file but the remaining datanodes will fail to create the container file with FileAlreadyExistsException. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11946) Ozone: Containers in different datanodes are mapped to the same location
[ https://issues.apache.org/jira/browse/HDFS-11946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-11946: --- Description: This is a problem in unit tests. Containers with the same container name in different datanodes are mapped to the same local path location. As a result, the first datanode will be able to succeed creating the container file but the remaining datanodes will fail to create the container file with FileAlreadyExistsException. (was: This is a problem in unit tests. Containers with the same container name in different datanodes are mapped to the same local path location. For example, As a result, the first datanode will be able to succeed creating the container file but the remaining datanodes will fail to create the container file with FileAlreadyExistsException.) > Ozone: Containers in different datanodes are mapped to the same location > > > Key: HDFS-11946 > URL: https://issues.apache.org/jira/browse/HDFS-11946 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Reporter: Tsz Wo Nicholas Sze >Assignee: Anu Engineer > > This is a problem in unit tests. Containers with the same container name in > different datanodes are mapped to the same local path location. As a result, > the first datanode will be able to succeed creating the container file but > the remaining datanodes will fail to create the container file with > FileAlreadyExistsException. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11918) Ozone: Encapsulate KSM metadata key into protobuf messages for better (de)serialization
[ https://issues.apache.org/jira/browse/HDFS-11918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042067#comment-16042067 ] Xiaoyu Yao commented on HDFS-11918: --- Thanks [~cheersyang] for proposing and posting the patches for this. I agree with [~anu]'s comment above. In addition, I also have performance concerns when changing the key schema for KSM DB. LevelDB stores entries sorted lexicographically by keys. With the serialized keys, it is not clear to me whether we will lose some locality for large sequential read of records or not, like read some keys with certain prefix. I would suggest we revisit this after other major KSM work is finished. > Ozone: Encapsulate KSM metadata key into protobuf messages for better > (de)serialization > --- > > Key: HDFS-11918 > URL: https://issues.apache.org/jira/browse/HDFS-11918 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Critical > Attachments: HDFS-11918-HDFS-7240.001.patch > > > There are multiple type of keys stored in KSM database > # Volume Key > # Bucket Key > # Object Key > # User Key > Currently they are represented as plain string with some conventions, such as > # /volume > # /volume/bucket > # /volume/bucket/key > # $user > this approach makes it so difficult to parse volume/bucket/keys from KSM > database. Propose to encapsulate these types of keys into protobuf messages, > and take advantage of protobuf to serialize(deserialize) classes to byte > arrays (and vice versa). -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-11797) BlockManager#createLocatedBlocks() can throw ArrayIndexOutofBoundsException when corrupt replicas are inconsistent
[ https://issues.apache.org/jira/browse/HDFS-11797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042052#comment-16042052 ] Jing Zhao edited comment on HDFS-11797 at 6/8/17 1:29 AM: -- I have not checked the details, but is it related to HDFS-11445 (more specifically, this [comment|https://issues.apache.org/jira/browse/HDFS-11445?focusedCommentId=15898236&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15898236]) ? was (Author: jingzhao): I have not checked the details, but is it related to HDFS-11445 (more specifically, this [comment|https://issues.apache.org/jira/browse/HDFS-11445?focusedCommentId=15898236&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15898236]). > BlockManager#createLocatedBlocks() can throw ArrayIndexOutofBoundsException > when corrupt replicas are inconsistent > -- > > Key: HDFS-11797 > URL: https://issues.apache.org/jira/browse/HDFS-11797 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla >Priority: Critical > Attachments: HDFS-11797.001.patch > > > The calculation for {{numMachines}} can be too less (causing > ArrayIndexOutOfBoundsException) or too many (causing NPE (HDFS-9958)) if data > structures find inconsistent number of corrupt replicas. This was earlier > found related to failed storages. This JIRA tracks a change that works for > all possible cases of inconsistencies. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-11797) BlockManager#createLocatedBlocks() can throw ArrayIndexOutofBoundsException when corrupt replicas are inconsistent
[ https://issues.apache.org/jira/browse/HDFS-11797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042052#comment-16042052 ] Jing Zhao edited comment on HDFS-11797 at 6/8/17 1:29 AM: -- I have not checked the details, but is it related to HDFS-11445 (more specifically, this [comment|https://issues.apache.org/jira/browse/HDFS-11445?focusedCommentId=15898236&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15898236]). was (Author: jingzhao): I have not checked the details, but is it related to HDFS-11445? > BlockManager#createLocatedBlocks() can throw ArrayIndexOutofBoundsException > when corrupt replicas are inconsistent > -- > > Key: HDFS-11797 > URL: https://issues.apache.org/jira/browse/HDFS-11797 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla >Priority: Critical > Attachments: HDFS-11797.001.patch > > > The calculation for {{numMachines}} can be too less (causing > ArrayIndexOutOfBoundsException) or too many (causing NPE (HDFS-9958)) if data > structures find inconsistent number of corrupt replicas. This was earlier > found related to failed storages. This JIRA tracks a change that works for > all possible cases of inconsistencies. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11797) BlockManager#createLocatedBlocks() can throw ArrayIndexOutofBoundsException when corrupt replicas are inconsistent
[ https://issues.apache.org/jira/browse/HDFS-11797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042052#comment-16042052 ] Jing Zhao commented on HDFS-11797: -- I have not checked the details, but is it related to HDFS-11445? > BlockManager#createLocatedBlocks() can throw ArrayIndexOutofBoundsException > when corrupt replicas are inconsistent > -- > > Key: HDFS-11797 > URL: https://issues.apache.org/jira/browse/HDFS-11797 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla >Priority: Critical > Attachments: HDFS-11797.001.patch > > > The calculation for {{numMachines}} can be too less (causing > ArrayIndexOutOfBoundsException) or too many (causing NPE (HDFS-9958)) if data > structures find inconsistent number of corrupt replicas. This was earlier > found related to failed storages. This JIRA tracks a change that works for > all possible cases of inconsistencies. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11851) getGlobalJNIEnv() may deadlock if exception is thrown
[ https://issues.apache.org/jira/browse/HDFS-11851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042046#comment-16042046 ] Hadoop QA commented on HDFS-11851: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 1s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 31s{color} | {color:green} hadoop-hdfs-native-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 16m 24s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | HDFS-11851 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12871963/HDFS-11851.005.patch | | Optional Tests | asflicense compile cc mvnsite javac unit | | uname | Linux d76918db1020 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 5672ae7 | | Default Java | 1.8.0_131 | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/19829/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs-native-client U: hadoop-hdfs-project/hadoop-hdfs-native-client | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/19829/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > getGlobalJNIEnv() may deadlock if exception is thrown > - > > Key: HDFS-11851 > URL: https://issues.apache.org/jira/browse/HDFS-11851 > Project: Hadoop HDFS > Issue Type: Bug > Components: libhdfs >Affects Versions: 3.0.0-alpha4 >Reporter: Henry Robinson >Assignee: Sailesh Mukil >Priority: Blocker > Attachments: HDFS-11851.000.patch, HDFS-11851.001.patch, > HDFS-11851.002.patch, HDFS-11851.003.patch, HDFS-11851.004.patch, > HDFS-11851.005.patch > > > HDFS-11529 introduced a deadlock into {{getGlobalJNIEnv()}} if an exception > is thrown. {{getGlobalJNIEnv()}} holds {{jvmMutex}}, but > {{printExceptionAndFree()}} will eventually try to acquire that lock in > {{setTLSExceptionStrings()}}. > The exception might get caught from {{loadFileSystems}}: > {code} > jthr = invokeMethod(env, NULL, STATIC, NULL, > "org/apache/hadoop/fs/FileSystem", > "loadFileSystems", "()V"); > if (jthr) { > printExceptionAndFree(env, jthr, PRINT_EXC_ALL, > "loadFileSystems"); > } > } > {code} > and here's the relevant parts of the stack trace from where I call this API > in Impala, which uses {{libhdfs}}: > {code} > #0 __lll_lock_wait () at > ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 > #1
[jira] [Resolved] (HDFS-11940) Throw an NoSuchMethodError exception when testing TestDFSPacket
[ https://issues.apache.org/jira/browse/HDFS-11940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] legend resolved HDFS-11940. --- Resolution: Auto Closed > Throw an NoSuchMethodError exception when testing TestDFSPacket > > > Key: HDFS-11940 > URL: https://issues.apache.org/jira/browse/HDFS-11940 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 3.0.0-alpha3 > Environment: org.apache.maven.surefire 2.17 > jdk 1.8 >Reporter: legend > > Throw an exception when I run TestDFSPacket. Details are listed below. > [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-surefire-plugin:2.17:test (default-test) on > project hadoop-hdfs-client: There are test failures. > [ERROR] > [ERROR] Please refer to > /home/hadoop/GitHub/hadoop/hadoop-hdfs-project/hadoop-hdfs-client/target/surefire-reports > for the individual test results. > [ERROR] -> [Help 1] > org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute > goal org.apache.maven.plugins:maven-surefire-plugin:2.17:test (default-test) > on project hadoop-hdfs-client: There are test failures. > Please refer to > /home/hadoop/GitHub/hadoop/hadoop-hdfs-project/hadoop-hdfs-client/target/surefire-reports > for the individual test results. > at > org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:212) > at > org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153) > at > org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145) > at > org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116) > at > org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80) > at > org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51) > at > org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128) > at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:307) > at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:193) > at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:106) > at org.apache.maven.cli.MavenCli.execute(MavenCli.java:863) > at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:288) > at org.apache.maven.cli.MavenCli.main(MavenCli.java:199) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289) > at > org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229) > at > org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415) > at > org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356) > Caused by: org.apache.maven.plugin.MojoFailureException: There are test > failures. > Please refer to > /home/hadoop/GitHub/hadoop/hadoop-hdfs-project/hadoop-hdfs-client/target/surefire-reports > for the individual test results. > at > org.apache.maven.plugin.surefire.SurefireHelper.reportExecution(SurefireHelper.java:82) > at > org.apache.maven.plugin.surefire.SurefirePlugin.handleSummary(SurefirePlugin.java:195) > at > org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeAfterPreconditionsChecked(AbstractSurefireMojo.java:861) > at > org.apache.maven.plugin.surefire.AbstractSurefireMojo.execute(AbstractSurefireMojo.java:729) > at > org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134) > at > org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:207) > ... 20 more -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10865) Datanodemanager adds nodes twice to NetworkTopology
[ https://issues.apache.org/jira/browse/HDFS-10865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042028#comment-16042028 ] Hadoop QA commented on HDFS-10865: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 62m 34s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 90m 34s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080 | | | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150 | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | HDFS-10865 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12828770/HDFS-10865.000.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 9c7d860b20ca 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 5672ae7 | | Default Java | 1.8.0_131 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/19828/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/19828/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/19828/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Datanodemanager adds nodes twice to NetworkTopology > --- > > Key: HDFS-10865 > URL: https://issues.apache.org/jira/browse/HDFS-10865 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Ver
[jira] [Commented] (HDFS-11943) Warn log frequently print to screen in doEncode function on AbstractNativeRawEncoder class
[ https://issues.apache.org/jira/browse/HDFS-11943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042018#comment-16042018 ] Kai Zheng commented on HDFS-11943: -- I think the message itself is very clear about what it means. It complains in {{doEncode/doDecode}} regarding the input/output buffers each time. As the call is so frequent I guess we can just remind at the first occurrence time? Is there any method call stack so we can have an idea why/where on-heap bytebuffers were passed to the native coder? It does affect performance. > Warn log frequently print to screen in doEncode function on > AbstractNativeRawEncoder class > --- > > Key: HDFS-11943 > URL: https://issues.apache.org/jira/browse/HDFS-11943 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding, native >Affects Versions: 3.0.0-alpha4 > Environment: cluster: 3 nodes > os:(Red Hat 2.6.33.20, Red Hat 3.10.0-514.6.1.el7.x86_64, > Ubuntu4.4.0-31-generic) > hadoop version: hadoop-3.0.0-alpha4 > erasure coding: XOR-2-1-64k and enabled Intel ISA-L > hadoop fs -put file / >Reporter: liaoyuxiangqin >Assignee: liaoyuxiangqin >Priority: Minor > Attachments: HDFS-11943.patch > > Original Estimate: 0.05h > Remaining Estimate: 0.05h > > when i write file to hdfs on above environment, the hdfs client frequently > print warn log of use direct ByteBuffer inputs/outputs in doEncode function > to screen, detail information as follows: > 2017-06-07 15:20:42,856 WARN rawcoder.AbstractNativeRawEncoder: > convertToByteBufferState is invoked, not efficiently. Please use direct > ByteBuffer inputs/outputs -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11851) getGlobalJNIEnv() may deadlock if exception is thrown
[ https://issues.apache.org/jira/browse/HDFS-11851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042017#comment-16042017 ] Sailesh Mukil commented on HDFS-11851: -- [~jzhuge] Thanks! Oddly enough, it didn't throw that warning locally. I incorrectly assumed it may have been a portability issue. I made that change in the latest patch. > getGlobalJNIEnv() may deadlock if exception is thrown > - > > Key: HDFS-11851 > URL: https://issues.apache.org/jira/browse/HDFS-11851 > Project: Hadoop HDFS > Issue Type: Bug > Components: libhdfs >Affects Versions: 3.0.0-alpha4 >Reporter: Henry Robinson >Assignee: Sailesh Mukil >Priority: Blocker > Attachments: HDFS-11851.000.patch, HDFS-11851.001.patch, > HDFS-11851.002.patch, HDFS-11851.003.patch, HDFS-11851.004.patch, > HDFS-11851.005.patch > > > HDFS-11529 introduced a deadlock into {{getGlobalJNIEnv()}} if an exception > is thrown. {{getGlobalJNIEnv()}} holds {{jvmMutex}}, but > {{printExceptionAndFree()}} will eventually try to acquire that lock in > {{setTLSExceptionStrings()}}. > The exception might get caught from {{loadFileSystems}}: > {code} > jthr = invokeMethod(env, NULL, STATIC, NULL, > "org/apache/hadoop/fs/FileSystem", > "loadFileSystems", "()V"); > if (jthr) { > printExceptionAndFree(env, jthr, PRINT_EXC_ALL, > "loadFileSystems"); > } > } > {code} > and here's the relevant parts of the stack trace from where I call this API > in Impala, which uses {{libhdfs}}: > {code} > #0 __lll_lock_wait () at > ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 > #1 0x74a8d657 in _L_lock_909 () from > /lib/x86_64-linux-gnu/libpthread.so.0 > #2 0x74a8d480 in __GI___pthread_mutex_lock (mutex=0x47ce960 > ) at ../nptl/pthread_mutex_lock.c:79 > #3 0x02f06056 in mutexLock (m=) at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/os/posix/mutexes.c:28 > #4 0x02efe817 in setTLSExceptionStrings (rootCause=0x0, > stackTrace=0x0) at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/jni_helper.c:581 > #5 0x02f065d7 in printExceptionAndFreeV (env=0x513c1e8, > exc=0x508a8c0, noPrintFlags=, fmt=0x34349cf "loadFileSystems", > ap=0x7fffb660) > at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/exception.c:183 > #6 0x02f0683d in printExceptionAndFree (env=, > exc=, noPrintFlags=, fmt=) > at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/exception.c:213 > #7 0x02eff60f in getGlobalJNIEnv () at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/jni_helper.c:463 > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11851) getGlobalJNIEnv() may deadlock if exception is thrown
[ https://issues.apache.org/jira/browse/HDFS-11851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sailesh Mukil updated HDFS-11851: - Attachment: HDFS-11851.005.patch > getGlobalJNIEnv() may deadlock if exception is thrown > - > > Key: HDFS-11851 > URL: https://issues.apache.org/jira/browse/HDFS-11851 > Project: Hadoop HDFS > Issue Type: Bug > Components: libhdfs >Affects Versions: 3.0.0-alpha4 >Reporter: Henry Robinson >Assignee: Sailesh Mukil >Priority: Blocker > Attachments: HDFS-11851.000.patch, HDFS-11851.001.patch, > HDFS-11851.002.patch, HDFS-11851.003.patch, HDFS-11851.004.patch, > HDFS-11851.005.patch > > > HDFS-11529 introduced a deadlock into {{getGlobalJNIEnv()}} if an exception > is thrown. {{getGlobalJNIEnv()}} holds {{jvmMutex}}, but > {{printExceptionAndFree()}} will eventually try to acquire that lock in > {{setTLSExceptionStrings()}}. > The exception might get caught from {{loadFileSystems}}: > {code} > jthr = invokeMethod(env, NULL, STATIC, NULL, > "org/apache/hadoop/fs/FileSystem", > "loadFileSystems", "()V"); > if (jthr) { > printExceptionAndFree(env, jthr, PRINT_EXC_ALL, > "loadFileSystems"); > } > } > {code} > and here's the relevant parts of the stack trace from where I call this API > in Impala, which uses {{libhdfs}}: > {code} > #0 __lll_lock_wait () at > ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 > #1 0x74a8d657 in _L_lock_909 () from > /lib/x86_64-linux-gnu/libpthread.so.0 > #2 0x74a8d480 in __GI___pthread_mutex_lock (mutex=0x47ce960 > ) at ../nptl/pthread_mutex_lock.c:79 > #3 0x02f06056 in mutexLock (m=) at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/os/posix/mutexes.c:28 > #4 0x02efe817 in setTLSExceptionStrings (rootCause=0x0, > stackTrace=0x0) at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/jni_helper.c:581 > #5 0x02f065d7 in printExceptionAndFreeV (env=0x513c1e8, > exc=0x508a8c0, noPrintFlags=, fmt=0x34349cf "loadFileSystems", > ap=0x7fffb660) > at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/exception.c:183 > #6 0x02f0683d in printExceptionAndFree (env=, > exc=, noPrintFlags=, fmt=) > at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/exception.c:213 > #7 0x02eff60f in getGlobalJNIEnv () at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/jni_helper.c:463 > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11851) getGlobalJNIEnv() may deadlock if exception is thrown
[ https://issues.apache.org/jira/browse/HDFS-11851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sailesh Mukil updated HDFS-11851: - Status: Patch Available (was: Open) > getGlobalJNIEnv() may deadlock if exception is thrown > - > > Key: HDFS-11851 > URL: https://issues.apache.org/jira/browse/HDFS-11851 > Project: Hadoop HDFS > Issue Type: Bug > Components: libhdfs >Affects Versions: 3.0.0-alpha4 >Reporter: Henry Robinson >Assignee: Sailesh Mukil >Priority: Blocker > Attachments: HDFS-11851.000.patch, HDFS-11851.001.patch, > HDFS-11851.002.patch, HDFS-11851.003.patch, HDFS-11851.004.patch, > HDFS-11851.005.patch > > > HDFS-11529 introduced a deadlock into {{getGlobalJNIEnv()}} if an exception > is thrown. {{getGlobalJNIEnv()}} holds {{jvmMutex}}, but > {{printExceptionAndFree()}} will eventually try to acquire that lock in > {{setTLSExceptionStrings()}}. > The exception might get caught from {{loadFileSystems}}: > {code} > jthr = invokeMethod(env, NULL, STATIC, NULL, > "org/apache/hadoop/fs/FileSystem", > "loadFileSystems", "()V"); > if (jthr) { > printExceptionAndFree(env, jthr, PRINT_EXC_ALL, > "loadFileSystems"); > } > } > {code} > and here's the relevant parts of the stack trace from where I call this API > in Impala, which uses {{libhdfs}}: > {code} > #0 __lll_lock_wait () at > ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 > #1 0x74a8d657 in _L_lock_909 () from > /lib/x86_64-linux-gnu/libpthread.so.0 > #2 0x74a8d480 in __GI___pthread_mutex_lock (mutex=0x47ce960 > ) at ../nptl/pthread_mutex_lock.c:79 > #3 0x02f06056 in mutexLock (m=) at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/os/posix/mutexes.c:28 > #4 0x02efe817 in setTLSExceptionStrings (rootCause=0x0, > stackTrace=0x0) at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/jni_helper.c:581 > #5 0x02f065d7 in printExceptionAndFreeV (env=0x513c1e8, > exc=0x508a8c0, noPrintFlags=, fmt=0x34349cf "loadFileSystems", > ap=0x7fffb660) > at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/exception.c:183 > #6 0x02f0683d in printExceptionAndFree (env=, > exc=, noPrintFlags=, fmt=) > at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/exception.c:213 > #7 0x02eff60f in getGlobalJNIEnv () at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/jni_helper.c:463 > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11851) getGlobalJNIEnv() may deadlock if exception is thrown
[ https://issues.apache.org/jira/browse/HDFS-11851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16041940#comment-16041940 ] John Zhuge commented on HDFS-11851: --- [~sailesh] The "cc" failure for patch 003 is caused by a typo in my sample code. The type of {{jvmMutexAttr}} should be {{pthread_mutexattr_t}} not {{mutex}}: {code} pthread_mutexattr_t jvmMutexAttr; {code} > getGlobalJNIEnv() may deadlock if exception is thrown > - > > Key: HDFS-11851 > URL: https://issues.apache.org/jira/browse/HDFS-11851 > Project: Hadoop HDFS > Issue Type: Bug > Components: libhdfs >Affects Versions: 3.0.0-alpha4 >Reporter: Henry Robinson >Assignee: Sailesh Mukil >Priority: Blocker > Attachments: HDFS-11851.000.patch, HDFS-11851.001.patch, > HDFS-11851.002.patch, HDFS-11851.003.patch, HDFS-11851.004.patch > > > HDFS-11529 introduced a deadlock into {{getGlobalJNIEnv()}} if an exception > is thrown. {{getGlobalJNIEnv()}} holds {{jvmMutex}}, but > {{printExceptionAndFree()}} will eventually try to acquire that lock in > {{setTLSExceptionStrings()}}. > The exception might get caught from {{loadFileSystems}}: > {code} > jthr = invokeMethod(env, NULL, STATIC, NULL, > "org/apache/hadoop/fs/FileSystem", > "loadFileSystems", "()V"); > if (jthr) { > printExceptionAndFree(env, jthr, PRINT_EXC_ALL, > "loadFileSystems"); > } > } > {code} > and here's the relevant parts of the stack trace from where I call this API > in Impala, which uses {{libhdfs}}: > {code} > #0 __lll_lock_wait () at > ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 > #1 0x74a8d657 in _L_lock_909 () from > /lib/x86_64-linux-gnu/libpthread.so.0 > #2 0x74a8d480 in __GI___pthread_mutex_lock (mutex=0x47ce960 > ) at ../nptl/pthread_mutex_lock.c:79 > #3 0x02f06056 in mutexLock (m=) at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/os/posix/mutexes.c:28 > #4 0x02efe817 in setTLSExceptionStrings (rootCause=0x0, > stackTrace=0x0) at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/jni_helper.c:581 > #5 0x02f065d7 in printExceptionAndFreeV (env=0x513c1e8, > exc=0x508a8c0, noPrintFlags=, fmt=0x34349cf "loadFileSystems", > ap=0x7fffb660) > at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/exception.c:183 > #6 0x02f0683d in printExceptionAndFree (env=, > exc=, noPrintFlags=, fmt=) > at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/exception.c:213 > #7 0x02eff60f in getGlobalJNIEnv () at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/jni_helper.c:463 > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10865) Datanodemanager adds nodes twice to NetworkTopology
[ https://issues.apache.org/jira/browse/HDFS-10865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Inigo Goiri updated HDFS-10865: --- Status: Patch Available (was: In Progress) > Datanodemanager adds nodes twice to NetworkTopology > --- > > Key: HDFS-10865 > URL: https://issues.apache.org/jira/browse/HDFS-10865 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.7.3 >Reporter: Inigo Goiri >Assignee: Inigo Goiri > Attachments: HDFS-10865.000.patch > > > {{DatanodeManager}} tries to add datanodes to the {{NetworkTopology}} twice > in {{registerDatanode()}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11861) ipc.Client.Connection#sendRpcRequest should log request name
[ https://issues.apache.org/jira/browse/HDFS-11861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16041873#comment-16041873 ] Hudson commented on HDFS-11861: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11843 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/11843/]) HDFS-11861. ipc.Client.Connection#sendRpcRequest should log request (jzhuge: rev 5672ae7b37ce75086a1cb5bb9a388288fc913eb7) * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java > ipc.Client.Connection#sendRpcRequest should log request name > > > Key: HDFS-11861 > URL: https://issues.apache.org/jira/browse/HDFS-11861 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ipc >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Trivial > Labels: supportability > Fix For: 2.9.0, 3.0.0-alpha4, 2.8.2 > > Attachments: HDFS-11861.001.patch > > > {{ipc.Client.Connection#sendRpcRequest}} only logs the call id. > {code} > if (LOG.isDebugEnabled()) > LOG.debug(getName() + " sending #" + call.id); > {code} > It'd be much more helpful to log request name for several benefits: > * Find out which requests sent to which target > * Correlate with the debug log in {{ipc.Server.Handler}} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11861) ipc.Client.Connection#sendRpcRequest should log request name
[ https://issues.apache.org/jira/browse/HDFS-11861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-11861: -- Resolution: Fixed Fix Version/s: 2.8.2 3.0.0-alpha4 2.9.0 Status: Resolved (was: Patch Available) Committed to trunk, branch-2, and branch-2.8. Thanks [~xiaochen] for the review. > ipc.Client.Connection#sendRpcRequest should log request name > > > Key: HDFS-11861 > URL: https://issues.apache.org/jira/browse/HDFS-11861 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ipc >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Trivial > Labels: supportability > Fix For: 2.9.0, 3.0.0-alpha4, 2.8.2 > > Attachments: HDFS-11861.001.patch > > > {{ipc.Client.Connection#sendRpcRequest}} only logs the call id. > {code} > if (LOG.isDebugEnabled()) > LOG.debug(getName() + " sending #" + call.id); > {code} > It'd be much more helpful to log request name for several benefits: > * Find out which requests sent to which target > * Correlate with the debug log in {{ipc.Server.Handler}} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11303) Hedged read might hang infinitely if read data from all DN failed
[ https://issues.apache.org/jira/browse/HDFS-11303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16041806#comment-16041806 ] John Zhuge commented on HDFS-11303: --- [~zhangchen] Thanks for the contribution. Could you please fix the checkstyle error and investigate whether the unit tests are related? > Hedged read might hang infinitely if read data from all DN failed > -- > > Key: HDFS-11303 > URL: https://issues.apache.org/jira/browse/HDFS-11303 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 3.0.0-alpha1 >Reporter: Chen Zhang >Assignee: Chen Zhang > Attachments: HDFS-11303-001.patch, HDFS-11303-001.patch > > > Hedged read will read from a DN first, if timeout, then read other DNs > simultaneously. > If read all DN failed, this bug will cause the future-list not empty(the > first timeout request left in list), and hang in the loop infinitely -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11882) Client fails if acknowledged size is greater than bytes sent
[ https://issues.apache.org/jira/browse/HDFS-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16041804#comment-16041804 ] Andrew Wang commented on HDFS-11882: Thanks for working on this Akira. I looked at this part of the code again and there are things that could be improved, and that I don't understand. {code} /** * Get the number of acked stripes. An acked stripe means at least data block * number size cells of the stripe were acked. */ private long getNumAckedStripes() { {code} Although it says that "at least data block number size cells", the method doesn't check this. I think this is okay though since the callers validate that there are enough healthy streamers, but the javadoc minimally should be updated, or additional checks added. [Walter's intent though was to only count full stripes|https://issues.apache.org/jira/browse/HDFS-9342?focusedCommentId=15072472&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15072472], but I don't understand why this is correct. When closing a file at a non-stripe boundary, the last stripe is necessarily not a full stripe. In this situation, shouldn't the length be (as getNumAckedStripes alludes) the stripe that has minimally numDataBlocks cells? What's the effect of truncating the file length to the last full stripe? Does this truncate the file? When updatePipeline is called while closing the file, I don't see any logic to rewrite that last partial stripe. I'm hoping some of these questions can be investigated via unit test. Ping [~walter.k.su] / [~zhz] for inputs. > Client fails if acknowledged size is greater than bytes sent > > > Key: HDFS-11882 > URL: https://issues.apache.org/jira/browse/HDFS-11882 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding, test >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka > Attachments: HDFS-11882.01.patch > > > Some tests of erasure coding fails by the following exception. The following > test was removed by HDFS-11823, however, this type of error can happen in > real cluster. > {noformat} > Running > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > Tests run: 14, Failures: 0, Errors: 1, Skipped: 10, Time elapsed: 89.086 sec > <<< FAILURE! - in > org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure > testMultipleDatanodeFailure56(org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure) > Time elapsed: 38.831 sec <<< ERROR! > java.lang.IllegalStateException: null > at > com.google.common.base.Preconditions.checkState(Preconditions.java:129) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.updatePipeline(DFSStripedOutputStream.java:780) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamerFailures(DFSStripedOutputStream.java:664) > at > org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1034) > at > org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:842) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:472) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTestWithMultipleFailure(TestDFSStripedOutputStreamWithFailure.java:381) > at > org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testMultipleDatanodeFailure56(TestDFSStripedOutputStreamWithFailure.java:245) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apach
[jira] [Updated] (HDFS-11851) getGlobalJNIEnv() may deadlock if exception is thrown
[ https://issues.apache.org/jira/browse/HDFS-11851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sailesh Mukil updated HDFS-11851: - Attachment: HDFS-11851.004.patch > getGlobalJNIEnv() may deadlock if exception is thrown > - > > Key: HDFS-11851 > URL: https://issues.apache.org/jira/browse/HDFS-11851 > Project: Hadoop HDFS > Issue Type: Bug > Components: libhdfs >Affects Versions: 3.0.0-alpha4 >Reporter: Henry Robinson >Assignee: Sailesh Mukil >Priority: Blocker > Attachments: HDFS-11851.000.patch, HDFS-11851.001.patch, > HDFS-11851.002.patch, HDFS-11851.003.patch, HDFS-11851.004.patch > > > HDFS-11529 introduced a deadlock into {{getGlobalJNIEnv()}} if an exception > is thrown. {{getGlobalJNIEnv()}} holds {{jvmMutex}}, but > {{printExceptionAndFree()}} will eventually try to acquire that lock in > {{setTLSExceptionStrings()}}. > The exception might get caught from {{loadFileSystems}}: > {code} > jthr = invokeMethod(env, NULL, STATIC, NULL, > "org/apache/hadoop/fs/FileSystem", > "loadFileSystems", "()V"); > if (jthr) { > printExceptionAndFree(env, jthr, PRINT_EXC_ALL, > "loadFileSystems"); > } > } > {code} > and here's the relevant parts of the stack trace from where I call this API > in Impala, which uses {{libhdfs}}: > {code} > #0 __lll_lock_wait () at > ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 > #1 0x74a8d657 in _L_lock_909 () from > /lib/x86_64-linux-gnu/libpthread.so.0 > #2 0x74a8d480 in __GI___pthread_mutex_lock (mutex=0x47ce960 > ) at ../nptl/pthread_mutex_lock.c:79 > #3 0x02f06056 in mutexLock (m=) at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/os/posix/mutexes.c:28 > #4 0x02efe817 in setTLSExceptionStrings (rootCause=0x0, > stackTrace=0x0) at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/jni_helper.c:581 > #5 0x02f065d7 in printExceptionAndFreeV (env=0x513c1e8, > exc=0x508a8c0, noPrintFlags=, fmt=0x34349cf "loadFileSystems", > ap=0x7fffb660) > at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/exception.c:183 > #6 0x02f0683d in printExceptionAndFree (env=, > exc=, noPrintFlags=, fmt=) > at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/exception.c:213 > #7 0x02eff60f in getGlobalJNIEnv () at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/jni_helper.c:463 > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11851) getGlobalJNIEnv() may deadlock if exception is thrown
[ https://issues.apache.org/jira/browse/HDFS-11851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sailesh Mukil updated HDFS-11851: - Status: Open (was: Patch Available) > getGlobalJNIEnv() may deadlock if exception is thrown > - > > Key: HDFS-11851 > URL: https://issues.apache.org/jira/browse/HDFS-11851 > Project: Hadoop HDFS > Issue Type: Bug > Components: libhdfs >Affects Versions: 3.0.0-alpha4 >Reporter: Henry Robinson >Assignee: Sailesh Mukil >Priority: Blocker > Attachments: HDFS-11851.000.patch, HDFS-11851.001.patch, > HDFS-11851.002.patch, HDFS-11851.003.patch, HDFS-11851.004.patch > > > HDFS-11529 introduced a deadlock into {{getGlobalJNIEnv()}} if an exception > is thrown. {{getGlobalJNIEnv()}} holds {{jvmMutex}}, but > {{printExceptionAndFree()}} will eventually try to acquire that lock in > {{setTLSExceptionStrings()}}. > The exception might get caught from {{loadFileSystems}}: > {code} > jthr = invokeMethod(env, NULL, STATIC, NULL, > "org/apache/hadoop/fs/FileSystem", > "loadFileSystems", "()V"); > if (jthr) { > printExceptionAndFree(env, jthr, PRINT_EXC_ALL, > "loadFileSystems"); > } > } > {code} > and here's the relevant parts of the stack trace from where I call this API > in Impala, which uses {{libhdfs}}: > {code} > #0 __lll_lock_wait () at > ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 > #1 0x74a8d657 in _L_lock_909 () from > /lib/x86_64-linux-gnu/libpthread.so.0 > #2 0x74a8d480 in __GI___pthread_mutex_lock (mutex=0x47ce960 > ) at ../nptl/pthread_mutex_lock.c:79 > #3 0x02f06056 in mutexLock (m=) at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/os/posix/mutexes.c:28 > #4 0x02efe817 in setTLSExceptionStrings (rootCause=0x0, > stackTrace=0x0) at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/jni_helper.c:581 > #5 0x02f065d7 in printExceptionAndFreeV (env=0x513c1e8, > exc=0x508a8c0, noPrintFlags=, fmt=0x34349cf "loadFileSystems", > ap=0x7fffb660) > at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/exception.c:183 > #6 0x02f0683d in printExceptionAndFree (env=, > exc=, noPrintFlags=, fmt=) > at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/exception.c:213 > #7 0x02eff60f in getGlobalJNIEnv () at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/jni_helper.c:463 > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11647) Add -E option in hdfs "count" command to show erasure policy summarization
[ https://issues.apache.org/jira/browse/HDFS-11647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16041757#comment-16041757 ] Lei (Eddy) Xu commented on HDFS-11647: -- Hey, [~luhuichun] Thanks a lot for working on this. Looks good overall. Some minor comments: * Should {{ContentSummary#ecPolicy}} has a default value? Does user expect {{getErasureCodingPolicy}} to return {{null}}? * {code} getErasureCodingPolicy() == right.getErasureCodingPolicy() && {code} Because the ec policy is a string, we should check null && {{.equals()}} here. Also, please fix the indent of this line and in {{hashCode}}. * In {{Count.java}}, please add spaces on both sides of symbols like "+" and "=". You could take a look of [hadoop code style|https://wiki.apache.org/hadoop/CodeReviewChecklist] for reference. * We can use StringBuilder in {processOptions}. * {{hdfs.proto}}, why dont you name {{optional string redundancyPolicy = 13;}} as {{ecPolicy}} as used everywhere else? It'd be nice to keep them consistent. * {{ContentSummaryComputationContext#REPLICATED}} please change the signature to {{public static final String...}} for consistency. * {{public String getErasureCodingPolicyName(INode inode) }}, we might want to be consistent of the signature between modules. In some places, we call it {{ecPolicy()}}, in some places it is {{getErasureCodingPolicy}}. * {code} } catch (IOException ioe) { LOG.warn("Encountered error getting ec policy for " + inode.getFullPathName(), ioe); return ""; } {code} If the {{IOE}} is HDFS related, we should throw it instead of ignore it here. {code} if (isStriped()) { String ecPolicyName = summary.getErasureCodingPolicyName(this); } {code} This has no side-effect to {{summary}}? Thanks! > Add -E option in hdfs "count" command to show erasure policy summarization > -- > > Key: HDFS-11647 > URL: https://issues.apache.org/jira/browse/HDFS-11647 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: SammiChen >Assignee: luhuichun > Labels: hdfs-ec-3.0-nice-to-have > Attachments: HADOOP-11647.patch > > > Add -E option in hdfs "count" command to show erasure policy summarization -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11880) Ozone: KSM: Remove protobuf formats from KSM wrappers
[ https://issues.apache.org/jira/browse/HDFS-11880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HDFS-11880: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: HDFS-7240 Target Version/s: HDFS-7240 Status: Resolved (was: Patch Available) Thanks [~nandakumar131] for the contribution. I've commit the fix into the feature branch. > Ozone: KSM: Remove protobuf formats from KSM wrappers > - > > Key: HDFS-11880 > URL: https://issues.apache.org/jira/browse/HDFS-11880 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Affects Versions: HDFS-7240 >Reporter: Nandakumar >Assignee: Nandakumar > Fix For: HDFS-7240 > > Attachments: HDFS-11880-HDFS-7240.000.patch, > HDFS-11880-HDFS-7240.001.patch > > > KSM wrappers like KsmBucketInfo and KsmBucketArgs are using protobuf formats > such as StorageTypeProto and OzoneAclInfo, this jira is to remove the > dependency and use {{StorageType}} and {{OzoneAcl}} instead. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11880) Ozone: KSM: Remove protobuf formats from KSM wrappers
[ https://issues.apache.org/jira/browse/HDFS-11880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HDFS-11880: -- Summary: Ozone: KSM: Remove protobuf formats from KSM wrappers (was: Ozone: KSM: Remove protobuf formats such as StorageTypeProto and OzoneAclInfo from KSM wrappers) > Ozone: KSM: Remove protobuf formats from KSM wrappers > - > > Key: HDFS-11880 > URL: https://issues.apache.org/jira/browse/HDFS-11880 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Affects Versions: HDFS-7240 >Reporter: Nandakumar >Assignee: Nandakumar > Attachments: HDFS-11880-HDFS-7240.000.patch, > HDFS-11880-HDFS-7240.001.patch > > > KSM wrappers like KsmBucketInfo and KsmBucketArgs are using protobuf formats > such as StorageTypeProto and OzoneAclInfo, this jira is to remove the > dependency and use {{StorageType}} and {{OzoneAcl}} instead. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11880) Ozone: KSM: Remove protobuf formats such as StorageTypeProto and OzoneAclInfo from KSM wrappers
[ https://issues.apache.org/jira/browse/HDFS-11880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16041714#comment-16041714 ] Xiaoyu Yao commented on HDFS-11880: --- Thanks [~nandakumar131] for the patch. The latest patch looks good to me. +1. Only one minor javadoc issue: KsmBucketInfo.java line 94/110 that I will update at commit time. I noticed that OzoneConsts.java is moved into hadoop-hdfs-client with this change. Some of the const are used only by KSM/SCM server. We might consider have a OzoneClientConsts.java for those client only constants. We can do that in a follow up JIRA. > Ozone: KSM: Remove protobuf formats such as StorageTypeProto and OzoneAclInfo > from KSM wrappers > --- > > Key: HDFS-11880 > URL: https://issues.apache.org/jira/browse/HDFS-11880 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Affects Versions: HDFS-7240 >Reporter: Nandakumar >Assignee: Nandakumar > Attachments: HDFS-11880-HDFS-7240.000.patch, > HDFS-11880-HDFS-7240.001.patch > > > KSM wrappers like KsmBucketInfo and KsmBucketArgs are using protobuf formats > such as StorageTypeProto and OzoneAclInfo, this jira is to remove the > dependency and use {{StorageType}} and {{OzoneAcl}} instead. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11646) Add -E option in 'ls' to list erasure coding policy of each file and directory if applicable
[ https://issues.apache.org/jira/browse/HDFS-11646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16041711#comment-16041711 ] Lei (Eddy) Xu commented on HDFS-11646: -- Hi,[~luhuichun] Thanks a lot for working on this. Some minor nits: * Could you limit the patch to the lines that are actually changed? A few diffs are due to re-format of the code. * In {{processPath}}, does {{lineFormat}} change due to the value of {{displayPolicy}} ? Could you also add tests for both cases? Thanks! > Add -E option in 'ls' to list erasure coding policy of each file and > directory if applicable > > > Key: HDFS-11646 > URL: https://issues.apache.org/jira/browse/HDFS-11646 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Reporter: SammiChen >Assignee: luhuichun > Labels: hdfs-ec-3.0-nice-to-have > Attachments: HADOOP-11646.patch > > > Add -E option in "ls" to show erasure coding policy of file and directory, > leverage the "number_of_replicas " column. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11924) FSPermissionChecker.checkTraverse doesn't pass FsAction access properly
[ https://issues.apache.org/jira/browse/HDFS-11924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated HDFS-11924: --- Target Version/s: 2.8.2 Fix Version/s: (was: 2.8.1) Please leave the fix-version field alone for a committer to set it at commit time. Updating it myself for now. > FSPermissionChecker.checkTraverse doesn't pass FsAction access properly > --- > > Key: HDFS-11924 > URL: https://issues.apache.org/jira/browse/HDFS-11924 > Project: Hadoop HDFS > Issue Type: Bug > Components: security >Affects Versions: 2.8.0 >Reporter: Zsombor Gegesy > Labels: hdfs, hdfspermission > Attachments: > 0001-HDFS-11924-Pass-FsAction-to-the-external-AccessContr.patch > > > In 2.7.1, during file access check, the AccessControlEnforcer is called with > the access parameter filled with FsAction values. > A thread dump in this case: > {code} > FSPermissionChecker.checkPermission(INodesInPath, boolean, FsAction, > FsAction, FsAction, FsAction, boolean) line: 189 > FSDirectory.checkPermission(FSPermissionChecker, INodesInPath, boolean, > FsAction, FsAction, FsAction, FsAction, boolean) line: 1698 > FSDirectory.checkPermission(FSPermissionChecker, INodesInPath, boolean, > FsAction, FsAction, FsAction, FsAction) line: 1682 > FSDirectory.checkPathAccess(FSPermissionChecker, INodesInPath, > FsAction) line: 1656 > FSNamesystem.appendFileInternal(FSPermissionChecker, INodesInPath, > String, String, boolean, boolean) line: 2668 > FSNamesystem.appendFileInt(String, String, String, boolean, boolean) > line: 2985 > FSNamesystem.appendFile(String, String, String, EnumSet, > boolean) line: 2952 > NameNodeRpcServer.append(String, String, EnumSetWritable) > line: 653 > ClientNamenodeProtocolServerSideTranslatorPB.append(RpcController, > ClientNamenodeProtocolProtos$AppendRequestProto) line: 421 > > ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(Descriptors$MethodDescriptor, > RpcController, Message) line: not available > ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(RPC$Server, String, > Writable, long) line: 616 > ProtobufRpcEngine$Server(RPC$Server).call(RPC$RpcKind, String, > Writable, long) line: 969 > Server$Handler$1.run() line: 2049 > Server$Handler$1.run() line: 2045 > AccessController.doPrivileged(PrivilegedExceptionAction, > AccessControlContext) line: not available [native method] > Subject.doAs(Subject, PrivilegedExceptionAction) line: 422 > UserGroupInformation.doAs(PrivilegedExceptionAction) line: 1657 > {code} > However, in 2.8.0 this value is changed to null, because in > FSPermissionChecker.checkTraverse(FSPermissionChecker pc, INodesInPath iip, > boolean resolveLink) couldn't pass the required information, so it's simply > use 'null'. > This is a regression between 2.7.1 and 2.8.0, because external > AccessControlEnforcer couldn't work properly -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11945) Internal lease recovery may not be retried for a long time
[ https://issues.apache.org/jira/browse/HDFS-11945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16041561#comment-16041561 ] Mingliang Liu commented on HDFS-11945: -- I'm +1 on the patch. Minor comments: # The {{internalLeaseHolder}} value to be concatenated by _ instead of space # The last test statement: {code} assertFalse(holder.equals(lm.getInternalLeaseHolder())); {code} Better to use: {code} assertNotEquals("some meaningful message", holder, lm.getInternalLeaseHolder()); {code} > Internal lease recovery may not be retried for a long time > -- > > Key: HDFS-11945 > URL: https://issues.apache.org/jira/browse/HDFS-11945 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Attachments: HDFS-11945.trunk.patch > > > Lease is assigned per client who is identified by its holder ID or client ID, > thus a renewal or an expiration of a lease affects all files being written by > the client. > When a client/writer dies without closing a file, its lease expires in one > hour (hard limit) and the namenode tries to recover the lease. As a part of > the process, the namenode takes the ownership of the lease and renews it. If > the recovery does not finish successfully, the lease will expire in one hour > and the namenode will try again to recover the lease. > However, if a file system has another lease expiring within the hour, the > recovery attempt for the lease will push forward the expiration of the lease > held by the namenode. This causes failed lease recoveries to be not retried > for a long time. We have seen it happening for days. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11945) Internal lease recovery may not be retried for a long time
[ https://issues.apache.org/jira/browse/HDFS-11945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16041535#comment-16041535 ] Kihwal Lee commented on HDFS-11945: --- The failed tests all pass when I run them. {noformat} --- T E S T S --- Running org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 88.762 sec - in org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped Running org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure010 Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 170.511 sec - in org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure010 Running org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080 Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 106.453 sec - in org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080 Results : Tests run: 32, Failures: 0, Errors: 0, Skipped: 0 {noformat} > Internal lease recovery may not be retried for a long time > -- > > Key: HDFS-11945 > URL: https://issues.apache.org/jira/browse/HDFS-11945 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Attachments: HDFS-11945.trunk.patch > > > Lease is assigned per client who is identified by its holder ID or client ID, > thus a renewal or an expiration of a lease affects all files being written by > the client. > When a client/writer dies without closing a file, its lease expires in one > hour (hard limit) and the namenode tries to recover the lease. As a part of > the process, the namenode takes the ownership of the lease and renews it. If > the recovery does not finish successfully, the lease will expire in one hour > and the namenode will try again to recover the lease. > However, if a file system has another lease expiring within the hour, the > recovery attempt for the lease will push forward the expiration of the lease > held by the namenode. This causes failed lease recoveries to be not retried > for a long time. We have seen it happening for days. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11851) getGlobalJNIEnv() may deadlock if exception is thrown
[ https://issues.apache.org/jira/browse/HDFS-11851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16041482#comment-16041482 ] Hadoop QA commented on HDFS-11851: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 35s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} cc {color} | {color:red} 0m 17s{color} | {color:red} hadoop-hdfs-project_hadoop-hdfs-native-client generated 36 new + 0 unchanged - 0 fixed = 36 total (was 0) {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 43s{color} | {color:green} hadoop-hdfs-native-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 20m 59s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | HDFS-11851 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12871882/HDFS-11851.003.patch | | Optional Tests | asflicense compile cc mvnsite javac unit | | uname | Linux 2931ba9d027e 3.13.0-108-generic #155-Ubuntu SMP Wed Jan 11 16:58:52 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 24181f5 | | Default Java | 1.8.0_131 | | cc | https://builds.apache.org/job/PreCommit-HDFS-Build/19827/artifact/patchprocess/diff-compile-cc-hadoop-hdfs-project_hadoop-hdfs-native-client.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/19827/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs-native-client U: hadoop-hdfs-project/hadoop-hdfs-native-client | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/19827/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > getGlobalJNIEnv() may deadlock if exception is thrown > - > > Key: HDFS-11851 > URL: https://issues.apache.org/jira/browse/HDFS-11851 > Project: Hadoop HDFS > Issue Type: Bug > Components: libhdfs >Affects Versions: 3.0.0-alpha4 >Reporter: Henry Robinson >Assignee: Sailesh Mukil >Priority: Blocker > Attachments: HDFS-11851.000.patch, HDFS-11851.001.patch, > HDFS-11851.002.patch, HDFS-11851.003.patch > > > HDFS-11529 introduced a deadlock into {{getGlobalJNIEnv()}} if an exception > is thrown. {{getGlobalJNIEnv()}} holds {{jvmMutex}}, but > {{printExceptionAndFree()}} will eventually try to acquire that lock in > {{setTLSExceptionStrings()}}. > The exception might get caught from {{loadFileSystems}}: > {code} > jthr = invokeMethod(env, NULL, STATIC, NULL, > "org/apache/hadoop/fs/FileSystem", > "loadFileSystems", "()V"); > if (jthr) { > printExceptionAndFree(env, jthr, PRINT_EXC_ALL, > "loadFileSystems"); > } > } > {code} > and here's the relevant pa
[jira] [Commented] (HDFS-11945) Internal lease recovery may not be retried for a long time
[ https://issues.apache.org/jira/browse/HDFS-11945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16041475#comment-16041475 ] Hadoop QA commented on HDFS-11945: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 33s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 75 unchanged - 2 fixed = 75 total (was 77) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m 31s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 89m 2s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080 | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure010 | | | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | HDFS-11945 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12871868/HDFS-11945.trunk.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 3a0e209ce470 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 24181f5 | | Default Java | 1.8.0_131 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/19826/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/19826/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/19826/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Internal lease recovery may not be retried for a long time > -- > > Key: HDFS-11945 > URL: https://issues.apache.org/jira/browse/HDFS-11945 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Attachments: HDFS-11945.trunk.pa
[jira] [Comment Edited] (HDFS-11851) getGlobalJNIEnv() may deadlock if exception is thrown
[ https://issues.apache.org/jira/browse/HDFS-11851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16041420#comment-16041420 ] Sailesh Mukil edited comment on HDFS-11851 at 6/7/17 7:02 PM: -- [~andrew.wang] [~jzhuge] Sorry this fell off my radar. Will post the updated patch shortly. Thanks for the review John. I went with the second method of using the static mutex initializer as it seems a lot more clean to do it that way. was (Author: sailesh): [~andrew.wang] [~jzhuge] Sorry this fell off my radar. Will post the updated patch shortly. Thanks for the review John. > getGlobalJNIEnv() may deadlock if exception is thrown > - > > Key: HDFS-11851 > URL: https://issues.apache.org/jira/browse/HDFS-11851 > Project: Hadoop HDFS > Issue Type: Bug > Components: libhdfs >Affects Versions: 3.0.0-alpha4 >Reporter: Henry Robinson >Assignee: Sailesh Mukil >Priority: Blocker > Attachments: HDFS-11851.000.patch, HDFS-11851.001.patch, > HDFS-11851.002.patch, HDFS-11851.003.patch > > > HDFS-11529 introduced a deadlock into {{getGlobalJNIEnv()}} if an exception > is thrown. {{getGlobalJNIEnv()}} holds {{jvmMutex}}, but > {{printExceptionAndFree()}} will eventually try to acquire that lock in > {{setTLSExceptionStrings()}}. > The exception might get caught from {{loadFileSystems}}: > {code} > jthr = invokeMethod(env, NULL, STATIC, NULL, > "org/apache/hadoop/fs/FileSystem", > "loadFileSystems", "()V"); > if (jthr) { > printExceptionAndFree(env, jthr, PRINT_EXC_ALL, > "loadFileSystems"); > } > } > {code} > and here's the relevant parts of the stack trace from where I call this API > in Impala, which uses {{libhdfs}}: > {code} > #0 __lll_lock_wait () at > ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 > #1 0x74a8d657 in _L_lock_909 () from > /lib/x86_64-linux-gnu/libpthread.so.0 > #2 0x74a8d480 in __GI___pthread_mutex_lock (mutex=0x47ce960 > ) at ../nptl/pthread_mutex_lock.c:79 > #3 0x02f06056 in mutexLock (m=) at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/os/posix/mutexes.c:28 > #4 0x02efe817 in setTLSExceptionStrings (rootCause=0x0, > stackTrace=0x0) at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/jni_helper.c:581 > #5 0x02f065d7 in printExceptionAndFreeV (env=0x513c1e8, > exc=0x508a8c0, noPrintFlags=, fmt=0x34349cf "loadFileSystems", > ap=0x7fffb660) > at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/exception.c:183 > #6 0x02f0683d in printExceptionAndFree (env=, > exc=, noPrintFlags=, fmt=) > at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/exception.c:213 > #7 0x02eff60f in getGlobalJNIEnv () at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/jni_helper.c:463 > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11851) getGlobalJNIEnv() may deadlock if exception is thrown
[ https://issues.apache.org/jira/browse/HDFS-11851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sailesh Mukil updated HDFS-11851: - Status: Open (was: Patch Available) > getGlobalJNIEnv() may deadlock if exception is thrown > - > > Key: HDFS-11851 > URL: https://issues.apache.org/jira/browse/HDFS-11851 > Project: Hadoop HDFS > Issue Type: Bug > Components: libhdfs >Affects Versions: 3.0.0-alpha4 >Reporter: Henry Robinson >Assignee: Sailesh Mukil >Priority: Blocker > Attachments: HDFS-11851.000.patch, HDFS-11851.001.patch, > HDFS-11851.002.patch, HDFS-11851.003.patch > > > HDFS-11529 introduced a deadlock into {{getGlobalJNIEnv()}} if an exception > is thrown. {{getGlobalJNIEnv()}} holds {{jvmMutex}}, but > {{printExceptionAndFree()}} will eventually try to acquire that lock in > {{setTLSExceptionStrings()}}. > The exception might get caught from {{loadFileSystems}}: > {code} > jthr = invokeMethod(env, NULL, STATIC, NULL, > "org/apache/hadoop/fs/FileSystem", > "loadFileSystems", "()V"); > if (jthr) { > printExceptionAndFree(env, jthr, PRINT_EXC_ALL, > "loadFileSystems"); > } > } > {code} > and here's the relevant parts of the stack trace from where I call this API > in Impala, which uses {{libhdfs}}: > {code} > #0 __lll_lock_wait () at > ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 > #1 0x74a8d657 in _L_lock_909 () from > /lib/x86_64-linux-gnu/libpthread.so.0 > #2 0x74a8d480 in __GI___pthread_mutex_lock (mutex=0x47ce960 > ) at ../nptl/pthread_mutex_lock.c:79 > #3 0x02f06056 in mutexLock (m=) at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/os/posix/mutexes.c:28 > #4 0x02efe817 in setTLSExceptionStrings (rootCause=0x0, > stackTrace=0x0) at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/jni_helper.c:581 > #5 0x02f065d7 in printExceptionAndFreeV (env=0x513c1e8, > exc=0x508a8c0, noPrintFlags=, fmt=0x34349cf "loadFileSystems", > ap=0x7fffb660) > at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/exception.c:183 > #6 0x02f0683d in printExceptionAndFree (env=, > exc=, noPrintFlags=, fmt=) > at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/exception.c:213 > #7 0x02eff60f in getGlobalJNIEnv () at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/jni_helper.c:463 > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11851) getGlobalJNIEnv() may deadlock if exception is thrown
[ https://issues.apache.org/jira/browse/HDFS-11851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sailesh Mukil updated HDFS-11851: - Status: Patch Available (was: Open) > getGlobalJNIEnv() may deadlock if exception is thrown > - > > Key: HDFS-11851 > URL: https://issues.apache.org/jira/browse/HDFS-11851 > Project: Hadoop HDFS > Issue Type: Bug > Components: libhdfs >Affects Versions: 3.0.0-alpha4 >Reporter: Henry Robinson >Assignee: Sailesh Mukil >Priority: Blocker > Attachments: HDFS-11851.000.patch, HDFS-11851.001.patch, > HDFS-11851.002.patch, HDFS-11851.003.patch > > > HDFS-11529 introduced a deadlock into {{getGlobalJNIEnv()}} if an exception > is thrown. {{getGlobalJNIEnv()}} holds {{jvmMutex}}, but > {{printExceptionAndFree()}} will eventually try to acquire that lock in > {{setTLSExceptionStrings()}}. > The exception might get caught from {{loadFileSystems}}: > {code} > jthr = invokeMethod(env, NULL, STATIC, NULL, > "org/apache/hadoop/fs/FileSystem", > "loadFileSystems", "()V"); > if (jthr) { > printExceptionAndFree(env, jthr, PRINT_EXC_ALL, > "loadFileSystems"); > } > } > {code} > and here's the relevant parts of the stack trace from where I call this API > in Impala, which uses {{libhdfs}}: > {code} > #0 __lll_lock_wait () at > ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 > #1 0x74a8d657 in _L_lock_909 () from > /lib/x86_64-linux-gnu/libpthread.so.0 > #2 0x74a8d480 in __GI___pthread_mutex_lock (mutex=0x47ce960 > ) at ../nptl/pthread_mutex_lock.c:79 > #3 0x02f06056 in mutexLock (m=) at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/os/posix/mutexes.c:28 > #4 0x02efe817 in setTLSExceptionStrings (rootCause=0x0, > stackTrace=0x0) at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/jni_helper.c:581 > #5 0x02f065d7 in printExceptionAndFreeV (env=0x513c1e8, > exc=0x508a8c0, noPrintFlags=, fmt=0x34349cf "loadFileSystems", > ap=0x7fffb660) > at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/exception.c:183 > #6 0x02f0683d in printExceptionAndFree (env=, > exc=, noPrintFlags=, fmt=) > at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/exception.c:213 > #7 0x02eff60f in getGlobalJNIEnv () at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/jni_helper.c:463 > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11851) getGlobalJNIEnv() may deadlock if exception is thrown
[ https://issues.apache.org/jira/browse/HDFS-11851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sailesh Mukil updated HDFS-11851: - Attachment: HDFS-11851.003.patch > getGlobalJNIEnv() may deadlock if exception is thrown > - > > Key: HDFS-11851 > URL: https://issues.apache.org/jira/browse/HDFS-11851 > Project: Hadoop HDFS > Issue Type: Bug > Components: libhdfs >Affects Versions: 3.0.0-alpha4 >Reporter: Henry Robinson >Assignee: Sailesh Mukil >Priority: Blocker > Attachments: HDFS-11851.000.patch, HDFS-11851.001.patch, > HDFS-11851.002.patch, HDFS-11851.003.patch > > > HDFS-11529 introduced a deadlock into {{getGlobalJNIEnv()}} if an exception > is thrown. {{getGlobalJNIEnv()}} holds {{jvmMutex}}, but > {{printExceptionAndFree()}} will eventually try to acquire that lock in > {{setTLSExceptionStrings()}}. > The exception might get caught from {{loadFileSystems}}: > {code} > jthr = invokeMethod(env, NULL, STATIC, NULL, > "org/apache/hadoop/fs/FileSystem", > "loadFileSystems", "()V"); > if (jthr) { > printExceptionAndFree(env, jthr, PRINT_EXC_ALL, > "loadFileSystems"); > } > } > {code} > and here's the relevant parts of the stack trace from where I call this API > in Impala, which uses {{libhdfs}}: > {code} > #0 __lll_lock_wait () at > ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 > #1 0x74a8d657 in _L_lock_909 () from > /lib/x86_64-linux-gnu/libpthread.so.0 > #2 0x74a8d480 in __GI___pthread_mutex_lock (mutex=0x47ce960 > ) at ../nptl/pthread_mutex_lock.c:79 > #3 0x02f06056 in mutexLock (m=) at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/os/posix/mutexes.c:28 > #4 0x02efe817 in setTLSExceptionStrings (rootCause=0x0, > stackTrace=0x0) at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/jni_helper.c:581 > #5 0x02f065d7 in printExceptionAndFreeV (env=0x513c1e8, > exc=0x508a8c0, noPrintFlags=, fmt=0x34349cf "loadFileSystems", > ap=0x7fffb660) > at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/exception.c:183 > #6 0x02f0683d in printExceptionAndFree (env=, > exc=, noPrintFlags=, fmt=) > at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/exception.c:213 > #7 0x02eff60f in getGlobalJNIEnv () at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/jni_helper.c:463 > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11851) getGlobalJNIEnv() may deadlock if exception is thrown
[ https://issues.apache.org/jira/browse/HDFS-11851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16041420#comment-16041420 ] Sailesh Mukil commented on HDFS-11851: -- [~andrew.wang] [~jzhuge] Sorry this fell off my radar. Will post the updated patch shortly. Thanks for the review John. > getGlobalJNIEnv() may deadlock if exception is thrown > - > > Key: HDFS-11851 > URL: https://issues.apache.org/jira/browse/HDFS-11851 > Project: Hadoop HDFS > Issue Type: Bug > Components: libhdfs >Affects Versions: 3.0.0-alpha4 >Reporter: Henry Robinson >Assignee: Sailesh Mukil >Priority: Blocker > Attachments: HDFS-11851.000.patch, HDFS-11851.001.patch, > HDFS-11851.002.patch > > > HDFS-11529 introduced a deadlock into {{getGlobalJNIEnv()}} if an exception > is thrown. {{getGlobalJNIEnv()}} holds {{jvmMutex}}, but > {{printExceptionAndFree()}} will eventually try to acquire that lock in > {{setTLSExceptionStrings()}}. > The exception might get caught from {{loadFileSystems}}: > {code} > jthr = invokeMethod(env, NULL, STATIC, NULL, > "org/apache/hadoop/fs/FileSystem", > "loadFileSystems", "()V"); > if (jthr) { > printExceptionAndFree(env, jthr, PRINT_EXC_ALL, > "loadFileSystems"); > } > } > {code} > and here's the relevant parts of the stack trace from where I call this API > in Impala, which uses {{libhdfs}}: > {code} > #0 __lll_lock_wait () at > ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 > #1 0x74a8d657 in _L_lock_909 () from > /lib/x86_64-linux-gnu/libpthread.so.0 > #2 0x74a8d480 in __GI___pthread_mutex_lock (mutex=0x47ce960 > ) at ../nptl/pthread_mutex_lock.c:79 > #3 0x02f06056 in mutexLock (m=) at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/os/posix/mutexes.c:28 > #4 0x02efe817 in setTLSExceptionStrings (rootCause=0x0, > stackTrace=0x0) at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/jni_helper.c:581 > #5 0x02f065d7 in printExceptionAndFreeV (env=0x513c1e8, > exc=0x508a8c0, noPrintFlags=, fmt=0x34349cf "loadFileSystems", > ap=0x7fffb660) > at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/exception.c:183 > #6 0x02f0683d in printExceptionAndFree (env=, > exc=, noPrintFlags=, fmt=) > at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/exception.c:213 > #7 0x02eff60f in getGlobalJNIEnv () at > /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/jni_helper.c:463 > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-11711) DN should not delete the block On "Too many open files" Exception
[ https://issues.apache.org/jira/browse/HDFS-11711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16041353#comment-16041353 ] Wei-Chiu Chuang edited comment on HDFS-11711 at 6/7/17 6:37 PM: [~brahmareddy] sorry i didn't make myself clear. To begin with, this behavior was caused by HDFS-8492, which throws FileNotFoundException("BlockId " + blockId + " is not valid."). I was just thinking that "Too many open files" error is thrown within Java library, so there's no guarantee this would be compatible between different operating systems, or across different Java versions, or different JVM/JDK implementation. IMHO, the more compatible approach would be that we check if FNFE has "BlockId " + blockId + " is not valid.", and only delete the block when that's the case. Edit: HDFS-3100 throws FileNotFoundException("Meta-data not found for " + block) when meta file checksum is not found. So this should be checked as well. Or, it should just throw a new type of exception in these two cases. was (Author: jojochuang): [~brahmareddy] sorry i didn't make myself clear. To begin with, this behavior was caused by HDFS-8492, which throws FileNotFoundException("BlockId " + blockId + " is not valid."). I was just thinking that "Too many open files" error is thrown within Java library, so there's no guarantee this would be compatible between different operating systems, or across different Java versions, or different JVM/JDK implementation. IMHO, the more compatible approach would be that we check if FNFE has "BlockId " + blockId + " is not valid.", and only delete the block when that's the case. Edit: HDFS-3100 throws FileNotFoundException("Meta-data not found for " + block) when meta file checksum is not found. So this should be checked as well. > DN should not delete the block On "Too many open files" Exception > - > > Key: HDFS-11711 > URL: https://issues.apache.org/jira/browse/HDFS-11711 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula >Priority: Critical > Fix For: 2.9.0, 3.0.0-alpha4, 2.8.2 > > Attachments: HDFS-11711-002.patch, HDFS-11711-003.patch, > HDFS-11711-004.patch, HDFS-11711-branch-2-002.patch, > HDFS-11711-branch-2-003.patch, HDFS-11711.patch > > > *Seen the following scenario in one of our customer environment* > * while jobclient writing {{"job.xml"}} there are pipeline failures and > written to only one DN. > * when mapper reading the {{"job.xml"}}, DN got {{"Too many open files"}} (as > system exceed limit) and block got deleted. Hence mapper failed to read and > job got failed. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-11711) DN should not delete the block On "Too many open files" Exception
[ https://issues.apache.org/jira/browse/HDFS-11711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16041353#comment-16041353 ] Wei-Chiu Chuang edited comment on HDFS-11711 at 6/7/17 6:21 PM: [~brahmareddy] sorry i didn't make myself clear. To begin with, this behavior was caused by HDFS-8492, which throws FileNotFoundException("BlockId " + blockId + " is not valid."). I was just thinking that "Too many open files" error is thrown within Java library, so there's no guarantee this would be compatible between different operating systems, or across different Java versions, or different JVM/JDK implementation. IMHO, the more compatible approach would be that we check if FNFE has "BlockId " + blockId + " is not valid.", and only delete the block when that's the case. Edit: HDFS-3100 throws FileNotFoundException("Meta-data not found for " + block) when meta file checksum is not found. So this should be checked as well. was (Author: jojochuang): [~brahmareddy] sorry i didn't make myself clear. To begin with, this behavior was caused by HDFS-8492, which throws FileNotFoundException("BlockId " + blockId + " is not valid."). I was just thinking that "Too many open files" error is thrown within Java library, so there's no guarantee this would be compatible between different operating systems, or across different Java versions, or different JVM/JDK implementation. IMHO, the more compatible approach would be that we check if FNFE has "BlockId " + blockId + " is not valid.", and only delete the block when that's the case. > DN should not delete the block On "Too many open files" Exception > - > > Key: HDFS-11711 > URL: https://issues.apache.org/jira/browse/HDFS-11711 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula >Priority: Critical > Fix For: 2.9.0, 3.0.0-alpha4, 2.8.2 > > Attachments: HDFS-11711-002.patch, HDFS-11711-003.patch, > HDFS-11711-004.patch, HDFS-11711-branch-2-002.patch, > HDFS-11711-branch-2-003.patch, HDFS-11711.patch > > > *Seen the following scenario in one of our customer environment* > * while jobclient writing {{"job.xml"}} there are pipeline failures and > written to only one DN. > * when mapper reading the {{"job.xml"}}, DN got {{"Too many open files"}} (as > system exceed limit) and block got deleted. Hence mapper failed to read and > job got failed. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11711) DN should not delete the block On "Too many open files" Exception
[ https://issues.apache.org/jira/browse/HDFS-11711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16041353#comment-16041353 ] Wei-Chiu Chuang commented on HDFS-11711: [~brahmareddy] sorry i didn't make myself clear. To begin with, this behavior was caused by HDFS-8492, which throws FileNotFoundException("BlockId " + blockId + " is not valid."). I was just thinking that "Too many open files" error is thrown within Java library, so there's no guarantee this would be compatible between different operating systems, or across different Java versions, or different JVM/JDK implementation. IMHO, the more compatible approach would be that we check if FNFE has "BlockId " + blockId + " is not valid.", and only delete the block when that's the case. > DN should not delete the block On "Too many open files" Exception > - > > Key: HDFS-11711 > URL: https://issues.apache.org/jira/browse/HDFS-11711 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula >Priority: Critical > Fix For: 2.9.0, 3.0.0-alpha4, 2.8.2 > > Attachments: HDFS-11711-002.patch, HDFS-11711-003.patch, > HDFS-11711-004.patch, HDFS-11711-branch-2-002.patch, > HDFS-11711-branch-2-003.patch, HDFS-11711.patch > > > *Seen the following scenario in one of our customer environment* > * while jobclient writing {{"job.xml"}} there are pipeline failures and > written to only one DN. > * when mapper reading the {{"job.xml"}}, DN got {{"Too many open files"}} (as > system exceed limit) and block got deleted. Hence mapper failed to read and > job got failed. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11711) DN should not delete the block On "Too many open files" Exception
[ https://issues.apache.org/jira/browse/HDFS-11711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-11711: --- Fix Version/s: (was: 2.8.1) > DN should not delete the block On "Too many open files" Exception > - > > Key: HDFS-11711 > URL: https://issues.apache.org/jira/browse/HDFS-11711 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula >Priority: Critical > Fix For: 2.9.0, 3.0.0-alpha4, 2.8.2 > > Attachments: HDFS-11711-002.patch, HDFS-11711-003.patch, > HDFS-11711-004.patch, HDFS-11711-branch-2-002.patch, > HDFS-11711-branch-2-003.patch, HDFS-11711.patch > > > *Seen the following scenario in one of our customer environment* > * while jobclient writing {{"job.xml"}} there are pipeline failures and > written to only one DN. > * when mapper reading the {{"job.xml"}}, DN got {{"Too many open files"}} (as > system exceed limit) and block got deleted. Hence mapper failed to read and > job got failed. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11945) Internal lease recovery may not be retried for a long time
[ https://issues.apache.org/jira/browse/HDFS-11945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11945: -- Attachment: HDFS-11945.trunk.patch > Internal lease recovery may not be retried for a long time > -- > > Key: HDFS-11945 > URL: https://issues.apache.org/jira/browse/HDFS-11945 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Attachments: HDFS-11945.trunk.patch > > > Lease is assigned per client who is identified by its holder ID or client ID, > thus a renewal or an expiration of a lease affects all files being written by > the client. > When a client/writer dies without closing a file, its lease expires in one > hour (hard limit) and the namenode tries to recover the lease. As a part of > the process, the namenode takes the ownership of the lease and renews it. If > the recovery does not finish successfully, the lease will expire in one hour > and the namenode will try again to recover the lease. > However, if a file system has another lease expiring within the hour, the > recovery attempt for the lease will push forward the expiration of the lease > held by the namenode. This causes failed lease recoveries to be not retried > for a long time. We have seen it happening for days. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11945) Internal lease recovery may not be retried for a long time
[ https://issues.apache.org/jira/browse/HDFS-11945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11945: -- Status: Patch Available (was: Open) > Internal lease recovery may not be retried for a long time > -- > > Key: HDFS-11945 > URL: https://issues.apache.org/jira/browse/HDFS-11945 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Attachments: HDFS-11945.trunk.patch > > > Lease is assigned per client who is identified by its holder ID or client ID, > thus a renewal or an expiration of a lease affects all files being written by > the client. > When a client/writer dies without closing a file, its lease expires in one > hour (hard limit) and the namenode tries to recover the lease. As a part of > the process, the namenode takes the ownership of the lease and renews it. If > the recovery does not finish successfully, the lease will expire in one hour > and the namenode will try again to recover the lease. > However, if a file system has another lease expiring within the hour, the > recovery attempt for the lease will push forward the expiration of the lease > held by the namenode. This causes failed lease recoveries to be not retried > for a long time. We have seen it happening for days. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11945) Internal lease recovery may not be retried for a long time
[ https://issues.apache.org/jira/browse/HDFS-11945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11945: -- Attachment: (was: HDFS-11945.trunk.patch) > Internal lease recovery may not be retried for a long time > -- > > Key: HDFS-11945 > URL: https://issues.apache.org/jira/browse/HDFS-11945 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Kihwal Lee >Assignee: Kihwal Lee > > Lease is assigned per client who is identified by its holder ID or client ID, > thus a renewal or an expiration of a lease affects all files being written by > the client. > When a client/writer dies without closing a file, its lease expires in one > hour (hard limit) and the namenode tries to recover the lease. As a part of > the process, the namenode takes the ownership of the lease and renews it. If > the recovery does not finish successfully, the lease will expire in one hour > and the namenode will try again to recover the lease. > However, if a file system has another lease expiring within the hour, the > recovery attempt for the lease will push forward the expiration of the lease > held by the namenode. This causes failed lease recoveries to be not retried > for a long time. We have seen it happening for days. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11945) Internal lease recovery may not be retried for a long time
[ https://issues.apache.org/jira/browse/HDFS-11945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11945: -- Status: Open (was: Patch Available) > Internal lease recovery may not be retried for a long time > -- > > Key: HDFS-11945 > URL: https://issues.apache.org/jira/browse/HDFS-11945 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Kihwal Lee >Assignee: Kihwal Lee > > Lease is assigned per client who is identified by its holder ID or client ID, > thus a renewal or an expiration of a lease affects all files being written by > the client. > When a client/writer dies without closing a file, its lease expires in one > hour (hard limit) and the namenode tries to recover the lease. As a part of > the process, the namenode takes the ownership of the lease and renews it. If > the recovery does not finish successfully, the lease will expire in one hour > and the namenode will try again to recover the lease. > However, if a file system has another lease expiring within the hour, the > recovery attempt for the lease will push forward the expiration of the lease > held by the namenode. This causes failed lease recoveries to be not retried > for a long time. We have seen it happening for days. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11945) Internal lease recovery may not be retried for a long time
[ https://issues.apache.org/jira/browse/HDFS-11945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11945: -- Status: Patch Available (was: Open) > Internal lease recovery may not be retried for a long time > -- > > Key: HDFS-11945 > URL: https://issues.apache.org/jira/browse/HDFS-11945 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Attachments: HDFS-11945.trunk.patch > > > Lease is assigned per client who is identified by its holder ID or client ID, > thus a renewal or an expiration of a lease affects all files being written by > the client. > When a client/writer dies without closing a file, its lease expires in one > hour (hard limit) and the namenode tries to recover the lease. As a part of > the process, the namenode takes the ownership of the lease and renews it. If > the recovery does not finish successfully, the lease will expire in one hour > and the namenode will try again to recover the lease. > However, if a file system has another lease expiring within the hour, the > recovery attempt for the lease will push forward the expiration of the lease > held by the namenode. This causes failed lease recoveries to be not retried > for a long time. We have seen it happening for days. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11945) Internal lease recovery may not be retried for a long time
[ https://issues.apache.org/jira/browse/HDFS-11945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11945: -- Attachment: HDFS-11945.trunk.patch > Internal lease recovery may not be retried for a long time > -- > > Key: HDFS-11945 > URL: https://issues.apache.org/jira/browse/HDFS-11945 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Attachments: HDFS-11945.trunk.patch > > > Lease is assigned per client who is identified by its holder ID or client ID, > thus a renewal or an expiration of a lease affects all files being written by > the client. > When a client/writer dies without closing a file, its lease expires in one > hour (hard limit) and the namenode tries to recover the lease. As a part of > the process, the namenode takes the ownership of the lease and renews it. If > the recovery does not finish successfully, the lease will expire in one hour > and the namenode will try again to recover the lease. > However, if a file system has another lease expiring within the hour, the > recovery attempt for the lease will push forward the expiration of the lease > held by the namenode. This causes failed lease recoveries to be not retried > for a long time. We have seen it happening for days. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11812) fix spelling mistake in TestFsVolumeList.java
[ https://issues.apache.org/jira/browse/HDFS-11812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16041284#comment-16041284 ] Ravi Prakash commented on HDFS-11812: - Hi Chencan! Thanks a lot for your contribution. I appreciate your spirit for contributing. However there is usually some overhead for managing JIRA, reviewing changes, committing them, release notes etc. To amortize that effort we expect patches to be quite a lot more substantial than a single spelling mistake in a comment in a test file. I do understand that just going through the process of contributing makes you more familiar with it even for trivial changes. However, I'd encourage you to make some more substantive changes, e.g. maybe fix a failing test, or a bug even just to familiarize yourself with the process. I look forward to such contributions in the future from you. Please forgive me right now for marking this JIRA as WON'T FIX. > fix spelling mistake in TestFsVolumeList.java > -- > > Key: HDFS-11812 > URL: https://issues.apache.org/jira/browse/HDFS-11812 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: chencan > Attachments: HADOOP-11812.patch > > > We found a spelling mistake in TestFsVolumeList.java: // Mock > reservedForReplcas should be // Mock reservedForReplicas。 -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11812) fix spelling mistake in TestFsVolumeList.java
[ https://issues.apache.org/jira/browse/HDFS-11812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated HDFS-11812: Resolution: Won't Fix Status: Resolved (was: Patch Available) > fix spelling mistake in TestFsVolumeList.java > -- > > Key: HDFS-11812 > URL: https://issues.apache.org/jira/browse/HDFS-11812 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: chencan > Attachments: HADOOP-11812.patch > > > We found a spelling mistake in TestFsVolumeList.java: // Mock > reservedForReplcas should be // Mock reservedForReplicas。 -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11943) Warn log frequently print to screen in doEncode function on AbstractNativeRawEncoder class
[ https://issues.apache.org/jira/browse/HDFS-11943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16041246#comment-16041246 ] Andrew Wang commented on HDFS-11943: Ping [~Sammi] / [~drankye] also, wondering why this log fires so often in the specified environment? > Warn log frequently print to screen in doEncode function on > AbstractNativeRawEncoder class > --- > > Key: HDFS-11943 > URL: https://issues.apache.org/jira/browse/HDFS-11943 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding, native >Affects Versions: 3.0.0-alpha4 > Environment: cluster: 3 nodes > os:(Red Hat 2.6.33.20, Red Hat 3.10.0-514.6.1.el7.x86_64, > Ubuntu4.4.0-31-generic) > hadoop version: hadoop-3.0.0-alpha4 > erasure coding: XOR-2-1-64k and enabled Intel ISA-L > hadoop fs -put file / >Reporter: liaoyuxiangqin >Assignee: liaoyuxiangqin >Priority: Minor > Attachments: HDFS-11943.patch > > Original Estimate: 0.05h > Remaining Estimate: 0.05h > > when i write file to hdfs on above environment, the hdfs client frequently > print warn log of use direct ByteBuffer inputs/outputs in doEncode function > to screen, detail information as follows: > 2017-06-07 15:20:42,856 WARN rawcoder.AbstractNativeRawEncoder: > convertToByteBufferState is invoked, not efficiently. Please use direct > ByteBuffer inputs/outputs -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11943) Warn log frequently print to screen in doEncode function on AbstractNativeRawEncoder class
[ https://issues.apache.org/jira/browse/HDFS-11943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16041243#comment-16041243 ] Andrew Wang commented on HDFS-11943: Hi, thanks for the JIRA and the patch [~liaoyuxiangqin]! One suggestion, do you mind changing this to use the PerformanceAdvisory LOG at debug? e.g. we have these logs for encryption: {code} public OpensslSecureRandom() { if (!nativeEnabled) { PerformanceAdvisory.LOG.debug("Build does not support openssl, " + "falling back to Java SecureRandom."); fallback = new java.security.SecureRandom(); } } {code} We don't need to guard with {{isDebugEnabled}} since it's an SLF4J logger. > Warn log frequently print to screen in doEncode function on > AbstractNativeRawEncoder class > --- > > Key: HDFS-11943 > URL: https://issues.apache.org/jira/browse/HDFS-11943 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding, native >Affects Versions: 3.0.0-alpha4 > Environment: cluster: 3 nodes > os:(Red Hat 2.6.33.20, Red Hat 3.10.0-514.6.1.el7.x86_64, > Ubuntu4.4.0-31-generic) > hadoop version: hadoop-3.0.0-alpha4 > erasure coding: XOR-2-1-64k and enabled Intel ISA-L > hadoop fs -put file / >Reporter: liaoyuxiangqin >Assignee: liaoyuxiangqin >Priority: Minor > Attachments: HDFS-11943.patch > > Original Estimate: 0.05h > Remaining Estimate: 0.05h > > when i write file to hdfs on above environment, the hdfs client frequently > print warn log of use direct ByteBuffer inputs/outputs in doEncode function > to screen, detail information as follows: > 2017-06-07 15:20:42,856 WARN rawcoder.AbstractNativeRawEncoder: > convertToByteBufferState is invoked, not efficiently. Please use direct > ByteBuffer inputs/outputs -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-11943) Warn log frequently print to screen in doEncode function on AbstractNativeRawEncoder class
[ https://issues.apache.org/jira/browse/HDFS-11943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang reassigned HDFS-11943: -- Assignee: liaoyuxiangqin > Warn log frequently print to screen in doEncode function on > AbstractNativeRawEncoder class > --- > > Key: HDFS-11943 > URL: https://issues.apache.org/jira/browse/HDFS-11943 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding, native >Affects Versions: 3.0.0-alpha4 > Environment: cluster: 3 nodes > os:(Red Hat 2.6.33.20, Red Hat 3.10.0-514.6.1.el7.x86_64, > Ubuntu4.4.0-31-generic) > hadoop version: hadoop-3.0.0-alpha4 > erasure coding: XOR-2-1-64k and enabled Intel ISA-L > hadoop fs -put file / >Reporter: liaoyuxiangqin >Assignee: liaoyuxiangqin >Priority: Minor > Attachments: HDFS-11943.patch > > Original Estimate: 0.05h > Remaining Estimate: 0.05h > > when i write file to hdfs on above environment, the hdfs client frequently > print warn log of use direct ByteBuffer inputs/outputs in doEncode function > to screen, detail information as follows: > 2017-06-07 15:20:42,856 WARN rawcoder.AbstractNativeRawEncoder: > convertToByteBufferState is invoked, not efficiently. Please use direct > ByteBuffer inputs/outputs -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10785) libhdfs++: Implement the rest of the tools
[ https://issues.apache.org/jira/browse/HDFS-10785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16041228#comment-16041228 ] Hadoop QA commented on HDFS-10785: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 49s{color} | {color:green} HDFS-8707 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 10s{color} | {color:green} HDFS-8707 passed with JDK v1.8.0_131 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 10s{color} | {color:green} HDFS-8707 passed with JDK v1.7.0_131 {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 17s{color} | {color:green} HDFS-8707 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 11s{color} | {color:green} HDFS-8707 passed with JDK v1.8.0_131 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 10s{color} | {color:green} HDFS-8707 passed with JDK v1.7.0_131 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 28s{color} | {color:green} the patch passed with JDK v1.8.0_131 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 6m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 50s{color} | {color:green} the patch passed with JDK v1.7.0_131 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 6m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 8s{color} | {color:green} the patch passed with JDK v1.8.0_131 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 8s{color} | {color:green} the patch passed with JDK v1.7.0_131 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 6s{color} | {color:green} hadoop-hdfs-native-client in the patch passed with JDK v1.7.0_131. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 58m 4s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:5ae34ac | | JIRA Issue | HDFS-10785 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12871851/HDFS-10785.HDFS-8707.013.patch | | Optional Tests | asflicense compile cc mvnsite javac unit javadoc mvninstall | | uname | Linux 31fd20670b56 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | HDFS-8707 / 5ae34ac | | Default Java | 1.7.0_131 | | Multi-JDK versions | /usr/lib/jvm/java-8-oracle:1.8.0_131 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_131 | | JDK v1.7.0_131 Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/19824/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs-native-client U: hadoop-hdfs-project/hadoop-hdfs-native-client | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/19824/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > libhdfs++: Implement the rest of the tools >
[jira] [Commented] (HDFS-11804) KMS client needs retry logic
[ https://issues.apache.org/jira/browse/HDFS-11804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16041227#comment-16041227 ] Hadoop QA commented on HDFS-11804: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 22s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 21m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 36s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 45s{color} | {color:red} hadoop-common-project/hadoop-common in trunk has 19 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 8s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 16m 22s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 21s{color} | {color:orange} root: The patch generated 7 new + 27 unchanged - 0 fixed = 34 total (was 27) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 16s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 98m 41s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 36s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}187m 46s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.balancer.TestBalancer | | | hadoop.hdfs.TestDFSStripedInputStreamWithRandomECPolicy | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | | | hadoop.hdfs.TestAclsEndToEnd | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | HDFS-11804 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12871836/HDFS-11804-trunk-3.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 87ee4abaacb5 3.13.0-108-generic #155-Ubuntu SMP Wed Jan 11 16:58:52 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 5ec7163 | | Default Java | 1.8.0_131 | | findbugs | v3.1.0-RC1 | | findbugs | https://builds.apache.org/job/PreCommit-HDFS-Build/19823/artifact/patchprocess/branch-findbugs-hadoop-common-project_hadoop-common-warnings.html | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/19823/artifact/patchprocess/diff-checkstyle-root.txt | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/19823/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://
[jira] [Commented] (HDFS-9807) Add an optional StorageID to writes
[ https://issues.apache.org/jira/browse/HDFS-9807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16041207#comment-16041207 ] Andrew Wang commented on HDFS-9807: --- I believe this (or HDFS-6708) broke backwards compatibility with older clients. I'm running a 2.6.0-ish client against a 3.0.0-alpha4-ish DN, and am seeing this in the DN log: {noformat} 2017-06-06 23:27:22,568 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Block token verification failed: op=WRITE_BLOCK, remoteAddress=/172.28.208.200:53900, message=Block token with StorageIDs [DS-c0f24154-a39b-4941-93cd-5b8323067ba2] not valid for access with StorageIDs [] {noformat} The client then blacklists each DN when it gets this error, until it runs out of DNs. Dunno if this retry behavior is fixed in newer clients; InvalidBlockTokenExceptions are kind of fatal. What is the expected behavior here? > Add an optional StorageID to writes > --- > > Key: HDFS-9807 > URL: https://issues.apache.org/jira/browse/HDFS-9807 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0-alpha2 >Reporter: Chris Douglas >Assignee: Ewan Higgs > Fix For: 3.0.0-alpha4 > > Attachments: HDFS-9807.001.patch, HDFS-9807.002.patch, > HDFS-9807.003.patch, HDFS-9807.004.patch, HDFS-9807.005.patch, > HDFS-9807.006.patch, HDFS-9807.007.patch, HDFS-9807.008.patch, > HDFS-9807.009.patch, HDFS-9807.010.patch > > > The {{BlockPlacementPolicy}} considers specific storages, but when the > replica is written the DN {{VolumeChoosingPolicy}} is unaware of any > preference or constraints from other policies affecting placement. This > limits heterogeneity to the declared storage types, which are treated as > fungible within the target DN. It should be possible to influence or > constrain the DN policy to select a particular storage. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11777) Ozone: KSM: add deleteBucket
[ https://issues.apache.org/jira/browse/HDFS-11777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HDFS-11777: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: HDFS-7240 Target Version/s: HDFS-7240 Status: Resolved (was: Patch Available) Thanks [~nandakumar131] for the contribution. I've commit the patch to the feature branch. > Ozone: KSM: add deleteBucket > > > Key: HDFS-11777 > URL: https://issues.apache.org/jira/browse/HDFS-11777 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Affects Versions: HDFS-7240 >Reporter: Anu Engineer >Assignee: Nandakumar > Fix For: HDFS-7240 > > Attachments: HDFS-11777-HDFS-7240.000.patch > > > Allows a bucket to to be deleted if there are no keys in the bucket. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11777) Ozone: KSM: add deleteBucket
[ https://issues.apache.org/jira/browse/HDFS-11777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16041191#comment-16041191 ] Xiaoyu Yao commented on HDFS-11777: --- Thanks [~nandakumar131] for working on this. The patch looks good to me. +1. There is a wildcard import change in TestBucketManagerImpl that I plan to fix at commit time. > Ozone: KSM: add deleteBucket > > > Key: HDFS-11777 > URL: https://issues.apache.org/jira/browse/HDFS-11777 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Affects Versions: HDFS-7240 >Reporter: Anu Engineer >Assignee: Nandakumar > Attachments: HDFS-11777-HDFS-7240.000.patch > > > Allows a bucket to to be deleted if there are no keys in the bucket. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11726) [SPS] : StoragePolicySatisfier should not select same storage type as source and destination in same datanode.
[ https://issues.apache.org/jira/browse/HDFS-11726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16041101#comment-16041101 ] Rakesh R commented on HDFS-11726: - Hi [~umamaheswararao], I've observed {{TestPersistentStoragePolicySatisfier}} test failures for the last few builds [Build_19665|https://builds.apache.org/job/PreCommit-HDFS-Build/19665/testReport/], [Build_19694|https://builds.apache.org/job/PreCommit-HDFS-Build/19694/testReport/] and I think there is no relation with this patch. Shall I commit this patch and analyse the test failure separately? > [SPS] : StoragePolicySatisfier should not select same storage type as source > and destination in same datanode. > -- > > Key: HDFS-11726 > URL: https://issues.apache.org/jira/browse/HDFS-11726 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: HDFS-10285 >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore > Attachments: HDFS-11726-HDFS-10285.001.patch, > HDFS-11726-HDFS-10285.002.patch > > > {code} > 2017-04-30 16:12:28,569 [BlockMoverTask-0] INFO > datanode.StoragePolicySatisfyWorker (Worker.java:moveBlock(248)) - Start > moving block:blk_1073741826_1002 from src:127.0.0.1:41699 to > destin:127.0.0.1:41699 to satisfy storageType, sourceStoragetype:ARCHIVE and > destinStoragetype:ARCHIVE > {code} > {code} > 2017-04-30 16:12:28,571 [DataXceiver for client /127.0.0.1:36428 [Replacing > block BP-1409501412-127.0.1.1-1493548923222:blk_1073741826_1002 from > 6c7aa66e-a778-43d5-89f6-053d5f6b35bc]] INFO datanode.DataNode > (DataXceiver.java:replaceBlock(1202)) - opReplaceBlock > BP-1409501412-127.0.1.1-1493548923222:blk_1073741826_1002 received exception > org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Replica > FinalizedReplica, blk_1073741826_1002, FINALIZED > getNumBytes() = 1024 > getBytesOnDisk() = 1024 > getVisibleLength()= 1024 > getVolume() = > /home/sachin/software/hadoop/HDFS-10285/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data7 > getBlockURI() = > file:/home/sachin/software/hadoop/HDFS-10285/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data7/current/BP-1409501412-127.0.1.1-1493548923222/current/finalized/subdir0/subdir0/blk_1073741826 > already exists on storage ARCHIVE > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11743) Revert the incompatible fsck reporting output in HDFS-7933 from branch-2.7
[ https://issues.apache.org/jira/browse/HDFS-11743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-11743: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.7.4 Status: Resolved (was: Patch Available) Thanks [~ajisakaa] and [~brahmareddy] for the reviews! I just committed the patch to branch-2.7. > Revert the incompatible fsck reporting output in HDFS-7933 from branch-2.7 > -- > > Key: HDFS-11743 > URL: https://issues.apache.org/jira/browse/HDFS-11743 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Zhe Zhang >Assignee: Zhe Zhang >Priority: Blocker > Labels: release-blocker > Fix For: 2.7.4 > > Attachments: HDFS-11743-branch-2.7.00.patch, > HDFS-11743-branch-2.7.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10785) libhdfs++: Implement the rest of the tools
[ https://issues.apache.org/jira/browse/HDFS-10785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anatoli Shein updated HDFS-10785: - Attachment: HDFS-10785.HDFS-8707.013.patch Resubmitting the latest version of the patch to be tested with the CI system. > libhdfs++: Implement the rest of the tools > -- > > Key: HDFS-10785 > URL: https://issues.apache.org/jira/browse/HDFS-10785 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Anatoli Shein >Assignee: Anatoli Shein > Attachments: HDFS-10785.HDFS-8707.000.patch, > HDFS-10785.HDFS-8707.001.patch, HDFS-10785.HDFS-8707.002.patch, > HDFS-10785.HDFS-8707.003.patch, HDFS-10785.HDFS-8707.004.patch, > HDFS-10785.HDFS-8707.005.patch, HDFS-10785.HDFS-8707.006.patch, > HDFS-10785.HDFS-8707.007.patch, HDFS-10785.HDFS-8707.008.patch, > HDFS-10785.HDFS-8707.009.patch, HDFS-10785.HDFS-8707.010.patch, > HDFS-10785.HDFS-8707.011.patch, HDFS-10785.HDFS-8707.012.patch, > HDFS-10785.HDFS-8707.013.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11930) libhdfs++: Docker script fails while trying to download JDK 7.
[ https://issues.apache.org/jira/browse/HDFS-11930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16041080#comment-16041080 ] James Clampffer commented on HDFS-11930: Committed the HDFS-11930.HDFS-8707.001.patch (one without intentional failure) to HDFS-8707. Thanks [~anatoli.shein]! > libhdfs++: Docker script fails while trying to download JDK 7. > -- > > Key: HDFS-11930 > URL: https://issues.apache.org/jira/browse/HDFS-11930 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Anatoli Shein >Assignee: Anatoli Shein > Attachments: HDFS-11930.HDFS-8707.000.patch, > HDFS-11930.HDFS-8707.001.patch, HDFS-11930.HDFS-8707.002.patch > > > The CI system Apache Yetus crashes with: > Docker failed to build yetus/hadoop:78fc6b6 > The error appears to be caused by: > Location: > http://download.oracle.com/otn-pub/java/jdk/7u80-b15/jdk-7u80-linux-x64.tar.gz?AuthParam=1495630848_6203b920eab45659a5ec46a07cf84db9 > [following] > --2017-05-24 12:58:48-- > http://download.oracle.com/otn-pub/java/jdk/7u80-b15/jdk-7u80-linux-x64.tar.gz?AuthParam=1495630848_6203b920eab45659a5ec46a07cf84db9 > Connecting to download.oracle.com (download.oracle.com)|23.59.189.81|:80... > [0m[91mconnected. > HTTP request sent, awaiting response... [0m[91m404 Not Found > [0m[91m2017-05-24 12:58:49 ERROR 404: Not Found. > [0m[91m > [0m[91mdownload failed > Oracle JDK 7 is NOT installed. > [0m[91mdpkg: error processing package oracle-java7-installer (--configure): > subprocess installed post-installation script returned error exit status 1 > It looks like the docker scripts need to be updated to resolve this issue. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11930) libhdfs++: Docker script fails while trying to download JDK 7.
[ https://issues.apache.org/jira/browse/HDFS-11930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Clampffer updated HDFS-11930: --- Resolution: Fixed Status: Resolved (was: Patch Available) > libhdfs++: Docker script fails while trying to download JDK 7. > -- > > Key: HDFS-11930 > URL: https://issues.apache.org/jira/browse/HDFS-11930 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Anatoli Shein >Assignee: Anatoli Shein > Attachments: HDFS-11930.HDFS-8707.000.patch, > HDFS-11930.HDFS-8707.001.patch, HDFS-11930.HDFS-8707.002.patch > > > The CI system Apache Yetus crashes with: > Docker failed to build yetus/hadoop:78fc6b6 > The error appears to be caused by: > Location: > http://download.oracle.com/otn-pub/java/jdk/7u80-b15/jdk-7u80-linux-x64.tar.gz?AuthParam=1495630848_6203b920eab45659a5ec46a07cf84db9 > [following] > --2017-05-24 12:58:48-- > http://download.oracle.com/otn-pub/java/jdk/7u80-b15/jdk-7u80-linux-x64.tar.gz?AuthParam=1495630848_6203b920eab45659a5ec46a07cf84db9 > Connecting to download.oracle.com (download.oracle.com)|23.59.189.81|:80... > [0m[91mconnected. > HTTP request sent, awaiting response... [0m[91m404 Not Found > [0m[91m2017-05-24 12:58:49 ERROR 404: Not Found. > [0m[91m > [0m[91mdownload failed > Oracle JDK 7 is NOT installed. > [0m[91mdpkg: error processing package oracle-java7-installer (--configure): > subprocess installed post-installation script returned error exit status 1 > It looks like the docker scripts need to be updated to resolve this issue. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-11945) Internal lease recovery may not be retried for a long time
[ https://issues.apache.org/jira/browse/HDFS-11945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16040927#comment-16040927 ] Kihwal Lee edited comment on HDFS-11945 at 6/7/17 3:18 PM: --- We could change the namenode lease holder ID every hour. Normally there will be only a brief moment of two being active in the system. Multiple ones can be active If there are failures. If the ID is suffixed by time stamp or date string, the log message for recovery will show how old the leases are. The major cause of lease recovery failures is datanodes having problems during block recoveries. One interesting case is when the namenode throws "server too busy" to datanodes. A {{commitBlockSynchronization()}} call can fail for this reason and won't be retried. HADOOP-14035 will mitigate this particular case. was (Author: kihwal): We could change the namenode lease holder ID every hour. Normally there will be only a brief moment of two being active in the system. Multiple ones can be active If there are failures. If the ID is suffixed by time stamp or date string, the log message for recovery will show how old the leases are. The major cause of lease recovery failures is datanodes having problems during block recoveries. One interesting case is when the namenode throws "server too busy" to datanodes. A {{commitBlockSynchronization()}} call can fail for this reason and won't be retried. > Internal lease recovery may not be retried for a long time > -- > > Key: HDFS-11945 > URL: https://issues.apache.org/jira/browse/HDFS-11945 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Kihwal Lee >Assignee: Kihwal Lee > > Lease is assigned per client who is identified by its holder ID or client ID, > thus a renewal or an expiration of a lease affects all files being written by > the client. > When a client/writer dies without closing a file, its lease expires in one > hour (hard limit) and the namenode tries to recover the lease. As a part of > the process, the namenode takes the ownership of the lease and renews it. If > the recovery does not finish successfully, the lease will expire in one hour > and the namenode will try again to recover the lease. > However, if a file system has another lease expiring within the hour, the > recovery attempt for the lease will push forward the expiration of the lease > held by the namenode. This causes failed lease recoveries to be not retried > for a long time. We have seen it happening for days. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-11945) Internal lease recovery may not be retried for a long time
[ https://issues.apache.org/jira/browse/HDFS-11945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee reassigned HDFS-11945: - Assignee: Kihwal Lee > Internal lease recovery may not be retried for a long time > -- > > Key: HDFS-11945 > URL: https://issues.apache.org/jira/browse/HDFS-11945 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Kihwal Lee >Assignee: Kihwal Lee > > Lease is assigned per client who is identified by its holder ID or client ID, > thus a renewal or an expiration of a lease affects all files being written by > the client. > When a client/writer dies without closing a file, its lease expires in one > hour (hard limit) and the namenode tries to recover the lease. As a part of > the process, the namenode takes the ownership of the lease and renews it. If > the recovery does not finish successfully, the lease will expire in one hour > and the namenode will try again to recover the lease. > However, if a file system has another lease expiring within the hour, the > recovery attempt for the lease will push forward the expiration of the lease > held by the namenode. This causes failed lease recoveries to be not retried > for a long time. We have seen it happening for days. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11743) Revert the incompatible fsck reporting output in HDFS-7933 from branch-2.7
[ https://issues.apache.org/jira/browse/HDFS-11743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16040920#comment-16040920 ] Brahma Reddy Battula commented on HDFS-11743: - [~zhz] thanks for updating the patch.. +1 on latest patch. I believe {{TestDecommissioningStatus}} failure unrelated..It will be random failure, 2 times failed when I ran whole class 6 times.Looks incomplete(rbw) blocks are not written in failure(FBR is triggered) case,will dig more. > Revert the incompatible fsck reporting output in HDFS-7933 from branch-2.7 > -- > > Key: HDFS-11743 > URL: https://issues.apache.org/jira/browse/HDFS-11743 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Zhe Zhang >Assignee: Zhe Zhang >Priority: Blocker > Labels: release-blocker > Attachments: HDFS-11743-branch-2.7.00.patch, > HDFS-11743-branch-2.7.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11880) Ozone: KSM: Remove protobuf formats such as StorageTypeProto and OzoneAclInfo from KSM wrappers
[ https://issues.apache.org/jira/browse/HDFS-11880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16040943#comment-16040943 ] Hadoop QA commented on HDFS-11880: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 5 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 29s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 48s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 39s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 46s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 42s{color} | {color:green} HDFS-7240 passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 44s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client in HDFS-7240 has 2 extant Findbugs warnings. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 55s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in HDFS-7240 has 10 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 36s{color} | {color:green} HDFS-7240 passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 31s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 39s{color} | {color:orange} hadoop-hdfs-project: The patch generated 3 new + 7 unchanged - 4 fixed = 10 total (was 11) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 17s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 68m 59s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}108m 48s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure140 | | | hadoop.ozone.scm.TestContainerSQLCli | | | hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure | | Timed out junit tests | org.apache.hadoop.ozone.container.ozoneimpl.TestRatisManager | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | HDFS-11880 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12871828/HDFS-11880-HDFS-7240.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux dfc0e0939b9c 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | HDFS-7240 / 9961fa3 | | Default Java | 1.8.0_131 | | findbugs | v3.1.0-RC1 | | findbugs | https://builds.apache.org/job/PreCommit-HDFS-Build/19822/artifact/patchprocess
[jira] [Commented] (HDFS-11945) Internal lease recovery may not be retried for a long time
[ https://issues.apache.org/jira/browse/HDFS-11945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16040927#comment-16040927 ] Kihwal Lee commented on HDFS-11945: --- We could change the namenode lease holder ID every hour. Normally there will be only a brief moment of two being active in the system. Multiple ones can be active If there are failures. If the ID is suffixed by time stamp or date string, the log message for recovery will show how old the leases are. The major cause of lease recovery failures is datanodes having problems during block recoveries. One interesting case is when the namenode throws "server too busy" to datanodes. A {{commitBlockSynchronization()}} call can fail for this reason and won't be retried. > Internal lease recovery may not be retried for a long time > -- > > Key: HDFS-11945 > URL: https://issues.apache.org/jira/browse/HDFS-11945 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Kihwal Lee > > Lease is assigned per client who is identified by its holder ID or client ID, > thus a renewal or an expiration of a lease affects all files being written by > the client. > When a client/writer dies without closing a file, its lease expires in one > hour (hard limit) and the namenode tries to recover the lease. As a part of > the process, the namenode takes the ownership of the lease and renews it. If > the recovery does not finish successfully, the lease will expire in one hour > and the namenode will try again to recover the lease. > However, if a file system has another lease expiring within the hour, the > recovery attempt for the lease will push forward the expiration of the lease > held by the namenode. This causes failed lease recoveries to be not retried > for a long time. We have seen it happening for days. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org