[jira] [Updated] (HDFS-7933) fsck should also report decommissioning replicas.
[ https://issues.apache.org/jira/browse/HDFS-7933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HDFS-7933: --- Hadoop Flags: Incompatible change,Reviewed (was: Reviewed) Marking this as an incompatible change since it the output of fsck is now different. fsck should also report decommissioning replicas. -- Key: HDFS-7933 URL: https://issues.apache.org/jira/browse/HDFS-7933 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Jitendra Nath Pandey Assignee: Xiaoyu Yao Fix For: 2.8.0 Attachments: HDFS-7933.00.patch, HDFS-7933.01.patch, HDFS-7933.02.patch, HDFS-7933.03.patch Fsck doesn't count replicas that are on decommissioning nodes. If a block has all replicas on the decommissioning nodes, it will be marked as missing, which is alarming for the admins, although the system will replicate them before nodes are decommissioned. Fsck output should also show decommissioning replicas along with the live replicas. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7866) Erasure coding: NameNode manages EC schemas
[ https://issues.apache.org/jira/browse/HDFS-7866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Zheng updated HDFS-7866: Attachment: HDFS-7866-v2.patch Updated the patch, adding a test. Looks like there're not too much work left to be done here. Erasure coding: NameNode manages EC schemas --- Key: HDFS-7866 URL: https://issues.apache.org/jira/browse/HDFS-7866 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Kai Zheng Attachments: HDFS-7866-v1.patch, HDFS-7866-v2.patch This is to extend NameNode to load, list and sync predefine EC schemas in authorized and controlled approach. The provided facilities will be used to implement DFSAdmin commands so admin can list available EC schemas, then could choose some of them for target EC zones. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493702#comment-14493702 ] Xinwei Qin commented on HDFS-7859: --- OK, I will track it. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8092) dfs -count -q should not consider snapshots under REM_QUOTA
[ https://issues.apache.org/jira/browse/HDFS-8092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493693#comment-14493693 ] Archana T commented on HDFS-8092: - Hi [~aw] The first two columns of hdfs count cmd refers to {{QUOTA REM_QUOTA}} NameQuota. The issue i observed is on NameQuota not the SpaceQuota. AFAIK, the {{REM_QUOTA}} should not go to negative values. I think namequota is not considered for snapshots creation, according to HDFS-4091 max snapshots created for a folder is 65K. dfs -count -q should not consider snapshots under REM_QUOTA --- Key: HDFS-8092 URL: https://issues.apache.org/jira/browse/HDFS-8092 Project: Hadoop HDFS Issue Type: Bug Components: snapshots, tools Reporter: Archana T Assignee: Rakesh R Priority: Minor dfs -count -q should not consider snapshots under Remaining quota List of Operations performed- 1. hdfs dfs -mkdir /Dir1 2. hdfs dfsadmin -setQuota 2 /Dir1 3. hadoop fs -count -q -h -v /Dir1 QUOTA {color:red} REM_QUOTA{color} SPACE_QUOTA REM_SPACE_QUOTADIR_COUNT FILE_COUNT CONTENT_SIZE PATHNAME 2 {color:red} 1 {color}none inf10 0 /Dir1 4. hdfs dfs -put hdfs /Dir1/f1 5. hadoop fs -count -q -h -v /Dir1 QUOTA {color:red} REM_QUOTA{color} SPACE_QUOTA REM_SPACE_QUOTADIR_COUNT FILE_COUNT CONTENT_SIZE PATHNAME 2 {color:red} 0{color} none inf11 11.4 K /Dir1 6. hdfs dfsadmin -allowSnapshot /Dir1 7. hdfs dfs -createSnapshot /Dir1 8. hadoop fs -count -q -h -v /Dir1 QUOTA {color:red} REM_QUOTA{color} SPACE_QUOTA REM_SPACE_QUOTA DIR_COUNT FILE_COUNT CONTENT_SIZE PATHNAME 2 {color:red} -1 {color}none inf21 11.4 K /Dir1 Whenever snapshots created the value of REM_QUOTA gets decremented. When creation of snaphots are not considered under quota of that respective dir then dfs -count should not decrement REM_QUOTA value -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493695#comment-14493695 ] Kai Zheng commented on HDFS-7859: - Note I have updated the patch in HDFS-7866 aligning with this. When it's getting in then this one can rebase and be in then. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493699#comment-14493699 ] Xinwei Qin commented on HDFS-7859: --- [~drankye], thanks for your comments. {quote} 1. Looks like this couples with HDFS-7866. Maybe I could commit HDFS-7866 first and then this gets all the left work done. Will it work for you this way? {quote} Yes, committing HDFS-7866 first is better. bq. 2. What methods can ECSchemaManager call to make it happen? Some methods like {{logAddECSchema()}} in {{FSEditLog.java}} are missing, I will add them in next patch. bq. 3. In ECSchemaManager, new methods like addECSchema are not necessarily public. I will change to friendly. bq. 4. Are we supporting the two formats? Please add Javadoc to explain them, thanks. Yes, two formats are supported. These methods are all only called during namenode startup or do checkpoint, and which method is called depends on the FSImage format. I will add detail Javadoc on them. bq. 5. Would you have separate issue(s) for the following? I will create a new issue for it. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8120) Erasure coding: created util class to analyze striped block groups
[ https://issues.apache.org/jira/browse/HDFS-8120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-8120: Attachment: HDFS-8120.003.patch Thanks Jing for the comment on {{BlockManager}}. The 003 patch makes the suggested change. I will add a new unit test in the next rev. Also thanks for [~libo-intel]'s suggestion. I changed the variable name. I also tried updating the {{writeParityCellsForLastStripe}} logic to use the new {{getInternalBlockLength}} method to calculate the size of last parity cell, but it caused test failures. You can look at the commented code in the method to find how I tried. Could you take a look and suggest how to update it? I think we should make the calculation consistent with the util class. Thanks! Erasure coding: created util class to analyze striped block groups -- Key: HDFS-8120 URL: https://issues.apache.org/jira/browse/HDFS-8120 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-8120.000.patch, HDFS-8120.001.patch, HDFS-8120.002.patch, HDFS-8120.003.patch The patch adds logic of calculating size of individual blocks in a striped block group. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-7842) Blocks missed while performing downgrade immediately after rolling back the cluster.
[ https://issues.apache.org/jira/browse/HDFS-7842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] J.Andreina resolved HDFS-7842. -- Resolution: Duplicate Closing this issue as it is already been fixed as part of HDFS-7645 Blocks missed while performing downgrade immediately after rolling back the cluster. Key: HDFS-7842 URL: https://issues.apache.org/jira/browse/HDFS-7842 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: J.Andreina Assignee: J.Andreina Priority: Critical Performing downgrade immediately after rolling back the cluster , will replace the blocks from trash Since the block id for the files created before rollback will be same as the file created before downgrade, namenode will get into safemode , as the block size reported from Datanode will be different from the one in block map (corrupted blocks) . Steps to Reproduce {noformat} Step 1: Prepare rolling upgrade using hdfs dfsadmin -rollingUpgrade prepare Step 2: Shutdown SNN and NN Step 3: Start NN with the hdfs namenode -rollingUpgrade started option. Step 4: Executed hdfs dfsadmin -shutdownDatanode DATANODE_HOST:IPC_PORT upgrade and restarted Datanode Step 5: Create File_1 of size 11526 Step 6: Shutdown both NN and DN Step 7: Start NNs with the hdfs namenode -rollingUpgrade rollback option. Start DNs with the -rollback option. Step 8: Prepare rolling upgrade using hdfs dfsadmin -rollingUpgrade prepare Step 9: Shutdown SNN and NN Step 10: Start NN with the hdfs namenode -rollingUpgrade started option . Step 11: Executed hdfs dfsadmin -shutdownDatanode DATANODE_HOST:IPC_PORT upgrade and restarted Datanode step 12: Add file File_2 with size 6324 (which has same blockid as previous created File_1 with block size 11526) Step 13: Shutdown both NN and DN Step 14: Start NNs with the hdfs namenode -rollingUpgrade downgrade option.Start DNs normally. {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493703#comment-14493703 ] Xinwei Qin commented on HDFS-7859: --- OK, I will track it. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493704#comment-14493704 ] Xinwei Qin commented on HDFS-7859: --- OK, I will track it. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7866) Erasure coding: NameNode manages EC schemas
[ https://issues.apache.org/jira/browse/HDFS-7866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493725#comment-14493725 ] Kai Zheng commented on HDFS-7866: - I thought it's good to have this in the patch and HDFS-7859 can remove them when adding the codes. Sounds good? Erasure coding: NameNode manages EC schemas --- Key: HDFS-7866 URL: https://issues.apache.org/jira/browse/HDFS-7866 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Kai Zheng Attachments: HDFS-7866-v1.patch, HDFS-7866-v2.patch This is to extend NameNode to load, list and sync predefine EC schemas in authorized and controlled approach. The provided facilities will be used to implement DFSAdmin commands so admin can list available EC schemas, then could choose some of them for target EC zones. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8123) Erasure Coding: Better to move EC related proto messages to a separate 'erasurecode.proto' file
[ https://issues.apache.org/jira/browse/HDFS-8123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493742#comment-14493742 ] Rakesh R commented on HDFS-8123: Thanks [~drankye] for the comments. bq. For somehow feeling maybe we could use erasurecoding instead of erasurecode for the new proto file? How others would think? I agree with you, {{erasurecoding}} is better one. bq. It might not be good to have the following in the new proto file right now? I'm not sure. I think, its not required to add this section. I will remove this section. I copied the license and header from {{hdfs.proto}} Erasure Coding: Better to move EC related proto messages to a separate 'erasurecode.proto' file --- Key: HDFS-8123 URL: https://issues.apache.org/jira/browse/HDFS-8123 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-8123-001.patch, HDFS-8123-002.patch, HDFS-8123-003.patch While reviewing the code I've noticed EC related proto messages are getting added into {{hdfs.proto}}. IMHO, for better maintainability of the erasure code feature, its good to move this to a separate {{erasurecode.proto}} file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7866) Erasure coding: NameNode manages EC schemas
[ https://issues.apache.org/jira/browse/HDFS-7866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493755#comment-14493755 ] Vinayakumar B commented on HDFS-7866: - Hi [~xinwei], Thanks for working on this. Patch looks good. Some nits. 1. namesystem is unused now. Let it be added whenever actually used. {code} + /** + * The FSNamesystem that contains this ECSchemaManager. + */ + private final FSNamesystem namesystem;{code} 2. can remove ): in the log message. I think you wanted to put sad smiley ( :( ) , but logger wont understand your feelings ;) {code} + LOG.warn(A schema {} is updated but will be ignored as not + + supported yet):, schema.getSchemaName());{code} 3. Can also update TODO in the {{FSNameSystem#getECSchemas()}} to return loaded schemas. 4. Rename {{TestSchemaManager}} to {{TestECSchemaManager}} Erasure coding: NameNode manages EC schemas --- Key: HDFS-7866 URL: https://issues.apache.org/jira/browse/HDFS-7866 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Kai Zheng Attachments: HDFS-7866-v1.patch, HDFS-7866-v2.patch This is to extend NameNode to load, list and sync predefine EC schemas in authorized and controlled approach. The provided facilities will be used to implement DFSAdmin commands so admin can list available EC schemas, then could choose some of them for target EC zones. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8127) NameNode Failover during HA upgrade can cause DataNode to finalize upgrade
[ https://issues.apache.org/jira/browse/HDFS-8127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493818#comment-14493818 ] Hadoop QA commented on HDFS-8127: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12725153/HDFS-8127.001.patch against trunk revision b9b832a. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/10272//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/10272//console This message is automatically generated. NameNode Failover during HA upgrade can cause DataNode to finalize upgrade -- Key: HDFS-8127 URL: https://issues.apache.org/jira/browse/HDFS-8127 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Jing Zhao Assignee: Jing Zhao Priority: Blocker Attachments: HDFS-8127.000.patch, HDFS-8127.001.patch Currently for HA upgrade (enabled by HDFS-5138), we use {{-bootstrapStandby}} to initialize the standby NameNode. The standby NameNode does not have the {{previous}} directory thus it does not know that the cluster is in the upgrade state. If NN failover happens, as response of block reports, the new ANN will tell DNs to finalize the upgrade thus make it impossible to rollback again. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8140) ECSchema supports for offline EditsVistor over an OEV XML file
Xinwei Qin created HDFS-8140: - Summary: ECSchema supports for offline EditsVistor over an OEV XML file Key: HDFS-8140 URL: https://issues.apache.org/jira/browse/HDFS-8140 Project: Hadoop HDFS Issue Type: Task Affects Versions: HDFS-7285 Reporter: Xinwei Qin Assignee: Xinwei Qin Make the ECSchema info in Editlog Support for offline EditsVistor over an OEV XML file, which is not implemented in HDFS-7859. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8141) Use Certificate as a zone key
Abhijit C Patil created HDFS-8141: - Summary: Use Certificate as a zone key Key: HDFS-8141 URL: https://issues.apache.org/jira/browse/HDFS-8141 Project: Hadoop HDFS Issue Type: Improvement Components: encryption Affects Versions: 2.5.2 Reporter: Abhijit C Patil Priority: Minor Hi, I was looking at HDFS encryption and the way the encrypted zone is created is by first creating a key and then using the created key to create a secure zone. If we are able to create a secure zone by giving certificate with private key that would make the zone more secure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7349) Support DFS command for the EC encoding
[ https://issues.apache.org/jira/browse/HDFS-7349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-7349: Attachment: HDFS-7349-005.patch Attached the updated patch. Changes 1. Moved {{ECCli}} to {{tools.erasurecode}} package. 2. renamed cli name to {{erasurecode}} from {{ec}}. So to get cli now has to invoke {{hdfs erasurecode}} 3. {{getECZoneInfo}} to {{getErasureCodingZoneInfo}} throughout 4. Added one more command {{ListECSchemas}} Support DFS command for the EC encoding --- Key: HDFS-7349 URL: https://issues.apache.org/jira/browse/HDFS-7349 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-7349-001.patch, HDFS-7349-002.patch, HDFS-7349-003.patch, HDFS-7349-004.patch, HDFS-7349-005.patch Support implementation of the following commands *hdfs dfs -convertToEC path* path: Converts all blocks under this path to EC form (if not already in EC form, and if can be coded). *hdfs dfs -convertToRep path* path: Converts all blocks under this path to be replicated form. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7993) Incorrect descriptions in fsck when nodes are decommissioned
[ https://issues.apache.org/jira/browse/HDFS-7993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493729#comment-14493729 ] Vinayakumar B commented on HDFS-7993: - patch needs rebase after HDFS-7933. bq. Change from report.append( repl= + liveReplicas); to report.append( repl= + totalReplicas); In the current output, {{blk_x len=y repl=3 \[dn1, dn2, dn3, dn4\]}}, the count {{repl=3}} exactly gives the count of live replicas excluding decommission(ing/ed). So i think leaving it as is would be better. bq. Instead of using DatanodeInfo to find replica details, we can use NumberReplicas instead. As discussed above, this jira is to add the detail/state about each replica, not just the overall count, which is not available in {{NumberReplicas}}. bq. If we need to count stale datanode, we can add another field to NumberReplicas for that. I think this count will be there for long time, since the block report interval is long. IMO If necessary may go in followup jira Incorrect descriptions in fsck when nodes are decommissioned Key: HDFS-7993 URL: https://issues.apache.org/jira/browse/HDFS-7993 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.6.0 Reporter: Ming Ma Assignee: J.Andreina Attachments: HDFS-7993.1.patch, HDFS-7993.2.patch When you run fsck with -files or -racks, you will get something like below if one of the replicas is decommissioned. {noformat} blk_x len=y repl=3 [dn1, dn2, dn3, dn4] {noformat} That is because in NamenodeFsck, the repl count comes from live replicas count; while the actual nodes come from LocatedBlock which include decommissioned nodes. Another issue in NamenodeFsck is BlockPlacementPolicy's verifyBlockPlacement verifies LocatedBlock that includes decommissioned nodes. However, it seems better to exclude the decommissioned nodes in the verification; just like how fsck excludes decommissioned nodes when it check for under replicated blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7866) Erasure coding: NameNode manages EC schemas
[ https://issues.apache.org/jira/browse/HDFS-7866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493727#comment-14493727 ] Xinwei Qin commented on HDFS-7866: --- OK, that sounds good. Erasure coding: NameNode manages EC schemas --- Key: HDFS-7866 URL: https://issues.apache.org/jira/browse/HDFS-7866 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Kai Zheng Attachments: HDFS-7866-v1.patch, HDFS-7866-v2.patch This is to extend NameNode to load, list and sync predefine EC schemas in authorized and controlled approach. The provided facilities will be used to implement DFSAdmin commands so admin can list available EC schemas, then could choose some of them for target EC zones. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7937) Erasure Coding: INodeFile quota computation unit tests
[ https://issues.apache.org/jira/browse/HDFS-7937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Sasaki updated HDFS-7937: - Attachment: HDFS-7937.8.patch Erasure Coding: INodeFile quota computation unit tests -- Key: HDFS-7937 URL: https://issues.apache.org/jira/browse/HDFS-7937 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Sasaki Assignee: Kai Sasaki Priority: Minor Attachments: HDFS-7937.1.patch, HDFS-7937.2.patch, HDFS-7937.3.patch, HDFS-7937.4.patch, HDFS-7937.5.patch, HDFS-7937.6.patch, HDFS-7937.7.patch, HDFS-7937.8.patch Unit test for [HDFS-7826|https://issues.apache.org/jira/browse/HDFS-7826] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8123) Erasure Coding: Better to move EC related proto messages to a separate 'erasurecode.proto' file
[ https://issues.apache.org/jira/browse/HDFS-8123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493715#comment-14493715 ] Kai Zheng commented on HDFS-8123: - 1. For somehow feeling maybe we could use {{erasurecoding}} instead of {{erasurecode}} for the new proto file? How others would think? 2. It might not be good to have the following in the new proto file right now? I'm not sure. {code} These .proto interfaces are private and stable. {code} [~vinayrpet], I'm not familiar with this aspect. Would you take a look at this and give your comments? Thanks. Erasure Coding: Better to move EC related proto messages to a separate 'erasurecode.proto' file --- Key: HDFS-8123 URL: https://issues.apache.org/jira/browse/HDFS-8123 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-8123-001.patch, HDFS-8123-002.patch, HDFS-8123-003.patch While reviewing the code I've noticed EC related proto messages are getting added into {{hdfs.proto}}. IMHO, for better maintainability of the erasure code feature, its good to move this to a separate {{erasurecode.proto}} file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7866) Erasure coding: NameNode manages EC schemas
[ https://issues.apache.org/jira/browse/HDFS-7866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493720#comment-14493720 ] Xinwei Qin commented on HDFS-7866: --- Hi, [~drankye] {code} +/** + * TODO: HDFS-7859 persist into NameNode + * load persistent schemas from image and editlog, which is done only once + * during NameNode startup. This can be done here or in a separate method. + */ {code} These annotation can be removed. Now loading persistent schemas from fsimage and editlog is done in {{loadECSchemas()}} or {{loadState()}} method and these methods are called during NameNode startup. Erasure coding: NameNode manages EC schemas --- Key: HDFS-7866 URL: https://issues.apache.org/jira/browse/HDFS-7866 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Kai Zheng Attachments: HDFS-7866-v1.patch, HDFS-7866-v2.patch This is to extend NameNode to load, list and sync predefine EC schemas in authorized and controlled approach. The provided facilities will be used to implement DFSAdmin commands so admin can list available EC schemas, then could choose some of them for target EC zones. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8120) Erasure coding: created util class to analyze striped block groups
[ https://issues.apache.org/jira/browse/HDFS-8120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493774#comment-14493774 ] Li Bo commented on HDFS-8120: - hi, Zhe, I update branch code and find it failed to build. The trunk code is built successfully. Could you check this problem which is possibly caused by code merging? Erasure coding: created util class to analyze striped block groups -- Key: HDFS-8120 URL: https://issues.apache.org/jira/browse/HDFS-8120 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-8120.000.patch, HDFS-8120.001.patch, HDFS-8120.002.patch, HDFS-8120.003.patch The patch adds logic of calculating size of individual blocks in a striped block group. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7934) During Rolling upgrade rollback ,standby namenode startup fails.
[ https://issues.apache.org/jira/browse/HDFS-7934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] J.Andreina updated HDFS-7934: - Attachment: HDFS-7934.2.patch Uploaded the patch for modifying the steps for rolling upgrade rollback . Please review. During Rolling upgrade rollback ,standby namenode startup fails. Key: HDFS-7934 URL: https://issues.apache.org/jira/browse/HDFS-7934 Project: Hadoop HDFS Issue Type: Bug Reporter: J.Andreina Assignee: J.Andreina Priority: Critical Attachments: HDFS-7934.1.patch, HDFS-7934.2.patch During Rolling upgrade rollback , standby namenode startup fails , while loading edits and when there is no local copy of edits created after upgrade ( which is already been removed by Active Namenode from journal manager and from Active's local). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7701) Support reporting per storage type quota and usage with hadoop/hdfs shell
[ https://issues.apache.org/jira/browse/HDFS-7701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493942#comment-14493942 ] Hudson commented on HDFS-7701: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #163 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/163/]) HDFS-7701. Support reporting per storage type quota and usage with hadoop/hdfs shell. (Contributed by Peter Shi) (arp: rev 18a3dad44afd8061643fffc5bbe50fa66e47b72c) * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/Count.java * hadoop-common-project/hadoop-common/src/test/resources/testConf.xml * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CommandFormat.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/ContentSummary.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestCount.java Support reporting per storage type quota and usage with hadoop/hdfs shell - Key: HDFS-7701 URL: https://issues.apache.org/jira/browse/HDFS-7701 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Reporter: Xiaoyu Yao Assignee: Peter Shi Fix For: 2.8.0 Attachments: HDFS-7701.01.patch, HDFS-7701.02.patch, HDFS-7701.03.patch, HDFS-7701.04.patch, HDFS-7701.05.patch, HDFS-7701.06.patch hadoop fs -count -q or hdfs dfs -count -q currently shows name space/disk space quota and remaining quota information. With HDFS-7584, we want to display per storage type quota and its remaining information as well. The current output format as shown below may not easily accomodate 6 more columns = 3 (existing storage types) * 2 (quota/remaining quota). With new storage types added in future, this will make the output even more crowded. There are also compatibility issues as we don't want to break any existing scripts monitoring hadoop fs -count -q output. $ hadoop fs -count -q -v /test QUOTA REM_QUOTA SPACE_QUOTA REM_SPACE_QUOTADIR_COUNT FILE_COUNT CONTENT_SIZE PATHNAME none inf 524288000 5242665691 15 21431 /test Propose to add -t parameter to display ONLY the storage type quota information of the directory in the separately. This way, existing scripts will work as-is without using -t parameter. 1) When -t is not followed by a specific storage type, quota and usage information for all storage types will be displayed. $ hadoop fs -count -q -t -h -v /test SSD_QUOTA REM_SSD_QUOTA DISK_QUOTA REM_DISK_QUOTA ARCHIVAL_QUOTA REM_ARCHIVAL_QUOTA PATHNAME 512MB 256MB none inf none inf/test 2) If -t is followed by a storage type, only the quota and remaining quota of the storage type is displayed. $ hadoop fs -count -q -t SSD -h -v /test SSD_QUOTA REM_SSD_QUOTA PATHNAME 512 MB 256 MB /test -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8117) More accurate verification in SimulatedFSDataset: replace DEFAULT_DATABYTE with patterned data
[ https://issues.apache.org/jira/browse/HDFS-8117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493933#comment-14493933 ] Hudson commented on HDFS-8117: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #163 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/163/]) HDFS-8117. More accurate verification in SimulatedFSDataset: replace DEFAULT_DATABYTE with patterned data. Contributed by Zhe Zhang. (wang: rev d60e22152ac098da103fd37fb81f8758e68d1efa) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestSimulatedFSDataset.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestSmallBlock.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppend.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestPread.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt More accurate verification in SimulatedFSDataset: replace DEFAULT_DATABYTE with patterned data -- Key: HDFS-8117 URL: https://issues.apache.org/jira/browse/HDFS-8117 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.6.0 Reporter: Zhe Zhang Assignee: Zhe Zhang Fix For: 3.0.0 Attachments: HDFS-8117-branch2.patch, HDFS-8117.000.patch, HDFS-8117.001.patch, HDFS-8117.002.patch, HDFS-8117.003.patch Currently {{SimulatedFSDataset}} uses a single {{DEFAULT_DATABYTE}} to simulate _all_ block content. This is not accurate because the return of this byte just means the read request has hit an arbitrary position in an arbitrary simulated block. This JIRA aims to improve it with a more accurate verification. When position {{p}} of a simulated block {{b}} is accessed, the returned byte is {{b}}'s block ID plus {{p}}, moduled by the max value of a byte. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8083) Separate the client write conf from DFSConfigKeys
[ https://issues.apache.org/jira/browse/HDFS-8083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493936#comment-14493936 ] Hudson commented on HDFS-8083: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #163 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/163/]) HDFS-8083. Move dfs.client.write.* conf from DFSConfigKeys to HdfsClientConfigKeys.Write. (szetszwo: rev 7fc50e2525b8b8fe36d92e283a68eeeb09c63d21) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/client/impl/DfsClientConf.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNamenodeCapacityReport.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/ReplaceDatanodeOnFailure.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/resources/EnumSetParam.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestClientProtocolForPipelineRecovery.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/util/TestByteArrayManager.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientExcludedNodes.java Separate the client write conf from DFSConfigKeys - Key: HDFS-8083 URL: https://issues.apache.org/jira/browse/HDFS-8083 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Fix For: 2.8.0 Attachments: h8083_20150410.patch A part of HDFS-8050, move dfs.client.write.* conf from DFSConfigKeys to a new class HdfsClientConfigKeys. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8111) NPE thrown when invalid FSImage filename given for hdfs oiv_legacy cmd
[ https://issues.apache.org/jira/browse/HDFS-8111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493943#comment-14493943 ] Hudson commented on HDFS-8111: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #163 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/163/]) HDFS-8111. NPE thrown when invalid FSImage filename given for 'hdfs oiv_legacy' cmd ( Contributed by surendra singh lilhore ) (vinayakumarb: rev 14384f5b5142a98a10ce4bffadeb13e89bda9365) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/OfflineImageViewer.java NPE thrown when invalid FSImage filename given for hdfs oiv_legacy cmd Key: HDFS-8111 URL: https://issues.apache.org/jira/browse/HDFS-8111 Project: Hadoop HDFS Issue Type: Bug Components: tools Affects Versions: 2.6.0 Reporter: Archana T Assignee: surendra singh lilhore Priority: Minor Fix For: 2.8.0 Attachments: HDFS-8111.patch NPE thrown when invalid filename is given as argument for hdfs oiv_legacy command {code} ./hdfs oiv_legacy -i /home/hadoop/hadoop/hadoop-3.0.0/dfs/name/current/fsimage_00042 -o fsimage.txt Exception in thread main java.lang.NullPointerException at org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageViewer.go(OfflineImageViewer.java:140) at org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageViewer.main(OfflineImageViewer.java:260) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8092) dfs -count -q should not consider snapshots under REM_QUOTA
[ https://issues.apache.org/jira/browse/HDFS-8092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493997#comment-14493997 ] Rakesh R commented on HDFS-8092: Thanks [~archanat] for reporting this. Thanks [~aw] for the interest and the comments. bq. Snapshots should most definitely be considered part of the quota calculation. They are not free and do take up space. My observation is, snapshots are not considered while verifying the quota. But the {{ContentSummay}} object has the logic of calculating the remaining quota as follows. Here, the {{directoryCount}} is considering the snapshots and is causing the {{REM_QUOTA}} evaluated to a negative number, which is misleading, isn't it? {code} if (quota0) { quotaStr = formatSize(quota, hOption); quotaRem = formatSize(quota-(directoryCount+fileCount), hOption); } {code} Snapshot has the default quota limits. Presently there is no way(command) to set the quota value for the snapshots. {code} DirectorySnapshottableFeature.java /** Number of snapshots allowed. */ private int snapshotQuota = SNAPSHOT_LIMIT; {code} dfs -count -q should not consider snapshots under REM_QUOTA --- Key: HDFS-8092 URL: https://issues.apache.org/jira/browse/HDFS-8092 Project: Hadoop HDFS Issue Type: Bug Components: snapshots, tools Reporter: Archana T Assignee: Rakesh R Priority: Minor dfs -count -q should not consider snapshots under Remaining quota List of Operations performed- 1. hdfs dfs -mkdir /Dir1 2. hdfs dfsadmin -setQuota 2 /Dir1 3. hadoop fs -count -q -h -v /Dir1 QUOTA {color:red} REM_QUOTA{color} SPACE_QUOTA REM_SPACE_QUOTADIR_COUNT FILE_COUNT CONTENT_SIZE PATHNAME 2 {color:red} 1 {color}none inf10 0 /Dir1 4. hdfs dfs -put hdfs /Dir1/f1 5. hadoop fs -count -q -h -v /Dir1 QUOTA {color:red} REM_QUOTA{color} SPACE_QUOTA REM_SPACE_QUOTADIR_COUNT FILE_COUNT CONTENT_SIZE PATHNAME 2 {color:red} 0{color} none inf11 11.4 K /Dir1 6. hdfs dfsadmin -allowSnapshot /Dir1 7. hdfs dfs -createSnapshot /Dir1 8. hadoop fs -count -q -h -v /Dir1 QUOTA {color:red} REM_QUOTA{color} SPACE_QUOTA REM_SPACE_QUOTA DIR_COUNT FILE_COUNT CONTENT_SIZE PATHNAME 2 {color:red} -1 {color}none inf21 11.4 K /Dir1 Whenever snapshots created the value of REM_QUOTA gets decremented. When creation of snaphots are not considered under quota of that respective dir then dfs -count should not decrement REM_QUOTA value -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7866) Erasure coding: NameNode manages EC schemas
[ https://issues.apache.org/jira/browse/HDFS-7866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493863#comment-14493863 ] Kai Zheng commented on HDFS-7866: - Hi [~vinayrpet], thanks for your good comments. Patch updated. Would you review one more time? Thanks. Erasure coding: NameNode manages EC schemas --- Key: HDFS-7866 URL: https://issues.apache.org/jira/browse/HDFS-7866 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Kai Zheng Attachments: HDFS-7866-v1.patch, HDFS-7866-v2.patch, HDFS-7866-v3.patch This is to extend NameNode to load, list and sync predefine EC schemas in authorized and controlled approach. The provided facilities will be used to implement DFSAdmin commands so admin can list available EC schemas, then could choose some of them for target EC zones. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8127) NameNode Failover during HA upgrade can cause DataNode to finalize upgrade
[ https://issues.apache.org/jira/browse/HDFS-8127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493870#comment-14493870 ] Vinayakumar B commented on HDFS-8127: - Same/little different problem could occur with the RollingUpgrade also with bootstrapstandby. Impact may not be this much serious as datanodes will not do finalize if they didnt find rollingupgrade status NameNode Failover during HA upgrade can cause DataNode to finalize upgrade -- Key: HDFS-8127 URL: https://issues.apache.org/jira/browse/HDFS-8127 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Jing Zhao Assignee: Jing Zhao Priority: Blocker Attachments: HDFS-8127.000.patch, HDFS-8127.001.patch Currently for HA upgrade (enabled by HDFS-5138), we use {{-bootstrapStandby}} to initialize the standby NameNode. The standby NameNode does not have the {{previous}} directory thus it does not know that the cluster is in the upgrade state. If NN failover happens, as response of block reports, the new ANN will tell DNs to finalize the upgrade thus make it impossible to rollback again. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8033) Erasure coding: stateful (non-positional) read from files in striped layout
[ https://issues.apache.org/jira/browse/HDFS-8033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493886#comment-14493886 ] GAO Rui commented on HDFS-8033: --- Zhe, thank you very much for your help. I understand read() and pread() now. Erasure coding: stateful (non-positional) read from files in striped layout --- Key: HDFS-8033 URL: https://issues.apache.org/jira/browse/HDFS-8033 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-8033.000.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8117) More accurate verification in SimulatedFSDataset: replace DEFAULT_DATABYTE with patterned data
[ https://issues.apache.org/jira/browse/HDFS-8117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493980#comment-14493980 ] Hudson commented on HDFS-8117: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #154 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/154/]) HDFS-8117. More accurate verification in SimulatedFSDataset: replace DEFAULT_DATABYTE with patterned data. Contributed by Zhe Zhang. (wang: rev d60e22152ac098da103fd37fb81f8758e68d1efa) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestSimulatedFSDataset.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestSmallBlock.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestPread.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppend.java More accurate verification in SimulatedFSDataset: replace DEFAULT_DATABYTE with patterned data -- Key: HDFS-8117 URL: https://issues.apache.org/jira/browse/HDFS-8117 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.6.0 Reporter: Zhe Zhang Assignee: Zhe Zhang Fix For: 3.0.0 Attachments: HDFS-8117-branch2.patch, HDFS-8117.000.patch, HDFS-8117.001.patch, HDFS-8117.002.patch, HDFS-8117.003.patch Currently {{SimulatedFSDataset}} uses a single {{DEFAULT_DATABYTE}} to simulate _all_ block content. This is not accurate because the return of this byte just means the read request has hit an arbitrary position in an arbitrary simulated block. This JIRA aims to improve it with a more accurate verification. When position {{p}} of a simulated block {{b}} is accessed, the returned byte is {{b}}'s block ID plus {{p}}, moduled by the max value of a byte. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7701) Support reporting per storage type quota and usage with hadoop/hdfs shell
[ https://issues.apache.org/jira/browse/HDFS-7701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493989#comment-14493989 ] Hudson commented on HDFS-7701: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #154 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/154/]) HDFS-7701. Support reporting per storage type quota and usage with hadoop/hdfs shell. (Contributed by Peter Shi) (arp: rev 18a3dad44afd8061643fffc5bbe50fa66e47b72c) * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/ContentSummary.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/Count.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CommandFormat.java * hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestCount.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-common-project/hadoop-common/src/test/resources/testConf.xml Support reporting per storage type quota and usage with hadoop/hdfs shell - Key: HDFS-7701 URL: https://issues.apache.org/jira/browse/HDFS-7701 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Reporter: Xiaoyu Yao Assignee: Peter Shi Fix For: 2.8.0 Attachments: HDFS-7701.01.patch, HDFS-7701.02.patch, HDFS-7701.03.patch, HDFS-7701.04.patch, HDFS-7701.05.patch, HDFS-7701.06.patch hadoop fs -count -q or hdfs dfs -count -q currently shows name space/disk space quota and remaining quota information. With HDFS-7584, we want to display per storage type quota and its remaining information as well. The current output format as shown below may not easily accomodate 6 more columns = 3 (existing storage types) * 2 (quota/remaining quota). With new storage types added in future, this will make the output even more crowded. There are also compatibility issues as we don't want to break any existing scripts monitoring hadoop fs -count -q output. $ hadoop fs -count -q -v /test QUOTA REM_QUOTA SPACE_QUOTA REM_SPACE_QUOTADIR_COUNT FILE_COUNT CONTENT_SIZE PATHNAME none inf 524288000 5242665691 15 21431 /test Propose to add -t parameter to display ONLY the storage type quota information of the directory in the separately. This way, existing scripts will work as-is without using -t parameter. 1) When -t is not followed by a specific storage type, quota and usage information for all storage types will be displayed. $ hadoop fs -count -q -t -h -v /test SSD_QUOTA REM_SSD_QUOTA DISK_QUOTA REM_DISK_QUOTA ARCHIVAL_QUOTA REM_ARCHIVAL_QUOTA PATHNAME 512MB 256MB none inf none inf/test 2) If -t is followed by a storage type, only the quota and remaining quota of the storage type is displayed. $ hadoop fs -count -q -t SSD -h -v /test SSD_QUOTA REM_SSD_QUOTA PATHNAME 512 MB 256 MB /test -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8117) More accurate verification in SimulatedFSDataset: replace DEFAULT_DATABYTE with patterned data
[ https://issues.apache.org/jira/browse/HDFS-8117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493966#comment-14493966 ] Hudson commented on HDFS-8117: -- FAILURE: Integrated in Hadoop-Yarn-trunk #897 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/897/]) HDFS-8117. More accurate verification in SimulatedFSDataset: replace DEFAULT_DATABYTE with patterned data. Contributed by Zhe Zhang. (wang: rev d60e22152ac098da103fd37fb81f8758e68d1efa) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestSimulatedFSDataset.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestSmallBlock.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppend.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestPread.java More accurate verification in SimulatedFSDataset: replace DEFAULT_DATABYTE with patterned data -- Key: HDFS-8117 URL: https://issues.apache.org/jira/browse/HDFS-8117 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.6.0 Reporter: Zhe Zhang Assignee: Zhe Zhang Fix For: 3.0.0 Attachments: HDFS-8117-branch2.patch, HDFS-8117.000.patch, HDFS-8117.001.patch, HDFS-8117.002.patch, HDFS-8117.003.patch Currently {{SimulatedFSDataset}} uses a single {{DEFAULT_DATABYTE}} to simulate _all_ block content. This is not accurate because the return of this byte just means the read request has hit an arbitrary position in an arbitrary simulated block. This JIRA aims to improve it with a more accurate verification. When position {{p}} of a simulated block {{b}} is accessed, the returned byte is {{b}}'s block ID plus {{p}}, moduled by the max value of a byte. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8111) NPE thrown when invalid FSImage filename given for hdfs oiv_legacy cmd
[ https://issues.apache.org/jira/browse/HDFS-8111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493976#comment-14493976 ] Hudson commented on HDFS-8111: -- FAILURE: Integrated in Hadoop-Yarn-trunk #897 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/897/]) HDFS-8111. NPE thrown when invalid FSImage filename given for 'hdfs oiv_legacy' cmd ( Contributed by surendra singh lilhore ) (vinayakumarb: rev 14384f5b5142a98a10ce4bffadeb13e89bda9365) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/OfflineImageViewer.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt NPE thrown when invalid FSImage filename given for hdfs oiv_legacy cmd Key: HDFS-8111 URL: https://issues.apache.org/jira/browse/HDFS-8111 Project: Hadoop HDFS Issue Type: Bug Components: tools Affects Versions: 2.6.0 Reporter: Archana T Assignee: surendra singh lilhore Priority: Minor Fix For: 2.8.0 Attachments: HDFS-8111.patch NPE thrown when invalid filename is given as argument for hdfs oiv_legacy command {code} ./hdfs oiv_legacy -i /home/hadoop/hadoop/hadoop-3.0.0/dfs/name/current/fsimage_00042 -o fsimage.txt Exception in thread main java.lang.NullPointerException at org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageViewer.go(OfflineImageViewer.java:140) at org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageViewer.main(OfflineImageViewer.java:260) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7866) Erasure coding: NameNode manages EC schemas
[ https://issues.apache.org/jira/browse/HDFS-7866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Zheng updated HDFS-7866: Attachment: HDFS-7866-v3.patch Updated the patch according to the review comments. Erasure coding: NameNode manages EC schemas --- Key: HDFS-7866 URL: https://issues.apache.org/jira/browse/HDFS-7866 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Kai Zheng Attachments: HDFS-7866-v1.patch, HDFS-7866-v2.patch, HDFS-7866-v3.patch This is to extend NameNode to load, list and sync predefine EC schemas in authorized and controlled approach. The provided facilities will be used to implement DFSAdmin commands so admin can list available EC schemas, then could choose some of them for target EC zones. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7349) Support DFS command for the EC encoding
[ https://issues.apache.org/jira/browse/HDFS-7349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493867#comment-14493867 ] Kai Zheng commented on HDFS-7349: - I'm OK with these renamings. Maybe we can revisit them in HDFS-8129 together in some time later. How do you think of this, [~umamaheswararao]? Thanks. Support DFS command for the EC encoding --- Key: HDFS-7349 URL: https://issues.apache.org/jira/browse/HDFS-7349 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-7349-001.patch, HDFS-7349-002.patch, HDFS-7349-003.patch, HDFS-7349-004.patch, HDFS-7349-005.patch Support implementation of the following commands *hdfs dfs -convertToEC path* path: Converts all blocks under this path to EC form (if not already in EC form, and if can be coded). *hdfs dfs -convertToRep path* path: Converts all blocks under this path to be replicated form. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7866) Erasure coding: NameNode manages EC schemas
[ https://issues.apache.org/jira/browse/HDFS-7866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493922#comment-14493922 ] Vinayakumar B commented on HDFS-7866: - Some more nits, 1. Following changes not required. Why to have same value in two places, one in variable and one in map. In fact you need to remove from the map while assigning to variable. {code}+if (options == null) { + options = new HashMap(); +} +options.put(CODEC_NAME_KEY, codecName); +options.put(NUM_DATA_UNITS_KEY, String.valueOf(numDataUnits)); +options.put(NUM_PARITY_UNITS_KEY, String.valueOf(numParityUnits)); +this.options = Collections.unmodifiableMap(options);{code} 2. {{ECSchemaManager#reloadPredefined()}} should be called on initialization. May be in constructor. Erasure coding: NameNode manages EC schemas --- Key: HDFS-7866 URL: https://issues.apache.org/jira/browse/HDFS-7866 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Kai Zheng Attachments: HDFS-7866-v1.patch, HDFS-7866-v2.patch, HDFS-7866-v3.patch This is to extend NameNode to load, list and sync predefine EC schemas in authorized and controlled approach. The provided facilities will be used to implement DFSAdmin commands so admin can list available EC schemas, then could choose some of them for target EC zones. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8117) More accurate verification in SimulatedFSDataset: replace DEFAULT_DATABYTE with patterned data
[ https://issues.apache.org/jira/browse/HDFS-8117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493950#comment-14493950 ] Hudson commented on HDFS-8117: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2095 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2095/]) HDFS-8117. More accurate verification in SimulatedFSDataset: replace DEFAULT_DATABYTE with patterned data. Contributed by Zhe Zhang. (wang: rev d60e22152ac098da103fd37fb81f8758e68d1efa) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestSmallBlock.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestPread.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppend.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestSimulatedFSDataset.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java More accurate verification in SimulatedFSDataset: replace DEFAULT_DATABYTE with patterned data -- Key: HDFS-8117 URL: https://issues.apache.org/jira/browse/HDFS-8117 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.6.0 Reporter: Zhe Zhang Assignee: Zhe Zhang Fix For: 3.0.0 Attachments: HDFS-8117-branch2.patch, HDFS-8117.000.patch, HDFS-8117.001.patch, HDFS-8117.002.patch, HDFS-8117.003.patch Currently {{SimulatedFSDataset}} uses a single {{DEFAULT_DATABYTE}} to simulate _all_ block content. This is not accurate because the return of this byte just means the read request has hit an arbitrary position in an arbitrary simulated block. This JIRA aims to improve it with a more accurate verification. When position {{p}} of a simulated block {{b}} is accessed, the returned byte is {{b}}'s block ID plus {{p}}, moduled by the max value of a byte. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8083) Separate the client write conf from DFSConfigKeys
[ https://issues.apache.org/jira/browse/HDFS-8083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493953#comment-14493953 ] Hudson commented on HDFS-8083: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2095 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2095/]) HDFS-8083. Move dfs.client.write.* conf from DFSConfigKeys to HdfsClientConfigKeys.Write. (szetszwo: rev 7fc50e2525b8b8fe36d92e283a68eeeb09c63d21) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNamenodeCapacityReport.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/util/TestByteArrayManager.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/ReplaceDatanodeOnFailure.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/resources/EnumSetParam.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/client/impl/DfsClientConf.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientExcludedNodes.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestClientProtocolForPipelineRecovery.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java Separate the client write conf from DFSConfigKeys - Key: HDFS-8083 URL: https://issues.apache.org/jira/browse/HDFS-8083 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Fix For: 2.8.0 Attachments: h8083_20150410.patch A part of HDFS-8050, move dfs.client.write.* conf from DFSConfigKeys to a new class HdfsClientConfigKeys. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8111) NPE thrown when invalid FSImage filename given for hdfs oiv_legacy cmd
[ https://issues.apache.org/jira/browse/HDFS-8111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493960#comment-14493960 ] Hudson commented on HDFS-8111: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2095 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2095/]) HDFS-8111. NPE thrown when invalid FSImage filename given for 'hdfs oiv_legacy' cmd ( Contributed by surendra singh lilhore ) (vinayakumarb: rev 14384f5b5142a98a10ce4bffadeb13e89bda9365) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/OfflineImageViewer.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt NPE thrown when invalid FSImage filename given for hdfs oiv_legacy cmd Key: HDFS-8111 URL: https://issues.apache.org/jira/browse/HDFS-8111 Project: Hadoop HDFS Issue Type: Bug Components: tools Affects Versions: 2.6.0 Reporter: Archana T Assignee: surendra singh lilhore Priority: Minor Fix For: 2.8.0 Attachments: HDFS-8111.patch NPE thrown when invalid filename is given as argument for hdfs oiv_legacy command {code} ./hdfs oiv_legacy -i /home/hadoop/hadoop/hadoop-3.0.0/dfs/name/current/fsimage_00042 -o fsimage.txt Exception in thread main java.lang.NullPointerException at org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageViewer.go(OfflineImageViewer.java:140) at org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageViewer.main(OfflineImageViewer.java:260) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7701) Support reporting per storage type quota and usage with hadoop/hdfs shell
[ https://issues.apache.org/jira/browse/HDFS-7701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493959#comment-14493959 ] Hudson commented on HDFS-7701: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2095 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2095/]) HDFS-7701. Support reporting per storage type quota and usage with hadoop/hdfs shell. (Contributed by Peter Shi) (arp: rev 18a3dad44afd8061643fffc5bbe50fa66e47b72c) * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/Count.java * hadoop-common-project/hadoop-common/src/test/resources/testConf.xml * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CommandFormat.java * hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestCount.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/ContentSummary.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Support reporting per storage type quota and usage with hadoop/hdfs shell - Key: HDFS-7701 URL: https://issues.apache.org/jira/browse/HDFS-7701 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Reporter: Xiaoyu Yao Assignee: Peter Shi Fix For: 2.8.0 Attachments: HDFS-7701.01.patch, HDFS-7701.02.patch, HDFS-7701.03.patch, HDFS-7701.04.patch, HDFS-7701.05.patch, HDFS-7701.06.patch hadoop fs -count -q or hdfs dfs -count -q currently shows name space/disk space quota and remaining quota information. With HDFS-7584, we want to display per storage type quota and its remaining information as well. The current output format as shown below may not easily accomodate 6 more columns = 3 (existing storage types) * 2 (quota/remaining quota). With new storage types added in future, this will make the output even more crowded. There are also compatibility issues as we don't want to break any existing scripts monitoring hadoop fs -count -q output. $ hadoop fs -count -q -v /test QUOTA REM_QUOTA SPACE_QUOTA REM_SPACE_QUOTADIR_COUNT FILE_COUNT CONTENT_SIZE PATHNAME none inf 524288000 5242665691 15 21431 /test Propose to add -t parameter to display ONLY the storage type quota information of the directory in the separately. This way, existing scripts will work as-is without using -t parameter. 1) When -t is not followed by a specific storage type, quota and usage information for all storage types will be displayed. $ hadoop fs -count -q -t -h -v /test SSD_QUOTA REM_SSD_QUOTA DISK_QUOTA REM_DISK_QUOTA ARCHIVAL_QUOTA REM_ARCHIVAL_QUOTA PATHNAME 512MB 256MB none inf none inf/test 2) If -t is followed by a storage type, only the quota and remaining quota of the storage type is displayed. $ hadoop fs -count -q -t SSD -h -v /test SSD_QUOTA REM_SSD_QUOTA PATHNAME 512 MB 256 MB /test -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7701) Support reporting per storage type quota and usage with hadoop/hdfs shell
[ https://issues.apache.org/jira/browse/HDFS-7701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493975#comment-14493975 ] Hudson commented on HDFS-7701: -- FAILURE: Integrated in Hadoop-Yarn-trunk #897 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/897/]) HDFS-7701. Support reporting per storage type quota and usage with hadoop/hdfs shell. (Contributed by Peter Shi) (arp: rev 18a3dad44afd8061643fffc5bbe50fa66e47b72c) * hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestCount.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/Count.java * hadoop-common-project/hadoop-common/src/test/resources/testConf.xml * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CommandFormat.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/ContentSummary.java Support reporting per storage type quota and usage with hadoop/hdfs shell - Key: HDFS-7701 URL: https://issues.apache.org/jira/browse/HDFS-7701 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Reporter: Xiaoyu Yao Assignee: Peter Shi Fix For: 2.8.0 Attachments: HDFS-7701.01.patch, HDFS-7701.02.patch, HDFS-7701.03.patch, HDFS-7701.04.patch, HDFS-7701.05.patch, HDFS-7701.06.patch hadoop fs -count -q or hdfs dfs -count -q currently shows name space/disk space quota and remaining quota information. With HDFS-7584, we want to display per storage type quota and its remaining information as well. The current output format as shown below may not easily accomodate 6 more columns = 3 (existing storage types) * 2 (quota/remaining quota). With new storage types added in future, this will make the output even more crowded. There are also compatibility issues as we don't want to break any existing scripts monitoring hadoop fs -count -q output. $ hadoop fs -count -q -v /test QUOTA REM_QUOTA SPACE_QUOTA REM_SPACE_QUOTADIR_COUNT FILE_COUNT CONTENT_SIZE PATHNAME none inf 524288000 5242665691 15 21431 /test Propose to add -t parameter to display ONLY the storage type quota information of the directory in the separately. This way, existing scripts will work as-is without using -t parameter. 1) When -t is not followed by a specific storage type, quota and usage information for all storage types will be displayed. $ hadoop fs -count -q -t -h -v /test SSD_QUOTA REM_SSD_QUOTA DISK_QUOTA REM_DISK_QUOTA ARCHIVAL_QUOTA REM_ARCHIVAL_QUOTA PATHNAME 512MB 256MB none inf none inf/test 2) If -t is followed by a storage type, only the quota and remaining quota of the storage type is displayed. $ hadoop fs -count -q -t SSD -h -v /test SSD_QUOTA REM_SSD_QUOTA PATHNAME 512 MB 256 MB /test -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8083) Separate the client write conf from DFSConfigKeys
[ https://issues.apache.org/jira/browse/HDFS-8083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493983#comment-14493983 ] Hudson commented on HDFS-8083: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #154 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/154/]) HDFS-8083. Move dfs.client.write.* conf from DFSConfigKeys to HdfsClientConfigKeys.Write. (szetszwo: rev 7fc50e2525b8b8fe36d92e283a68eeeb09c63d21) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/client/impl/DfsClientConf.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNamenodeCapacityReport.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientExcludedNodes.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestClientProtocolForPipelineRecovery.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/resources/EnumSetParam.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/util/TestByteArrayManager.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/ReplaceDatanodeOnFailure.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java Separate the client write conf from DFSConfigKeys - Key: HDFS-8083 URL: https://issues.apache.org/jira/browse/HDFS-8083 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Fix For: 2.8.0 Attachments: h8083_20150410.patch A part of HDFS-8050, move dfs.client.write.* conf from DFSConfigKeys to a new class HdfsClientConfigKeys. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8111) NPE thrown when invalid FSImage filename given for hdfs oiv_legacy cmd
[ https://issues.apache.org/jira/browse/HDFS-8111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493990#comment-14493990 ] Hudson commented on HDFS-8111: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #154 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/154/]) HDFS-8111. NPE thrown when invalid FSImage filename given for 'hdfs oiv_legacy' cmd ( Contributed by surendra singh lilhore ) (vinayakumarb: rev 14384f5b5142a98a10ce4bffadeb13e89bda9365) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/OfflineImageViewer.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt NPE thrown when invalid FSImage filename given for hdfs oiv_legacy cmd Key: HDFS-8111 URL: https://issues.apache.org/jira/browse/HDFS-8111 Project: Hadoop HDFS Issue Type: Bug Components: tools Affects Versions: 2.6.0 Reporter: Archana T Assignee: surendra singh lilhore Priority: Minor Fix For: 2.8.0 Attachments: HDFS-8111.patch NPE thrown when invalid filename is given as argument for hdfs oiv_legacy command {code} ./hdfs oiv_legacy -i /home/hadoop/hadoop/hadoop-3.0.0/dfs/name/current/fsimage_00042 -o fsimage.txt Exception in thread main java.lang.NullPointerException at org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageViewer.go(OfflineImageViewer.java:140) at org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageViewer.main(OfflineImageViewer.java:260) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8083) Separate the client write conf from DFSConfigKeys
[ https://issues.apache.org/jira/browse/HDFS-8083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493969#comment-14493969 ] Hudson commented on HDFS-8083: -- FAILURE: Integrated in Hadoop-Yarn-trunk #897 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/897/]) HDFS-8083. Move dfs.client.write.* conf from DFSConfigKeys to HdfsClientConfigKeys.Write. (szetszwo: rev 7fc50e2525b8b8fe36d92e283a68eeeb09c63d21) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/util/TestByteArrayManager.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestClientProtocolForPipelineRecovery.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNamenodeCapacityReport.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientExcludedNodes.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/resources/EnumSetParam.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/client/impl/DfsClientConf.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/ReplaceDatanodeOnFailure.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Separate the client write conf from DFSConfigKeys - Key: HDFS-8083 URL: https://issues.apache.org/jira/browse/HDFS-8083 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Fix For: 2.8.0 Attachments: h8083_20150410.patch A part of HDFS-8050, move dfs.client.write.* conf from DFSConfigKeys to a new class HdfsClientConfigKeys. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6666) Abort NameNode and DataNode startup if security is enabled but block access token is not enabled.
[ https://issues.apache.org/jira/browse/HDFS-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-: Affects Version/s: (was: 2.5.0) (was: 3.0.0) 2.7.1 Abort NameNode and DataNode startup if security is enabled but block access token is not enabled. - Key: HDFS- URL: https://issues.apache.org/jira/browse/HDFS- Project: Hadoop HDFS Issue Type: Bug Components: datanode, namenode, security Affects Versions: 2.7.1 Reporter: Chris Nauroth Assignee: Vijay Bhat Priority: Minor Fix For: 2.8.0 Attachments: HDFS-.001.patch, HDFS-.002.patch, HDFS-.003.patch, HDFS-.004.patch, HDFS-.005.patch Currently, if security is enabled by setting hadoop.security.authentication to kerberos, but HDFS block access tokens are disabled by setting dfs.block.access.token.enable to false (which is the default), then the NameNode logs an error and proceeds, and the DataNode proceeds without even logging an error. This jira proposes that this it's invalid to turn on security but not turn on block access tokens, and that it would be better to fail fast and abort the daemons during startup if this happens. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6666) Abort NameNode and DataNode startup if security is enabled but block access token is not enabled.
[ https://issues.apache.org/jira/browse/HDFS-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-: Resolution: Fixed Fix Version/s: 2.8.0 Target Version/s: 2.8.0 (was: 3.0.0, 2.6.0) Status: Resolved (was: Patch Available) +1 for patch v005. I committed this to trunk and branch-2. {{TestSecureNameNode}} does not exist on branch-2, so I simply removed that when I cherry-picked. Vijay, thank you for the patch. Arpit, thank you for help with the code review. Abort NameNode and DataNode startup if security is enabled but block access token is not enabled. - Key: HDFS- URL: https://issues.apache.org/jira/browse/HDFS- Project: Hadoop HDFS Issue Type: Bug Components: datanode, namenode, security Affects Versions: 2.7.1 Reporter: Chris Nauroth Assignee: Vijay Bhat Priority: Minor Fix For: 2.8.0 Attachments: HDFS-.001.patch, HDFS-.002.patch, HDFS-.003.patch, HDFS-.004.patch, HDFS-.005.patch Currently, if security is enabled by setting hadoop.security.authentication to kerberos, but HDFS block access tokens are disabled by setting dfs.block.access.token.enable to false (which is the default), then the NameNode logs an error and proceeds, and the DataNode proceeds without even logging an error. This jira proposes that this it's invalid to turn on security but not turn on block access tokens, and that it would be better to fail fast and abort the daemons during startup if this happens. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8055) NullPointerException when topology script is missing.
[ https://issues.apache.org/jira/browse/HDFS-8055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-8055: Resolution: Fixed Fix Version/s: 2.8.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) +1 for patch v003. I committed this to trunk and branch-2. Anu, thank you for tracking down the bug and providing a fix. Arpit, thank you for the help with the code review. NullPointerException when topology script is missing. - Key: HDFS-8055 URL: https://issues.apache.org/jira/browse/HDFS-8055 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.2.0 Reporter: Anu Engineer Assignee: Anu Engineer Fix For: 2.8.0 Attachments: hdfs-8055.001.patch, hdfs-8055.002.patch, hdfs-8055.003.patch We've received reports that the NameNode can get a NullPointerException when the topology script is missing. This issue tracks investigating whether or not we can improve the validation logic and give a more informative error message. Here is a sample stack trace : Getting NPE from HDFS: 2015-02-06 23:02:12,250 ERROR [pool-4-thread-1] util.HFileV1Detector: Got exception while reading trailer for file:hdfs://hqhd02nm01.pclc0.merkle.local:8020/hbase/.META./1028785192/info/1490a396aea448b693da563f76a28486^M org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): java.lang.NullPointerException^M at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.sortLocatedBlocks(DatanodeManager.java:359)^M at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1789)^M at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:542)^M at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:362)^M at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)^M at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)^M at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)^M at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)^M at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)^M at java.security.AccessController.doPrivileged(Native Method)^M at javax.security.auth.Subject.doAs(Subject.java:415)^M at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)^M at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)^M ^M at org.apache.hadoop.ipc.Client.call(Client.java:1468)^M at org.apache.hadoop.ipc.Client.call(Client.java:1399)^M at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)^M at com.sun.proxy.$Proxy14.getBlockLocations(Unknown Source)^M at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:254)^M at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)^M at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)^M at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)^M at java.lang.reflect.Method.invoke(Method.java:606)^M at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)^M at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)^M at com.sun.proxy.$Proxy15.getBlockLocations(Unknown Source)^M at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1220)^M at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1210)^M at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1200)^M at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:271)^M at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:238)^M at org.apache.hadoop.hdfs.DFSInputStream.init(DFSInputStream.java:231)^M at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1498)^M at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:302)^M at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:298)^M at
[jira] [Updated] (HDFS-6666) Abort NameNode and DataNode startup if security is enabled but block access token is not enabled.
[ https://issues.apache.org/jira/browse/HDFS-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-: Release Note: NameNode and DataNode now abort during startup if attempting to run in secure mode, but block access tokens are not enabled by setting configuration property dfs.block.access.token.enable to true in hdfs-site.xml. Previously, this case logged a warning, because this would be an insecure configuration. (was: The patch has the following changes: * Abort namenode and datanode startup if kerberos is enabled but block tokens are not enabled. * Test case that verifies the appropriate exception is thrown when the cluster is brought up with kerberos enabled and block tokens disabled (using Chris N's suggestion in the comments)) Abort NameNode and DataNode startup if security is enabled but block access token is not enabled. - Key: HDFS- URL: https://issues.apache.org/jira/browse/HDFS- Project: Hadoop HDFS Issue Type: Bug Components: datanode, namenode, security Affects Versions: 2.7.1 Reporter: Chris Nauroth Assignee: Vijay Bhat Priority: Minor Fix For: 2.8.0 Attachments: HDFS-.001.patch, HDFS-.002.patch, HDFS-.003.patch, HDFS-.004.patch, HDFS-.005.patch Currently, if security is enabled by setting hadoop.security.authentication to kerberos, but HDFS block access tokens are disabled by setting dfs.block.access.token.enable to false (which is the default), then the NameNode logs an error and proceeds, and the DataNode proceeds without even logging an error. This jira proposes that this it's invalid to turn on security but not turn on block access tokens, and that it would be better to fail fast and abort the daemons during startup if this happens. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7889) Subclass DFSOutputStream to support writing striping layout files
[ https://issues.apache.org/jira/browse/HDFS-7889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14494495#comment-14494495 ] Zhe Zhang commented on HDFS-7889: - As we go deeper in testing, I've found the following issues and questions about {{StripedDataStreamer#locateFollowingBlock}}: # {{hasCommittedBlock}} is initially {{false}}. But once becoming {{true}}, it will never be {{false}} again. What's the purpose of this flag? # Why are we always polling the first located block, instead of the i_th? {code} for (int i = 1; i HdfsConstants.NUM_DATA_BLOCKS; i++) { try { LocatedBlock finishedLocatedBlock = stripedBlocks.get(0).poll(30, TimeUnit.SECONDS); {code} # Why do we need the above loop at all? Shouldn't we always commit {{block.getNumBytes() * NUM_DATA_BLOCKS}}? Let's clarify these questions here and see if we need to do a follow-on for this logic. Subclass DFSOutputStream to support writing striping layout files - Key: HDFS-7889 URL: https://issues.apache.org/jira/browse/HDFS-7889 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Li Bo Assignee: Li Bo Fix For: HDFS-7285 Attachments: HDFS-7889-001.patch, HDFS-7889-002.patch, HDFS-7889-003.patch, HDFS-7889-004.patch, HDFS-7889-005.patch, HDFS-7889-006.patch, HDFS-7889-007.patch, HDFS-7889-008.patch, HDFS-7889-009.patch, HDFS-7889-010.patch, HDFS-7889-011.patch, HDFS-7889-012.patch, HDFS-7889-013.patch, HDFS-7889-014.patch After HDFS-7888, we can subclass {{DFSOutputStream}} to support writing striping layout files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8120) Erasure coding: created util class to analyze striped block groups
[ https://issues.apache.org/jira/browse/HDFS-8120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14494354#comment-14494354 ] Zhe Zhang commented on HDFS-8120: - The branch builds OK on my local machine. Our nightly Jenkins job also ran well. Erasure coding: created util class to analyze striped block groups -- Key: HDFS-8120 URL: https://issues.apache.org/jira/browse/HDFS-8120 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-8120.000.patch, HDFS-8120.001.patch, HDFS-8120.002.patch, HDFS-8120.003.patch The patch adds logic of calculating size of individual blocks in a striped block group. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7949) WebImageViewer need support file size calculation with striped blocks
[ https://issues.apache.org/jira/browse/HDFS-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14494390#comment-14494390 ] Zhe Zhang commented on HDFS-7949: - Thanks Rakesh for the update! The latest patch looks good overall. It needs a minor rebase. The below now needs to specify a schema. [~drankye] and [~vinayrpet]: can we use a null for the schema here? I tried and the test failed. Maybe the hard-coded value was based on the old default schema (3+2)? Now we have (6+3). We should update the hard-coded assert value to link to system default schema. {code} fs.getClient().getNamenode().createErasureCodingZone(/eczone); {code} WebImageViewer need support file size calculation with striped blocks - Key: HDFS-7949 URL: https://issues.apache.org/jira/browse/HDFS-7949 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Hui Zheng Assignee: Rakesh R Priority: Minor Attachments: HDFS-7949-001.patch, HDFS-7949-002.patch, HDFS-7949-003.patch The file size calculation should be changed when the blocks of the file are striped in WebImageViewer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8055) NullPointerException when topology script is missing.
[ https://issues.apache.org/jira/browse/HDFS-8055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14494464#comment-14494464 ] Hudson commented on HDFS-8055: -- FAILURE: Integrated in Hadoop-trunk-Commit #7583 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7583/]) HDFS-8055. NullPointerException when topology script is missing. Contributed by Anu Engineer. (cnauroth: rev fef596df038112cbbc86c4dc49314e274fca0190) * hadoop-hdfs-project/hadoop-hdfs/src/test/resources/topology-broken-script.cmd * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java * hadoop-hdfs-project/hadoop-hdfs/src/test/resources/topology-broken-script.sh * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestDatanodeManager.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/resources/topology-script.sh * hadoop-hdfs-project/hadoop-hdfs/src/test/resources/topology-script.cmd NullPointerException when topology script is missing. - Key: HDFS-8055 URL: https://issues.apache.org/jira/browse/HDFS-8055 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.2.0 Reporter: Anu Engineer Assignee: Anu Engineer Fix For: 2.8.0 Attachments: hdfs-8055.001.patch, hdfs-8055.002.patch, hdfs-8055.003.patch We've received reports that the NameNode can get a NullPointerException when the topology script is missing. This issue tracks investigating whether or not we can improve the validation logic and give a more informative error message. Here is a sample stack trace : Getting NPE from HDFS: 2015-02-06 23:02:12,250 ERROR [pool-4-thread-1] util.HFileV1Detector: Got exception while reading trailer for file:hdfs://hqhd02nm01.pclc0.merkle.local:8020/hbase/.META./1028785192/info/1490a396aea448b693da563f76a28486^M org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): java.lang.NullPointerException^M at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.sortLocatedBlocks(DatanodeManager.java:359)^M at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1789)^M at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:542)^M at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:362)^M at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)^M at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)^M at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)^M at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)^M at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)^M at java.security.AccessController.doPrivileged(Native Method)^M at javax.security.auth.Subject.doAs(Subject.java:415)^M at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)^M at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)^M ^M at org.apache.hadoop.ipc.Client.call(Client.java:1468)^M at org.apache.hadoop.ipc.Client.call(Client.java:1399)^M at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)^M at com.sun.proxy.$Proxy14.getBlockLocations(Unknown Source)^M at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:254)^M at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)^M at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)^M at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)^M at java.lang.reflect.Method.invoke(Method.java:606)^M at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)^M at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)^M at com.sun.proxy.$Proxy15.getBlockLocations(Unknown Source)^M at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1220)^M at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1210)^M at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1200)^M at
[jira] [Commented] (HDFS-6666) Abort NameNode and DataNode startup if security is enabled but block access token is not enabled.
[ https://issues.apache.org/jira/browse/HDFS-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14494415#comment-14494415 ] Hudson commented on HDFS-: -- FAILURE: Integrated in Hadoop-trunk-Commit #7582 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7582/]) HDFS-. Abort NameNode and DataNode startup if security is enabled but block access token is not enabled. Contributed by Vijay Bhat. (cnauroth: rev d45aa7647b1fecf81860ec7b563085be2af99a0b) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestSecureNameNode.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocol/datatransfer/sasl/SaslDataTransferTestCase.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Abort NameNode and DataNode startup if security is enabled but block access token is not enabled. - Key: HDFS- URL: https://issues.apache.org/jira/browse/HDFS- Project: Hadoop HDFS Issue Type: Bug Components: datanode, namenode, security Affects Versions: 2.7.1 Reporter: Chris Nauroth Assignee: Vijay Bhat Priority: Minor Fix For: 2.8.0 Attachments: HDFS-.001.patch, HDFS-.002.patch, HDFS-.003.patch, HDFS-.004.patch, HDFS-.005.patch Currently, if security is enabled by setting hadoop.security.authentication to kerberos, but HDFS block access tokens are disabled by setting dfs.block.access.token.enable to false (which is the default), then the NameNode logs an error and proceeds, and the DataNode proceeds without even logging an error. This jira proposes that this it's invalid to turn on security but not turn on block access tokens, and that it would be better to fail fast and abort the daemons during startup if this happens. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7993) Incorrect descriptions in fsck when nodes are decommissioned
[ https://issues.apache.org/jira/browse/HDFS-7993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14494363#comment-14494363 ] Hadoop QA commented on HDFS-7993: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12725215/HDFS-7993.3.patch against trunk revision b5a0b24. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/10274//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/10274//console This message is automatically generated. Incorrect descriptions in fsck when nodes are decommissioned Key: HDFS-7993 URL: https://issues.apache.org/jira/browse/HDFS-7993 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.6.0 Reporter: Ming Ma Assignee: J.Andreina Attachments: HDFS-7993.1.patch, HDFS-7993.2.patch, HDFS-7993.3.patch When you run fsck with -files or -racks, you will get something like below if one of the replicas is decommissioned. {noformat} blk_x len=y repl=3 [dn1, dn2, dn3, dn4] {noformat} That is because in NamenodeFsck, the repl count comes from live replicas count; while the actual nodes come from LocatedBlock which include decommissioned nodes. Another issue in NamenodeFsck is BlockPlacementPolicy's verifyBlockPlacement verifies LocatedBlock that includes decommissioned nodes. However, it seems better to exclude the decommissioned nodes in the verification; just like how fsck excludes decommissioned nodes when it check for under replicated blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7993) Incorrect descriptions in fsck when nodes are decommissioned
[ https://issues.apache.org/jira/browse/HDFS-7993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14494372#comment-14494372 ] Ming Ma commented on HDFS-7993: --- Thanks, Vinay. bq. In the current output, blk_x len=y repl=3 [dn1, dn2, dn3, dn4], the count repl=3 exactly gives the count of live replicas excluding decommission(ing/ed). So i think leaving it as is would be better. Maybe we can change the description from {{repl}} to {{live repl}}? It will address the confusion others might have. bq. As discussed above, this jira is to add the detail/state about each replica, not just the overall count, which is not available in NumberReplicas. Good point. bq. I think this count will be there for long time, since the block report interval is long. IMO If necessary may go in followup jira It will be useful to show stale block content replica. After NN failover if there is any over replication, it won't be counted as excess replicas until BR. So running fsck will show these to-be-excess replicas as Live Replicas. Incorrect descriptions in fsck when nodes are decommissioned Key: HDFS-7993 URL: https://issues.apache.org/jira/browse/HDFS-7993 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.6.0 Reporter: Ming Ma Assignee: J.Andreina Attachments: HDFS-7993.1.patch, HDFS-7993.2.patch, HDFS-7993.3.patch When you run fsck with -files or -racks, you will get something like below if one of the replicas is decommissioned. {noformat} blk_x len=y repl=3 [dn1, dn2, dn3, dn4] {noformat} That is because in NamenodeFsck, the repl count comes from live replicas count; while the actual nodes come from LocatedBlock which include decommissioned nodes. Another issue in NamenodeFsck is BlockPlacementPolicy's verifyBlockPlacement verifies LocatedBlock that includes decommissioned nodes. However, it seems better to exclude the decommissioned nodes in the verification; just like how fsck excludes decommissioned nodes when it check for under replicated blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8127) NameNode Failover during HA upgrade can cause DataNode to finalize upgrade
[ https://issues.apache.org/jira/browse/HDFS-8127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14494489#comment-14494489 ] Jing Zhao commented on HDFS-8127: - RollingUpgrade does not have this issue because instead of using the bootstrapstandby command, we use an editlog transaction to sync SBN with ANN. In RollingUpgrade, both SBN and ANN should have the same view about the rollingupgrade state. {{bootstrapstandby}} is only used for {{-rollingupgrade rollback}} but I think that should be fine. NameNode Failover during HA upgrade can cause DataNode to finalize upgrade -- Key: HDFS-8127 URL: https://issues.apache.org/jira/browse/HDFS-8127 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Jing Zhao Assignee: Jing Zhao Priority: Blocker Attachments: HDFS-8127.000.patch, HDFS-8127.001.patch Currently for HA upgrade (enabled by HDFS-5138), we use {{-bootstrapStandby}} to initialize the standby NameNode. The standby NameNode does not have the {{previous}} directory thus it does not know that the cluster is in the upgrade state. If NN failover happens, as response of block reports, the new ANN will tell DNs to finalize the upgrade thus make it impossible to rollback again. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8142) DistributedFileSystem#EncryptionZones should resolve given path relative to workingDir
Rakesh R created HDFS-8142: -- Summary: DistributedFileSystem#EncryptionZones should resolve given path relative to workingDir Key: HDFS-8142 URL: https://issues.apache.org/jira/browse/HDFS-8142 Project: Hadoop HDFS Issue Type: Bug Reporter: Rakesh R Assignee: Rakesh R Presently {{DFS#createEncryptionZone}} and {{DFS#getEZForPath}} APIs are not resolving the given path relative to the {{workingDir}}. This jira is to discuss and provide the implementation of the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8078) HDFS client gets errors trying to to connect to IPv6 DataNode
[ https://issues.apache.org/jira/browse/HDFS-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nate Edel updated HDFS-8078: Status: Patch Available (was: In Progress) HDFS client gets errors trying to to connect to IPv6 DataNode - Key: HDFS-8078 URL: https://issues.apache.org/jira/browse/HDFS-8078 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.6.0 Reporter: Nate Edel Assignee: Nate Edel Labels: ipv6 Attachments: HDFS-8078.patch 1st exception, on put: 15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception java.lang.IllegalArgumentException: Does not contain a valid host:port authority: 2401:db00:1010:70ba:face:0:8:0:50010 at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153) at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588) Appears to actually stem from code in DataNodeID which assumes it's safe to append together (ipaddr + : + port) -- which is OK for IPv4 and not OK for IPv6. NetUtils.createSocketAddr( ) assembles a Java URI object, which requires the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010 Currently using InetAddress.getByName() to validate IPv6 (guava InetAddresses.forString has been flaky) but could also use our own parsing. (From logging this, it seems like a low-enough frequency call that the extra object creation shouldn't be problematic, and for me the slight risk of passing in bad input that is not actually an IPv4 or IPv6 address and thus calling an external DNS lookup is outweighed by getting the address normalized and avoiding rewriting parsing.) Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress() --- 2nd exception (on datanode) 15/04/13 13:18:07 ERROR datanode.DataNode: dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown operation src: /2401:db00:20:7013:face:0:7:0:54152 dst: /2401:db00:11:d010:face:0:2f:0:50010 java.io.EOFException at java.io.DataInputStream.readShort(DataInputStream.java:315) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226) at java.lang.Thread.run(Thread.java:745) Which also comes as client error -get: 2401 is not an IP string literal. This one has existing parsing logic which needs to shift to the last colon rather than the first. Should also be a tiny bit faster by using lastIndexOf rather than split. Could alternatively use the techniques above. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8127) NameNode Failover during HA upgrade can cause DataNode to finalize upgrade
[ https://issues.apache.org/jira/browse/HDFS-8127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-8127: -- Hadoop Flags: Reviewed +1 the new patch looks good. NameNode Failover during HA upgrade can cause DataNode to finalize upgrade -- Key: HDFS-8127 URL: https://issues.apache.org/jira/browse/HDFS-8127 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Jing Zhao Assignee: Jing Zhao Priority: Blocker Attachments: HDFS-8127.000.patch, HDFS-8127.001.patch Currently for HA upgrade (enabled by HDFS-5138), we use {{-bootstrapStandby}} to initialize the standby NameNode. The standby NameNode does not have the {{previous}} directory thus it does not know that the cluster is in the upgrade state. If NN failover happens, as response of block reports, the new ANN will tell DNs to finalize the upgrade thus make it impossible to rollback again. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8142) DistributedFileSystem#EncryptionZones should resolve given path relative to workingDir
[ https://issues.apache.org/jira/browse/HDFS-8142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14494551#comment-14494551 ] Rakesh R commented on HDFS-8142: Attached patch (contains unit test to simulate the case) where it resolves the path relative to the {{workingDir}}. Please review the scenario and the patch fixing the same. Thanks! DistributedFileSystem#EncryptionZones should resolve given path relative to workingDir -- Key: HDFS-8142 URL: https://issues.apache.org/jira/browse/HDFS-8142 Project: Hadoop HDFS Issue Type: Bug Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-8142-001.patch Presently {{DFS#createEncryptionZone}} and {{DFS#getEZForPath}} APIs are not resolving the given path relative to the {{workingDir}}. This jira is to discuss and provide the implementation of the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8117) More accurate verification in SimulatedFSDataset: replace DEFAULT_DATABYTE with patterned data
[ https://issues.apache.org/jira/browse/HDFS-8117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14494201#comment-14494201 ] Hudson commented on HDFS-8117: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #164 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/164/]) HDFS-8117. More accurate verification in SimulatedFSDataset: replace DEFAULT_DATABYTE with patterned data. Contributed by Zhe Zhang. (wang: rev d60e22152ac098da103fd37fb81f8758e68d1efa) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestSimulatedFSDataset.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppend.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestSmallBlock.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestPread.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt More accurate verification in SimulatedFSDataset: replace DEFAULT_DATABYTE with patterned data -- Key: HDFS-8117 URL: https://issues.apache.org/jira/browse/HDFS-8117 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.6.0 Reporter: Zhe Zhang Assignee: Zhe Zhang Fix For: 3.0.0 Attachments: HDFS-8117-branch2.patch, HDFS-8117.000.patch, HDFS-8117.001.patch, HDFS-8117.002.patch, HDFS-8117.003.patch Currently {{SimulatedFSDataset}} uses a single {{DEFAULT_DATABYTE}} to simulate _all_ block content. This is not accurate because the return of this byte just means the read request has hit an arbitrary position in an arbitrary simulated block. This JIRA aims to improve it with a more accurate verification. When position {{p}} of a simulated block {{b}} is accessed, the returned byte is {{b}}'s block ID plus {{p}}, moduled by the max value of a byte. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8083) Separate the client write conf from DFSConfigKeys
[ https://issues.apache.org/jira/browse/HDFS-8083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14494273#comment-14494273 ] Hudson commented on HDFS-8083: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2113 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2113/]) HDFS-8083. Move dfs.client.write.* conf from DFSConfigKeys to HdfsClientConfigKeys.Write. (szetszwo: rev 7fc50e2525b8b8fe36d92e283a68eeeb09c63d21) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/ReplaceDatanodeOnFailure.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientExcludedNodes.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNamenodeCapacityReport.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/util/TestByteArrayManager.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestClientProtocolForPipelineRecovery.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/client/impl/DfsClientConf.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/resources/EnumSetParam.java Separate the client write conf from DFSConfigKeys - Key: HDFS-8083 URL: https://issues.apache.org/jira/browse/HDFS-8083 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Fix For: 2.8.0 Attachments: h8083_20150410.patch A part of HDFS-8050, move dfs.client.write.* conf from DFSConfigKeys to a new class HdfsClientConfigKeys. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7701) Support reporting per storage type quota and usage with hadoop/hdfs shell
[ https://issues.apache.org/jira/browse/HDFS-7701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14494280#comment-14494280 ] Hudson commented on HDFS-7701: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2113 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2113/]) HDFS-7701. Support reporting per storage type quota and usage with hadoop/hdfs shell. (Contributed by Peter Shi) (arp: rev 18a3dad44afd8061643fffc5bbe50fa66e47b72c) * hadoop-common-project/hadoop-common/src/test/resources/testConf.xml * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/ContentSummary.java * hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestCount.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CommandFormat.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/Count.java Support reporting per storage type quota and usage with hadoop/hdfs shell - Key: HDFS-7701 URL: https://issues.apache.org/jira/browse/HDFS-7701 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Reporter: Xiaoyu Yao Assignee: Peter Shi Fix For: 2.8.0 Attachments: HDFS-7701.01.patch, HDFS-7701.02.patch, HDFS-7701.03.patch, HDFS-7701.04.patch, HDFS-7701.05.patch, HDFS-7701.06.patch hadoop fs -count -q or hdfs dfs -count -q currently shows name space/disk space quota and remaining quota information. With HDFS-7584, we want to display per storage type quota and its remaining information as well. The current output format as shown below may not easily accomodate 6 more columns = 3 (existing storage types) * 2 (quota/remaining quota). With new storage types added in future, this will make the output even more crowded. There are also compatibility issues as we don't want to break any existing scripts monitoring hadoop fs -count -q output. $ hadoop fs -count -q -v /test QUOTA REM_QUOTA SPACE_QUOTA REM_SPACE_QUOTADIR_COUNT FILE_COUNT CONTENT_SIZE PATHNAME none inf 524288000 5242665691 15 21431 /test Propose to add -t parameter to display ONLY the storage type quota information of the directory in the separately. This way, existing scripts will work as-is without using -t parameter. 1) When -t is not followed by a specific storage type, quota and usage information for all storage types will be displayed. $ hadoop fs -count -q -t -h -v /test SSD_QUOTA REM_SSD_QUOTA DISK_QUOTA REM_DISK_QUOTA ARCHIVAL_QUOTA REM_ARCHIVAL_QUOTA PATHNAME 512MB 256MB none inf none inf/test 2) If -t is followed by a storage type, only the quota and remaining quota of the storage type is displayed. $ hadoop fs -count -q -t SSD -h -v /test SSD_QUOTA REM_SSD_QUOTA PATHNAME 512 MB 256 MB /test -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8117) More accurate verification in SimulatedFSDataset: replace DEFAULT_DATABYTE with patterned data
[ https://issues.apache.org/jira/browse/HDFS-8117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14494270#comment-14494270 ] Hudson commented on HDFS-8117: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2113 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2113/]) HDFS-8117. More accurate verification in SimulatedFSDataset: replace DEFAULT_DATABYTE with patterned data. Contributed by Zhe Zhang. (wang: rev d60e22152ac098da103fd37fb81f8758e68d1efa) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppend.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestSimulatedFSDataset.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestSmallBlock.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestPread.java More accurate verification in SimulatedFSDataset: replace DEFAULT_DATABYTE with patterned data -- Key: HDFS-8117 URL: https://issues.apache.org/jira/browse/HDFS-8117 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.6.0 Reporter: Zhe Zhang Assignee: Zhe Zhang Fix For: 3.0.0 Attachments: HDFS-8117-branch2.patch, HDFS-8117.000.patch, HDFS-8117.001.patch, HDFS-8117.002.patch, HDFS-8117.003.patch Currently {{SimulatedFSDataset}} uses a single {{DEFAULT_DATABYTE}} to simulate _all_ block content. This is not accurate because the return of this byte just means the read request has hit an arbitrary position in an arbitrary simulated block. This JIRA aims to improve it with a more accurate verification. When position {{p}} of a simulated block {{b}} is accessed, the returned byte is {{b}}'s block ID plus {{p}}, moduled by the max value of a byte. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8132) Namenode Startup Failing When we add Jcarder.jar in class Path
[ https://issues.apache.org/jira/browse/HDFS-8132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14494330#comment-14494330 ] Brahma Reddy Battula commented on HDFS-8132: [~tlipcon] any pointers to this issue..? Namenode Startup Failing When we add Jcarder.jar in class Path -- Key: HDFS-8132 URL: https://issues.apache.org/jira/browse/HDFS-8132 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.0 Reporter: Brahma Reddy Battula Assignee: Brahma Reddy Battula *{color:blue}Namenode while Startup Args{color}* ( Just added the jcarder args) exec /home/hdfs/jdk1.7.0_72/bin/java -Dproc_namenode -Xmx1000m -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/opt/ClusterSetup/Hadoop2.7/install/hadoop/namenode/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/opt/ClusterSetup/Hadoop2.7/install/hadoop/namenode -Dhadoop.id.str=hdfs -Dhadoop.root.logger=INFO,console -Djava.library.path=/opt/ClusterSetup/Hadoop2.7/install/hadoop/namenode/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dhadoop.security.logger=INFO,RFAS -Dhdfs.audit.logger=INFO,NullAppender {color:red}-javaagent:/opt/Jcarder/jcarder.jar=outputdir=/opt/Jcarder/Output/nn-jcarder{color} -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.hdfs.server.namenode.NameNode Setting outputdir to /opt/Jcarder/Output/nn-jcarder Starting JCarder (2.0.0/6) agent Opening for writing: /opt/Jcarder/Output/nn-jcarder/jcarder_events.db Opening for writing: /opt/Jcarder/Output/nn-jcarder/jcarder_contexts.db Not instrumenting standard library classes (AWT, Swing, etc.) JCarder agent initialized *{color:red}ERROR{color}* {noformat} Exception in thread main java.lang.VerifyError: Expecting a stackmap frame at branch target 21 Exception Details: Location: org/apache/hadoop/hdfs/server/namenode/NameNode.createHAState(Lorg/apache/hadoop/hdfs/server/common/HdfsServerConstants$StartupOption;)Lorg/apache/hadoop/hdfs/server/namenode/ha/HAState; @4: ifeq Reason: Expected stackmap frame at this location. Bytecode: 000: 2ab4 02d2 9900 112b b203 08a5 000a 2bb2 010: 030b a600 07b2 030d b0b2 030f b0 at java.lang.Class.getDeclaredMethods0(Native Method) at java.lang.Class.privateGetDeclaredMethods(Class.java:2615) at java.lang.Class.getMethod0(Class.java:2856) at java.lang.Class.getMethod(Class.java:1668) at sun.launcher.LauncherHelper.getMainMethod(LauncherHelper.java:494) at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:486) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7934) During Rolling upgrade rollback ,standby namenode startup fails.
[ https://issues.apache.org/jira/browse/HDFS-7934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14494294#comment-14494294 ] Hadoop QA commented on HDFS-7934: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12725207/HDFS-7934.2.patch against trunk revision b5a0b24. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/10273//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/10273//console This message is automatically generated. During Rolling upgrade rollback ,standby namenode startup fails. Key: HDFS-7934 URL: https://issues.apache.org/jira/browse/HDFS-7934 Project: Hadoop HDFS Issue Type: Bug Reporter: J.Andreina Assignee: J.Andreina Priority: Critical Attachments: HDFS-7934.1.patch, HDFS-7934.2.patch During Rolling upgrade rollback , standby namenode startup fails , while loading edits and when there is no local copy of edits created after upgrade ( which is already been removed by Active Namenode from journal manager and from Active's local). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7701) Support reporting per storage type quota and usage with hadoop/hdfs shell
[ https://issues.apache.org/jira/browse/HDFS-7701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14494212#comment-14494212 ] Hudson commented on HDFS-7701: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #164 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/164/]) HDFS-7701. Support reporting per storage type quota and usage with hadoop/hdfs shell. (Contributed by Peter Shi) (arp: rev 18a3dad44afd8061643fffc5bbe50fa66e47b72c) * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CommandFormat.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/ContentSummary.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/Count.java * hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestCount.java * hadoop-common-project/hadoop-common/src/test/resources/testConf.xml Support reporting per storage type quota and usage with hadoop/hdfs shell - Key: HDFS-7701 URL: https://issues.apache.org/jira/browse/HDFS-7701 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Reporter: Xiaoyu Yao Assignee: Peter Shi Fix For: 2.8.0 Attachments: HDFS-7701.01.patch, HDFS-7701.02.patch, HDFS-7701.03.patch, HDFS-7701.04.patch, HDFS-7701.05.patch, HDFS-7701.06.patch hadoop fs -count -q or hdfs dfs -count -q currently shows name space/disk space quota and remaining quota information. With HDFS-7584, we want to display per storage type quota and its remaining information as well. The current output format as shown below may not easily accomodate 6 more columns = 3 (existing storage types) * 2 (quota/remaining quota). With new storage types added in future, this will make the output even more crowded. There are also compatibility issues as we don't want to break any existing scripts monitoring hadoop fs -count -q output. $ hadoop fs -count -q -v /test QUOTA REM_QUOTA SPACE_QUOTA REM_SPACE_QUOTADIR_COUNT FILE_COUNT CONTENT_SIZE PATHNAME none inf 524288000 5242665691 15 21431 /test Propose to add -t parameter to display ONLY the storage type quota information of the directory in the separately. This way, existing scripts will work as-is without using -t parameter. 1) When -t is not followed by a specific storage type, quota and usage information for all storage types will be displayed. $ hadoop fs -count -q -t -h -v /test SSD_QUOTA REM_SSD_QUOTA DISK_QUOTA REM_DISK_QUOTA ARCHIVAL_QUOTA REM_ARCHIVAL_QUOTA PATHNAME 512MB 256MB none inf none inf/test 2) If -t is followed by a storage type, only the quota and remaining quota of the storage type is displayed. $ hadoop fs -count -q -t SSD -h -v /test SSD_QUOTA REM_SSD_QUOTA PATHNAME 512 MB 256 MB /test -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8083) Separate the client write conf from DFSConfigKeys
[ https://issues.apache.org/jira/browse/HDFS-8083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14494205#comment-14494205 ] Hudson commented on HDFS-8083: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #164 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/164/]) HDFS-8083. Move dfs.client.write.* conf from DFSConfigKeys to HdfsClientConfigKeys.Write. (szetszwo: rev 7fc50e2525b8b8fe36d92e283a68eeeb09c63d21) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientExcludedNodes.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/util/TestByteArrayManager.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/client/impl/DfsClientConf.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/ReplaceDatanodeOnFailure.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/resources/EnumSetParam.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNamenodeCapacityReport.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestClientProtocolForPipelineRecovery.java Separate the client write conf from DFSConfigKeys - Key: HDFS-8083 URL: https://issues.apache.org/jira/browse/HDFS-8083 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Fix For: 2.8.0 Attachments: h8083_20150410.patch A part of HDFS-8050, move dfs.client.write.* conf from DFSConfigKeys to a new class HdfsClientConfigKeys. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495498#comment-14495498 ] Tsz Wo Nicholas Sze commented on HDFS-7859: --- The patch under this JIRA handles saving / loading these default schemas in fsimage. I think this is necessary even without loading custom schemas from XML. Otherwise we cannot guarantee the NameNode which loads the fsimage has the same default schemas as the NameNode which saved it. It is obviously even more necessary when we add custom schemas ... I think we should not persist anything to NN before we have a clear design since we don't know what to persist. For example, should we persist schema ID? We are not able to answer this question since we don't even know if a schema should have an ID. If we change the layout later on, it requires cluster upgrade for the new layout and we have to support the old layout for backward compatibility. For now, I suggest to just hard code the only (6,3)-Reed-Solomon schema. We don't even need the xml file. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495546#comment-14495546 ] Tsz Wo Nicholas Sze commented on HDFS-7859: --- ... schema name for the ID purpose. ... There are a few choice choices: # Using schema name as ID # A schema name and a separated numeric ID # Multiple schema names and a numeric ID Why using #1? Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HDFS-8120) Erasure coding: created util class to analyze striped block groups
[ https://issues.apache.org/jira/browse/HDFS-8120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-8120: Comment: was deleted (was: bq. which make DFSOutputStream#completeFile fail Yes, the completeFile call should fail here when no real data is written.) Erasure coding: created util class to analyze striped block groups -- Key: HDFS-8120 URL: https://issues.apache.org/jira/browse/HDFS-8120 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-8120.000.patch, HDFS-8120.001.patch, HDFS-8120.002.patch, HDFS-8120.003.patch, HDFS-8120.004.patch The patch adds logic of calculating size of individual blocks in a striped block group. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7889) Subclass DFSOutputStream to support writing striping layout files
[ https://issues.apache.org/jira/browse/HDFS-7889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495614#comment-14495614 ] Li Bo commented on HDFS-7889: - hi, Zhe, please see my following explanation of the related code. The first(leading) streamer is responsible for committing block groups. Before committing, the first streamer needs to wait for other streamers to finish writing their blocks and then count the total number of bytes written in this block group. Because streamers only share {{stripedBlocks}}, when an ordinary streamer finish writing its block, it has to report its work to leading streamer. It sends a LocatedBlock object(containing how many bytes it has written for its block) to the blocking queue of leading streamer(i.e.{{stripedBlocks\[0\]}}). The leading streamer will wait for the queue and collect other streamers' report. The ordinary streamer can just send an Integer to the leading streamer, here I choose LocatedBlock is because it may be more convenient to do error handling in HDFS-7786. bq. hasCommittedBlock is initially false. But once becoming true, it will never be false again. What's the purpose of this flag? For an ordinary streamer, it send its report to leading streamer in {{endBlock}} when it finishes writing a block. For the leading streamer, at first he just request a block group from NN. When it has to request another block group, it has to commit the old one. So {{hasCommittedBlock}} will be true after the first request. bq. Why are we always polling the first located block, instead of the i_th? {{stripedBlocks.get(0)}} is the blocking queue of the leading streamer, it needs to get the results of other streamer’s work before committing the block group to NN. bq. Shouldn't we always commit block.getNumBytes() * NUM_DATA_BLOCKS? The size of last block group may be smaller than {{block.getNumBytes() * NUM_DATA_BLOCKS}}, {{StripedDataStreamer#countTrailingBlockGroupBytes()}} is used to count the written bytes of last block group. For previous full block group, the leading streamer has to wait for the slowest streamer to finish writing. Otherwise, if the leading streamer commits {{block.getNumBytes() * NUM_DATA_BLOCKS}} bytes to NN before slow streamers, and one streamer fails after that, the error handling will be complicated. The above solution may be not the best but it works by now. If you have a better solution, we can discuss and optimize the related logic. Subclass DFSOutputStream to support writing striping layout files - Key: HDFS-7889 URL: https://issues.apache.org/jira/browse/HDFS-7889 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Li Bo Assignee: Li Bo Fix For: HDFS-7285 Attachments: HDFS-7889-001.patch, HDFS-7889-002.patch, HDFS-7889-003.patch, HDFS-7889-004.patch, HDFS-7889-005.patch, HDFS-7889-006.patch, HDFS-7889-007.patch, HDFS-7889-008.patch, HDFS-7889-009.patch, HDFS-7889-010.patch, HDFS-7889-011.patch, HDFS-7889-012.patch, HDFS-7889-013.patch, HDFS-7889-014.patch After HDFS-7888, we can subclass {{DFSOutputStream}} to support writing striping layout files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8123) Erasure Coding: Better to move EC related proto messages to a separate 'erasurecode.proto' file
[ https://issues.apache.org/jira/browse/HDFS-8123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-8123: --- Attachment: HDFS-8123-004 Erasure Coding: Better to move EC related proto messages to a separate 'erasurecode.proto' file --- Key: HDFS-8123 URL: https://issues.apache.org/jira/browse/HDFS-8123 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-8123-001.patch, HDFS-8123-002.patch, HDFS-8123-003.patch While reviewing the code I've noticed EC related proto messages are getting added into {{hdfs.proto}}. IMHO, for better maintainability of the erasure code feature, its good to move this to a separate {{erasurecode.proto}} file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8123) Erasure Coding: Better to move EC related proto messages to a separate 'erasurecode.proto' file
[ https://issues.apache.org/jira/browse/HDFS-8123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-8123: --- Attachment: (was: HDFS-8123-004) Erasure Coding: Better to move EC related proto messages to a separate 'erasurecode.proto' file --- Key: HDFS-8123 URL: https://issues.apache.org/jira/browse/HDFS-8123 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-8123-001.patch, HDFS-8123-002.patch, HDFS-8123-003.patch While reviewing the code I've noticed EC related proto messages are getting added into {{hdfs.proto}}. IMHO, for better maintainability of the erasure code feature, its good to move this to a separate {{erasurecode.proto}} file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8144) Split TestLazyPersistFiles into multiple tests
[ https://issues.apache.org/jira/browse/HDFS-8144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495174#comment-14495174 ] Xiaoyu Yao commented on HDFS-8144: -- [~arpitagarwal], the patch looks pretty good to me. I just have two comments: 1. We don't need @Test (timeout=30) for each test case because the base class LazyPersistTestCase.java has already set timeout by Rule as follows. {code} @Rule public Timeout timeout = new Timeout(30); {code} 2. Nit: Extra empty line at TestLazyWriter.java: 278 Split TestLazyPersistFiles into multiple tests -- Key: HDFS-8144 URL: https://issues.apache.org/jira/browse/HDFS-8144 Project: Hadoop HDFS Issue Type: Improvement Components: test Affects Versions: 2.7.0 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-8144.01.patch TestLazyPersistFiles has grown too large and includes both NN and DN tests. We can split up related tests into smaller files to keep the test case manageable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8145) Fix the editlog corruption exposed by failed TestAddStripedBlocks
Jing Zhao created HDFS-8145: --- Summary: Fix the editlog corruption exposed by failed TestAddStripedBlocks Key: HDFS-8145 URL: https://issues.apache.org/jira/browse/HDFS-8145 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Jing Zhao Assignee: Jing Zhao {{TestAddStripedBlocks}} failed with some editlog corruption. Did some debugging, I can see at least two issues: # DFSStripedOutputStream tries to send out an empty packet to close the block even if writing 0 bytes # Because of the above, NN tries to close the file. This exposes another bug in {{BlockInfoStriped}}, which writes its data/parity block numbers into the close editlog but do not read them while loading. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495553#comment-14495553 ] Tsz Wo Nicholas Sze commented on HDFS-7859: --- ... We would persist the whole schema object ... How can we be sure that the schema object format won't change? Since we don't not yet support add/delete/update/rename schema operations, we don't need to persist anything in NN at this moment. We will support some of these schema operations down the road. We may persist schemas at that time. Sound good? Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8144) Split TestLazyPersistFiles into multiple tests
[ https://issues.apache.org/jira/browse/HDFS-8144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495562#comment-14495562 ] Hadoop QA commented on HDFS-8144: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12725417/HDFS-8144.01.patch against trunk revision fddd552. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 6 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/10278//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/10278//console This message is automatically generated. Split TestLazyPersistFiles into multiple tests -- Key: HDFS-8144 URL: https://issues.apache.org/jira/browse/HDFS-8144 Project: Hadoop HDFS Issue Type: Improvement Components: test Affects Versions: 2.7.0 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-8144.01.patch TestLazyPersistFiles has grown too large and includes both NN and DN tests. We can split up related tests into smaller files to keep the test case manageable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8120) Erasure coding: created util class to analyze striped block groups
[ https://issues.apache.org/jira/browse/HDFS-8120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495584#comment-14495584 ] Li Bo commented on HDFS-8120: - bq. the DFSStripedOutputStream should not allocate a new block in NameNode When {{DFSStripedOutputStream}} requests a block group from NN, it also doesn't know how many blocks in this block group will be empty. The client may just wirte 1 byte, which makes most of the blocks in this block group empty. In {{DFSOutputStream}}, if the write bytes of a block is 0, {{DFSOutputStream#close()}} will not send an empty package to DN. The two situations are a little different. Erasure coding: created util class to analyze striped block groups -- Key: HDFS-8120 URL: https://issues.apache.org/jira/browse/HDFS-8120 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-8120.000.patch, HDFS-8120.001.patch, HDFS-8120.002.patch, HDFS-8120.003.patch, HDFS-8120.004.patch The patch adds logic of calculating size of individual blocks in a striped block group. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8123) Erasure Coding: Better to move EC related proto messages to a separate 'erasurecode.proto' file
[ https://issues.apache.org/jira/browse/HDFS-8123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-8123: --- Attachment: HDFS-8123-004.patch Erasure Coding: Better to move EC related proto messages to a separate 'erasurecode.proto' file --- Key: HDFS-8123 URL: https://issues.apache.org/jira/browse/HDFS-8123 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-8123-001.patch, HDFS-8123-002.patch, HDFS-8123-003.patch, HDFS-8123-004.patch While reviewing the code I've noticed EC related proto messages are getting added into {{hdfs.proto}}. IMHO, for better maintainability of the erasure code feature, its good to move this to a separate {{erasurecode.proto}} file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7687) Change fsck to support EC files
[ https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495178#comment-14495178 ] Takanobu Asanuma commented on HDFS-7687: Thanks for your comments, Nicholas. I will write the patches with your suggestion. Change fsck to support EC files --- Key: HDFS-7687 URL: https://issues.apache.org/jira/browse/HDFS-7687 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Tsz Wo Nicholas Sze Assignee: Takanobu Asanuma We need to change fsck so that it can detect under replicated and corrupted EC files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495442#comment-14495442 ] Zhe Zhang commented on HDFS-7859: - [~szetszwo] / [~drankye]: The [phasing plan | https://issues.apache.org/jira/browse/HDFS-7285?focusedCommentId=14391207page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14391207] I posted might be a little confusing in regards of schemas. My apologies. In the offline meetup on 03/31, we didn't reach a clear conclusion on how much of schema work to include before merging. Therefore I left it in phase I, but marked it as optional. My thought was that we could make a better decision after observing how fast the work could proceed. Up to this point I think this thread is going pretty well and it seems we can have a multi-schema implementation when other HDFS-7285 tasks are done (see details below). Good [questions | https://issues.apache.org/jira/browse/HDFS-7859?focusedCommentId=14494933page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14494933] on schema design. I think we eventually need to answer them in the broader scope of HDFS-7337. IIUC HDFS-7859 / HDFS-7866 are not touching most of the tricky scenarios. Based on Kai's latest [comment | https://issues.apache.org/jira/browse/HDFS-7866?focusedCommentId=14494050page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14494050], HDFS-7866 will mostly handle _default_ schemas embedded in the {{ECSchemaManager}} code. The patch under this JIRA handles saving / loading these default schemas in fsimage. I think this is necessary even without loading custom schemas from XML. Otherwise we cannot guarantee the NameNode which loads the fsimage has the same default schemas as the NameNode which saved it. It is obviously even more necessary when we add custom schemas. The logic in the patch is quite straightforward; it's mostly about serialize / deserialize schemas. So here's my proposal: # Shrink this patch to get rid of logics on modifying and removing schemas ({{ECSchemaManager#modifyECSchema}} and {{OP_MODIFY_EC_SCHEMA}}). # Repurpose HDFS-7866 to focus on loading custom schemas from site xml files. [~szetszwo], [~drankye], [~vinayrpet]: let me know if you agree with the above. If we are all synced on this, how about moving this JIRA back to HDFS-7285 and keeping HDFS-7866 under HDFS-8031? Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7949) WebImageViewer need support file size calculation with striped blocks
[ https://issues.apache.org/jira/browse/HDFS-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-7949: --- Attachment: HDFS-7949-004.patch WebImageViewer need support file size calculation with striped blocks - Key: HDFS-7949 URL: https://issues.apache.org/jira/browse/HDFS-7949 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Hui Zheng Assignee: Rakesh R Priority: Minor Attachments: HDFS-7949-001.patch, HDFS-7949-002.patch, HDFS-7949-003.patch, HDFS-7949-004.patch The file size calculation should be changed when the blocks of the file are striped in WebImageViewer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495556#comment-14495556 ] Kai Zheng commented on HDFS-7859: - bq.Using schema name as ID As we would not make it heavy so don't have some field like {{description}} for an {{ECSchema}}, a friendly name like {{RS-6-3}} would make it more sense in the way rather than an number ID. Users should be clearly understand the schema before using it to create any zone. The name will help with identifying that. bq.We don't even need the xml file. Yeah, if we would do that thru command to define a schema by specifying the schema parameters, it should also be OK. I don't have strongly preference about that. Any file format or even not using file would also work I guess. We talked about this in the meetup, looks like XML file was synced. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8120) Erasure coding: created util class to analyze striped block groups
[ https://issues.apache.org/jira/browse/HDFS-8120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495568#comment-14495568 ] Li Bo commented on HDFS-8120: - When completing a block group, NN will check if the replications of this block group reach the minimum replicas. Currently the minimum number is the number of data blocks. If DFSStripedOutputStream don't send out an empty package to empty block, the replications of this block group may be smaller than the number of data blocks, which make {{DFSOutputStream#completeFile}} fail. Erasure coding: created util class to analyze striped block groups -- Key: HDFS-8120 URL: https://issues.apache.org/jira/browse/HDFS-8120 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-8120.000.patch, HDFS-8120.001.patch, HDFS-8120.002.patch, HDFS-8120.003.patch, HDFS-8120.004.patch The patch adds logic of calculating size of individual blocks in a striped block group. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8123) Erasure Coding: Better to move EC related proto messages to a separate 'erasurecode.proto' file
[ https://issues.apache.org/jira/browse/HDFS-8123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495172#comment-14495172 ] Rakesh R commented on HDFS-8123: Attached patch addressing [~drankye]'s comments. Hi [~vinayrpet], Please review the patch when you get a chance, it would be great to see your feedback. Thanks! Erasure Coding: Better to move EC related proto messages to a separate 'erasurecode.proto' file --- Key: HDFS-8123 URL: https://issues.apache.org/jira/browse/HDFS-8123 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-8123-001.patch, HDFS-8123-002.patch, HDFS-8123-003.patch, HDFS-8123-004.patch While reviewing the code I've noticed EC related proto messages are getting added into {{hdfs.proto}}. IMHO, for better maintainability of the erasure code feature, its good to move this to a separate {{erasurecode.proto}} file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7949) WebImageViewer need support file size calculation with striped blocks
[ https://issues.apache.org/jira/browse/HDFS-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495454#comment-14495454 ] Rakesh R commented on HDFS-7949: Thanks [~zhz]. I've attached new patch using null for the schema. Also, updated the hard-coded assert value. Kindly review! {code} fs.getClient().getNamenode().createErasureCodingZone(/eczone, null); {code} WebImageViewer need support file size calculation with striped blocks - Key: HDFS-7949 URL: https://issues.apache.org/jira/browse/HDFS-7949 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Hui Zheng Assignee: Rakesh R Priority: Minor Attachments: HDFS-7949-001.patch, HDFS-7949-002.patch, HDFS-7949-003.patch, HDFS-7949-004.patch The file size calculation should be changed when the blocks of the file are striped in WebImageViewer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495493#comment-14495493 ] Kai Zheng commented on HDFS-7859: - Hi [~zhz], Thanks for taking care of this and your good suggestion. It looks reasonable to me. This will sound like a more solid base for the merge. To summarize further: 1. This issue HDFS-7859 would provide two system defined schemas in Java codes: one is the system default schema (rs-6-3), already there; a new one, suggesting rs-10-4; It also ensure the two schemas will be persisted in the image/editlog for later querying. 2. The left gaps will be processed as follow-on to be done in HDFS-7866, mainly about how to customize site specific schemas thru a XML file. The design will also be updated. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8120) Erasure coding: created util class to analyze striped block groups
[ https://issues.apache.org/jira/browse/HDFS-8120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495520#comment-14495520 ] Jing Zhao commented on HDFS-8120: - Thanks Zhe. The latest patch looks pretty good to me. Some nits: # Let's write the following code using if-else: {code} +boolean wrongSize = storedBlock.getNumBytes() != reported.getNumBytes(); +if (storedBlock.isStriped()) { {code} i.e., {code} boolean wrongSize; if (storedBlock.isStriped()) { // } else { // } {code} # In {{DFSTestUtil#createStripedFile}}, instead of using null to indicate no need to create directory and EC zone, it may be better to use an additional parameter {{toMkdir}}. # Nit: need to remove 2 spaces before @Test. {code} - // @Test +@Test public void TestFileMoreThanABlockGroup2() throws IOException { {code} Besides, {{TestAddStripedBlocks}} failed with some editlog corruption. Did some debugging, I can see at least two issues: # DFSStripedOutputStream tries to send out an empty packet to close the block even if writing 0 bytes # Because of the above bug, NN tries to close the file. This exposes another bug in {{BlockInfoStriped}}, which writes its data/parity block numbers into the close editlog but do not read them while loading. I will create another jira to fix this. Erasure coding: created util class to analyze striped block groups -- Key: HDFS-8120 URL: https://issues.apache.org/jira/browse/HDFS-8120 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-8120.000.patch, HDFS-8120.001.patch, HDFS-8120.002.patch, HDFS-8120.003.patch, HDFS-8120.004.patch The patch adds logic of calculating size of individual blocks in a striped block group. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495517#comment-14495517 ] Kai Zheng commented on HDFS-7859: - bq.we don't know what to persist. For example, should we persist schema ID? We are not able to answer this question since we don't even know if a schema should have an ID. It's not true. We have {{ECSchema}} defined and it uses schema name for the ID purpose. We would persist the whole schema object. The on-going work although isn't reflected in the design doc but we did do that following our related discussion. In the meetup with [~zhz] and [~jingzhao], we covered this aspect and even your questions already. It's my mistake I didn't put it down clearly and update the doc accordingly. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495560#comment-14495560 ] Kai Zheng commented on HDFS-7859: - bq.How can we be sure that the schema object format won't change? Good question. In {{ECSchema}} class, in addition to the common parameters widely used by typical erasure codecs, an {{options}} map is also included so potentially any complex codec can use it to contain its own specific parameters or key-value pairs, such parameters are subject to its corresponding erasure coders to interpret. We try to make it flexible enough to avoid such change, but in case it needs change anyway, I thought it's supported, I mean the image layout version. Erasure Coding: Persist EC schemas in NameNode -- Key: HDFS-7859 URL: https://issues.apache.org/jira/browse/HDFS-7859 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Xinwei Qin Attachments: HDFS-7859.001.patch In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas in NameNode centrally and reliably, so that EC zones can reference them by name efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8132) Namenode Startup Failing When we add Jcarder.jar in class Path
[ https://issues.apache.org/jira/browse/HDFS-8132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495504#comment-14495504 ] Todd Lipcon commented on HDFS-8132: --- Yea, I'm guessing the same issue -- probably getting confused by some Java7 bytecode. I'm guessing that jcarder needs to be updated to use a new version of the 'asm' dependency - it's on a very old one (asm 2.2.2). The ASM changelog indicates that 4.0 was the first version with full support for Java 7. Namenode Startup Failing When we add Jcarder.jar in class Path -- Key: HDFS-8132 URL: https://issues.apache.org/jira/browse/HDFS-8132 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.0 Reporter: Brahma Reddy Battula Assignee: Brahma Reddy Battula *{color:blue}Namenode while Startup Args{color}* ( Just added the jcarder args) exec /home/hdfs/jdk1.7.0_72/bin/java -Dproc_namenode -Xmx1000m -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/opt/ClusterSetup/Hadoop2.7/install/hadoop/namenode/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/opt/ClusterSetup/Hadoop2.7/install/hadoop/namenode -Dhadoop.id.str=hdfs -Dhadoop.root.logger=INFO,console -Djava.library.path=/opt/ClusterSetup/Hadoop2.7/install/hadoop/namenode/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dhadoop.security.logger=INFO,RFAS -Dhdfs.audit.logger=INFO,NullAppender {color:red}-javaagent:/opt/Jcarder/jcarder.jar=outputdir=/opt/Jcarder/Output/nn-jcarder{color} -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.hdfs.server.namenode.NameNode Setting outputdir to /opt/Jcarder/Output/nn-jcarder Starting JCarder (2.0.0/6) agent Opening for writing: /opt/Jcarder/Output/nn-jcarder/jcarder_events.db Opening for writing: /opt/Jcarder/Output/nn-jcarder/jcarder_contexts.db Not instrumenting standard library classes (AWT, Swing, etc.) JCarder agent initialized *{color:red}ERROR{color}* {noformat} Exception in thread main java.lang.VerifyError: Expecting a stackmap frame at branch target 21 Exception Details: Location: org/apache/hadoop/hdfs/server/namenode/NameNode.createHAState(Lorg/apache/hadoop/hdfs/server/common/HdfsServerConstants$StartupOption;)Lorg/apache/hadoop/hdfs/server/namenode/ha/HAState; @4: ifeq Reason: Expected stackmap frame at this location. Bytecode: 000: 2ab4 02d2 9900 112b b203 08a5 000a 2bb2 010: 030b a600 07b2 030d b0b2 030f b0 at java.lang.Class.getDeclaredMethods0(Native Method) at java.lang.Class.privateGetDeclaredMethods(Class.java:2615) at java.lang.Class.getMethod0(Class.java:2856) at java.lang.Class.getMethod(Class.java:1668) at sun.launcher.LauncherHelper.getMainMethod(LauncherHelper.java:494) at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:486) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8132) Namenode Startup Failing When we add Jcarder.jar in class Path
[ https://issues.apache.org/jira/browse/HDFS-8132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495528#comment-14495528 ] Brahma Reddy Battula commented on HDFS-8132: [~cnauroth] and [~tlipcon] thanks a lot for your pointers.. Seems to be Jcarder needs to update..Going to close this issue..any further comments ..? Namenode Startup Failing When we add Jcarder.jar in class Path -- Key: HDFS-8132 URL: https://issues.apache.org/jira/browse/HDFS-8132 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.0 Reporter: Brahma Reddy Battula Assignee: Brahma Reddy Battula *{color:blue}Namenode while Startup Args{color}* ( Just added the jcarder args) exec /home/hdfs/jdk1.7.0_72/bin/java -Dproc_namenode -Xmx1000m -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/opt/ClusterSetup/Hadoop2.7/install/hadoop/namenode/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/opt/ClusterSetup/Hadoop2.7/install/hadoop/namenode -Dhadoop.id.str=hdfs -Dhadoop.root.logger=INFO,console -Djava.library.path=/opt/ClusterSetup/Hadoop2.7/install/hadoop/namenode/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dhadoop.security.logger=INFO,RFAS -Dhdfs.audit.logger=INFO,NullAppender {color:red}-javaagent:/opt/Jcarder/jcarder.jar=outputdir=/opt/Jcarder/Output/nn-jcarder{color} -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.hdfs.server.namenode.NameNode Setting outputdir to /opt/Jcarder/Output/nn-jcarder Starting JCarder (2.0.0/6) agent Opening for writing: /opt/Jcarder/Output/nn-jcarder/jcarder_events.db Opening for writing: /opt/Jcarder/Output/nn-jcarder/jcarder_contexts.db Not instrumenting standard library classes (AWT, Swing, etc.) JCarder agent initialized *{color:red}ERROR{color}* {noformat} Exception in thread main java.lang.VerifyError: Expecting a stackmap frame at branch target 21 Exception Details: Location: org/apache/hadoop/hdfs/server/namenode/NameNode.createHAState(Lorg/apache/hadoop/hdfs/server/common/HdfsServerConstants$StartupOption;)Lorg/apache/hadoop/hdfs/server/namenode/ha/HAState; @4: ifeq Reason: Expected stackmap frame at this location. Bytecode: 000: 2ab4 02d2 9900 112b b203 08a5 000a 2bb2 010: 030b a600 07b2 030d b0b2 030f b0 at java.lang.Class.getDeclaredMethods0(Native Method) at java.lang.Class.privateGetDeclaredMethods(Class.java:2615) at java.lang.Class.getMethod0(Class.java:2856) at java.lang.Class.getMethod(Class.java:1668) at sun.launcher.LauncherHelper.getMainMethod(LauncherHelper.java:494) at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:486) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8120) Erasure coding: created util class to analyze striped block groups
[ https://issues.apache.org/jira/browse/HDFS-8120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495571#comment-14495571 ] Jing Zhao commented on HDFS-8120: - bq. which make DFSOutputStream#completeFile fail Yes, the completeFile call should fail here when no real data is written. Erasure coding: created util class to analyze striped block groups -- Key: HDFS-8120 URL: https://issues.apache.org/jira/browse/HDFS-8120 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-8120.000.patch, HDFS-8120.001.patch, HDFS-8120.002.patch, HDFS-8120.003.patch, HDFS-8120.004.patch The patch adds logic of calculating size of individual blocks in a striped block group. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8120) Erasure coding: created util class to analyze striped block groups
[ https://issues.apache.org/jira/browse/HDFS-8120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495576#comment-14495576 ] Jing Zhao commented on HDFS-8120: - bq. which make DFSOutputStream#completeFile fail. More accurately, here the question is when no real data is written, the {{DFSStripedOutputStream}} should not allocate a new block in NameNode. If no block is allocated, NN will not wait for block-received reports from DN thus completeFile will not get blocked or fail. Erasure coding: created util class to analyze striped block groups -- Key: HDFS-8120 URL: https://issues.apache.org/jira/browse/HDFS-8120 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-8120.000.patch, HDFS-8120.001.patch, HDFS-8120.002.patch, HDFS-8120.003.patch, HDFS-8120.004.patch The patch adds logic of calculating size of individual blocks in a striped block group. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8144) Split TestLazyPersistFiles into multiple tests
[ https://issues.apache.org/jira/browse/HDFS-8144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-8144: Attachment: HDFS-8144.02.patch Thanks for the review [~xyao]. v2 patch addresses your feedback. Split TestLazyPersistFiles into multiple tests -- Key: HDFS-8144 URL: https://issues.apache.org/jira/browse/HDFS-8144 Project: Hadoop HDFS Issue Type: Improvement Components: test Affects Versions: 2.7.0 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-8144.01.patch, HDFS-8144.02.patch TestLazyPersistFiles has grown too large and includes both NN and DN tests. We can split up related tests into smaller files to keep the test case manageable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8132) Namenode Startup Failing When we add Jcarder.jar in class Path
[ https://issues.apache.org/jira/browse/HDFS-8132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495698#comment-14495698 ] Todd Lipcon commented on HDFS-8132: --- Nope. Feel free to send me a pull request for https://github.com/toddlipcon/jcarder/tree/lockclasses (this is the branch we were using internally until we switched to java7) Namenode Startup Failing When we add Jcarder.jar in class Path -- Key: HDFS-8132 URL: https://issues.apache.org/jira/browse/HDFS-8132 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.0 Reporter: Brahma Reddy Battula Assignee: Brahma Reddy Battula *{color:blue}Namenode while Startup Args{color}* ( Just added the jcarder args) exec /home/hdfs/jdk1.7.0_72/bin/java -Dproc_namenode -Xmx1000m -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/opt/ClusterSetup/Hadoop2.7/install/hadoop/namenode/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/opt/ClusterSetup/Hadoop2.7/install/hadoop/namenode -Dhadoop.id.str=hdfs -Dhadoop.root.logger=INFO,console -Djava.library.path=/opt/ClusterSetup/Hadoop2.7/install/hadoop/namenode/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dhadoop.security.logger=INFO,RFAS -Dhdfs.audit.logger=INFO,NullAppender {color:red}-javaagent:/opt/Jcarder/jcarder.jar=outputdir=/opt/Jcarder/Output/nn-jcarder{color} -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.hdfs.server.namenode.NameNode Setting outputdir to /opt/Jcarder/Output/nn-jcarder Starting JCarder (2.0.0/6) agent Opening for writing: /opt/Jcarder/Output/nn-jcarder/jcarder_events.db Opening for writing: /opt/Jcarder/Output/nn-jcarder/jcarder_contexts.db Not instrumenting standard library classes (AWT, Swing, etc.) JCarder agent initialized *{color:red}ERROR{color}* {noformat} Exception in thread main java.lang.VerifyError: Expecting a stackmap frame at branch target 21 Exception Details: Location: org/apache/hadoop/hdfs/server/namenode/NameNode.createHAState(Lorg/apache/hadoop/hdfs/server/common/HdfsServerConstants$StartupOption;)Lorg/apache/hadoop/hdfs/server/namenode/ha/HAState; @4: ifeq Reason: Expected stackmap frame at this location. Bytecode: 000: 2ab4 02d2 9900 112b b203 08a5 000a 2bb2 010: 030b a600 07b2 030d b0b2 030f b0 at java.lang.Class.getDeclaredMethods0(Native Method) at java.lang.Class.privateGetDeclaredMethods(Class.java:2615) at java.lang.Class.getMethod0(Class.java:2856) at java.lang.Class.getMethod(Class.java:1668) at sun.launcher.LauncherHelper.getMainMethod(LauncherHelper.java:494) at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:486) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)