[jira] [Commented] (HDFS-7343) HDFS smart storage management
[ https://issues.apache.org/jira/browse/HDFS-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17642239#comment-17642239 ] Feilong He commented on HDFS-7343: -- [~brahmareddy], thanks for comment. i) No feature is pending. As you may know, we have made an independent project called SSM based on this Jira's design. It is basically production ready except some experimental features, like data sync, HA, etc. ii) No, kafka and ZK are not required. It is recommended to deploy SSM in HDFS cluster. The only prerequisite is user needs to deploy mysql for maintaining SSM metadata. iii) This project is under maintenance phase. We have no plan to move it into HDFS or somewhere as subproject, or make it become an apache incubation project. > HDFS smart storage management > - > > Key: HDFS-7343 > URL: https://issues.apache.org/jira/browse/HDFS-7343 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Kai Zheng >Assignee: Wei Zhou >Priority: Major > Attachments: HDFS-Smart-Storage-Management-update.pdf, > HDFS-Smart-Storage-Management.pdf, > HDFSSmartStorageManagement-General-20170315.pdf, > HDFSSmartStorageManagement-Phase1-20170315.pdf, access_count_tables.jpg, > move.jpg, tables_in_ssm.xlsx > > > As discussed in HDFS-7285, it would be better to have a comprehensive and > flexible storage policy engine considering file attributes, metadata, data > temperature, storage type, EC codec, available hardware capabilities, > user/application preference and etc. > Modified the title for re-purpose. > We'd extend this effort some bit and aim to work on a comprehensive solution > to provide smart storage management service in order for convenient, > intelligent and effective utilizing of erasure coding or replicas, HDFS cache > facility, HSM offering, and all kinds of tools (balancer, mover, disk > balancer and so on) in a large cluster. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16014) Fix an issue in checking native pmdk lib by 'hadoop checknative' command
[ https://issues.apache.org/jira/browse/HDFS-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-16014: -- Summary: Fix an issue in checking native pmdk lib by 'hadoop checknative' command (was: Issue in checking native pmdk lib by 'hadoop checknative' command) > Fix an issue in checking native pmdk lib by 'hadoop checknative' command > > > Key: HDFS-16014 > URL: https://issues.apache.org/jira/browse/HDFS-16014 > Project: Hadoop HDFS > Issue Type: Bug > Components: native >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-16014-01.patch, HDFS-16014-02.patch > > > In HDFS-14818, we proposed a patch to support checking native pmdk lib. The > expected target is to display hint to user regarding pmdk lib loaded state. > Recently, it was found that pmdk lib was not successfully loaded actually but > the `hadoop checknative` command still tells user that it was. This issue can > be reproduced by moving libpmem.so* from specified installed path to other > place, or directly deleting these libs, after the project is built. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16014) Issue in checking native pmdk lib by 'hadoop checknative' command
[ https://issues.apache.org/jira/browse/HDFS-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17453751#comment-17453751 ] Feilong He commented on HDFS-16014: --- [~rakeshr], please check the latest QA report. It looks good. > Issue in checking native pmdk lib by 'hadoop checknative' command > - > > Key: HDFS-16014 > URL: https://issues.apache.org/jira/browse/HDFS-16014 > Project: Hadoop HDFS > Issue Type: Bug > Components: native >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-16014-01.patch, HDFS-16014-02.patch > > > In HDFS-14818, we proposed a patch to support checking native pmdk lib. The > expected target is to display hint to user regarding pmdk lib loaded state. > Recently, it was found that pmdk lib was not successfully loaded actually but > the `hadoop checknative` command still tells user that it was. This issue can > be reproduced by moving libpmem.so* from specified installed path to other > place, or directly deleting these libs, after the project is built. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16014) Issue in checking native pmdk lib by 'hadoop checknative' command
[ https://issues.apache.org/jira/browse/HDFS-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17452202#comment-17452202 ] Feilong He commented on HDFS-16014: --- [~rakeshr], thanks for your review! I just uploaded a same patch to trigger the latest QA checking. > Issue in checking native pmdk lib by 'hadoop checknative' command > - > > Key: HDFS-16014 > URL: https://issues.apache.org/jira/browse/HDFS-16014 > Project: Hadoop HDFS > Issue Type: Bug > Components: native >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-16014-01.patch, HDFS-16014-02.patch > > > In HDFS-14818, we proposed a patch to support checking native pmdk lib. The > expected target is to display hint to user regarding pmdk lib loaded state. > Recently, it was found that pmdk lib was not successfully loaded actually but > the `hadoop checknative` command still tells user that it was. This issue can > be reproduced by moving libpmem.so* from specified installed path to other > place, or directly deleting these libs, after the project is built. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16014) Issue in checking native pmdk lib by 'hadoop checknative' command
[ https://issues.apache.org/jira/browse/HDFS-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-16014: -- Attachment: HDFS-16014-02.patch > Issue in checking native pmdk lib by 'hadoop checknative' command > - > > Key: HDFS-16014 > URL: https://issues.apache.org/jira/browse/HDFS-16014 > Project: Hadoop HDFS > Issue Type: Bug > Components: native >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-16014-01.patch, HDFS-16014-02.patch > > > In HDFS-14818, we proposed a patch to support checking native pmdk lib. The > expected target is to display hint to user regarding pmdk lib loaded state. > Recently, it was found that pmdk lib was not successfully loaded actually but > the `hadoop checknative` command still tells user that it was. This issue can > be reproduced by moving libpmem.so* from specified installed path to other > place, or directly deleting these libs, after the project is built. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15788) Correct the statement for pmem cache to reflect cache persistence support
[ https://issues.apache.org/jira/browse/HDFS-15788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17444301#comment-17444301 ] Feilong He commented on HDFS-15788: --- [~rakeshr], this patch is just to update document to align with the implementation. If you have any comment, please let me know. > Correct the statement for pmem cache to reflect cache persistence support > - > > Key: HDFS-15788 > URL: https://issues.apache.org/jira/browse/HDFS-15788 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Minor > Attachments: HDFS-15788-01.patch, HDFS-15788-02.patch > > > Correct the statement for pmem cache to reflect cache persistence support. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16014) Issue in checking native pmdk lib by 'hadoop checknative' command
[ https://issues.apache.org/jira/browse/HDFS-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17444297#comment-17444297 ] Feilong He commented on HDFS-16014: --- [~rakeshr], do you have any comment on this patch? > Issue in checking native pmdk lib by 'hadoop checknative' command > - > > Key: HDFS-16014 > URL: https://issues.apache.org/jira/browse/HDFS-16014 > Project: Hadoop HDFS > Issue Type: Bug > Components: native >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-16014-01.patch > > > In HDFS-14818, we proposed a patch to support checking native pmdk lib. The > expected target is to display hint to user regarding pmdk lib loaded state. > Recently, it was found that pmdk lib was not successfully loaded actually but > the `hadoop checknative` command still tells user that it was. This issue can > be reproduced by moving libpmem.so* from specified installed path to other > place, or directly deleting these libs, after the project is built. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-14480) Shut down DataNode gracefully when responding to stop-dfs.sh/stop-dfs.cmd
[ https://issues.apache.org/jira/browse/HDFS-14480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He resolved HDFS-14480. --- Resolution: Won't Fix > Shut down DataNode gracefully when responding to stop-dfs.sh/stop-dfs.cmd > - > > Key: HDFS-14480 > URL: https://issues.apache.org/jira/browse/HDFS-14480 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > > Currently, DataNode has #shutdown method to tackle something before shutdown. > But its shutdown hook does't call this method. In HDFS-14401 for HDFS > persistent memory cache optimization, we added clean cache logic in DN's > #shutdown method. And we expect DN will clean up cache during shut down by > stop-dfs.sh/stop-dfs.cmd, which depends on this Jira's patch. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly
[ https://issues.apache.org/jira/browse/HDFS-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17444290#comment-17444290 ] Feilong He commented on HDFS-15714: --- Uploaded [^HDFS-15714-02.patch] with two commits introduced to fix the following issues: 1) Exclude provided storage in setting up pipeline for append operation. 2) Fix sync failure for truncated data with provided replica. > HDFS Provided Storage Read/Write Mount Support On-the-fly > - > > Key: HDFS-15714 > URL: https://issues.apache.org/jira/browse/HDFS-15714 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Labels: pull-request-available > Attachments: HDFS-15714-01.patch, HDFS-15714-02.patch, > HDFS_Provided_Storage_Design-V1.pdf, HDFS_Provided_Storage_Performance-V1.pdf > > Time Spent: 2.5h > Remaining Estimate: 0h > > HDFS Provided Storage (PS) is a feature to tier HDFS over other file systems. > In HDFS-9806, PROVIDED storage type was introduced to HDFS. Through > configuring external storage with PROVIDED tag for DataNode, user can enable > application to access data stored externally from HDFS side. However, there > are two issues need to be addressed. Firstly, mounting external storage > on-the-fly, namely dynamic mount, is lacking. It is necessary to get it > supported to flexibly combine HDFS with an external storage at runtime. > Secondly, PS write is not supported by current HDFS. But in real > applications, it is common to transfer data bi-directionally for read/write > between HDFS and external storage. > Through this JIRA, we are presenting our work for PS write support and > dynamic mount support for both read & write. Please note in the community > several JIRAs have been filed for these topics. Our work is based on these > previous community work, with new design & implementation to support called > writeBack mount and enable admin to add any mount on-the-fly. We appreciate > those folks in the community for their great contribution! See their pending > JIRAs: HDFS-14805 & HDFS-12090. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly
[ https://issues.apache.org/jira/browse/HDFS-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-15714: -- Attachment: HDFS-15714-02.patch > HDFS Provided Storage Read/Write Mount Support On-the-fly > - > > Key: HDFS-15714 > URL: https://issues.apache.org/jira/browse/HDFS-15714 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Labels: pull-request-available > Attachments: HDFS-15714-01.patch, HDFS-15714-02.patch, > HDFS_Provided_Storage_Design-V1.pdf, HDFS_Provided_Storage_Performance-V1.pdf > > Time Spent: 2.5h > Remaining Estimate: 0h > > HDFS Provided Storage (PS) is a feature to tier HDFS over other file systems. > In HDFS-9806, PROVIDED storage type was introduced to HDFS. Through > configuring external storage with PROVIDED tag for DataNode, user can enable > application to access data stored externally from HDFS side. However, there > are two issues need to be addressed. Firstly, mounting external storage > on-the-fly, namely dynamic mount, is lacking. It is necessary to get it > supported to flexibly combine HDFS with an external storage at runtime. > Secondly, PS write is not supported by current HDFS. But in real > applications, it is common to transfer data bi-directionally for read/write > between HDFS and external storage. > Through this JIRA, we are presenting our work for PS write support and > dynamic mount support for both read & write. Please note in the community > several JIRAs have been filed for these topics. Our work is based on these > previous community work, with new design & implementation to support called > writeBack mount and enable admin to add any mount on-the-fly. We appreciate > those folks in the community for their great contribution! See their pending > JIRAs: HDFS-14805 & HDFS-12090. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly
[ https://issues.apache.org/jira/browse/HDFS-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17357118#comment-17357118 ] Feilong He edited comment on HDFS-15714 at 6/4/21, 7:57 AM: Hi [~bpatel], sorry for this late reply. The relevant code path is shown as below. {code:java} ReadMountManager: FSMountAttrOp.addRemotePaths -> FSMountAttrOp: w.addToEdits -> MountEditLogWriter: createFile{code} {{In MountEditLogWriter#createFile}}, we can know a {{HdfsFileStatus}} will be created based on {{remoteStatus}} obtained from remote storage, which is like creating a normal HDFS file except that the data is stored outside HDFS. *Actually, remote file's own modification time is not used and kept in HDFS*. My previous reply may be ambiguous. I just did a simple test to verify it: compare a file(object)'s modification time in S3 and that in HDFS after S3 bucket containing that file is mounted to HDFS. The phenomenon is they are different, which is consistent with the code analysis. The modification time of that file in HDFS is the time HDFS generates when responding to user's mount request. For {{readOnly}} mount mode, mounted data cannot be changed from HDFS side. So its modification time keeps unchanged on HDFS. It is as same as create time. I think, generally, many upper HDFS applications don't care about data modification time. So the inconsistency of modification time may not cause issues. If you have any thought or case I ignored, please kindly point out it. Thanks a lot for your comment! And as always, any discussion is welcome! was (Author: philohe): Hi [~bpatel], sorry for this late reply. The relevant code path is shown as below. {code:java} ReadMountManager: FSMountAttrOp.addRemotePaths -> FSMountAttrOp: w.addToEdits -> MountEditLogWriter: createFile{code} {{In MountEditLogWriter#createFile}}, we can know a {{HdfsFileStatus}} will be created based on {{remoteStatus}} obtained from remote storage, which is like creating a normal HDFS file except that the data is stored outside HDFS. *Actually, modification time of remote file is not used and kept in HDFS*. My previous reply may be ambiguous. I just did a simple test to verify it: compare a file(object)'s modification time in S3 and that in HDFS after S3 bucket containing that file is mounted to HDFS. The phenomenon is they are different, which is consistent with the code analysis. The modification time of that file in HDFS is the time when the above {{#createFile}} is triggered to respond to user's mount request. For {{readOnly}} mount mode, mounted data cannot be changed from HDFS side. So its modification time keeps unchanged on HDFS. I think, generally, upper HDFS applications don't care about data modification time. So the inconsistency of modification time may not cause issues. If you have any thought or case I ignored, please kindly point out it. Thanks a lot for your comment! And as always, any discussion is welcome! > HDFS Provided Storage Read/Write Mount Support On-the-fly > - > > Key: HDFS-15714 > URL: https://issues.apache.org/jira/browse/HDFS-15714 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Labels: pull-request-available > Attachments: HDFS-15714-01.patch, > HDFS_Provided_Storage_Design-V1.pdf, HDFS_Provided_Storage_Performance-V1.pdf > > Time Spent: 2.5h > Remaining Estimate: 0h > > HDFS Provided Storage (PS) is a feature to tier HDFS over other file systems. > In HDFS-9806, PROVIDED storage type was introduced to HDFS. Through > configuring external storage with PROVIDED tag for DataNode, user can enable > application to access data stored externally from HDFS side. However, there > are two issues need to be addressed. Firstly, mounting external storage > on-the-fly, namely dynamic mount, is lacking. It is necessary to get it > supported to flexibly combine HDFS with an external storage at runtime. > Secondly, PS write is not supported by current HDFS. But in real > applications, it is common to transfer data bi-directionally for read/write > between HDFS and external storage. > Through this JIRA, we are presenting our work for PS write support and > dynamic mount support for both read & write. Please note in the community > several JIRAs have been filed for these topics. Our work is based on these > previous community work, with new design & implementation to support called > writeBack mount and enable admin to add any mount on-the-fly. We appreciate > those folks in the community for their great contribution! See their pending > JIRAs: HDFS-14805 & HDFS-12090. --
[jira] [Commented] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly
[ https://issues.apache.org/jira/browse/HDFS-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17357118#comment-17357118 ] Feilong He commented on HDFS-15714: --- Hi [~bpatel], sorry for this late reply. The relevant code path is shown as below. {code:java} ReadMountManager: FSMountAttrOp.addRemotePaths -> FSMountAttrOp: w.addToEdits -> MountEditLogWriter: createFile{code} {{In MountEditLogWriter#createFile}}, we can know a {{HdfsFileStatus}} will be created based on {{remoteStatus}} obtained from remote storage, which is like creating a normal HDFS file except that the data is stored outside HDFS. *Actually, modification time of remote file is not used and kept in HDFS*. My previous reply may be ambiguous. I just did a simple test to verify it: compare a file(object)'s modification time in S3 and that in HDFS after S3 bucket containing that file is mounted to HDFS. The phenomenon is they are different, which is consistent with the code analysis. The modification time of that file in HDFS is the time when the above {{#createFile}} is triggered to respond to user's mount request. For {{readOnly}} mount mode, mounted data cannot be changed from HDFS side. So its modification time keeps unchanged on HDFS. I think, generally, upper HDFS applications don't care about data modification time. So the inconsistency of modification time may not cause issues. If you have any thought or case I ignored, please kindly point out it. Thanks a lot for your comment! And as always, any discussion is welcome! > HDFS Provided Storage Read/Write Mount Support On-the-fly > - > > Key: HDFS-15714 > URL: https://issues.apache.org/jira/browse/HDFS-15714 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Labels: pull-request-available > Attachments: HDFS-15714-01.patch, > HDFS_Provided_Storage_Design-V1.pdf, HDFS_Provided_Storage_Performance-V1.pdf > > Time Spent: 2.5h > Remaining Estimate: 0h > > HDFS Provided Storage (PS) is a feature to tier HDFS over other file systems. > In HDFS-9806, PROVIDED storage type was introduced to HDFS. Through > configuring external storage with PROVIDED tag for DataNode, user can enable > application to access data stored externally from HDFS side. However, there > are two issues need to be addressed. Firstly, mounting external storage > on-the-fly, namely dynamic mount, is lacking. It is necessary to get it > supported to flexibly combine HDFS with an external storage at runtime. > Secondly, PS write is not supported by current HDFS. But in real > applications, it is common to transfer data bi-directionally for read/write > between HDFS and external storage. > Through this JIRA, we are presenting our work for PS write support and > dynamic mount support for both read & write. Please note in the community > several JIRAs have been filed for these topics. Our work is based on these > previous community work, with new design & implementation to support called > writeBack mount and enable admin to add any mount on-the-fly. We appreciate > those folks in the community for their great contribution! See their pending > JIRAs: HDFS-14805 & HDFS-12090. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly
[ https://issues.apache.org/jira/browse/HDFS-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17343783#comment-17343783 ] Feilong He commented on HDFS-15714: --- [~bpatel], I see. In the matter of updating mount, it will be useful to have to append or update mount operation. As you pointed out, it is inefficient and infeasible to remove a mount, then add that mount again, for syncing metadata purpose. Yes, we can file another Jira to track this functionality in the future. Thanks for your insightful comments! > HDFS Provided Storage Read/Write Mount Support On-the-fly > - > > Key: HDFS-15714 > URL: https://issues.apache.org/jira/browse/HDFS-15714 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Labels: pull-request-available > Attachments: HDFS-15714-01.patch, > HDFS_Provided_Storage_Design-V1.pdf, HDFS_Provided_Storage_Performance-V1.pdf > > Time Spent: 2.5h > Remaining Estimate: 0h > > HDFS Provided Storage (PS) is a feature to tier HDFS over other file systems. > In HDFS-9806, PROVIDED storage type was introduced to HDFS. Through > configuring external storage with PROVIDED tag for DataNode, user can enable > application to access data stored externally from HDFS side. However, there > are two issues need to be addressed. Firstly, mounting external storage > on-the-fly, namely dynamic mount, is lacking. It is necessary to get it > supported to flexibly combine HDFS with an external storage at runtime. > Secondly, PS write is not supported by current HDFS. But in real > applications, it is common to transfer data bi-directionally for read/write > between HDFS and external storage. > Through this JIRA, we are presenting our work for PS write support and > dynamic mount support for both read & write. Please note in the community > several JIRAs have been filed for these topics. Our work is based on these > previous community work, with new design & implementation to support called > writeBack mount and enable admin to add any mount on-the-fly. We appreciate > those folks in the community for their great contribution! See their pending > JIRAs: HDFS-14805 & HDFS-12090. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly
[ https://issues.apache.org/jira/browse/HDFS-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17343783#comment-17343783 ] Feilong He edited comment on HDFS-15714 at 5/13/21, 8:41 AM: - [~bpatel], I see. In the matter of updating mount, it will be useful to have append or update mount operation. As you pointed out, it is inefficient and infeasible to remove a mount, then add that mount again, for syncing metadata purpose. Yes, we can file another Jira to track this functionality in the future. Thanks for your insightful comments! was (Author: philohe): [~bpatel], I see. In the matter of updating mount, it will be useful to have to append or update mount operation. As you pointed out, it is inefficient and infeasible to remove a mount, then add that mount again, for syncing metadata purpose. Yes, we can file another Jira to track this functionality in the future. Thanks for your insightful comments! > HDFS Provided Storage Read/Write Mount Support On-the-fly > - > > Key: HDFS-15714 > URL: https://issues.apache.org/jira/browse/HDFS-15714 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Labels: pull-request-available > Attachments: HDFS-15714-01.patch, > HDFS_Provided_Storage_Design-V1.pdf, HDFS_Provided_Storage_Performance-V1.pdf > > Time Spent: 2.5h > Remaining Estimate: 0h > > HDFS Provided Storage (PS) is a feature to tier HDFS over other file systems. > In HDFS-9806, PROVIDED storage type was introduced to HDFS. Through > configuring external storage with PROVIDED tag for DataNode, user can enable > application to access data stored externally from HDFS side. However, there > are two issues need to be addressed. Firstly, mounting external storage > on-the-fly, namely dynamic mount, is lacking. It is necessary to get it > supported to flexibly combine HDFS with an external storage at runtime. > Secondly, PS write is not supported by current HDFS. But in real > applications, it is common to transfer data bi-directionally for read/write > between HDFS and external storage. > Through this JIRA, we are presenting our work for PS write support and > dynamic mount support for both read & write. Please note in the community > several JIRAs have been filed for these topics. Our work is based on these > previous community work, with new design & implementation to support called > writeBack mount and enable admin to add any mount on-the-fly. We appreciate > those folks in the community for their great contribution! See their pending > JIRAs: HDFS-14805 & HDFS-12090. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly
[ https://issues.apache.org/jira/browse/HDFS-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17343743#comment-17343743 ] Feilong He edited comment on HDFS-15714 at 5/13/21, 7:15 AM: - [~bpatel], thanks for your review. As you know, add/remove/list mount are basic operations. We thought merge/append mount operations are not very commonly used seemingly. Right? And another thought is it is better to make the patch concise and less complex for new feature's initial implementation, except for implementing very necessary core functionalities. For merge/append mount, I think many factors need to be considered. E.g., consider case: two mounts to be merged own some data with same name. So based on the above reasons, the current version doesn't cover merge/append mount operations. Any thought? was (Author: philohe): [~bpatel], thanks for your review. As you know, add/remove/list mount are basic operations. We thought merge/append mount operations are not very commonly used seemingly. Right? And another thought is it is better to make patch concise and less complex for new feature's initial implementation, except for implementing very necessary core functionalities. For merge/append mount, I think many factors need to be considered. E.g., consider case: two mounts to be merged own data with same name. So based on the above reasons, the current version doesn't cover merge/append mount operations. Any thought? > HDFS Provided Storage Read/Write Mount Support On-the-fly > - > > Key: HDFS-15714 > URL: https://issues.apache.org/jira/browse/HDFS-15714 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Labels: pull-request-available > Attachments: HDFS-15714-01.patch, > HDFS_Provided_Storage_Design-V1.pdf, HDFS_Provided_Storage_Performance-V1.pdf > > Time Spent: 2.5h > Remaining Estimate: 0h > > HDFS Provided Storage (PS) is a feature to tier HDFS over other file systems. > In HDFS-9806, PROVIDED storage type was introduced to HDFS. Through > configuring external storage with PROVIDED tag for DataNode, user can enable > application to access data stored externally from HDFS side. However, there > are two issues need to be addressed. Firstly, mounting external storage > on-the-fly, namely dynamic mount, is lacking. It is necessary to get it > supported to flexibly combine HDFS with an external storage at runtime. > Secondly, PS write is not supported by current HDFS. But in real > applications, it is common to transfer data bi-directionally for read/write > between HDFS and external storage. > Through this JIRA, we are presenting our work for PS write support and > dynamic mount support for both read & write. Please note in the community > several JIRAs have been filed for these topics. Our work is based on these > previous community work, with new design & implementation to support called > writeBack mount and enable admin to add any mount on-the-fly. We appreciate > those folks in the community for their great contribution! See their pending > JIRAs: HDFS-14805 & HDFS-12090. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly
[ https://issues.apache.org/jira/browse/HDFS-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17343743#comment-17343743 ] Feilong He edited comment on HDFS-15714 at 5/13/21, 7:14 AM: - [~bpatel], thanks for your review. As you know, add/remove/list mount are basic operations. We thought merge/append mount operations are not very commonly used seemingly. Right? And another thought is it is better to make patch concise and less complex for new feature's initial implementation, except for implementing very necessary core functionalities. For merge/append mount, I think many factors need to be considered. E.g., consider case: two mounts to be merged own data with same name. So based on the above reasons, the current version doesn't cover merge/append mount operations. Any thought? was (Author: philohe): [~bpatel], thanks for your review. As you know, add/remove/list mount are basic operations. We thought merge/append mount operations are not very commonly used seemingly. Right? And another thought is it is better to make patch concise and less complex for new feature's initial implementation, except for implementing very necessary core functionalities. For merge/append mount, I think many factors need to be considered. E.g., consider case: two mounts to be merged/appended own data with same name. So based on the above reasons, the current version doesn't cover merge/append mount operations. Any thought? > HDFS Provided Storage Read/Write Mount Support On-the-fly > - > > Key: HDFS-15714 > URL: https://issues.apache.org/jira/browse/HDFS-15714 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Labels: pull-request-available > Attachments: HDFS-15714-01.patch, > HDFS_Provided_Storage_Design-V1.pdf, HDFS_Provided_Storage_Performance-V1.pdf > > Time Spent: 2.5h > Remaining Estimate: 0h > > HDFS Provided Storage (PS) is a feature to tier HDFS over other file systems. > In HDFS-9806, PROVIDED storage type was introduced to HDFS. Through > configuring external storage with PROVIDED tag for DataNode, user can enable > application to access data stored externally from HDFS side. However, there > are two issues need to be addressed. Firstly, mounting external storage > on-the-fly, namely dynamic mount, is lacking. It is necessary to get it > supported to flexibly combine HDFS with an external storage at runtime. > Secondly, PS write is not supported by current HDFS. But in real > applications, it is common to transfer data bi-directionally for read/write > between HDFS and external storage. > Through this JIRA, we are presenting our work for PS write support and > dynamic mount support for both read & write. Please note in the community > several JIRAs have been filed for these topics. Our work is based on these > previous community work, with new design & implementation to support called > writeBack mount and enable admin to add any mount on-the-fly. We appreciate > those folks in the community for their great contribution! See their pending > JIRAs: HDFS-14805 & HDFS-12090. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly
[ https://issues.apache.org/jira/browse/HDFS-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17343743#comment-17343743 ] Feilong He commented on HDFS-15714: --- [~bpatel], thanks for your review. As you know, add/remove/list mount are basic operations. We thought merge/append mount operations are not very commonly used seemingly. Right? And another thought is it is better to make patch concise and less complex for new feature's initial implementation, except for implementing very necessary core functionalities. For merge/append mount, I think many factors need to be considered. E.g., consider case: two mounts to be merged/appended own data with same name. So based on the above reasons, the current version doesn't cover merge/append mount operations. Any thought? > HDFS Provided Storage Read/Write Mount Support On-the-fly > - > > Key: HDFS-15714 > URL: https://issues.apache.org/jira/browse/HDFS-15714 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Labels: pull-request-available > Attachments: HDFS-15714-01.patch, > HDFS_Provided_Storage_Design-V1.pdf, HDFS_Provided_Storage_Performance-V1.pdf > > Time Spent: 2.5h > Remaining Estimate: 0h > > HDFS Provided Storage (PS) is a feature to tier HDFS over other file systems. > In HDFS-9806, PROVIDED storage type was introduced to HDFS. Through > configuring external storage with PROVIDED tag for DataNode, user can enable > application to access data stored externally from HDFS side. However, there > are two issues need to be addressed. Firstly, mounting external storage > on-the-fly, namely dynamic mount, is lacking. It is necessary to get it > supported to flexibly combine HDFS with an external storage at runtime. > Secondly, PS write is not supported by current HDFS. But in real > applications, it is common to transfer data bi-directionally for read/write > between HDFS and external storage. > Through this JIRA, we are presenting our work for PS write support and > dynamic mount support for both read & write. Please note in the community > several JIRAs have been filed for these topics. Our work is based on these > previous community work, with new design & implementation to support called > writeBack mount and enable admin to add any mount on-the-fly. We appreciate > those folks in the community for their great contribution! See their pending > JIRAs: HDFS-14805 & HDFS-12090. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly
[ https://issues.apache.org/jira/browse/HDFS-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17342396#comment-17342396 ] Feilong He commented on HDFS-15714: --- [~bpatel] Thanks for your comments! In _ReadMountManager#prepareMount_, _FSMountAttrOp.getRemotePaths_ is used to get remote metadata for both remote path being mounted and its children. _FSTreeWalk_ (see its constructor) will instantiate a _FileSystem_ according to remote mount path url. For S3 url, it's _S3AFileSystem_. And by digging more in S3AFileSystem, you can find S3 client is employed to get file status in mounting phase. And with metadata wrapped in remote file status, like _modification time, access time, permission, etc_, HDFS will create its INode file accordingly and set Provided Storage type. Any comment is welcome! > HDFS Provided Storage Read/Write Mount Support On-the-fly > - > > Key: HDFS-15714 > URL: https://issues.apache.org/jira/browse/HDFS-15714 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Labels: pull-request-available > Attachments: HDFS-15714-01.patch, > HDFS_Provided_Storage_Design-V1.pdf, HDFS_Provided_Storage_Performance-V1.pdf > > Time Spent: 2.5h > Remaining Estimate: 0h > > HDFS Provided Storage (PS) is a feature to tier HDFS over other file systems. > In HDFS-9806, PROVIDED storage type was introduced to HDFS. Through > configuring external storage with PROVIDED tag for DataNode, user can enable > application to access data stored externally from HDFS side. However, there > are two issues need to be addressed. Firstly, mounting external storage > on-the-fly, namely dynamic mount, is lacking. It is necessary to get it > supported to flexibly combine HDFS with an external storage at runtime. > Secondly, PS write is not supported by current HDFS. But in real > applications, it is common to transfer data bi-directionally for read/write > between HDFS and external storage. > Through this JIRA, we are presenting our work for PS write support and > dynamic mount support for both read & write. Please note in the community > several JIRAs have been filed for these topics. Our work is based on these > previous community work, with new design & implementation to support called > writeBack mount and enable admin to add any mount on-the-fly. We appreciate > those folks in the community for their great contribution! See their pending > JIRAs: HDFS-14805 & HDFS-12090. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16014) Issue in checking native pmdk lib by 'hadoop checknative' command
[ https://issues.apache.org/jira/browse/HDFS-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-16014: -- Attachment: (was: HDFS-16014-01.patch) > Issue in checking native pmdk lib by 'hadoop checknative' command > - > > Key: HDFS-16014 > URL: https://issues.apache.org/jira/browse/HDFS-16014 > Project: Hadoop HDFS > Issue Type: Bug > Components: native >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-16014-01.patch > > > In HDFS-14818, we proposed a patch to support checking native pmdk lib. The > expected target is to display hint to user regarding pmdk lib loaded state. > Recently, it was found that pmdk lib was not successfully loaded actually but > the `hadoop checknative` command still tells user that it was. This issue can > be reproduced by moving libpmem.so* from specified installed path to other > place, or directly deleting these libs, after the project is built. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16014) Issue in checking native pmdk lib by 'hadoop checknative' command
[ https://issues.apache.org/jira/browse/HDFS-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-16014: -- Attachment: HDFS-16014-01.patch Status: Patch Available (was: In Progress) > Issue in checking native pmdk lib by 'hadoop checknative' command > - > > Key: HDFS-16014 > URL: https://issues.apache.org/jira/browse/HDFS-16014 > Project: Hadoop HDFS > Issue Type: Bug > Components: native >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-16014-01.patch > > > In HDFS-14818, we proposed a patch to support checking native pmdk lib. The > expected target is to display hint to user regarding pmdk lib loaded state. > Recently, it was found that pmdk lib was not successfully loaded actually but > the `hadoop checknative` command still tells user that it was. This issue can > be reproduced by moving libpmem.so* from specified installed path to other > place, or directly deleting these libs, after the project is built. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16014) Issue in checking native pmdk lib by 'hadoop checknative' command
[ https://issues.apache.org/jira/browse/HDFS-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-16014: -- Description: In HDFS-14818, we proposed a patch to support checking native pmdk lib. The expected target is to display hint to user regarding pmdk lib loaded state. Recently, it was found that pmdk lib was not successfully loaded but the `hadoop checknative` command still tells user that it was. This issue can be reproduced by moving libpmem.so* from specified installed path to other place, or directly deleting these libs, after the project is built. (was: In HDFS-14818, we proposed a patch to support checking native pmdk lib. The expected target is to display hint to user regarding pmdk loaded state. Recently, it was found that pmdk lib was not successfully loaded but the `hadoop checknative` command still tells user that it was. This issue can be reproduced by moving libpmem.so* from specified installed path to other place, or directly deleting these libs, after the project is built.) > Issue in checking native pmdk lib by 'hadoop checknative' command > - > > Key: HDFS-16014 > URL: https://issues.apache.org/jira/browse/HDFS-16014 > Project: Hadoop HDFS > Issue Type: Bug > Components: native >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-16014-01.patch > > > In HDFS-14818, we proposed a patch to support checking native pmdk lib. The > expected target is to display hint to user regarding pmdk lib loaded state. > Recently, it was found that pmdk lib was not successfully loaded but the > `hadoop checknative` command still tells user that it was. This issue can be > reproduced by moving libpmem.so* from specified installed path to other > place, or directly deleting these libs, after the project is built. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16014) Issue in checking native pmdk lib by 'hadoop checknative' command
[ https://issues.apache.org/jira/browse/HDFS-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-16014: -- Description: In HDFS-14818, we proposed a patch to support checking native pmdk lib. The expected target is to display hint to user regarding pmdk lib loaded state. Recently, it was found that pmdk lib was not successfully loaded actually but the `hadoop checknative` command still tells user that it was. This issue can be reproduced by moving libpmem.so* from specified installed path to other place, or directly deleting these libs, after the project is built. (was: In HDFS-14818, we proposed a patch to support checking native pmdk lib. The expected target is to display hint to user regarding pmdk lib loaded state. Recently, it was found that pmdk lib was not successfully loaded but the `hadoop checknative` command still tells user that it was. This issue can be reproduced by moving libpmem.so* from specified installed path to other place, or directly deleting these libs, after the project is built.) > Issue in checking native pmdk lib by 'hadoop checknative' command > - > > Key: HDFS-16014 > URL: https://issues.apache.org/jira/browse/HDFS-16014 > Project: Hadoop HDFS > Issue Type: Bug > Components: native >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-16014-01.patch > > > In HDFS-14818, we proposed a patch to support checking native pmdk lib. The > expected target is to display hint to user regarding pmdk lib loaded state. > Recently, it was found that pmdk lib was not successfully loaded actually but > the `hadoop checknative` command still tells user that it was. This issue can > be reproduced by moving libpmem.so* from specified installed path to other > place, or directly deleting these libs, after the project is built. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work started] (HDFS-16014) Issue in checking native pmdk lib by 'hadoop checknative' command
[ https://issues.apache.org/jira/browse/HDFS-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDFS-16014 started by Feilong He. - > Issue in checking native pmdk lib by 'hadoop checknative' command > - > > Key: HDFS-16014 > URL: https://issues.apache.org/jira/browse/HDFS-16014 > Project: Hadoop HDFS > Issue Type: Bug > Components: native >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-16014-01.patch > > > In HDFS-14818, we proposed a patch to support checking native pmdk lib. The > expected target is to display hint to user regarding pmdk loaded state. > Recently, it was found that pmdk lib was not successfully loaded but the > `hadoop checknative` command still tells user that it was. This issue can be > reproduced by moving libpmem.so* from specified installed path to other > place, or directly deleting these libs, after the project is built. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16014) Issue in checking native pmdk lib by 'hadoop checknative' command
[ https://issues.apache.org/jira/browse/HDFS-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-16014: -- Description: In HDFS-14818, we proposed a patch to support checking native pmdk lib. The expected target is to display hint to user regarding pmdk loaded state. Recently, it was found that pmdk lib was not successfully loaded but the `hadoop checknative` command still tells user that it was. This issue can be reproduced by moving libpmem.so* from specified installed path to other place, or directly deleting these libs, after the project is built. > Issue in checking native pmdk lib by 'hadoop checknative' command > - > > Key: HDFS-16014 > URL: https://issues.apache.org/jira/browse/HDFS-16014 > Project: Hadoop HDFS > Issue Type: Bug > Components: native >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-16014-01.patch > > > In HDFS-14818, we proposed a patch to support checking native pmdk lib. The > expected target is to display hint to user regarding pmdk loaded state. > Recently, it was found that pmdk lib was not successfully loaded but the > `hadoop checknative` command still tells user that it was. This issue can be > reproduced by moving libpmem.so* from specified installed path to other > place, or directly deleting these libs, after the project is built. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16014) Issue in checking native pmdk lib by 'hadoop checknative' command
[ https://issues.apache.org/jira/browse/HDFS-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-16014: -- Summary: Issue in checking native pmdk lib by 'hadoop checknative' command (was: HDFS check native pmdk lib issue) > Issue in checking native pmdk lib by 'hadoop checknative' command > - > > Key: HDFS-16014 > URL: https://issues.apache.org/jira/browse/HDFS-16014 > Project: Hadoop HDFS > Issue Type: Bug > Components: native >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-16014-01.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-16014) HDFS check native pmdk lib issue
[ https://issues.apache.org/jira/browse/HDFS-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He reassigned HDFS-16014: - Assignee: Feilong He > HDFS check native pmdk lib issue > > > Key: HDFS-16014 > URL: https://issues.apache.org/jira/browse/HDFS-16014 > Project: Hadoop HDFS > Issue Type: Bug > Components: native >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-16014-01.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16014) HDFS check native pmdk lib issue
[ https://issues.apache.org/jira/browse/HDFS-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-16014: -- Attachment: HDFS-16014-01.patch > HDFS check native pmdk lib issue > > > Key: HDFS-16014 > URL: https://issues.apache.org/jira/browse/HDFS-16014 > Project: Hadoop HDFS > Issue Type: Bug > Components: native >Affects Versions: 3.4.0 >Reporter: Feilong He >Priority: Major > Attachments: HDFS-16014-01.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16014) HDFS check native pmdk lib issue
Feilong He created HDFS-16014: - Summary: HDFS check native pmdk lib issue Key: HDFS-16014 URL: https://issues.apache.org/jira/browse/HDFS-16014 Project: Hadoop HDFS Issue Type: Bug Components: native Affects Versions: 3.4.0 Reporter: Feilong He -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15788) Correct the statement for pmem cache to reflect cache persistence support
[ https://issues.apache.org/jira/browse/HDFS-15788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17340051#comment-17340051 ] Feilong He commented on HDFS-15788: --- This is not a critical issue. I just changed the target version to just 3.4.0. > Correct the statement for pmem cache to reflect cache persistence support > - > > Key: HDFS-15788 > URL: https://issues.apache.org/jira/browse/HDFS-15788 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Minor > Attachments: HDFS-15788-01.patch, HDFS-15788-02.patch > > > Correct the statement for pmem cache to reflect cache persistence support. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15788) Correct the statement for pmem cache to reflect cache persistence support
[ https://issues.apache.org/jira/browse/HDFS-15788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-15788: -- Target Version/s: 3.4.0 (was: 3.3.1, 3.4.0) > Correct the statement for pmem cache to reflect cache persistence support > - > > Key: HDFS-15788 > URL: https://issues.apache.org/jira/browse/HDFS-15788 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Minor > Attachments: HDFS-15788-01.patch, HDFS-15788-02.patch > > > Correct the statement for pmem cache to reflect cache persistence support. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15788) Correct the statement for pmem cache to reflect cache persistence support
[ https://issues.apache.org/jira/browse/HDFS-15788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17316901#comment-17316901 ] Feilong He commented on HDFS-15788: --- Hi [~ayushtkn], sorry for this late reply. This issue is relevant to HDFS-14740 which has already been resolved in 3.3.0. We proposed this current Jira to update document to align with the code changes we made. The target of this Jira is 3.3.1 & 3.4.0. > Correct the statement for pmem cache to reflect cache persistence support > - > > Key: HDFS-15788 > URL: https://issues.apache.org/jira/browse/HDFS-15788 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Minor > Attachments: HDFS-15788-01.patch, HDFS-15788-02.patch > > > Correct the statement for pmem cache to reflect cache persistence support. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15788) Correct the statement for pmem cache to reflect cache persistence support
[ https://issues.apache.org/jira/browse/HDFS-15788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-15788: -- Target Version/s: 3.3.1, 3.4.0 (was: 3.3.1, 3.4.0, 3.1.5, 3.2.3) > Correct the statement for pmem cache to reflect cache persistence support > - > > Key: HDFS-15788 > URL: https://issues.apache.org/jira/browse/HDFS-15788 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Minor > Attachments: HDFS-15788-01.patch, HDFS-15788-02.patch > > > Correct the statement for pmem cache to reflect cache persistence support. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15788) Correct the statement for pmem cache to reflect cache persistence support
[ https://issues.apache.org/jira/browse/HDFS-15788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-15788: -- Attachment: HDFS-15788-02.patch > Correct the statement for pmem cache to reflect cache persistence support > - > > Key: HDFS-15788 > URL: https://issues.apache.org/jira/browse/HDFS-15788 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Minor > Attachments: HDFS-15788-01.patch, HDFS-15788-02.patch > > > Correct the statement for pmem cache to reflect cache persistence support. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly
[ https://issues.apache.org/jira/browse/HDFS-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-15714: -- Description: HDFS Provided Storage (PS) is a feature to tier HDFS over other file systems. In HDFS-9806, PROVIDED storage type was introduced to HDFS. Through configuring external storage with PROVIDED tag for DataNode, user can enable application to access data stored externally from HDFS side. However, there are two issues need to be addressed. Firstly, mounting external storage on-the-fly, namely dynamic mount, is lacking. It is necessary to get it supported to flexibly combine HDFS with an external storage at runtime. Secondly, PS write is not supported by current HDFS. But in real applications, it is common to transfer data bi-directionally for read/write between HDFS and external storage. Through this JIRA, we are presenting our work for PS write support and dynamic mount support for both read & write. Please note in the community several JIRAs have been filed for these topics. Our work is based on these previous community work, with new design & implementation to support called writeBack mount and enable admin to add any mount on-the-fly. We appreciate those folks in the community for their great contribution! See their pending JIRAs: HDFS-14805 & HDFS-12090. was: HDFS Provided Storage (PS) is a feature to tier HDFS over other file systems. In HDFS-9806, PROVIDED storage type was introduced to HDFS. Through configuring external storage with PROVIDED tag for DataNode, user can enable application to access data stored externally from HDFS side. However, there are two issues need to be addressed. Firstly, mounting external storage on-the-fly, namely dynamic mount, is lacked. It is necessary to get it supported to flexibly combine HDFS with an external storage at runtime. Secondly, PS write is not supported by current HDFS. But in real applications, it is common to transfer data bi-directionally for read/write between HDFS and external storage. Through this JIRA, we are presenting our work for PS write support and dynamic mount support for both read &write. Please note in the community several JIRAs have been filed for these topics. Our work is based on these previous community work, with new design & implementation to support called writeBack mount and enable admin to add any mount on-the-fly. We appreciate those folks in the community for their great contribution! See their pending JIRAs: HDFS-14805 & HDFS-12090. > HDFS Provided Storage Read/Write Mount Support On-the-fly > - > > Key: HDFS-15714 > URL: https://issues.apache.org/jira/browse/HDFS-15714 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Labels: pull-request-available > Attachments: HDFS-15714-01.patch, > HDFS_Provided_Storage_Design-V1.pdf, HDFS_Provided_Storage_Performance-V1.pdf > > Time Spent: 10m > Remaining Estimate: 0h > > HDFS Provided Storage (PS) is a feature to tier HDFS over other file systems. > In HDFS-9806, PROVIDED storage type was introduced to HDFS. Through > configuring external storage with PROVIDED tag for DataNode, user can enable > application to access data stored externally from HDFS side. However, there > are two issues need to be addressed. Firstly, mounting external storage > on-the-fly, namely dynamic mount, is lacking. It is necessary to get it > supported to flexibly combine HDFS with an external storage at runtime. > Secondly, PS write is not supported by current HDFS. But in real > applications, it is common to transfer data bi-directionally for read/write > between HDFS and external storage. > Through this JIRA, we are presenting our work for PS write support and > dynamic mount support for both read & write. Please note in the community > several JIRAs have been filed for these topics. Our work is based on these > previous community work, with new design & implementation to support called > writeBack mount and enable admin to add any mount on-the-fly. We appreciate > those folks in the community for their great contribution! See their pending > JIRAs: HDFS-14805 & HDFS-12090. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15788) Correct the statement for pmem cache to reflect cache persistence support
[ https://issues.apache.org/jira/browse/HDFS-15788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-15788: -- Target Version/s: 3.3.1, 3.4.0, 3.1.5, 3.2.3 (was: 3.4.0) > Correct the statement for pmem cache to reflect cache persistence support > - > > Key: HDFS-15788 > URL: https://issues.apache.org/jira/browse/HDFS-15788 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Minor > Attachments: HDFS-15788-01.patch > > > Correct the statement for pmem cache to reflect cache persistence support. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15788) Correct the statement for pmem cache to reflect cache persistence support
[ https://issues.apache.org/jira/browse/HDFS-15788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-15788: -- Attachment: HDFS-15788-01.patch > Correct the statement for pmem cache to reflect cache persistence support > - > > Key: HDFS-15788 > URL: https://issues.apache.org/jira/browse/HDFS-15788 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Minor > Attachments: HDFS-15788-01.patch > > > Correct the statement for pmem cache to reflect cache persistence support. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15788) Correct the statement for pmem cache to reflect cache persistence support
[ https://issues.apache.org/jira/browse/HDFS-15788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-15788: -- Status: Patch Available (was: Open) > Correct the statement for pmem cache to reflect cache persistence support > - > > Key: HDFS-15788 > URL: https://issues.apache.org/jira/browse/HDFS-15788 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Minor > Attachments: HDFS-15788-01.patch > > > Correct the statement for pmem cache to reflect cache persistence support. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15788) Correct the statement for pmem cache to reflect cache persistence support
Feilong He created HDFS-15788: - Summary: Correct the statement for pmem cache to reflect cache persistence support Key: HDFS-15788 URL: https://issues.apache.org/jira/browse/HDFS-15788 Project: Hadoop HDFS Issue Type: Bug Components: documentation Affects Versions: 3.4.0 Reporter: Feilong He Assignee: Feilong He Correct the statement for pmem cache to reflect cache persistence support. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12090) Handling writes from HDFS to Provided storages
[ https://issues.apache.org/jira/browse/HDFS-12090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17245615#comment-17245615 ] Feilong He commented on HDFS-12090: --- We filed another Jira (HDFS-15714) with this topic covered. We developed the feature based on the community work from [~virajith], [~ehiggs], etc. The basic design keeps unchanged. Thanks these folks for their great contribution. > Handling writes from HDFS to Provided storages > -- > > Key: HDFS-12090 > URL: https://issues.apache.org/jira/browse/HDFS-12090 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Virajith Jalaparti >Priority: Major > Labels: pull-request-available > Attachments: External-SyncService-CreateFile.001.png, > HDFS-12090-Functional-Specification.001.pdf, > HDFS-12090-Functional-Specification.002.pdf, > HDFS-12090-Functional-Specification.003.pdf, HDFS-12090-design.001.pdf, > HDFS-12090..patch, HDFS-12090.0001.patch > > Time Spent: 40m > Remaining Estimate: 0h > > HDFS-9806 introduces the concept of {{PROVIDED}} storage, which makes data in > external storage systems accessible through HDFS. However, HDFS-9806 is > limited to data being read through HDFS. This JIRA will deal with how data > can be written to such {{PROVIDED}} storages from HDFS. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14805) Mounting external stores in HDFS on-the-fly
[ https://issues.apache.org/jira/browse/HDFS-14805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17245613#comment-17245613 ] Feilong He commented on HDFS-14805: --- To push this feature further, we filed another Jira (HDFS-15714) with this Jira incorporated. We developed the feature based the pending patches in the community, from [~virajith], [~ehiggs], etc. The basic design keeps almost unchanged. Thanks these folks for their great contribution. > Mounting external stores in HDFS on-the-fly > --- > > Key: HDFS-14805 > URL: https://issues.apache.org/jira/browse/HDFS-14805 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Virajith Jalaparti >Priority: Major > Attachments: dynamic-mounts-in-hdfs.pdf > > > Provided storage (HDFS-9806) allows HDFS to address data in external storage > systems, including cloud stores. Data mounted in this manner, seamlessly, > appears to be part of HDFS for applications/clients. The external data can > also be cached by HDFS on local disks and SSDs, accelerating remote data > reads (HDFS-13069). > However, Provided storage was originally targeted at ephemeral HDFS > deployments in the cloud (e.g., Azure HDInsight). Long running HDFS clusters > are common in many other scenarios which can benefit from accessing data in > remote stores. This JIRA targets such scenarios and aims to provide the > ability to: > (a) Dynamically mount external stores in a HDFS cluster while supporting high > availability. > (b) Mount multiple remote stores simultaneously. > (c) Reduce deployment overheads and simplify usability of Provided storage. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly
[ https://issues.apache.org/jira/browse/HDFS-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17245117#comment-17245117 ] Feilong He edited comment on HDFS-15714 at 12/7/20, 10:27 AM: -- The whole patch has been uploaded. We can divide it into several ones in the future. The design doc and performance doc have also been uploaded. Please have a review. Any comment is welcome! was (Author: philohe): The whole patch has been uploaded. We can divide it into several patches in the future. The design doc and performance doc have also been uploaded. Please have a review. > HDFS Provided Storage Read/Write Mount Support On-the-fly > - > > Key: HDFS-15714 > URL: https://issues.apache.org/jira/browse/HDFS-15714 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-15714-01.patch, > HDFS_Provided_Storage_Design-V1.pdf, HDFS_Provided_Storage_Performance-V1.pdf > > > HDFS Provided Storage (PS) is a feature to tier HDFS over other file systems. > In HDFS-9806, PROVIDED storage type was introduced to HDFS. Through > configuring external storage with PROVIDED tag for DataNode, user can enable > application to access data stored externally from HDFS side. However, there > are two issues need to be addressed. Firstly, mounting external storage > on-the-fly, namely dynamic mount, is lacked. It is necessary to get it > supported to flexibly combine HDFS with an external storage at runtime. > Secondly, PS write is not supported by current HDFS. But in real > applications, it is common to transfer data bi-directionally for read/write > between HDFS and external storage. > Through this JIRA, we are presenting our work for PS write support and > dynamic mount support for both read &write. Please note in the community > several JIRAs have been filed for these topics. Our work is based on these > previous community work, with new design & implementation to support called > writeBack mount and enable admin to add any mount on-the-fly. We appreciate > those folks in the community for their great contribution! See their pending > JIRAs: HDFS-14805 & HDFS-12090. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly
[ https://issues.apache.org/jira/browse/HDFS-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-15714: -- Description: HDFS Provided Storage (PS) is a feature to tier HDFS over other file systems. In HDFS-9806, PROVIDED storage type was introduced to HDFS. Through configuring external storage with PROVIDED tag for DataNode, user can enable application to access data stored externally from HDFS side. However, there are two issues need to be addressed. Firstly, mounting external storage on-the-fly, namely dynamic mount, is lacked. It is necessary to get it supported to flexibly combine HDFS with an external storage at runtime. Secondly, PS write is not supported by current HDFS. But in real applications, it is common to transfer data bi-directionally for read/write between HDFS and external storage. Through this JIRA, we are presenting our work for PS write support and dynamic mount support for both read &write. Please note in the community several JIRAs have been filed for these topics. Our work is based on these previous community work, with new design & implementation to support called writeBack mount and enable admin to add any mount on-the-fly. We appreciate those folks in the community for their great contribution! See their pending JIRAs: HDFS-14805 & HDFS-12090. > HDFS Provided Storage Read/Write Mount Support On-the-fly > - > > Key: HDFS-15714 > URL: https://issues.apache.org/jira/browse/HDFS-15714 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-15714-01.patch, > HDFS_Provided_Storage_Design-V1.pdf, HDFS_Provided_Storage_Performance-V1.pdf > > > HDFS Provided Storage (PS) is a feature to tier HDFS over other file systems. > In HDFS-9806, PROVIDED storage type was introduced to HDFS. Through > configuring external storage with PROVIDED tag for DataNode, user can enable > application to access data stored externally from HDFS side. However, there > are two issues need to be addressed. Firstly, mounting external storage > on-the-fly, namely dynamic mount, is lacked. It is necessary to get it > supported to flexibly combine HDFS with an external storage at runtime. > Secondly, PS write is not supported by current HDFS. But in real > applications, it is common to transfer data bi-directionally for read/write > between HDFS and external storage. > Through this JIRA, we are presenting our work for PS write support and > dynamic mount support for both read &write. Please note in the community > several JIRAs have been filed for these topics. Our work is based on these > previous community work, with new design & implementation to support called > writeBack mount and enable admin to add any mount on-the-fly. We appreciate > those folks in the community for their great contribution! See their pending > JIRAs: HDFS-14805 & HDFS-12090. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly
[ https://issues.apache.org/jira/browse/HDFS-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17245117#comment-17245117 ] Feilong He commented on HDFS-15714: --- The whole patch has been uploaded. We can divide it into several patches in the future. The design doc and performance doc have also been uploaded. Please have a review. > HDFS Provided Storage Read/Write Mount Support On-the-fly > - > > Key: HDFS-15714 > URL: https://issues.apache.org/jira/browse/HDFS-15714 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-15714-01.patch, > HDFS_Provided_Storage_Design-V1.pdf, HDFS_Provided_Storage_Performance-V1.pdf > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly
[ https://issues.apache.org/jira/browse/HDFS-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-15714: -- Attachment: HDFS-15714-01.patch > HDFS Provided Storage Read/Write Mount Support On-the-fly > - > > Key: HDFS-15714 > URL: https://issues.apache.org/jira/browse/HDFS-15714 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-15714-01.patch, > HDFS_Provided_Storage_Design-V1.pdf, HDFS_Provided_Storage_Performance-V1.pdf > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly
[ https://issues.apache.org/jira/browse/HDFS-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-15714: -- Attachment: HDFS_Provided_Storage_Design-V1.pdf > HDFS Provided Storage Read/Write Mount Support On-the-fly > - > > Key: HDFS-15714 > URL: https://issues.apache.org/jira/browse/HDFS-15714 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS_Provided_Storage_Design-V1.pdf, > HDFS_Provided_Storage_Performance-V1.pdf > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly
[ https://issues.apache.org/jira/browse/HDFS-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-15714: -- Attachment: HDFS_Provided_Storage_Performance-V1.pdf > HDFS Provided Storage Read/Write Mount Support On-the-fly > - > > Key: HDFS-15714 > URL: https://issues.apache.org/jira/browse/HDFS-15714 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS_Provided_Storage_Design-V1.pdf, > HDFS_Provided_Storage_Performance-V1.pdf > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly
Feilong He created HDFS-15714: - Summary: HDFS Provided Storage Read/Write Mount Support On-the-fly Key: HDFS-15714 URL: https://issues.apache.org/jira/browse/HDFS-15714 Project: Hadoop HDFS Issue Type: New Feature Components: datanode, namenode Affects Versions: 3.4.0 Reporter: Feilong He Assignee: Feilong He -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-7343) HDFS smart storage management
[ https://issues.apache.org/jira/browse/HDFS-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17210639#comment-17210639 ] Feilong He commented on HDFS-7343: -- Hi Brahma, currently we have no plan to merge this feature to upstream. We have a repo to maintain this project. See https://github.com/Intel-bigdata/SSM > HDFS smart storage management > - > > Key: HDFS-7343 > URL: https://issues.apache.org/jira/browse/HDFS-7343 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Kai Zheng >Assignee: Wei Zhou >Priority: Major > Attachments: HDFS-Smart-Storage-Management-update.pdf, > HDFS-Smart-Storage-Management.pdf, > HDFSSmartStorageManagement-General-20170315.pdf, > HDFSSmartStorageManagement-Phase1-20170315.pdf, access_count_tables.jpg, > move.jpg, tables_in_ssm.xlsx > > > As discussed in HDFS-7285, it would be better to have a comprehensive and > flexible storage policy engine considering file attributes, metadata, data > temperature, storage type, EC codec, available hardware capabilities, > user/application preference and etc. > Modified the title for re-purpose. > We'd extend this effort some bit and aim to work on a comprehensive solution > to provide smart storage management service in order for convenient, > intelligent and effective utilizing of erasure coding or replicas, HDFS cache > facility, HSM offering, and all kinds of tools (balancer, mover, disk > balancer and so on) in a large cluster. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15337) Support available space choosing policy in HDFS Persistent Memory Cache
Feilong He created HDFS-15337: - Summary: Support available space choosing policy in HDFS Persistent Memory Cache Key: HDFS-15337 URL: https://issues.apache.org/jira/browse/HDFS-15337 Project: Hadoop HDFS Issue Type: Improvement Components: caching, datanode Reporter: Feilong He Assignee: Feilong He In HDFS-13762, we introduced HDFS Persistent Memory Cache feature. In that implementation, if more than one persistent memory volume is specified by user, a simple round-robin policy is used to pick up a volume to cache data. Evidently, the large difference of volume capacity can lead to imbalance issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15080) Fix the issue in reading persistent memory cache with an offset
[ https://issues.apache.org/jira/browse/HDFS-15080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-15080: -- Description: Some applications can read a segment of pmem cache with an offset specified. The previous implementation for pmem cache read with DirectByteBuffer didn't cover this situation. Let me explain further. In our test, we used spark SQL to run some TPC-DS workload to read the cache data and hits read exception. This was due to the missed seek offset arg, which is used in spark SQL to read data packet by packet. was:Some applications can read a segment of pmem cache with an offset specified. The previous implementation for pmem cache read with DirectByteBuffer didn't cover this situation. > Fix the issue in reading persistent memory cache with an offset > --- > > Key: HDFS-15080 > URL: https://issues.apache.org/jira/browse/HDFS-15080 > Project: Hadoop HDFS > Issue Type: Bug > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Fix For: 3.3.0, 3.1.4, 3.2.2 > > Attachments: HDFS-15080-000.patch, HDFS-15080-branch-3.1-000.patch, > HDFS-15080-branch-3.2-000.patch > > > Some applications can read a segment of pmem cache with an offset specified. > The previous implementation for pmem cache read with DirectByteBuffer didn't > cover this situation. > Let me explain further. In our test, we used spark SQL to run some TPC-DS > workload to read the cache data and hits read exception. This was due to the > missed seek offset arg, which is used in spark SQL to read data packet by > packet. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15080) Fix the issue in reading persistent memory cache with an offset
[ https://issues.apache.org/jira/browse/HDFS-15080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-15080: -- Attachment: HDFS-15080-branch-3.2-000.patch > Fix the issue in reading persistent memory cache with an offset > --- > > Key: HDFS-15080 > URL: https://issues.apache.org/jira/browse/HDFS-15080 > Project: Hadoop HDFS > Issue Type: Bug > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Fix For: 3.3.0, 3.1.4, 3.2.2 > > Attachments: HDFS-15080-000.patch, HDFS-15080-branch-3.1-000.patch, > HDFS-15080-branch-3.2-000.patch > > > Some applications can read a segment of pmem cache with an offset specified. > The previous implementation for pmem cache read with DirectByteBuffer didn't > cover this situation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15080) Fix the issue in reading persistent memory cache with an offset
[ https://issues.apache.org/jira/browse/HDFS-15080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-15080: -- Attachment: HDFS-15080-branch-3.1-000.patch > Fix the issue in reading persistent memory cache with an offset > --- > > Key: HDFS-15080 > URL: https://issues.apache.org/jira/browse/HDFS-15080 > Project: Hadoop HDFS > Issue Type: Bug > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Fix For: 3.3.0, 3.1.4, 3.2.2 > > Attachments: HDFS-15080-000.patch, HDFS-15080-branch-3.1-000.patch > > > Some applications can read a segment of pmem cache with an offset specified. > The previous implementation for pmem cache read with DirectByteBuffer didn't > cover this situation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts
[ https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17003132#comment-17003132 ] Feilong He commented on HDFS-14740: --- [^HDFS-14740.009.patch], [^HDFS-14740-branch-3.1-001.patch], [^HDFS-14740-branch-3.2-001.patch] were loaded with some code refactor. We will consider to check in in the following days. If you have any suggestion, please feel free to post it. > Recover data blocks from persistent memory read cache during datanode restarts > -- > > Key: HDFS-14740 > URL: https://issues.apache.org/jira/browse/HDFS-14740 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-14740-branch-3.1-000.patch, > HDFS-14740-branch-3.1-001.patch, HDFS-14740-branch-3.2-000.patch, > HDFS-14740-branch-3.2-001.patch, HDFS-14740.000.patch, HDFS-14740.001.patch, > HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, > HDFS-14740.005.patch, HDFS-14740.006.patch, HDFS-14740.007.patch, > HDFS-14740.008.patch, HDFS-14740.009.patch, > HDFS_Persistent_Read-Cache_Design-v1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf > > > In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache > management. Even though PM can persist cache data, for simplifying the > initial implementation, the previous cache data will be cleaned up during > DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking > advantage of PM's data persistence characteristic, i.e., recovering the > status for cached data, if any, when DataNode restarts, thus, cache warm up > time can be saved for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15080) Fix the issue in reading persistent memory cache with an offset
[ https://issues.apache.org/jira/browse/HDFS-15080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-15080: -- Fix Version/s: 3.2.2 3.1.4 3.3.0 > Fix the issue in reading persistent memory cache with an offset > --- > > Key: HDFS-15080 > URL: https://issues.apache.org/jira/browse/HDFS-15080 > Project: Hadoop HDFS > Issue Type: Bug > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Fix For: 3.3.0, 3.1.4, 3.2.2 > > Attachments: HDFS-15080-000.patch > > > Some applications can read a segment of pmem cache with an offset specified. > The previous implementation for pmem cache read with DirectByteBuffer didn't > cover this situation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15080) Fix the issue in reading persistent memory cache with an offset
[ https://issues.apache.org/jira/browse/HDFS-15080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-15080: -- Description: Some applications can read a segment of pmem cache with an offset specified. The previous implementation for pmem cache read with DirectByteBuffer didn't cover this situation. (was: Some applications can read a segment of pmem cache with an offset specified. The previous implementation didn't cover this situation.) > Fix the issue in reading persistent memory cache with an offset > --- > > Key: HDFS-15080 > URL: https://issues.apache.org/jira/browse/HDFS-15080 > Project: Hadoop HDFS > Issue Type: Bug > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-15080-000.patch > > > Some applications can read a segment of pmem cache with an offset specified. > The previous implementation for pmem cache read with DirectByteBuffer didn't > cover this situation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15080) Fix the issue in reading persistent memory cache with an offset
[ https://issues.apache.org/jira/browse/HDFS-15080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-15080: -- Description: Some applications can read a segment of pmem cache with an offset specified. The previous implementation didn't cover this situation. > Fix the issue in reading persistent memory cache with an offset > --- > > Key: HDFS-15080 > URL: https://issues.apache.org/jira/browse/HDFS-15080 > Project: Hadoop HDFS > Issue Type: Bug > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-15080-000.patch > > > Some applications can read a segment of pmem cache with an offset specified. > The previous implementation didn't cover this situation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15080) Fix the issue in reading persistent memory cache with an offset
[ https://issues.apache.org/jira/browse/HDFS-15080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-15080: -- Attachment: HDFS-15080-000.patch Status: Patch Available (was: Open) > Fix the issue in reading persistent memory cache with an offset > --- > > Key: HDFS-15080 > URL: https://issues.apache.org/jira/browse/HDFS-15080 > Project: Hadoop HDFS > Issue Type: Bug > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-15080-000.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15080) Fix the issue in reading persistent memory cache with an offset
Feilong He created HDFS-15080: - Summary: Fix the issue in reading persistent memory cache with an offset Key: HDFS-15080 URL: https://issues.apache.org/jira/browse/HDFS-15080 Project: Hadoop HDFS Issue Type: Bug Components: caching, datanode Reporter: Feilong He Assignee: Feilong He -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts
[ https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-14740: -- Attachment: HDFS-14740-branch-3.1-001.patch > Recover data blocks from persistent memory read cache during datanode restarts > -- > > Key: HDFS-14740 > URL: https://issues.apache.org/jira/browse/HDFS-14740 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-14740-branch-3.1-000.patch, > HDFS-14740-branch-3.1-001.patch, HDFS-14740-branch-3.2-000.patch, > HDFS-14740-branch-3.2-001.patch, HDFS-14740.000.patch, HDFS-14740.001.patch, > HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, > HDFS-14740.005.patch, HDFS-14740.006.patch, HDFS-14740.007.patch, > HDFS-14740.008.patch, HDFS-14740.009.patch, > HDFS_Persistent_Read-Cache_Design-v1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf > > > In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache > management. Even though PM can persist cache data, for simplifying the > initial implementation, the previous cache data will be cleaned up during > DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking > advantage of PM's data persistence characteristic, i.e., recovering the > status for cached data, if any, when DataNode restarts, thus, cache warm up > time can be saved for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts
[ https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-14740: -- Attachment: HDFS-14740-branch-3.2-001.patch > Recover data blocks from persistent memory read cache during datanode restarts > -- > > Key: HDFS-14740 > URL: https://issues.apache.org/jira/browse/HDFS-14740 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-14740-branch-3.1-000.patch, > HDFS-14740-branch-3.2-000.patch, HDFS-14740-branch-3.2-001.patch, > HDFS-14740.000.patch, HDFS-14740.001.patch, HDFS-14740.002.patch, > HDFS-14740.003.patch, HDFS-14740.004.patch, HDFS-14740.005.patch, > HDFS-14740.006.patch, HDFS-14740.007.patch, HDFS-14740.008.patch, > HDFS-14740.009.patch, HDFS_Persistent_Read-Cache_Design-v1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf > > > In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache > management. Even though PM can persist cache data, for simplifying the > initial implementation, the previous cache data will be cleaned up during > DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking > advantage of PM's data persistence characteristic, i.e., recovering the > status for cached data, if any, when DataNode restarts, thus, cache warm up > time can be saved for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts
[ https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-14740: -- Attachment: HDFS-14740.009.patch > Recover data blocks from persistent memory read cache during datanode restarts > -- > > Key: HDFS-14740 > URL: https://issues.apache.org/jira/browse/HDFS-14740 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-14740-branch-3.1-000.patch, > HDFS-14740-branch-3.2-000.patch, HDFS-14740.000.patch, HDFS-14740.001.patch, > HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, > HDFS-14740.005.patch, HDFS-14740.006.patch, HDFS-14740.007.patch, > HDFS-14740.008.patch, HDFS-14740.009.patch, > HDFS_Persistent_Read-Cache_Design-v1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf > > > In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache > management. Even though PM can persist cache data, for simplifying the > initial implementation, the previous cache data will be cleaned up during > DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking > advantage of PM's data persistence characteristic, i.e., recovering the > status for cached data, if any, when DataNode restarts, thus, cache warm up > time can be saved for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts
[ https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-14740: -- Attachment: HDFS-14740-branch-3.1-000.patch > Recover data blocks from persistent memory read cache during datanode restarts > -- > > Key: HDFS-14740 > URL: https://issues.apache.org/jira/browse/HDFS-14740 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-14740-branch-3.1-000.patch, > HDFS-14740-branch-3.2-000.patch, HDFS-14740.000.patch, HDFS-14740.001.patch, > HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, > HDFS-14740.005.patch, HDFS-14740.006.patch, HDFS-14740.007.patch, > HDFS-14740.008.patch, HDFS_Persistent_Read-Cache_Design-v1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf > > > In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache > management. Even though PM can persist cache data, for simplifying the > initial implementation, the previous cache data will be cleaned up during > DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking > advantage of PM's data persistence characteristic, i.e., recovering the > status for cached data, if any, when DataNode restarts, thus, cache warm up > time can be saved for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts
[ https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-14740: -- Attachment: (was: HDFS-14740-branch-3.1-000.patch) > Recover data blocks from persistent memory read cache during datanode restarts > -- > > Key: HDFS-14740 > URL: https://issues.apache.org/jira/browse/HDFS-14740 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-14740-branch-3.1-000.patch, > HDFS-14740-branch-3.2-000.patch, HDFS-14740.000.patch, HDFS-14740.001.patch, > HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, > HDFS-14740.005.patch, HDFS-14740.006.patch, HDFS-14740.007.patch, > HDFS-14740.008.patch, HDFS_Persistent_Read-Cache_Design-v1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf > > > In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache > management. Even though PM can persist cache data, for simplifying the > initial implementation, the previous cache data will be cleaned up during > DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking > advantage of PM's data persistence characteristic, i.e., recovering the > status for cached data, if any, when DataNode restarts, thus, cache warm up > time can be saved for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts
[ https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16999833#comment-16999833 ] Feilong He commented on HDFS-14740: --- [^HDFS-14740-branch-3.1-000.patch] & [^HDFS-14740-branch-3.2-000.patch] have been uploaded, respectively for backporting the code to branch-3.1 and branch-3.2. > Recover data blocks from persistent memory read cache during datanode restarts > -- > > Key: HDFS-14740 > URL: https://issues.apache.org/jira/browse/HDFS-14740 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-14740-branch-3.1-000.patch, > HDFS-14740-branch-3.2-000.patch, HDFS-14740.000.patch, HDFS-14740.001.patch, > HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, > HDFS-14740.005.patch, HDFS-14740.006.patch, HDFS-14740.007.patch, > HDFS-14740.008.patch, HDFS_Persistent_Read-Cache_Design-v1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf > > > In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache > management. Even though PM can persist cache data, for simplifying the > initial implementation, the previous cache data will be cleaned up during > DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking > advantage of PM's data persistence characteristic, i.e., recovering the > status for cached data, if any, when DataNode restarts, thus, cache warm up > time can be saved for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts
[ https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-14740: -- Attachment: HDFS-14740-branch-3.2-000.patch > Recover data blocks from persistent memory read cache during datanode restarts > -- > > Key: HDFS-14740 > URL: https://issues.apache.org/jira/browse/HDFS-14740 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-14740-branch-3.1-000.patch, > HDFS-14740-branch-3.2-000.patch, HDFS-14740.000.patch, HDFS-14740.001.patch, > HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, > HDFS-14740.005.patch, HDFS-14740.006.patch, HDFS-14740.007.patch, > HDFS-14740.008.patch, HDFS_Persistent_Read-Cache_Design-v1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf > > > In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache > management. Even though PM can persist cache data, for simplifying the > initial implementation, the previous cache data will be cleaned up during > DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking > advantage of PM's data persistence characteristic, i.e., recovering the > status for cached data, if any, when DataNode restarts, thus, cache warm up > time can be saved for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts
[ https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-14740: -- Attachment: HDFS-14740-branch-3.1-000.patch > Recover data blocks from persistent memory read cache during datanode restarts > -- > > Key: HDFS-14740 > URL: https://issues.apache.org/jira/browse/HDFS-14740 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-14740-branch-3.1-000.patch, HDFS-14740.000.patch, > HDFS-14740.001.patch, HDFS-14740.002.patch, HDFS-14740.003.patch, > HDFS-14740.004.patch, HDFS-14740.005.patch, HDFS-14740.006.patch, > HDFS-14740.007.patch, HDFS-14740.008.patch, > HDFS_Persistent_Read-Cache_Design-v1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf > > > In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache > management. Even though PM can persist cache data, for simplifying the > initial implementation, the previous cache data will be cleaned up during > DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking > advantage of PM's data persistence characteristic, i.e., recovering the > status for cached data, if any, when DataNode restarts, thus, cache warm up > time can be saved for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts
[ https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16997937#comment-16997937 ] Feilong He commented on HDFS-14740: --- Thanks [~rakeshr] for your suggestion. '{{dfs.datanode.pmem.cache.restore}}' and '{{dfs.datanode.pmem.cache.dirs}}' looks good to me. [^HDFS-14740.008.patch] has some updates covering this. > Recover data blocks from persistent memory read cache during datanode restarts > -- > > Key: HDFS-14740 > URL: https://issues.apache.org/jira/browse/HDFS-14740 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, > HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, > HDFS-14740.005.patch, HDFS-14740.006.patch, HDFS-14740.007.patch, > HDFS-14740.008.patch, HDFS_Persistent_Read-Cache_Design-v1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf > > > In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache > management. Even though PM can persist cache data, for simplifying the > initial implementation, the previous cache data will be cleaned up during > DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking > advantage of PM's data persistence characteristic, i.e., recovering the > status for cached data, if any, when DataNode restarts, thus, cache warm up > time can be saved for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts
[ https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-14740: -- Attachment: HDFS-14740.008.patch > Recover data blocks from persistent memory read cache during datanode restarts > -- > > Key: HDFS-14740 > URL: https://issues.apache.org/jira/browse/HDFS-14740 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, > HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, > HDFS-14740.005.patch, HDFS-14740.006.patch, HDFS-14740.007.patch, > HDFS-14740.008.patch, HDFS_Persistent_Read-Cache_Design-v1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf > > > In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache > management. Even though PM can persist cache data, for simplifying the > initial implementation, the previous cache data will be cleaned up during > DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking > advantage of PM's data persistence characteristic, i.e., recovering the > status for cached data, if any, when DataNode restarts, thus, cache warm up > time can be saved for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts
[ https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16989587#comment-16989587 ] Feilong He commented on HDFS-14740: --- [^HDFS-14740.007.patch] has been uploaded to change a property to 'dfs.datanode.cache.restore.enabled'. Comment is welcome! > Recover data blocks from persistent memory read cache during datanode restarts > -- > > Key: HDFS-14740 > URL: https://issues.apache.org/jira/browse/HDFS-14740 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, > HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, > HDFS-14740.005.patch, HDFS-14740.006.patch, HDFS-14740.007.patch, > HDFS_Persistent_Read-Cache_Design-v1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf > > > In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache > management. Even though PM can persist cache data, for simplifying the > initial implementation, the previous cache data will be cleaned up during > DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking > advantage of PM's data persistence characteristic, i.e., recovering the > status for cached data, if any, when DataNode restarts, thus, cache warm up > time can be saved for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts
[ https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-14740: -- Attachment: HDFS-14740.007.patch > Recover data blocks from persistent memory read cache during datanode restarts > -- > > Key: HDFS-14740 > URL: https://issues.apache.org/jira/browse/HDFS-14740 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, > HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, > HDFS-14740.005.patch, HDFS-14740.006.patch, HDFS-14740.007.patch, > HDFS_Persistent_Read-Cache_Design-v1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf > > > In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache > management. Even though PM can persist cache data, for simplifying the > initial implementation, the previous cache data will be cleaned up during > DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking > advantage of PM's data persistence characteristic, i.e., recovering the > status for cached data, if any, when DataNode restarts, thus, cache warm up > time can be saved for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts
[ https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16989573#comment-16989573 ] Feilong He commented on HDFS-14740: --- Thanks [~rakeshr] so much for your comments. Sorry for this late reply. # Yes, 'dfs.datanode.cache.persistence.enabled' looks a bit ambiguous to user. This property is used to control whether the cache on pmem should be restored to aviod unnecessarily pulling data to pmem again after DataNode restarts. I prefer to use 'dfs.datanode.cache.restore.enabled'. If you have other comment, please kindly let me know. # I have conducted some tests on the case you mentioned. 1) In my test, a file is cached to pmem by HDFS with the above flag set to true. Then, I shutdown the cluster and set the flag to false. After restarted the cluster, I noted that the previous cache is dropped on pmem and DataNode has to recache the block data to pmem, as we expected. 2) I also did another test. Firstly, a file is cached to pmem by HDFS with the above flag set to false. Then, I shutdown the cluster and set the flat to true. During the restarting of DataNode, I can see that the previous cache is restored, as we expected. To sum up, the behavior in the two tests aligns with the purpose of this flag. > Recover data blocks from persistent memory read cache during datanode restarts > -- > > Key: HDFS-14740 > URL: https://issues.apache.org/jira/browse/HDFS-14740 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, > HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, > HDFS-14740.005.patch, HDFS-14740.006.patch, > HDFS_Persistent_Read-Cache_Design-v1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf > > > In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache > management. Even though PM can persist cache data, for simplifying the > initial implementation, the previous cache data will be cleaned up during > DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking > advantage of PM's data persistence characteristic, i.e., recovering the > status for cached data, if any, when DataNode restarts, thus, cache warm up > time can be saved for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts
[ https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He reassigned HDFS-14740: - Assignee: Feilong He (was: Rui Mo) > Recover data blocks from persistent memory read cache during datanode restarts > -- > > Key: HDFS-14740 > URL: https://issues.apache.org/jira/browse/HDFS-14740 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, > HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, > HDFS-14740.005.patch, HDFS-14740.006.patch, > HDFS_Persistent_Read-Cache_Design-v1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf > > > In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache > management. Even though PM can persist cache data, for simplifying the > initial implementation, the previous cache data will be cleaned up during > DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking > advantage of PM's data persistence characteristic, i.e., recovering the > status for cached data, if any, when DataNode restarts, thus, cache warm up > time can be saved for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts
[ https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16974816#comment-16974816 ] Feilong He commented on HDFS-14740: --- [^HDFS_Persistent_Read-Cache_Test-v2.pdf] has been uploaded for your reference. > Recover data blocks from persistent memory read cache during datanode restarts > -- > > Key: HDFS-14740 > URL: https://issues.apache.org/jira/browse/HDFS-14740 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Rui Mo >Priority: Major > Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, > HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, > HDFS-14740.005.patch, HDFS-14740.006.patch, > HDFS_Persistent_Read-Cache_Design-v1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf > > > In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache > management. Even though PM can persist cache data, for simplifying the > initial implementation, the previous cache data will be cleaned up during > DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking > advantage of PM's data persistence characteristic, i.e., recovering the > status for cached data, if any, when DataNode restarts, thus, cache warm up > time can be saved for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts
[ https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-14740: -- Attachment: HDFS_Persistent_Read-Cache_Test-v2.pdf > Recover data blocks from persistent memory read cache during datanode restarts > -- > > Key: HDFS-14740 > URL: https://issues.apache.org/jira/browse/HDFS-14740 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Rui Mo >Priority: Major > Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, > HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, > HDFS-14740.005.patch, HDFS-14740.006.patch, > HDFS_Persistent_Read-Cache_Design-v1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf > > > In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache > management. Even though PM can persist cache data, for simplifying the > initial implementation, the previous cache data will be cleaned up during > DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking > advantage of PM's data persistence characteristic, i.e., recovering the > status for cached data, if any, when DataNode restarts, thus, cache warm up > time can be saved for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14740) HDFS read cache persistence support
[ https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16958672#comment-16958672 ] Feilong He commented on HDFS-14740: --- [^HDFS_Persistent_Read-Cache_Design-v1.pdf] and [^HDFS_Persistent_Read-Cache_Test-v1.pdf] have been uploaded. Any comment is welcome! > HDFS read cache persistence support > --- > > Key: HDFS-14740 > URL: https://issues.apache.org/jira/browse/HDFS-14740 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Rui Mo >Priority: Major > Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, > HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, > HDFS-14740.005.patch, HDFS-14740.006.patch, > HDFS_Persistent_Read-Cache_Design-v1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.pdf > > > In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache > management. Even though PM can persist cache data, for simplifying the > initial implementation, the previous cache data will be cleaned up during > DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking > advantage of PM's data persistence characteristic, i.e., recovering the > status for cached data, if any, when DataNode restarts, thus, cache warm up > time can be saved for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14740) HDFS read cache persistence support
[ https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-14740: -- Attachment: HDFS_Persistent_Read-Cache_Test-v1.pdf HDFS_Persistent_Read-Cache_Design-v1.pdf > HDFS read cache persistence support > --- > > Key: HDFS-14740 > URL: https://issues.apache.org/jira/browse/HDFS-14740 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Rui Mo >Priority: Major > Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, > HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, > HDFS-14740.005.patch, HDFS-14740.006.patch, > HDFS_Persistent_Read-Cache_Design-v1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.pdf > > > In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache > management. Even though PM can persist cache data, for simplifying the > initial implementation, the previous cache data will be cleaned up during > DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking > advantage of PM's data persistence characteristic, i.e., recovering the > status for cached data, if any, when DataNode restarts, thus, cache warm up > time can be saved for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14740) HDFS read cache persistence support
[ https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16950849#comment-16950849 ] Feilong He commented on HDFS-14740: --- [~Rui Mo], please prepare a design doc and test doc, then upload them to this JIra. Thanks! > HDFS read cache persistence support > --- > > Key: HDFS-14740 > URL: https://issues.apache.org/jira/browse/HDFS-14740 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Rui Mo >Priority: Major > Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, > HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, > HDFS-14740.005.patch, HDFS-14740.006.patch > > > In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache > management. Even though PM can persist cache data, for simplifying the > initial implementation, the previous cache data will be cleaned up during > DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking > advantage of PM's data persistence characteristic, i.e., recovering the > status for cached data, if any, when DataNode restarts, thus, cache warm up > time can be saved for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14905) Backport HDFS persistent memory read cache support to branch-3.2
[ https://issues.apache.org/jira/browse/HDFS-14905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-14905: -- Attachment: HDFS-14905-branch-3.2-000.patch Status: Patch Available (was: Open) > Backport HDFS persistent memory read cache support to branch-3.2 > > > Key: HDFS-14905 > URL: https://issues.apache.org/jira/browse/HDFS-14905 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14905-branch-3.2-000.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-14905) Backport HDFS persistent memory read cache support to branch-3.2
Feilong He created HDFS-14905: - Summary: Backport HDFS persistent memory read cache support to branch-3.2 Key: HDFS-14905 URL: https://issues.apache.org/jira/browse/HDFS-14905 Project: Hadoop HDFS Issue Type: Improvement Components: caching, datanode Reporter: Feilong He Assignee: Feilong He Fix For: 3.3.0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14745) Backport HDFS persistent memory read cache support to branch-3.1
[ https://issues.apache.org/jira/browse/HDFS-14745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-14745: -- Attachment: HDFS-14745-branch-3.1-003.patch > Backport HDFS persistent memory read cache support to branch-3.1 > > > Key: HDFS-14745 > URL: https://issues.apache.org/jira/browse/HDFS-14745 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Labels: cache, datanode > Fix For: 3.3.0 > > Attachments: HDFS-14745-branch-3.1-000.patch, > HDFS-14745-branch-3.1-001.patch, HDFS-14745-branch-3.1-002.patch, > HDFS-14745-branch-3.1-003.patch > > > We are proposing to backport the patches for HDFS-13762, HDFS persistent > memory read cache support, to branch-3.1. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14518) Optimize HDFS cache checksum and make checksum enabling configurable
[ https://issues.apache.org/jira/browse/HDFS-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16851456#comment-16851456 ] Feilong He edited comment on HDFS-14518 at 10/8/19 8:51 AM: Hi [~weichiu], this Jira is common to DRAM cache and Pmem cache. So strictly speaking, it is not only related to HDFS-13762, and the original DRAM cache will also be affected. was (Author: philohe): Hi [~weichiu], this Jira is common to DRAM cache and Pmem cache. So strictly speaking, it is not only related to HDFS-13762, but the original DRAM cache will also be affected. > Optimize HDFS cache checksum and make checksum enabling configurable > > > Key: HDFS-14518 > URL: https://issues.apache.org/jira/browse/HDFS-14518 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Minor > Attachments: HDFS-14518-.patch > > > HDFS cache checksum can be operated on cached data for verification. And we > can also consider to make checksum configurable, thus user can shutdown > checksum operation when caching data. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14518) Optimize HDFS cache checksum and make checksum enabling configurable
[ https://issues.apache.org/jira/browse/HDFS-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-14518: -- Summary: Optimize HDFS cache checksum and make checksum enabling configurable (was: Optimize HDFS cache checksum and make checksum configurable) > Optimize HDFS cache checksum and make checksum enabling configurable > > > Key: HDFS-14518 > URL: https://issues.apache.org/jira/browse/HDFS-14518 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Minor > Attachments: HDFS-14518-.patch > > > HDFS cache checksum can be operated on cached data for verification. And we > can also consider to make checksum configurable, thus user can shutdown > checksum operation when caching data. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14518) Optimize HDFS cache checksum and make checksum configurable
[ https://issues.apache.org/jira/browse/HDFS-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16939242#comment-16939242 ] Feilong He commented on HDFS-14518: --- [^HDFS-14518-.patch] is an inital patch and the native PMDK impl for caching block to PMEM has not been included. It looks that the size of buffer used in checksum can have an evident impact on the performance. It may also need to be optimized. Any comment is welcome! > Optimize HDFS cache checksum and make checksum configurable > --- > > Key: HDFS-14518 > URL: https://issues.apache.org/jira/browse/HDFS-14518 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Minor > Attachments: HDFS-14518-.patch > > > HDFS cache checksum can be operated on cached data for verification. And we > can also consider to make checksum configurable, thus user can shutdown > checksum operation when caching data. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14518) Optimize HDFS cache checksum and make checksum configurable
[ https://issues.apache.org/jira/browse/HDFS-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-14518: -- Attachment: HDFS-14518-.patch > Optimize HDFS cache checksum and make checksum configurable > --- > > Key: HDFS-14518 > URL: https://issues.apache.org/jira/browse/HDFS-14518 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Minor > Attachments: HDFS-14518-.patch > > > HDFS cache checksum can be operated on cached data for verification. And we > can also consider to make checksum configurable, thus user can shutdown > checksum operation when caching data. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14518) Optimize HDFS cache checksum and make checksum configurable
[ https://issues.apache.org/jira/browse/HDFS-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16851456#comment-16851456 ] Feilong He edited comment on HDFS-14518 at 9/27/19 8:11 AM: Hi [~jojochuang], this Jira is common to DRAM cache and Pmem cache. So strictly speaking, it is not only related to HDFS-13762, but the original DRAM cache will also be affected. was (Author: philohe): Hi [~jojochuang], this Jira is common to DRAM cache and Pmem cache. So strictly speaking, it is not related to HDFS-13762. > Optimize HDFS cache checksum and make checksum configurable > --- > > Key: HDFS-14518 > URL: https://issues.apache.org/jira/browse/HDFS-14518 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Minor > > HDFS cache checksum can be operated on cached data for verification. And we > can also consider to make checksum configurable, thus user can shutdown > checksum operation when caching data. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14518) Optimize HDFS cache checksum and make checksum configurable
[ https://issues.apache.org/jira/browse/HDFS-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-14518: -- Description: HDFS cache checksum can be operated on cached data for verification. And we can also consider to make checksum configurable, thus user can shutdown checksum operation when caching data. (was: HDFS cache checksum can be operated on cached data for verification. And we can also consider to make checksum configurable, so user can shutdown checksum operation when caching data.) > Optimize HDFS cache checksum and make checksum configurable > --- > > Key: HDFS-14518 > URL: https://issues.apache.org/jira/browse/HDFS-14518 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Minor > > HDFS cache checksum can be operated on cached data for verification. And we > can also consider to make checksum configurable, thus user can shutdown > checksum operation when caching data. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14745) Backport HDFS persistent memory read cache support to branch-3.1
[ https://issues.apache.org/jira/browse/HDFS-14745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16939199#comment-16939199 ] Feilong He edited comment on HDFS-14745 at 9/27/19 8:03 AM: [^HDFS-14745-branch-3.1-002.patch] has been uploaded to include the patch for HDFS-14818. was (Author: philohe): [^HDFS-14745-branch-3.1-002.patch] has been uploaded to include the patch HDFS-14818. > Backport HDFS persistent memory read cache support to branch-3.1 > > > Key: HDFS-14745 > URL: https://issues.apache.org/jira/browse/HDFS-14745 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Labels: cache, datanode > Fix For: 3.3.0 > > Attachments: HDFS-14745-branch-3.1-000.patch, > HDFS-14745-branch-3.1-001.patch, HDFS-14745-branch-3.1-002.patch > > > We are proposing to backport the patches for HDFS-13762, HDFS persistent > memory read cache support, to branch-3.1. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14745) Backport HDFS persistent memory read cache support to branch-3.1
[ https://issues.apache.org/jira/browse/HDFS-14745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16939199#comment-16939199 ] Feilong He commented on HDFS-14745: --- [^HDFS-14745-branch-3.1-002.patch] has been uploaded to include the patch HDFS-14818. > Backport HDFS persistent memory read cache support to branch-3.1 > > > Key: HDFS-14745 > URL: https://issues.apache.org/jira/browse/HDFS-14745 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Labels: cache, datanode > Fix For: 3.3.0 > > Attachments: HDFS-14745-branch-3.1-000.patch, > HDFS-14745-branch-3.1-001.patch, HDFS-14745-branch-3.1-002.patch > > > We are proposing to backport the patches for HDFS-13762, HDFS persistent > memory read cache support, to branch-3.1. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14745) Backport HDFS persistent memory read cache support to branch-3.1
[ https://issues.apache.org/jira/browse/HDFS-14745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-14745: -- Attachment: HDFS-14745-branch-3.1-002.patch > Backport HDFS persistent memory read cache support to branch-3.1 > > > Key: HDFS-14745 > URL: https://issues.apache.org/jira/browse/HDFS-14745 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Labels: cache, datanode > Fix For: 3.3.0 > > Attachments: HDFS-14745-branch-3.1-000.patch, > HDFS-14745-branch-3.1-001.patch, HDFS-14745-branch-3.1-002.patch > > > We are proposing to backport the patches for HDFS-13762, HDFS persistent > memory read cache support, to branch-3.1. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14818) Check native pmdk lib by 'hadoop checknative' command
[ https://issues.apache.org/jira/browse/HDFS-14818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16934140#comment-16934140 ] Feilong He commented on HDFS-14818: --- The uploaded [^HDFS-14818.004.patch] fixed checkstyle issues and made deferring conditions checked only for DRAM cache case. These original deferring conditions are just applicable to DRAM cache. > Check native pmdk lib by 'hadoop checknative' command > - > > Key: HDFS-14818 > URL: https://issues.apache.org/jira/browse/HDFS-14818 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: native >Reporter: Feilong He >Assignee: Feilong He >Priority: Minor > Attachments: HDFS-14818.000.patch, HDFS-14818.001.patch, > HDFS-14818.002.patch, HDFS-14818.003.patch, HDFS-14818.004.patch, > check_native_after_building_with_PMDK.png, > check_native_after_building_with_PMDK_using_NAME_instead_of_REALPATH.png, > check_native_after_building_without_PMDK.png > > > Currently, 'hadoop checknative' command supports checking native libs, such > as zlib, snappy, openssl and ISA-L etc. It's necessary to include pmdk lib in > the checking. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14818) Check native pmdk lib by 'hadoop checknative' command
[ https://issues.apache.org/jira/browse/HDFS-14818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-14818: -- Attachment: HDFS-14818.004.patch > Check native pmdk lib by 'hadoop checknative' command > - > > Key: HDFS-14818 > URL: https://issues.apache.org/jira/browse/HDFS-14818 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: native >Reporter: Feilong He >Assignee: Feilong He >Priority: Minor > Attachments: HDFS-14818.000.patch, HDFS-14818.001.patch, > HDFS-14818.002.patch, HDFS-14818.003.patch, HDFS-14818.004.patch, > check_native_after_building_with_PMDK.png, > check_native_after_building_with_PMDK_using_NAME_instead_of_REALPATH.png, > check_native_after_building_without_PMDK.png > > > Currently, 'hadoop checknative' command supports checking native libs, such > as zlib, snappy, openssl and ISA-L etc. It's necessary to include pmdk lib in > the checking. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14818) Check native pmdk lib by 'hadoop checknative' command
[ https://issues.apache.org/jira/browse/HDFS-14818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-14818: -- Attachment: HDFS-14818.003.patch > Check native pmdk lib by 'hadoop checknative' command > - > > Key: HDFS-14818 > URL: https://issues.apache.org/jira/browse/HDFS-14818 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: native >Reporter: Feilong He >Assignee: Feilong He >Priority: Minor > Attachments: HDFS-14818.000.patch, HDFS-14818.001.patch, > HDFS-14818.002.patch, HDFS-14818.003.patch, > check_native_after_building_with_PMDK.png, > check_native_after_building_with_PMDK_using_NAME_instead_of_REALPATH.png, > check_native_after_building_without_PMDK.png > > > Currently, 'hadoop checknative' command supports checking native libs, such > as zlib, snappy, openssl and ISA-L etc. It's necessary to include pmdk lib in > the checking. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14740) HDFS read cache persistence support
[ https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-14740: -- Component/s: datanode caching > HDFS read cache persistence support > --- > > Key: HDFS-14740 > URL: https://issues.apache.org/jira/browse/HDFS-14740 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Rui Mo >Priority: Major > Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, > HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch > > > In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache > management. Even though PM can persist cache data, for simplifying the > initial implementation, the previous cache data will be cleaned up during > DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking > advantage of PM's data persistence characteristic, i.e., recovering the > status for cached data, if any, when DataNode restarts, thus, cache warm up > time can be saved for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14740) HDFS read cache persistence support
[ https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-14740: -- Description: In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache management. Even though PM can persist cache data, for simplifying the initial implementation, the previous cache data will be cleaned up during DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking advantage of PM's data persistence characteristic, i.e., recovering the status for cached data, if any, when DataNode restarts, thus, cache warm up time can be saved for user. (was: In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache management. Even though PM can persist cache data, for simplifying the initial implementation, the previous cache data will be cleaned up during DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking advantage of PM's data persistence characteristic, i.e., recovering the cache status when DataNode restarts, thus, cache warm up time can be saved for user.) > HDFS read cache persistence support > --- > > Key: HDFS-14740 > URL: https://issues.apache.org/jira/browse/HDFS-14740 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Feilong He >Assignee: Rui Mo >Priority: Major > Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, > HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch > > > In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache > management. Even though PM can persist cache data, for simplifying the > initial implementation, the previous cache data will be cleaned up during > DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking > advantage of PM's data persistence characteristic, i.e., recovering the > status for cached data, if any, when DataNode restarts, thus, cache warm up > time can be saved for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14818) Check native pmdk lib by 'hadoop checknative' command
[ https://issues.apache.org/jira/browse/HDFS-14818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-14818: -- Attachment: HDFS-14818.002.patch > Check native pmdk lib by 'hadoop checknative' command > - > > Key: HDFS-14818 > URL: https://issues.apache.org/jira/browse/HDFS-14818 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: native >Reporter: Feilong He >Assignee: Feilong He >Priority: Minor > Attachments: HDFS-14818.000.patch, HDFS-14818.001.patch, > HDFS-14818.002.patch, check_native_after_building_with_PMDK.png, > check_native_after_building_with_PMDK_using_NAME_instead_of_REALPATH.png, > check_native_after_building_without_PMDK.png > > > Currently, 'hadoop checknative' command supports checking native libs, such > as zlib, snappy, openssl and ISA-L etc. It's necessary to include pmdk lib in > the checking. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14818) Check native pmdk lib by 'hadoop checknative' command
[ https://issues.apache.org/jira/browse/HDFS-14818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16931007#comment-16931007 ] Feilong He commented on HDFS-14818: --- [^HDFS-14818.001.patch] has been uploaded with adding some comments for PMDK support states. > Check native pmdk lib by 'hadoop checknative' command > - > > Key: HDFS-14818 > URL: https://issues.apache.org/jira/browse/HDFS-14818 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: native >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-14818.000.patch, HDFS-14818.001.patch, > check_native_after_building_with_PMDK.png, > check_native_after_building_with_PMDK_using_NAME_instead_of_REALPATH.png, > check_native_after_building_without_PMDK.png > > > Currently, 'hadoop checknative' command supports checking native libs, such > as zlib, snappy, openssl and ISA-L etc. It's necessary to include pmdk lib in > the checking. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14818) Check native pmdk lib by 'hadoop checknative' command
[ https://issues.apache.org/jira/browse/HDFS-14818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16930216#comment-16930216 ] Feilong He edited comment on HDFS-14818 at 9/17/19 1:50 AM: Thanks [~rakeshr] for your comments. To make the code change effect clear to reviewers, I posted some screenshots. * The below picture shows the result of 'hadoop checknative' afer building WITH PMDK. The build cmd is 'mvn clean package -Pdist,native -DskipTests -Dtar -Drequire.pmdk'. !check_native_after_building_with_PMDK.png! * The below picture shows the result of 'hadoop checknative' afer building WITHOUT PMDK. The build cmd is 'mvn clean package -Pdist,native -DskipTests -Dtar'. !check_native_after_building_without_PMDK.png! * The below picture shows the result of 'hadoop checknative' afer building WITH PMDK, but shading the modification brought by this patch for CMakeLists.txt, i.e., still use 'NAME' instead of 'REALPATH'. The build cmd is 'mvn clean package -Pdist,native -DskipTests -Dtar -Drequire.pmdk'. !check_native_after_building_with_PMDK_using_NAME_instead_of_REALPATH.png! {quote}{{SupportState.PMDK_LIB_NOT_FOUND}} - its unused now, can you remove it. {quote} In some env, if the PMDK native lib is not found, this state and its message will help user to identify the fact. So I am leaning to keep this state. {quote}Any reason to change 'NAME' to 'REALPATH'. {quote} As the above 3rd picture shows, if 'NAME' is used instead of ‘REALPATH', only the lib name can be obtained and then printed by 'hadoop checknative'. In this patch, by using 'REALPATH', the real path of the target lib will be kept, which is more useful to user, I think. Please refer to [https://cmake.org/cmake/help/v3.15/command/get_filename_component.html]. was (Author: philohe): Thanks [~rakeshr] for your comments. To make the code change effect clear to reviewers, I posted some screenshots. * The below picture shows the result of 'hadoop checknative' afer building WITH PMDK. The build cmd is 'mvn clean package -Pdist,native -DskipTests -Dtar -Drequire.pmdk'. !check_native_after_building_with_PMDK.png! * The below picture shows the result of 'hadoop checknative' afer building WITHOUT PMDK. The build cmd is 'mvn clean package -Pdist,native -DskipTests -Dtar'. !check_native_after_building_without_PMDK.png! * The below picture shows the result of 'hadoop checknative' afer building WITH PMDK, but shading the modification brought by this patch for CMakeLists.txt, i.e., still use 'NAME' instead of 'REALPATH'. The build cmd is 'mvn clean package -Pdist,native -DskipTests -Dtar -Drequire.pmdk'. !check_native_after_building_with_PMDK_using_NAME_instead_of_REALPATH.png! {quote}{{SupportState.PMDK_LIB_NOT_FOUND}} - its unused now, can you remove it. {quote} In some env, if the PMDK native lib is nout found, this state and its message will help user to identify the fact. So I am leaning to keep this state. {quote}Any reason to change 'NAME' to 'REALPATH'. {quote} As the above 3rd picture shows, if 'NAME' is used instead of ‘REALPATH', only the lib name can be obtained and then printed by 'hadoop checknative'. In this patch, by using 'REALPATH', the real path of the target lib will be kept, which is more useful to user, I think. Please refer to [https://cmake.org/cmake/help/v3.15/command/get_filename_component.html]. > Check native pmdk lib by 'hadoop checknative' command > - > > Key: HDFS-14818 > URL: https://issues.apache.org/jira/browse/HDFS-14818 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: native >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-14818.000.patch, HDFS-14818.001.patch, > check_native_after_building_with_PMDK.png, > check_native_after_building_with_PMDK_using_NAME_instead_of_REALPATH.png, > check_native_after_building_without_PMDK.png > > > Currently, 'hadoop checknative' command supports checking native libs, such > as zlib, snappy, openssl and ISA-L etc. It's necessary to include pmdk lib in > the checking. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14818) Check native pmdk lib by 'hadoop checknative' command
[ https://issues.apache.org/jira/browse/HDFS-14818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-14818: -- Attachment: HDFS-14818.001.patch > Check native pmdk lib by 'hadoop checknative' command > - > > Key: HDFS-14818 > URL: https://issues.apache.org/jira/browse/HDFS-14818 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: native >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-14818.000.patch, HDFS-14818.001.patch, > check_native_after_building_with_PMDK.png, > check_native_after_building_with_PMDK_using_NAME_instead_of_REALPATH.png, > check_native_after_building_without_PMDK.png > > > Currently, 'hadoop checknative' command supports checking native libs, such > as zlib, snappy, openssl and ISA-L etc. It's necessary to include pmdk lib in > the checking. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14818) Check native pmdk lib by 'hadoop checknative' command
[ https://issues.apache.org/jira/browse/HDFS-14818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16930216#comment-16930216 ] Feilong He edited comment on HDFS-14818 at 9/16/19 3:52 AM: Thanks [~rakeshr] for your comments. To make the code change effect clear to reviewers, I posted some screenshots. * The below picture shows the result of 'hadoop checknative' afer building WITH PMDK. The build cmd is 'mvn clean package -Pdist,native -DskipTests -Dtar -Drequire.pmdk'. !check_native_after_building_with_PMDK.png! * The below picture shows the result of 'hadoop checknative' afer building WITHOUT PMDK. The build cmd is 'mvn clean package -Pdist,native -DskipTests -Dtar'. !check_native_after_building_without_PMDK.png! * The below picture shows the result of 'hadoop checknative' afer building WITH PMDK, but shading the modification brought by this patch for CMakeLists.txt, i.e., still use 'NAME' instead of 'REALPATH'. The build cmd is 'mvn clean package -Pdist,native -DskipTests -Dtar -Drequire.pmdk'. !check_native_after_building_with_PMDK_using_NAME_instead_of_REALPATH.png! {quote}{{SupportState.PMDK_LIB_NOT_FOUND}} - its unused now, can you remove it. {quote} In some env, if the PMDK native lib is nout found, this state and its message will help user to identify the fact. So I am leaning to keep this state. {quote}Any reason to change 'NAME' to 'REALPATH'. {quote} As the above 3rd picture shows, if 'NAME' is used instead of ‘REALPATH', only the lib name can be obtained and then printed by 'hadoop checknative'. In this patch, by using 'REALPATH', the real path of the target lib will be kept, which is more useful to user, I think. Please refer to [https://cmake.org/cmake/help/v3.15/command/get_filename_component.html]. was (Author: philohe): Thanks [~rakeshr] for your comments. To make the code change effect clear to reviewers, I posted some screenshots. * The below picture shows the result of 'hadoop checknative' afer building WITH PMDK. The build cmd is 'mvn clean package -Pdist,native -DskipTests -Dtar -Drequire.pmdk'. You could see the path !check_native_after_building_with_PMDK.png! * The below picture shows the result of 'hadoop checknative' afer building WITHOUT PMDK. The build cmd is 'mvn clean package -Pdist,native -DskipTests -Dtar'. !check_native_after_building_without_PMDK.png! * The below picture shows the result of 'hadoop checknative' afer building WITH PMDK, but shading the modification brought by this patch for CMakeLists.txt, i.e., still use 'NAME' instead of 'REALPATH'. The build cmd is 'mvn clean package -Pdist,native -DskipTests -Dtar -Drequire.pmdk'. !check_native_after_building_with_PMDK_using_NAME_instead_of_REALPATH.png! {quote}{{SupportState.PMDK_LIB_NOT_FOUND}} - its unused now, can you remove it. {quote} In some env, if the PMDK native lib is nout found, this state and its message will help user to identify the fact. So I am leaning to keep this state. {quote}Any reason to change 'NAME' to 'REALPATH'. {quote} As the above 3rd picture shows, if 'NAME' is used instead of ‘REALPATH', only the lib name can be obtained and then printed by 'hadoop checknative'. In this patch, by using 'REALPATH', the real path of the target lib will be kept, which is more useful to user, I think. Please refer to [https://cmake.org/cmake/help/v3.15/command/get_filename_component.html]. > Check native pmdk lib by 'hadoop checknative' command > - > > Key: HDFS-14818 > URL: https://issues.apache.org/jira/browse/HDFS-14818 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: native >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-14818.000.patch, > check_native_after_building_with_PMDK.png, > check_native_after_building_with_PMDK_using_NAME_instead_of_REALPATH.png, > check_native_after_building_without_PMDK.png > > > Currently, 'hadoop checknative' command supports checking native libs, such > as zlib, snappy, openssl and ISA-L etc. It's necessary to include pmdk lib in > the checking. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14818) Check native pmdk lib by 'hadoop checknative' command
[ https://issues.apache.org/jira/browse/HDFS-14818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16930216#comment-16930216 ] Feilong He commented on HDFS-14818: --- Thanks [~rakeshr] for your comments. To make the code change effect clear to reviewers, I posted some screenshots. * The below picture shows the result of 'hadoop checknative' afer building WITH PMDK. The build cmd is 'mvn clean package -Pdist,native -DskipTests -Dtar -Drequire.pmdk'. You could see the path !check_native_after_building_with_PMDK.png! * The below picture shows the result of 'hadoop checknative' afer building WITHOUT PMDK. The build cmd is 'mvn clean package -Pdist,native -DskipTests -Dtar'. !check_native_after_building_without_PMDK.png! * The below picture shows the result of 'hadoop checknative' afer building WITH PMDK, but shading the modification brought by this patch for CMakeLists.txt, i.e., still use 'NAME' instead of 'REALPATH'. The build cmd is 'mvn clean package -Pdist,native -DskipTests -Dtar -Drequire.pmdk'. !check_native_after_building_with_PMDK_using_NAME_instead_of_REALPATH.png! {quote}{{SupportState.PMDK_LIB_NOT_FOUND}} - its unused now, can you remove it. {quote} In some env, if the PMDK native lib is nout found, this state and its message will help user to identify the fact. So I am leaning to keep this state. {quote}Any reason to change 'NAME' to 'REALPATH'. {quote} As the above 3rd picture shows, if 'NAME' is used instead of ‘REALPATH', only the lib name can be obtained and then printed by 'hadoop checknative'. In this patch, by using 'REALPATH', the real path of the target lib will be kept, which is more useful to user, I think. Please refer to [https://cmake.org/cmake/help/v3.15/command/get_filename_component.html]. > Check native pmdk lib by 'hadoop checknative' command > - > > Key: HDFS-14818 > URL: https://issues.apache.org/jira/browse/HDFS-14818 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: native >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-14818.000.patch, > check_native_after_building_with_PMDK.png, > check_native_after_building_with_PMDK_using_NAME_instead_of_REALPATH.png, > check_native_after_building_without_PMDK.png > > > Currently, 'hadoop checknative' command supports checking native libs, such > as zlib, snappy, openssl and ISA-L etc. It's necessary to include pmdk lib in > the checking. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org