from:"Feilong He \(Jira\)"

[jira] [Commented] (HDFS-7343) HDFS smart storage management

2022-12-01 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17642239#comment-17642239
 ] 

Feilong He commented on HDFS-7343:
--

[~brahmareddy], thanks for comment.

i) No feature is pending. As you may know, we have made an independent project 
called SSM based on this Jira's design. It is basically production ready except 
some experimental features, like data sync, HA, etc.

ii) No, kafka and ZK are not required. It is recommended to deploy SSM in HDFS 
cluster. The only prerequisite is user needs to deploy mysql for maintaining 
SSM metadata.

iii) This project is under maintenance phase. We have no plan to move it into 
HDFS or somewhere as subproject, or make it become an apache incubation project.

> HDFS smart storage management
> -
>
> Key: HDFS-7343
> URL: https://issues.apache.org/jira/browse/HDFS-7343
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Wei Zhou
>Priority: Major
> Attachments: HDFS-Smart-Storage-Management-update.pdf, 
> HDFS-Smart-Storage-Management.pdf, 
> HDFSSmartStorageManagement-General-20170315.pdf, 
> HDFSSmartStorageManagement-Phase1-20170315.pdf, access_count_tables.jpg, 
> move.jpg, tables_in_ssm.xlsx
>
>
> As discussed in HDFS-7285, it would be better to have a comprehensive and 
> flexible storage policy engine considering file attributes, metadata, data 
> temperature, storage type, EC codec, available hardware capabilities, 
> user/application preference and etc.
> Modified the title for re-purpose.
> We'd extend this effort some bit and aim to work on a comprehensive solution 
> to provide smart storage management service in order for convenient, 
> intelligent and effective utilizing of erasure coding or replicas, HDFS cache 
> facility, HSM offering, and all kinds of tools (balancer, mover, disk 
> balancer and so on) in a large cluster.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-16014) Fix an issue in checking native pmdk lib by 'hadoop checknative' command

2021-12-07 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-16014:
--
Summary: Fix an issue in checking native pmdk lib by 'hadoop checknative' 
command  (was: Issue in checking native pmdk lib by 'hadoop checknative' 
command)

> Fix an issue in checking native pmdk lib by 'hadoop checknative' command
> 
>
> Key: HDFS-16014
> URL: https://issues.apache.org/jira/browse/HDFS-16014
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: native
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-16014-01.patch, HDFS-16014-02.patch
>
>
> In HDFS-14818, we proposed a patch to support checking native pmdk lib. The 
> expected target is to display hint to user regarding pmdk lib loaded state. 
> Recently, it was found that pmdk lib was not successfully loaded actually but 
> the `hadoop checknative` command still tells user that it was. This issue can 
> be reproduced by moving libpmem.so* from specified installed path to other 
> place, or directly deleting these libs, after the project is built.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-16014) Issue in checking native pmdk lib by 'hadoop checknative' command

2021-12-05 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17453751#comment-17453751
 ] 

Feilong He commented on HDFS-16014:
---

[~rakeshr], please check the latest QA report. It looks good.

> Issue in checking native pmdk lib by 'hadoop checknative' command
> -
>
> Key: HDFS-16014
> URL: https://issues.apache.org/jira/browse/HDFS-16014
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: native
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-16014-01.patch, HDFS-16014-02.patch
>
>
> In HDFS-14818, we proposed a patch to support checking native pmdk lib. The 
> expected target is to display hint to user regarding pmdk lib loaded state. 
> Recently, it was found that pmdk lib was not successfully loaded actually but 
> the `hadoop checknative` command still tells user that it was. This issue can 
> be reproduced by moving libpmem.so* from specified installed path to other 
> place, or directly deleting these libs, after the project is built.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-16014) Issue in checking native pmdk lib by 'hadoop checknative' command

2021-12-01 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17452202#comment-17452202
 ] 

Feilong He commented on HDFS-16014:
---

[~rakeshr], thanks for your review!

I just uploaded a same patch to trigger the latest QA checking.

> Issue in checking native pmdk lib by 'hadoop checknative' command
> -
>
> Key: HDFS-16014
> URL: https://issues.apache.org/jira/browse/HDFS-16014
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: native
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-16014-01.patch, HDFS-16014-02.patch
>
>
> In HDFS-14818, we proposed a patch to support checking native pmdk lib. The 
> expected target is to display hint to user regarding pmdk lib loaded state. 
> Recently, it was found that pmdk lib was not successfully loaded actually but 
> the `hadoop checknative` command still tells user that it was. This issue can 
> be reproduced by moving libpmem.so* from specified installed path to other 
> place, or directly deleting these libs, after the project is built.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-16014) Issue in checking native pmdk lib by 'hadoop checknative' command

2021-12-01 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-16014:
--
Attachment: HDFS-16014-02.patch

> Issue in checking native pmdk lib by 'hadoop checknative' command
> -
>
> Key: HDFS-16014
> URL: https://issues.apache.org/jira/browse/HDFS-16014
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: native
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-16014-01.patch, HDFS-16014-02.patch
>
>
> In HDFS-14818, we proposed a patch to support checking native pmdk lib. The 
> expected target is to display hint to user regarding pmdk lib loaded state. 
> Recently, it was found that pmdk lib was not successfully loaded actually but 
> the `hadoop checknative` command still tells user that it was. This issue can 
> be reproduced by moving libpmem.so* from specified installed path to other 
> place, or directly deleting these libs, after the project is built.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-15788) Correct the statement for pmem cache to reflect cache persistence support

2021-11-15 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-15788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17444301#comment-17444301
 ] 

Feilong He commented on HDFS-15788:
---

[~rakeshr], this patch is just to update document to align with the 
implementation. If you have any comment, please let me know.

> Correct the statement for pmem cache to reflect cache persistence support
> -
>
> Key: HDFS-15788
> URL: https://issues.apache.org/jira/browse/HDFS-15788
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Minor
> Attachments: HDFS-15788-01.patch, HDFS-15788-02.patch
>
>
> Correct the statement for pmem cache to reflect cache persistence support.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-16014) Issue in checking native pmdk lib by 'hadoop checknative' command

2021-11-15 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17444297#comment-17444297
 ] 

Feilong He commented on HDFS-16014:
---

[~rakeshr], do you have any comment on this patch?

> Issue in checking native pmdk lib by 'hadoop checknative' command
> -
>
> Key: HDFS-16014
> URL: https://issues.apache.org/jira/browse/HDFS-16014
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: native
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-16014-01.patch
>
>
> In HDFS-14818, we proposed a patch to support checking native pmdk lib. The 
> expected target is to display hint to user regarding pmdk lib loaded state. 
> Recently, it was found that pmdk lib was not successfully loaded actually but 
> the `hadoop checknative` command still tells user that it was. This issue can 
> be reproduced by moving libpmem.so* from specified installed path to other 
> place, or directly deleting these libs, after the project is built.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-14480) Shut down DataNode gracefully when responding to stop-dfs.sh/stop-dfs.cmd

2021-11-15 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He resolved HDFS-14480.
---
Resolution: Won't Fix

> Shut down DataNode gracefully when responding to stop-dfs.sh/stop-dfs.cmd
> -
>
> Key: HDFS-14480
> URL: https://issues.apache.org/jira/browse/HDFS-14480
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
>
> Currently, DataNode has #shutdown method to tackle something before shutdown. 
> But its shutdown hook does't call this method. In HDFS-14401 for HDFS 
> persistent memory cache optimization, we added clean cache logic in DN's 
> #shutdown method. And we expect DN will clean up cache during shut down by 
> stop-dfs.sh/stop-dfs.cmd, which depends on this Jira's patch.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly

2021-11-15 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17444290#comment-17444290
 ] 

Feilong He commented on HDFS-15714:
---

Uploaded [^HDFS-15714-02.patch] with two commits introduced to fix the 
following issues:
1) Exclude provided storage in setting up pipeline for append operation.
2) Fix sync failure for truncated data with provided replica.

> HDFS Provided Storage Read/Write Mount Support On-the-fly
> -
>
> Key: HDFS-15714
> URL: https://issues.apache.org/jira/browse/HDFS-15714
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-15714-01.patch, HDFS-15714-02.patch, 
> HDFS_Provided_Storage_Design-V1.pdf, HDFS_Provided_Storage_Performance-V1.pdf
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> HDFS Provided Storage (PS) is a feature to tier HDFS over other file systems. 
> In HDFS-9806, PROVIDED storage type was introduced to HDFS. Through 
> configuring external storage with PROVIDED tag for DataNode, user can enable 
> application to access data stored externally from HDFS side. However, there 
> are two issues need to be addressed. Firstly, mounting external storage 
> on-the-fly, namely dynamic mount, is lacking. It is necessary to get it 
> supported to flexibly combine HDFS with an external storage at runtime. 
> Secondly, PS write is not supported by current HDFS. But in real 
> applications, it is common to transfer data bi-directionally for read/write 
> between HDFS and external storage.
> Through this JIRA, we are presenting our work for PS write support and 
> dynamic mount support for both read & write. Please note in the community 
> several JIRAs have been filed for these topics. Our work is based on these 
> previous community work, with new design & implementation to support called 
> writeBack mount and enable admin to add any mount on-the-fly. We appreciate 
> those folks in the community for their great contribution! See their pending 
> JIRAs: HDFS-14805 & HDFS-12090.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly

2021-11-15 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-15714:
--
Attachment: HDFS-15714-02.patch

> HDFS Provided Storage Read/Write Mount Support On-the-fly
> -
>
> Key: HDFS-15714
> URL: https://issues.apache.org/jira/browse/HDFS-15714
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-15714-01.patch, HDFS-15714-02.patch, 
> HDFS_Provided_Storage_Design-V1.pdf, HDFS_Provided_Storage_Performance-V1.pdf
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> HDFS Provided Storage (PS) is a feature to tier HDFS over other file systems. 
> In HDFS-9806, PROVIDED storage type was introduced to HDFS. Through 
> configuring external storage with PROVIDED tag for DataNode, user can enable 
> application to access data stored externally from HDFS side. However, there 
> are two issues need to be addressed. Firstly, mounting external storage 
> on-the-fly, namely dynamic mount, is lacking. It is necessary to get it 
> supported to flexibly combine HDFS with an external storage at runtime. 
> Secondly, PS write is not supported by current HDFS. But in real 
> applications, it is common to transfer data bi-directionally for read/write 
> between HDFS and external storage.
> Through this JIRA, we are presenting our work for PS write support and 
> dynamic mount support for both read & write. Please note in the community 
> several JIRAs have been filed for these topics. Our work is based on these 
> previous community work, with new design & implementation to support called 
> writeBack mount and enable admin to add any mount on-the-fly. We appreciate 
> those folks in the community for their great contribution! See their pending 
> JIRAs: HDFS-14805 & HDFS-12090.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly

2021-06-04 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17357118#comment-17357118
 ] 

Feilong He edited comment on HDFS-15714 at 6/4/21, 7:57 AM:


Hi [~bpatel], sorry for this late reply.

The relevant code path is shown as below.
{code:java}
ReadMountManager: FSMountAttrOp.addRemotePaths -> FSMountAttrOp: w.addToEdits 
-> MountEditLogWriter: createFile{code}
{{In MountEditLogWriter#createFile}}, we can know a {{HdfsFileStatus}} will be 
created based on {{remoteStatus}} obtained from remote storage, which is like 
creating a normal HDFS file except that the data is stored outside HDFS. 
*Actually, remote file's own modification time is not used and kept in HDFS*. 
My previous reply may be ambiguous.

I just did a simple test to verify it: compare a file(object)'s modification 
time in S3 and that in HDFS after S3 bucket containing that file is mounted to 
HDFS. The phenomenon is they are different, which is consistent with the code 
analysis. The modification time of that file in HDFS is the time HDFS generates 
when responding to user's mount request.

For {{readOnly}} mount mode, mounted data cannot be changed from HDFS side. So 
its modification time keeps unchanged on HDFS. It is as same as create time.

I think, generally, many upper HDFS applications don't care about data 
modification time. So the inconsistency of modification time may not cause 
issues. If you have any thought or case I ignored, please kindly point out it.

Thanks a lot for your comment! And as always, any discussion is welcome! 


was (Author: philohe):
Hi [~bpatel], sorry for this late reply.

The relevant code path is shown as below.
{code:java}
ReadMountManager: FSMountAttrOp.addRemotePaths -> FSMountAttrOp: w.addToEdits 
-> MountEditLogWriter: createFile{code}
{{In MountEditLogWriter#createFile}}, we can know a {{HdfsFileStatus}} will be 
created based on {{remoteStatus}} obtained from remote storage, which is like 
creating a normal HDFS file except that the data is stored outside HDFS. 
*Actually, modification time of remote file is not used and kept in HDFS*. My 
previous reply may be ambiguous.

I just did a simple test to verify it: compare a file(object)'s modification 
time in S3 and that in HDFS after S3 bucket containing that file is mounted to 
HDFS. The phenomenon is they are different, which is consistent with the code 
analysis. The modification time of that file in HDFS is the time when the above 
{{#createFile}} is triggered to respond to user's mount request.

For {{readOnly}} mount mode, mounted data cannot be changed from HDFS side. So 
its modification time keeps unchanged on HDFS.

I think, generally, upper HDFS applications don't care about data modification 
time. So the inconsistency of modification time may not cause issues. If you 
have any thought or case I ignored, please kindly point out it.

Thanks a lot for your comment! And as always, any discussion is welcome! 

> HDFS Provided Storage Read/Write Mount Support On-the-fly
> -
>
> Key: HDFS-15714
> URL: https://issues.apache.org/jira/browse/HDFS-15714
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-15714-01.patch, 
> HDFS_Provided_Storage_Design-V1.pdf, HDFS_Provided_Storage_Performance-V1.pdf
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> HDFS Provided Storage (PS) is a feature to tier HDFS over other file systems. 
> In HDFS-9806, PROVIDED storage type was introduced to HDFS. Through 
> configuring external storage with PROVIDED tag for DataNode, user can enable 
> application to access data stored externally from HDFS side. However, there 
> are two issues need to be addressed. Firstly, mounting external storage 
> on-the-fly, namely dynamic mount, is lacking. It is necessary to get it 
> supported to flexibly combine HDFS with an external storage at runtime. 
> Secondly, PS write is not supported by current HDFS. But in real 
> applications, it is common to transfer data bi-directionally for read/write 
> between HDFS and external storage.
> Through this JIRA, we are presenting our work for PS write support and 
> dynamic mount support for both read & write. Please note in the community 
> several JIRAs have been filed for these topics. Our work is based on these 
> previous community work, with new design & implementation to support called 
> writeBack mount and enable admin to add any mount on-the-fly. We appreciate 
> those folks in the community for their great contribution! See their pending 
> JIRAs: HDFS-14805 & HDFS-12090.



--
This message

[jira] [Commented] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly

2021-06-04 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17357118#comment-17357118
 ] 

Feilong He commented on HDFS-15714:
---

Hi [~bpatel], sorry for this late reply.

The relevant code path is shown as below.
{code:java}
ReadMountManager: FSMountAttrOp.addRemotePaths -> FSMountAttrOp: w.addToEdits 
-> MountEditLogWriter: createFile{code}
{{In MountEditLogWriter#createFile}}, we can know a {{HdfsFileStatus}} will be 
created based on {{remoteStatus}} obtained from remote storage, which is like 
creating a normal HDFS file except that the data is stored outside HDFS. 
*Actually, modification time of remote file is not used and kept in HDFS*. My 
previous reply may be ambiguous.

I just did a simple test to verify it: compare a file(object)'s modification 
time in S3 and that in HDFS after S3 bucket containing that file is mounted to 
HDFS. The phenomenon is they are different, which is consistent with the code 
analysis. The modification time of that file in HDFS is the time when the above 
{{#createFile}} is triggered to respond to user's mount request.

For {{readOnly}} mount mode, mounted data cannot be changed from HDFS side. So 
its modification time keeps unchanged on HDFS.

I think, generally, upper HDFS applications don't care about data modification 
time. So the inconsistency of modification time may not cause issues. If you 
have any thought or case I ignored, please kindly point out it.

Thanks a lot for your comment! And as always, any discussion is welcome! 

> HDFS Provided Storage Read/Write Mount Support On-the-fly
> -
>
> Key: HDFS-15714
> URL: https://issues.apache.org/jira/browse/HDFS-15714
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-15714-01.patch, 
> HDFS_Provided_Storage_Design-V1.pdf, HDFS_Provided_Storage_Performance-V1.pdf
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> HDFS Provided Storage (PS) is a feature to tier HDFS over other file systems. 
> In HDFS-9806, PROVIDED storage type was introduced to HDFS. Through 
> configuring external storage with PROVIDED tag for DataNode, user can enable 
> application to access data stored externally from HDFS side. However, there 
> are two issues need to be addressed. Firstly, mounting external storage 
> on-the-fly, namely dynamic mount, is lacking. It is necessary to get it 
> supported to flexibly combine HDFS with an external storage at runtime. 
> Secondly, PS write is not supported by current HDFS. But in real 
> applications, it is common to transfer data bi-directionally for read/write 
> between HDFS and external storage.
> Through this JIRA, we are presenting our work for PS write support and 
> dynamic mount support for both read & write. Please note in the community 
> several JIRAs have been filed for these topics. Our work is based on these 
> previous community work, with new design & implementation to support called 
> writeBack mount and enable admin to add any mount on-the-fly. We appreciate 
> those folks in the community for their great contribution! See their pending 
> JIRAs: HDFS-14805 & HDFS-12090.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly

2021-05-13 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17343783#comment-17343783
 ] 

Feilong He commented on HDFS-15714:
---

[~bpatel], I see. In the matter of updating mount, it will be useful to have to 
append or update mount operation. As you pointed out, it is inefficient and 
infeasible to remove a mount, then add that mount again, for syncing metadata 
purpose.

Yes, we can file another Jira to track this functionality in the future. Thanks 
for your insightful comments!

> HDFS Provided Storage Read/Write Mount Support On-the-fly
> -
>
> Key: HDFS-15714
> URL: https://issues.apache.org/jira/browse/HDFS-15714
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-15714-01.patch, 
> HDFS_Provided_Storage_Design-V1.pdf, HDFS_Provided_Storage_Performance-V1.pdf
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> HDFS Provided Storage (PS) is a feature to tier HDFS over other file systems. 
> In HDFS-9806, PROVIDED storage type was introduced to HDFS. Through 
> configuring external storage with PROVIDED tag for DataNode, user can enable 
> application to access data stored externally from HDFS side. However, there 
> are two issues need to be addressed. Firstly, mounting external storage 
> on-the-fly, namely dynamic mount, is lacking. It is necessary to get it 
> supported to flexibly combine HDFS with an external storage at runtime. 
> Secondly, PS write is not supported by current HDFS. But in real 
> applications, it is common to transfer data bi-directionally for read/write 
> between HDFS and external storage.
> Through this JIRA, we are presenting our work for PS write support and 
> dynamic mount support for both read & write. Please note in the community 
> several JIRAs have been filed for these topics. Our work is based on these 
> previous community work, with new design & implementation to support called 
> writeBack mount and enable admin to add any mount on-the-fly. We appreciate 
> those folks in the community for their great contribution! See their pending 
> JIRAs: HDFS-14805 & HDFS-12090.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly

2021-05-13 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17343783#comment-17343783
 ] 

Feilong He edited comment on HDFS-15714 at 5/13/21, 8:41 AM:
-

[~bpatel], I see. In the matter of updating mount, it will be useful to have 
append or update mount operation. As you pointed out, it is inefficient and 
infeasible to remove a mount, then add that mount again, for syncing metadata 
purpose.

Yes, we can file another Jira to track this functionality in the future. Thanks 
for your insightful comments!


was (Author: philohe):
[~bpatel], I see. In the matter of updating mount, it will be useful to have to 
append or update mount operation. As you pointed out, it is inefficient and 
infeasible to remove a mount, then add that mount again, for syncing metadata 
purpose.

Yes, we can file another Jira to track this functionality in the future. Thanks 
for your insightful comments!

> HDFS Provided Storage Read/Write Mount Support On-the-fly
> -
>
> Key: HDFS-15714
> URL: https://issues.apache.org/jira/browse/HDFS-15714
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-15714-01.patch, 
> HDFS_Provided_Storage_Design-V1.pdf, HDFS_Provided_Storage_Performance-V1.pdf
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> HDFS Provided Storage (PS) is a feature to tier HDFS over other file systems. 
> In HDFS-9806, PROVIDED storage type was introduced to HDFS. Through 
> configuring external storage with PROVIDED tag for DataNode, user can enable 
> application to access data stored externally from HDFS side. However, there 
> are two issues need to be addressed. Firstly, mounting external storage 
> on-the-fly, namely dynamic mount, is lacking. It is necessary to get it 
> supported to flexibly combine HDFS with an external storage at runtime. 
> Secondly, PS write is not supported by current HDFS. But in real 
> applications, it is common to transfer data bi-directionally for read/write 
> between HDFS and external storage.
> Through this JIRA, we are presenting our work for PS write support and 
> dynamic mount support for both read & write. Please note in the community 
> several JIRAs have been filed for these topics. Our work is based on these 
> previous community work, with new design & implementation to support called 
> writeBack mount and enable admin to add any mount on-the-fly. We appreciate 
> those folks in the community for their great contribution! See their pending 
> JIRAs: HDFS-14805 & HDFS-12090.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly

2021-05-13 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17343743#comment-17343743
 ] 

Feilong He edited comment on HDFS-15714 at 5/13/21, 7:15 AM:
-

[~bpatel], thanks for your review.

As you know, add/remove/list mount are basic operations. We thought 
merge/append mount operations are not very commonly used seemingly. Right? And 
another thought is it is better to make the patch concise and less complex for 
new feature's initial implementation, except for implementing very necessary 
core functionalities. For merge/append mount, I think many factors need to be 
considered. E.g., consider case: two mounts to be merged own some data with 
same name. So based on the above reasons, the current version doesn't cover 
merge/append mount operations. Any thought?


was (Author: philohe):
[~bpatel], thanks for your review.

As you know, add/remove/list mount are basic operations. We thought 
merge/append mount operations are not very commonly used seemingly. Right? And 
another thought is it is better to make patch concise and less complex for new 
feature's initial implementation, except for implementing very necessary core 
functionalities. For merge/append mount, I think many factors need to be 
considered. E.g., consider case: two mounts to be merged own data with same 
name. So based on the above reasons, the current version doesn't cover 
merge/append mount operations. Any thought?

> HDFS Provided Storage Read/Write Mount Support On-the-fly
> -
>
> Key: HDFS-15714
> URL: https://issues.apache.org/jira/browse/HDFS-15714
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-15714-01.patch, 
> HDFS_Provided_Storage_Design-V1.pdf, HDFS_Provided_Storage_Performance-V1.pdf
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> HDFS Provided Storage (PS) is a feature to tier HDFS over other file systems. 
> In HDFS-9806, PROVIDED storage type was introduced to HDFS. Through 
> configuring external storage with PROVIDED tag for DataNode, user can enable 
> application to access data stored externally from HDFS side. However, there 
> are two issues need to be addressed. Firstly, mounting external storage 
> on-the-fly, namely dynamic mount, is lacking. It is necessary to get it 
> supported to flexibly combine HDFS with an external storage at runtime. 
> Secondly, PS write is not supported by current HDFS. But in real 
> applications, it is common to transfer data bi-directionally for read/write 
> between HDFS and external storage.
> Through this JIRA, we are presenting our work for PS write support and 
> dynamic mount support for both read & write. Please note in the community 
> several JIRAs have been filed for these topics. Our work is based on these 
> previous community work, with new design & implementation to support called 
> writeBack mount and enable admin to add any mount on-the-fly. We appreciate 
> those folks in the community for their great contribution! See their pending 
> JIRAs: HDFS-14805 & HDFS-12090.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly

2021-05-13 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17343743#comment-17343743
 ] 

Feilong He edited comment on HDFS-15714 at 5/13/21, 7:14 AM:
-

[~bpatel], thanks for your review.

As you know, add/remove/list mount are basic operations. We thought 
merge/append mount operations are not very commonly used seemingly. Right? And 
another thought is it is better to make patch concise and less complex for new 
feature's initial implementation, except for implementing very necessary core 
functionalities. For merge/append mount, I think many factors need to be 
considered. E.g., consider case: two mounts to be merged own data with same 
name. So based on the above reasons, the current version doesn't cover 
merge/append mount operations. Any thought?


was (Author: philohe):
[~bpatel], thanks for your review.

As you know, add/remove/list mount are basic operations. We thought 
merge/append mount operations are not very commonly used seemingly. Right? And 
another thought is it is better to make patch concise and less complex for new 
feature's initial implementation, except for implementing very necessary core 
functionalities. For merge/append mount, I think many factors need to be 
considered. E.g., consider case: two mounts to be merged/appended own data with 
same name. So based on the above reasons, the current version doesn't cover 
merge/append mount operations. Any thought?

> HDFS Provided Storage Read/Write Mount Support On-the-fly
> -
>
> Key: HDFS-15714
> URL: https://issues.apache.org/jira/browse/HDFS-15714
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-15714-01.patch, 
> HDFS_Provided_Storage_Design-V1.pdf, HDFS_Provided_Storage_Performance-V1.pdf
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> HDFS Provided Storage (PS) is a feature to tier HDFS over other file systems. 
> In HDFS-9806, PROVIDED storage type was introduced to HDFS. Through 
> configuring external storage with PROVIDED tag for DataNode, user can enable 
> application to access data stored externally from HDFS side. However, there 
> are two issues need to be addressed. Firstly, mounting external storage 
> on-the-fly, namely dynamic mount, is lacking. It is necessary to get it 
> supported to flexibly combine HDFS with an external storage at runtime. 
> Secondly, PS write is not supported by current HDFS. But in real 
> applications, it is common to transfer data bi-directionally for read/write 
> between HDFS and external storage.
> Through this JIRA, we are presenting our work for PS write support and 
> dynamic mount support for both read & write. Please note in the community 
> several JIRAs have been filed for these topics. Our work is based on these 
> previous community work, with new design & implementation to support called 
> writeBack mount and enable admin to add any mount on-the-fly. We appreciate 
> those folks in the community for their great contribution! See their pending 
> JIRAs: HDFS-14805 & HDFS-12090.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly

2021-05-13 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17343743#comment-17343743
 ] 

Feilong He commented on HDFS-15714:
---

[~bpatel], thanks for your review.

As you know, add/remove/list mount are basic operations. We thought 
merge/append mount operations are not very commonly used seemingly. Right? And 
another thought is it is better to make patch concise and less complex for new 
feature's initial implementation, except for implementing very necessary core 
functionalities. For merge/append mount, I think many factors need to be 
considered. E.g., consider case: two mounts to be merged/appended own data with 
same name. So based on the above reasons, the current version doesn't cover 
merge/append mount operations. Any thought?

> HDFS Provided Storage Read/Write Mount Support On-the-fly
> -
>
> Key: HDFS-15714
> URL: https://issues.apache.org/jira/browse/HDFS-15714
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-15714-01.patch, 
> HDFS_Provided_Storage_Design-V1.pdf, HDFS_Provided_Storage_Performance-V1.pdf
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> HDFS Provided Storage (PS) is a feature to tier HDFS over other file systems. 
> In HDFS-9806, PROVIDED storage type was introduced to HDFS. Through 
> configuring external storage with PROVIDED tag for DataNode, user can enable 
> application to access data stored externally from HDFS side. However, there 
> are two issues need to be addressed. Firstly, mounting external storage 
> on-the-fly, namely dynamic mount, is lacking. It is necessary to get it 
> supported to flexibly combine HDFS with an external storage at runtime. 
> Secondly, PS write is not supported by current HDFS. But in real 
> applications, it is common to transfer data bi-directionally for read/write 
> between HDFS and external storage.
> Through this JIRA, we are presenting our work for PS write support and 
> dynamic mount support for both read & write. Please note in the community 
> several JIRAs have been filed for these topics. Our work is based on these 
> previous community work, with new design & implementation to support called 
> writeBack mount and enable admin to add any mount on-the-fly. We appreciate 
> those folks in the community for their great contribution! See their pending 
> JIRAs: HDFS-14805 & HDFS-12090.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly

2021-05-11 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17342396#comment-17342396
 ] 

Feilong He commented on HDFS-15714:
---

[~bpatel] Thanks for your comments!
 In _ReadMountManager#prepareMount_, _FSMountAttrOp.getRemotePaths_ is used to 
get remote metadata for both remote path being mounted and its children.

_FSTreeWalk_ (see its constructor) will instantiate a _FileSystem_ according to 
remote mount path url. For S3 url, it's _S3AFileSystem_. And by digging more in 
S3AFileSystem, you can find S3 client is employed to get file status in 
mounting phase. And with metadata wrapped in remote file status, like 
_modification time, access time, permission, etc_, HDFS will create its INode 
file accordingly and set Provided Storage type. 

Any comment is welcome!

> HDFS Provided Storage Read/Write Mount Support On-the-fly
> -
>
> Key: HDFS-15714
> URL: https://issues.apache.org/jira/browse/HDFS-15714
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-15714-01.patch, 
> HDFS_Provided_Storage_Design-V1.pdf, HDFS_Provided_Storage_Performance-V1.pdf
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> HDFS Provided Storage (PS) is a feature to tier HDFS over other file systems. 
> In HDFS-9806, PROVIDED storage type was introduced to HDFS. Through 
> configuring external storage with PROVIDED tag for DataNode, user can enable 
> application to access data stored externally from HDFS side. However, there 
> are two issues need to be addressed. Firstly, mounting external storage 
> on-the-fly, namely dynamic mount, is lacking. It is necessary to get it 
> supported to flexibly combine HDFS with an external storage at runtime. 
> Secondly, PS write is not supported by current HDFS. But in real 
> applications, it is common to transfer data bi-directionally for read/write 
> between HDFS and external storage.
> Through this JIRA, we are presenting our work for PS write support and 
> dynamic mount support for both read & write. Please note in the community 
> several JIRAs have been filed for these topics. Our work is based on these 
> previous community work, with new design & implementation to support called 
> writeBack mount and enable admin to add any mount on-the-fly. We appreciate 
> those folks in the community for their great contribution! See their pending 
> JIRAs: HDFS-14805 & HDFS-12090.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-16014) Issue in checking native pmdk lib by 'hadoop checknative' command

2021-05-09 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-16014:
--
Attachment: (was: HDFS-16014-01.patch)

> Issue in checking native pmdk lib by 'hadoop checknative' command
> -
>
> Key: HDFS-16014
> URL: https://issues.apache.org/jira/browse/HDFS-16014
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: native
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-16014-01.patch
>
>
> In HDFS-14818, we proposed a patch to support checking native pmdk lib. The 
> expected target is to display hint to user regarding pmdk lib loaded state. 
> Recently, it was found that pmdk lib was not successfully loaded actually but 
> the `hadoop checknative` command still tells user that it was. This issue can 
> be reproduced by moving libpmem.so* from specified installed path to other 
> place, or directly deleting these libs, after the project is built.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-16014) Issue in checking native pmdk lib by 'hadoop checknative' command

2021-05-09 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-16014:
--
Attachment: HDFS-16014-01.patch
Status: Patch Available  (was: In Progress)

> Issue in checking native pmdk lib by 'hadoop checknative' command
> -
>
> Key: HDFS-16014
> URL: https://issues.apache.org/jira/browse/HDFS-16014
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: native
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-16014-01.patch
>
>
> In HDFS-14818, we proposed a patch to support checking native pmdk lib. The 
> expected target is to display hint to user regarding pmdk lib loaded state. 
> Recently, it was found that pmdk lib was not successfully loaded actually but 
> the `hadoop checknative` command still tells user that it was. This issue can 
> be reproduced by moving libpmem.so* from specified installed path to other 
> place, or directly deleting these libs, after the project is built.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-16014) Issue in checking native pmdk lib by 'hadoop checknative' command

2021-05-08 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-16014:
--
Description: In HDFS-14818, we proposed a patch to support checking native 
pmdk lib. The expected target is to display hint to user regarding pmdk lib 
loaded state. Recently, it was found that pmdk lib was not successfully loaded 
but the `hadoop checknative` command still tells user that it was. This issue 
can be reproduced by moving libpmem.so* from specified installed path to other 
place, or directly deleting these libs, after the project is built.  (was: In 
HDFS-14818, we proposed a patch to support checking native pmdk lib. The 
expected target is to display hint to user regarding pmdk loaded state. 
Recently, it was found that pmdk lib was not successfully loaded but the 
`hadoop checknative` command still tells user that it was. This issue can be 
reproduced by moving libpmem.so* from specified installed path to other place, 
or directly deleting these libs, after the project is built.)

> Issue in checking native pmdk lib by 'hadoop checknative' command
> -
>
> Key: HDFS-16014
> URL: https://issues.apache.org/jira/browse/HDFS-16014
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: native
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-16014-01.patch
>
>
> In HDFS-14818, we proposed a patch to support checking native pmdk lib. The 
> expected target is to display hint to user regarding pmdk lib loaded state. 
> Recently, it was found that pmdk lib was not successfully loaded but the 
> `hadoop checknative` command still tells user that it was. This issue can be 
> reproduced by moving libpmem.so* from specified installed path to other 
> place, or directly deleting these libs, after the project is built.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-16014) Issue in checking native pmdk lib by 'hadoop checknative' command

2021-05-08 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-16014:
--
Description: In HDFS-14818, we proposed a patch to support checking native 
pmdk lib. The expected target is to display hint to user regarding pmdk lib 
loaded state. Recently, it was found that pmdk lib was not successfully loaded 
actually but the `hadoop checknative` command still tells user that it was. 
This issue can be reproduced by moving libpmem.so* from specified installed 
path to other place, or directly deleting these libs, after the project is 
built.  (was: In HDFS-14818, we proposed a patch to support checking native 
pmdk lib. The expected target is to display hint to user regarding pmdk lib 
loaded state. Recently, it was found that pmdk lib was not successfully loaded 
but the `hadoop checknative` command still tells user that it was. This issue 
can be reproduced by moving libpmem.so* from specified installed path to other 
place, or directly deleting these libs, after the project is built.)

> Issue in checking native pmdk lib by 'hadoop checknative' command
> -
>
> Key: HDFS-16014
> URL: https://issues.apache.org/jira/browse/HDFS-16014
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: native
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-16014-01.patch
>
>
> In HDFS-14818, we proposed a patch to support checking native pmdk lib. The 
> expected target is to display hint to user regarding pmdk lib loaded state. 
> Recently, it was found that pmdk lib was not successfully loaded actually but 
> the `hadoop checknative` command still tells user that it was. This issue can 
> be reproduced by moving libpmem.so* from specified installed path to other 
> place, or directly deleting these libs, after the project is built.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work started] (HDFS-16014) Issue in checking native pmdk lib by 'hadoop checknative' command

2021-05-08 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-16014 started by Feilong He.
-
> Issue in checking native pmdk lib by 'hadoop checknative' command
> -
>
> Key: HDFS-16014
> URL: https://issues.apache.org/jira/browse/HDFS-16014
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: native
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-16014-01.patch
>
>
> In HDFS-14818, we proposed a patch to support checking native pmdk lib. The 
> expected target is to display hint to user regarding pmdk loaded state. 
> Recently, it was found that pmdk lib was not successfully loaded but the 
> `hadoop checknative` command still tells user that it was. This issue can be 
> reproduced by moving libpmem.so* from specified installed path to other 
> place, or directly deleting these libs, after the project is built.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-16014) Issue in checking native pmdk lib by 'hadoop checknative' command

2021-05-08 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-16014:
--
Description: In HDFS-14818, we proposed a patch to support checking native 
pmdk lib. The expected target is to display hint to user regarding pmdk loaded 
state. Recently, it was found that pmdk lib was not successfully loaded but the 
`hadoop checknative` command still tells user that it was. This issue can be 
reproduced by moving libpmem.so* from specified installed path to other place, 
or directly deleting these libs, after the project is built.

> Issue in checking native pmdk lib by 'hadoop checknative' command
> -
>
> Key: HDFS-16014
> URL: https://issues.apache.org/jira/browse/HDFS-16014
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: native
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-16014-01.patch
>
>
> In HDFS-14818, we proposed a patch to support checking native pmdk lib. The 
> expected target is to display hint to user regarding pmdk loaded state. 
> Recently, it was found that pmdk lib was not successfully loaded but the 
> `hadoop checknative` command still tells user that it was. This issue can be 
> reproduced by moving libpmem.so* from specified installed path to other 
> place, or directly deleting these libs, after the project is built.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-16014) Issue in checking native pmdk lib by 'hadoop checknative' command

2021-05-08 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-16014:
--
Summary: Issue in checking native pmdk lib by 'hadoop checknative' command  
(was: HDFS check native pmdk lib issue)

> Issue in checking native pmdk lib by 'hadoop checknative' command
> -
>
> Key: HDFS-16014
> URL: https://issues.apache.org/jira/browse/HDFS-16014
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: native
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-16014-01.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Assigned] (HDFS-16014) HDFS check native pmdk lib issue

2021-05-08 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He reassigned HDFS-16014:
-

Assignee: Feilong He

> HDFS check native pmdk lib issue
> 
>
> Key: HDFS-16014
> URL: https://issues.apache.org/jira/browse/HDFS-16014
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: native
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-16014-01.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-16014) HDFS check native pmdk lib issue

2021-05-08 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-16014:
--
Attachment: HDFS-16014-01.patch

> HDFS check native pmdk lib issue
> 
>
> Key: HDFS-16014
> URL: https://issues.apache.org/jira/browse/HDFS-16014
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: native
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Priority: Major
> Attachments: HDFS-16014-01.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-16014) HDFS check native pmdk lib issue

2021-05-08 Thread Feilong He (Jira)

Feilong He created HDFS-16014:
-

 Summary: HDFS check native pmdk lib issue
 Key: HDFS-16014
 URL: https://issues.apache.org/jira/browse/HDFS-16014
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: native
Affects Versions: 3.4.0
Reporter: Feilong He






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-15788) Correct the statement for pmem cache to reflect cache persistence support

2021-05-06 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-15788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340051#comment-17340051
 ] 

Feilong He commented on HDFS-15788:
---

This is not a critical issue. I just changed the target version to just 3.4.0.

> Correct the statement for pmem cache to reflect cache persistence support
> -
>
> Key: HDFS-15788
> URL: https://issues.apache.org/jira/browse/HDFS-15788
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Minor
> Attachments: HDFS-15788-01.patch, HDFS-15788-02.patch
>
>
> Correct the statement for pmem cache to reflect cache persistence support.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-15788) Correct the statement for pmem cache to reflect cache persistence support

2021-05-06 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-15788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-15788:
--
Target Version/s: 3.4.0  (was: 3.3.1, 3.4.0)

> Correct the statement for pmem cache to reflect cache persistence support
> -
>
> Key: HDFS-15788
> URL: https://issues.apache.org/jira/browse/HDFS-15788
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Minor
> Attachments: HDFS-15788-01.patch, HDFS-15788-02.patch
>
>
> Correct the statement for pmem cache to reflect cache persistence support.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-15788) Correct the statement for pmem cache to reflect cache persistence support

2021-04-08 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-15788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17316901#comment-17316901
 ] 

Feilong He commented on HDFS-15788:
---

Hi [~ayushtkn], sorry for this late reply. This issue is relevant to HDFS-14740 
which has already been resolved in 3.3.0. We proposed this current Jira to 
update document to align with the code changes we made. The target of this Jira 
is 3.3.1 & 3.4.0.

> Correct the statement for pmem cache to reflect cache persistence support
> -
>
> Key: HDFS-15788
> URL: https://issues.apache.org/jira/browse/HDFS-15788
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Minor
> Attachments: HDFS-15788-01.patch, HDFS-15788-02.patch
>
>
> Correct the statement for pmem cache to reflect cache persistence support.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-15788) Correct the statement for pmem cache to reflect cache persistence support

2021-04-08 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-15788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-15788:
--
Target Version/s: 3.3.1, 3.4.0  (was: 3.3.1, 3.4.0, 3.1.5, 3.2.3)

> Correct the statement for pmem cache to reflect cache persistence support
> -
>
> Key: HDFS-15788
> URL: https://issues.apache.org/jira/browse/HDFS-15788
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Minor
> Attachments: HDFS-15788-01.patch, HDFS-15788-02.patch
>
>
> Correct the statement for pmem cache to reflect cache persistence support.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-15788) Correct the statement for pmem cache to reflect cache persistence support

2021-04-08 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-15788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-15788:
--
Attachment: HDFS-15788-02.patch

> Correct the statement for pmem cache to reflect cache persistence support
> -
>
> Key: HDFS-15788
> URL: https://issues.apache.org/jira/browse/HDFS-15788
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Minor
> Attachments: HDFS-15788-01.patch, HDFS-15788-02.patch
>
>
> Correct the statement for pmem cache to reflect cache persistence support.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly

2021-01-26 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-15714:
--
Description: 
HDFS Provided Storage (PS) is a feature to tier HDFS over other file systems. 
In HDFS-9806, PROVIDED storage type was introduced to HDFS. Through configuring 
external storage with PROVIDED tag for DataNode, user can enable application to 
access data stored externally from HDFS side. However, there are two issues 
need to be addressed. Firstly, mounting external storage on-the-fly, namely 
dynamic mount, is lacking. It is necessary to get it supported to flexibly 
combine HDFS with an external storage at runtime. Secondly, PS write is not 
supported by current HDFS. But in real applications, it is common to transfer 
data bi-directionally for read/write between HDFS and external storage.

Through this JIRA, we are presenting our work for PS write support and dynamic 
mount support for both read & write. Please note in the community several JIRAs 
have been filed for these topics. Our work is based on these previous community 
work, with new design & implementation to support called writeBack mount and 
enable admin to add any mount on-the-fly. We appreciate those folks in the 
community for their great contribution! See their pending JIRAs: HDFS-14805 & 
HDFS-12090.

  was:
HDFS Provided Storage (PS) is a feature to tier HDFS over other file systems. 
In HDFS-9806, PROVIDED storage type was introduced to HDFS. Through configuring 
external storage with PROVIDED tag for DataNode, user can enable application to 
access data stored externally from HDFS side. However, there are two issues 
need to be addressed. Firstly, mounting external storage on-the-fly, namely 
dynamic mount, is lacked. It is necessary to get it supported to flexibly 
combine HDFS with an external storage at runtime. Secondly, PS write is not 
supported by current HDFS. But in real applications, it is common to transfer 
data bi-directionally for read/write between HDFS and external storage.

Through this JIRA, we are presenting our work for PS write support and dynamic 
mount support for both read  Please note in the community several JIRAs 
have been filed for these topics. Our work is based on these previous community 
work, with new design & implementation to support called writeBack mount and 
enable admin to add any mount on-the-fly. We appreciate those folks in the 
community for their great contribution! See their pending JIRAs: HDFS-14805 & 
HDFS-12090.


> HDFS Provided Storage Read/Write Mount Support On-the-fly
> -
>
> Key: HDFS-15714
> URL: https://issues.apache.org/jira/browse/HDFS-15714
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-15714-01.patch, 
> HDFS_Provided_Storage_Design-V1.pdf, HDFS_Provided_Storage_Performance-V1.pdf
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> HDFS Provided Storage (PS) is a feature to tier HDFS over other file systems. 
> In HDFS-9806, PROVIDED storage type was introduced to HDFS. Through 
> configuring external storage with PROVIDED tag for DataNode, user can enable 
> application to access data stored externally from HDFS side. However, there 
> are two issues need to be addressed. Firstly, mounting external storage 
> on-the-fly, namely dynamic mount, is lacking. It is necessary to get it 
> supported to flexibly combine HDFS with an external storage at runtime. 
> Secondly, PS write is not supported by current HDFS. But in real 
> applications, it is common to transfer data bi-directionally for read/write 
> between HDFS and external storage.
> Through this JIRA, we are presenting our work for PS write support and 
> dynamic mount support for both read & write. Please note in the community 
> several JIRAs have been filed for these topics. Our work is based on these 
> previous community work, with new design & implementation to support called 
> writeBack mount and enable admin to add any mount on-the-fly. We appreciate 
> those folks in the community for their great contribution! See their pending 
> JIRAs: HDFS-14805 & HDFS-12090.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-15788) Correct the statement for pmem cache to reflect cache persistence support

2021-01-24 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-15788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-15788:
--
Target Version/s: 3.3.1, 3.4.0, 3.1.5, 3.2.3  (was: 3.4.0)

> Correct the statement for pmem cache to reflect cache persistence support
> -
>
> Key: HDFS-15788
> URL: https://issues.apache.org/jira/browse/HDFS-15788
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Minor
> Attachments: HDFS-15788-01.patch
>
>
> Correct the statement for pmem cache to reflect cache persistence support.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-15788) Correct the statement for pmem cache to reflect cache persistence support

2021-01-24 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-15788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-15788:
--
Attachment: HDFS-15788-01.patch

> Correct the statement for pmem cache to reflect cache persistence support
> -
>
> Key: HDFS-15788
> URL: https://issues.apache.org/jira/browse/HDFS-15788
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Minor
> Attachments: HDFS-15788-01.patch
>
>
> Correct the statement for pmem cache to reflect cache persistence support.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-15788) Correct the statement for pmem cache to reflect cache persistence support

2021-01-24 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-15788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-15788:
--
Status: Patch Available  (was: Open)

> Correct the statement for pmem cache to reflect cache persistence support
> -
>
> Key: HDFS-15788
> URL: https://issues.apache.org/jira/browse/HDFS-15788
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Minor
> Attachments: HDFS-15788-01.patch
>
>
> Correct the statement for pmem cache to reflect cache persistence support.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-15788) Correct the statement for pmem cache to reflect cache persistence support

2021-01-24 Thread Feilong He (Jira)

Feilong He created HDFS-15788:
-

 Summary: Correct the statement for pmem cache to reflect cache 
persistence support
 Key: HDFS-15788
 URL: https://issues.apache.org/jira/browse/HDFS-15788
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation
Affects Versions: 3.4.0
Reporter: Feilong He
Assignee: Feilong He


Correct the statement for pmem cache to reflect cache persistence support.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-12090) Handling writes from HDFS to Provided storages

2020-12-07 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-12090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17245615#comment-17245615
 ] 

Feilong He commented on HDFS-12090:
---

We filed another Jira (HDFS-15714) with this topic covered. We developed the 
feature based on the community work from [~virajith], [~ehiggs], etc. The basic 
design keeps unchanged. Thanks these folks for their great contribution.

> Handling writes from HDFS to Provided storages
> --
>
> Key: HDFS-12090
> URL: https://issues.apache.org/jira/browse/HDFS-12090
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Virajith Jalaparti
>Priority: Major
>  Labels: pull-request-available
> Attachments: External-SyncService-CreateFile.001.png, 
> HDFS-12090-Functional-Specification.001.pdf, 
> HDFS-12090-Functional-Specification.002.pdf, 
> HDFS-12090-Functional-Specification.003.pdf, HDFS-12090-design.001.pdf, 
> HDFS-12090..patch, HDFS-12090.0001.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> HDFS-9806 introduces the concept of {{PROVIDED}} storage, which makes data in 
> external storage systems accessible through HDFS. However, HDFS-9806 is 
> limited to data being read through HDFS. This JIRA will deal with how data 
> can be written to such {{PROVIDED}} storages from HDFS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14805) Mounting external stores in HDFS on-the-fly

2020-12-07 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17245613#comment-17245613
 ] 

Feilong He commented on HDFS-14805:
---

To push this feature further, we filed another Jira (HDFS-15714) with this Jira 
incorporated. We developed the feature based the pending patches in the 
community, from [~virajith], [~ehiggs], etc. The basic design keeps almost 
unchanged. Thanks these folks for their great contribution.

> Mounting external stores in HDFS on-the-fly
> ---
>
> Key: HDFS-14805
> URL: https://issues.apache.org/jira/browse/HDFS-14805
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Virajith Jalaparti
>Priority: Major
> Attachments: dynamic-mounts-in-hdfs.pdf
>
>
> Provided storage (HDFS-9806) allows HDFS to address data in external storage 
> systems, including cloud stores. Data mounted in this manner, seamlessly, 
> appears to be part of HDFS for applications/clients. The external data can 
> also be cached by HDFS on local disks and SSDs, accelerating remote data 
> reads (HDFS-13069). 
> However, Provided storage was originally targeted at ephemeral HDFS 
> deployments in the cloud (e.g., Azure HDInsight). Long running HDFS clusters 
> are common in many other scenarios which can benefit from accessing data in 
> remote stores. This JIRA targets such scenarios and aims to provide the 
> ability to:
> (a) Dynamically mount external stores in a HDFS cluster while supporting high 
> availability.
> (b) Mount multiple remote stores simultaneously.
> (c) Reduce deployment overheads and simplify usability of Provided storage.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly

2020-12-07 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17245117#comment-17245117
 ] 

Feilong He edited comment on HDFS-15714 at 12/7/20, 10:27 AM:
--

The whole patch has been uploaded. We can divide it into several ones in the 
future. The design doc and performance doc have also been uploaded. Please have 
a review. Any comment is welcome!


was (Author: philohe):
The whole patch has been uploaded. We can divide it into several patches in the 
future. The design doc and performance doc have also been uploaded. Please have 
a review.

> HDFS Provided Storage Read/Write Mount Support On-the-fly
> -
>
> Key: HDFS-15714
> URL: https://issues.apache.org/jira/browse/HDFS-15714
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-15714-01.patch, 
> HDFS_Provided_Storage_Design-V1.pdf, HDFS_Provided_Storage_Performance-V1.pdf
>
>
> HDFS Provided Storage (PS) is a feature to tier HDFS over other file systems. 
> In HDFS-9806, PROVIDED storage type was introduced to HDFS. Through 
> configuring external storage with PROVIDED tag for DataNode, user can enable 
> application to access data stored externally from HDFS side. However, there 
> are two issues need to be addressed. Firstly, mounting external storage 
> on-the-fly, namely dynamic mount, is lacked. It is necessary to get it 
> supported to flexibly combine HDFS with an external storage at runtime. 
> Secondly, PS write is not supported by current HDFS. But in real 
> applications, it is common to transfer data bi-directionally for read/write 
> between HDFS and external storage.
> Through this JIRA, we are presenting our work for PS write support and 
> dynamic mount support for both read  Please note in the community 
> several JIRAs have been filed for these topics. Our work is based on these 
> previous community work, with new design & implementation to support called 
> writeBack mount and enable admin to add any mount on-the-fly. We appreciate 
> those folks in the community for their great contribution! See their pending 
> JIRAs: HDFS-14805 & HDFS-12090.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly

2020-12-07 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-15714:
--
Description: 
HDFS Provided Storage (PS) is a feature to tier HDFS over other file systems. 
In HDFS-9806, PROVIDED storage type was introduced to HDFS. Through configuring 
external storage with PROVIDED tag for DataNode, user can enable application to 
access data stored externally from HDFS side. However, there are two issues 
need to be addressed. Firstly, mounting external storage on-the-fly, namely 
dynamic mount, is lacked. It is necessary to get it supported to flexibly 
combine HDFS with an external storage at runtime. Secondly, PS write is not 
supported by current HDFS. But in real applications, it is common to transfer 
data bi-directionally for read/write between HDFS and external storage.

Through this JIRA, we are presenting our work for PS write support and dynamic 
mount support for both read  Please note in the community several JIRAs 
have been filed for these topics. Our work is based on these previous community 
work, with new design & implementation to support called writeBack mount and 
enable admin to add any mount on-the-fly. We appreciate those folks in the 
community for their great contribution! See their pending JIRAs: HDFS-14805 & 
HDFS-12090.

> HDFS Provided Storage Read/Write Mount Support On-the-fly
> -
>
> Key: HDFS-15714
> URL: https://issues.apache.org/jira/browse/HDFS-15714
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-15714-01.patch, 
> HDFS_Provided_Storage_Design-V1.pdf, HDFS_Provided_Storage_Performance-V1.pdf
>
>
> HDFS Provided Storage (PS) is a feature to tier HDFS over other file systems. 
> In HDFS-9806, PROVIDED storage type was introduced to HDFS. Through 
> configuring external storage with PROVIDED tag for DataNode, user can enable 
> application to access data stored externally from HDFS side. However, there 
> are two issues need to be addressed. Firstly, mounting external storage 
> on-the-fly, namely dynamic mount, is lacked. It is necessary to get it 
> supported to flexibly combine HDFS with an external storage at runtime. 
> Secondly, PS write is not supported by current HDFS. But in real 
> applications, it is common to transfer data bi-directionally for read/write 
> between HDFS and external storage.
> Through this JIRA, we are presenting our work for PS write support and 
> dynamic mount support for both read  Please note in the community 
> several JIRAs have been filed for these topics. Our work is based on these 
> previous community work, with new design & implementation to support called 
> writeBack mount and enable admin to add any mount on-the-fly. We appreciate 
> those folks in the community for their great contribution! See their pending 
> JIRAs: HDFS-14805 & HDFS-12090.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly

2020-12-07 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17245117#comment-17245117
 ] 

Feilong He commented on HDFS-15714:
---

The whole patch has been uploaded. We can divide it into several patches in the 
future. The design doc and performance doc have also been uploaded. Please have 
a review.

> HDFS Provided Storage Read/Write Mount Support On-the-fly
> -
>
> Key: HDFS-15714
> URL: https://issues.apache.org/jira/browse/HDFS-15714
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-15714-01.patch, 
> HDFS_Provided_Storage_Design-V1.pdf, HDFS_Provided_Storage_Performance-V1.pdf
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly

2020-12-07 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-15714:
--
Attachment: HDFS-15714-01.patch

> HDFS Provided Storage Read/Write Mount Support On-the-fly
> -
>
> Key: HDFS-15714
> URL: https://issues.apache.org/jira/browse/HDFS-15714
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-15714-01.patch, 
> HDFS_Provided_Storage_Design-V1.pdf, HDFS_Provided_Storage_Performance-V1.pdf
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly

2020-12-07 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-15714:
--
Attachment: HDFS_Provided_Storage_Design-V1.pdf

> HDFS Provided Storage Read/Write Mount Support On-the-fly
> -
>
> Key: HDFS-15714
> URL: https://issues.apache.org/jira/browse/HDFS-15714
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS_Provided_Storage_Design-V1.pdf, 
> HDFS_Provided_Storage_Performance-V1.pdf
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly

2020-12-07 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-15714:
--
Attachment: HDFS_Provided_Storage_Performance-V1.pdf

> HDFS Provided Storage Read/Write Mount Support On-the-fly
> -
>
> Key: HDFS-15714
> URL: https://issues.apache.org/jira/browse/HDFS-15714
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS_Provided_Storage_Design-V1.pdf, 
> HDFS_Provided_Storage_Performance-V1.pdf
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly

2020-12-07 Thread Feilong He (Jira)

Feilong He created HDFS-15714:
-

 Summary: HDFS Provided Storage Read/Write Mount Support On-the-fly
 Key: HDFS-15714
 URL: https://issues.apache.org/jira/browse/HDFS-15714
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: datanode, namenode
Affects Versions: 3.4.0
Reporter: Feilong He
Assignee: Feilong He






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-7343) HDFS smart storage management

2020-10-09 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17210639#comment-17210639
 ] 

Feilong He commented on HDFS-7343:
--

Hi Brahma, currently we have no plan to merge this feature to upstream. We have 
a repo to maintain this project. See https://github.com/Intel-bigdata/SSM 

> HDFS smart storage management
> -
>
> Key: HDFS-7343
> URL: https://issues.apache.org/jira/browse/HDFS-7343
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Wei Zhou
>Priority: Major
> Attachments: HDFS-Smart-Storage-Management-update.pdf, 
> HDFS-Smart-Storage-Management.pdf, 
> HDFSSmartStorageManagement-General-20170315.pdf, 
> HDFSSmartStorageManagement-Phase1-20170315.pdf, access_count_tables.jpg, 
> move.jpg, tables_in_ssm.xlsx
>
>
> As discussed in HDFS-7285, it would be better to have a comprehensive and 
> flexible storage policy engine considering file attributes, metadata, data 
> temperature, storage type, EC codec, available hardware capabilities, 
> user/application preference and etc.
> Modified the title for re-purpose.
> We'd extend this effort some bit and aim to work on a comprehensive solution 
> to provide smart storage management service in order for convenient, 
> intelligent and effective utilizing of erasure coding or replicas, HDFS cache 
> facility, HSM offering, and all kinds of tools (balancer, mover, disk 
> balancer and so on) in a large cluster.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-15337) Support available space choosing policy in HDFS Persistent Memory Cache

2020-05-06 Thread Feilong He (Jira)

Feilong He created HDFS-15337:
-

 Summary: Support available space choosing policy in HDFS 
Persistent Memory Cache
 Key: HDFS-15337
 URL: https://issues.apache.org/jira/browse/HDFS-15337
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: caching, datanode
Reporter: Feilong He
Assignee: Feilong He


In HDFS-13762, we introduced HDFS Persistent Memory Cache feature. In that 
implementation, if more than one persistent memory volume is specified by user, 
a simple round-robin policy is used to pick up a volume to cache data. 
Evidently, the large difference of volume capacity can lead to imbalance issue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-15080) Fix the issue in reading persistent memory cache with an offset

2020-01-07 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-15080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-15080:
--
Description: 
Some applications can read a segment of pmem cache with an offset specified. 
The previous implementation for pmem cache read with DirectByteBuffer didn't 
cover this situation.

Let me explain further. In our test, we used spark SQL to run some TPC-DS 
workload to read the cache data and hits read exception. This was due to the 
missed seek offset arg, which is used in spark SQL to read data packet by 
packet.

  was:Some applications can read a segment of pmem cache with an offset 
specified. The previous implementation for pmem cache read with 
DirectByteBuffer didn't cover this situation.


> Fix the issue in reading persistent memory cache with an offset
> ---
>
> Key: HDFS-15080
> URL: https://issues.apache.org/jira/browse/HDFS-15080
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 3.2.2
>
> Attachments: HDFS-15080-000.patch, HDFS-15080-branch-3.1-000.patch, 
> HDFS-15080-branch-3.2-000.patch
>
>
> Some applications can read a segment of pmem cache with an offset specified. 
> The previous implementation for pmem cache read with DirectByteBuffer didn't 
> cover this situation.
> Let me explain further. In our test, we used spark SQL to run some TPC-DS 
> workload to read the cache data and hits read exception. This was due to the 
> missed seek offset arg, which is used in spark SQL to read data packet by 
> packet.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-15080) Fix the issue in reading persistent memory cache with an offset

2019-12-25 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-15080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-15080:
--
Attachment: HDFS-15080-branch-3.2-000.patch

> Fix the issue in reading persistent memory cache with an offset
> ---
>
> Key: HDFS-15080
> URL: https://issues.apache.org/jira/browse/HDFS-15080
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 3.2.2
>
> Attachments: HDFS-15080-000.patch, HDFS-15080-branch-3.1-000.patch, 
> HDFS-15080-branch-3.2-000.patch
>
>
> Some applications can read a segment of pmem cache with an offset specified. 
> The previous implementation for pmem cache read with DirectByteBuffer didn't 
> cover this situation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-15080) Fix the issue in reading persistent memory cache with an offset

2019-12-25 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-15080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-15080:
--
Attachment: HDFS-15080-branch-3.1-000.patch

> Fix the issue in reading persistent memory cache with an offset
> ---
>
> Key: HDFS-15080
> URL: https://issues.apache.org/jira/browse/HDFS-15080
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 3.2.2
>
> Attachments: HDFS-15080-000.patch, HDFS-15080-branch-3.1-000.patch
>
>
> Some applications can read a segment of pmem cache with an offset specified. 
> The previous implementation for pmem cache read with DirectByteBuffer didn't 
> cover this situation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts

2019-12-24 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17003132#comment-17003132
 ] 

Feilong He commented on HDFS-14740:
---

[^HDFS-14740.009.patch], [^HDFS-14740-branch-3.1-001.patch], 
[^HDFS-14740-branch-3.2-001.patch] were loaded with some code refactor. We will 
consider to check in in the following days. If you have any suggestion, please 
feel free to post it.

> Recover data blocks from persistent memory read cache during datanode restarts
> --
>
> Key: HDFS-14740
> URL: https://issues.apache.org/jira/browse/HDFS-14740
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-14740-branch-3.1-000.patch, 
> HDFS-14740-branch-3.1-001.patch, HDFS-14740-branch-3.2-000.patch, 
> HDFS-14740-branch-3.2-001.patch, HDFS-14740.000.patch, HDFS-14740.001.patch, 
> HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, 
> HDFS-14740.005.patch, HDFS-14740.006.patch, HDFS-14740.007.patch, 
> HDFS-14740.008.patch, HDFS-14740.009.patch, 
> HDFS_Persistent_Read-Cache_Design-v1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf
>
>
> In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache 
> management. Even though PM can persist cache data, for simplifying the 
> initial implementation, the previous cache data will be cleaned up during 
> DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking 
> advantage of PM's data persistence characteristic, i.e., recovering the 
> status for cached data, if any, when DataNode restarts, thus, cache warm up 
> time can be saved for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-15080) Fix the issue in reading persistent memory cache with an offset

2019-12-24 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-15080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-15080:
--
Fix Version/s: 3.2.2
   3.1.4
   3.3.0

> Fix the issue in reading persistent memory cache with an offset
> ---
>
> Key: HDFS-15080
> URL: https://issues.apache.org/jira/browse/HDFS-15080
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 3.2.2
>
> Attachments: HDFS-15080-000.patch
>
>
> Some applications can read a segment of pmem cache with an offset specified. 
> The previous implementation for pmem cache read with DirectByteBuffer didn't 
> cover this situation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-15080) Fix the issue in reading persistent memory cache with an offset

2019-12-24 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-15080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-15080:
--
Description: Some applications can read a segment of pmem cache with an 
offset specified. The previous implementation for pmem cache read with 
DirectByteBuffer didn't cover this situation.  (was: Some applications can read 
a segment of pmem cache with an offset specified. The previous implementation 
didn't cover this situation.)

> Fix the issue in reading persistent memory cache with an offset
> ---
>
> Key: HDFS-15080
> URL: https://issues.apache.org/jira/browse/HDFS-15080
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-15080-000.patch
>
>
> Some applications can read a segment of pmem cache with an offset specified. 
> The previous implementation for pmem cache read with DirectByteBuffer didn't 
> cover this situation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-15080) Fix the issue in reading persistent memory cache with an offset

2019-12-24 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-15080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-15080:
--
Description: Some applications can read a segment of pmem cache with an 
offset specified. The previous implementation didn't cover this situation.

> Fix the issue in reading persistent memory cache with an offset
> ---
>
> Key: HDFS-15080
> URL: https://issues.apache.org/jira/browse/HDFS-15080
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-15080-000.patch
>
>
> Some applications can read a segment of pmem cache with an offset specified. 
> The previous implementation didn't cover this situation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-15080) Fix the issue in reading persistent memory cache with an offset

2019-12-24 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-15080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-15080:
--
Attachment: HDFS-15080-000.patch
Status: Patch Available  (was: Open)

> Fix the issue in reading persistent memory cache with an offset
> ---
>
> Key: HDFS-15080
> URL: https://issues.apache.org/jira/browse/HDFS-15080
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-15080-000.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-15080) Fix the issue in reading persistent memory cache with an offset

2019-12-24 Thread Feilong He (Jira)

Feilong He created HDFS-15080:
-

 Summary: Fix the issue in reading persistent memory cache with an 
offset
 Key: HDFS-15080
 URL: https://issues.apache.org/jira/browse/HDFS-15080
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: caching, datanode
Reporter: Feilong He
Assignee: Feilong He






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts

2019-12-23 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-14740:
--
Attachment: HDFS-14740-branch-3.1-001.patch

> Recover data blocks from persistent memory read cache during datanode restarts
> --
>
> Key: HDFS-14740
> URL: https://issues.apache.org/jira/browse/HDFS-14740
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-14740-branch-3.1-000.patch, 
> HDFS-14740-branch-3.1-001.patch, HDFS-14740-branch-3.2-000.patch, 
> HDFS-14740-branch-3.2-001.patch, HDFS-14740.000.patch, HDFS-14740.001.patch, 
> HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, 
> HDFS-14740.005.patch, HDFS-14740.006.patch, HDFS-14740.007.patch, 
> HDFS-14740.008.patch, HDFS-14740.009.patch, 
> HDFS_Persistent_Read-Cache_Design-v1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf
>
>
> In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache 
> management. Even though PM can persist cache data, for simplifying the 
> initial implementation, the previous cache data will be cleaned up during 
> DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking 
> advantage of PM's data persistence characteristic, i.e., recovering the 
> status for cached data, if any, when DataNode restarts, thus, cache warm up 
> time can be saved for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts

2019-12-23 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-14740:
--
Attachment: HDFS-14740-branch-3.2-001.patch

> Recover data blocks from persistent memory read cache during datanode restarts
> --
>
> Key: HDFS-14740
> URL: https://issues.apache.org/jira/browse/HDFS-14740
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-14740-branch-3.1-000.patch, 
> HDFS-14740-branch-3.2-000.patch, HDFS-14740-branch-3.2-001.patch, 
> HDFS-14740.000.patch, HDFS-14740.001.patch, HDFS-14740.002.patch, 
> HDFS-14740.003.patch, HDFS-14740.004.patch, HDFS-14740.005.patch, 
> HDFS-14740.006.patch, HDFS-14740.007.patch, HDFS-14740.008.patch, 
> HDFS-14740.009.patch, HDFS_Persistent_Read-Cache_Design-v1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf
>
>
> In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache 
> management. Even though PM can persist cache data, for simplifying the 
> initial implementation, the previous cache data will be cleaned up during 
> DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking 
> advantage of PM's data persistence characteristic, i.e., recovering the 
> status for cached data, if any, when DataNode restarts, thus, cache warm up 
> time can be saved for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts

2019-12-23 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-14740:
--
Attachment: HDFS-14740.009.patch

> Recover data blocks from persistent memory read cache during datanode restarts
> --
>
> Key: HDFS-14740
> URL: https://issues.apache.org/jira/browse/HDFS-14740
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-14740-branch-3.1-000.patch, 
> HDFS-14740-branch-3.2-000.patch, HDFS-14740.000.patch, HDFS-14740.001.patch, 
> HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, 
> HDFS-14740.005.patch, HDFS-14740.006.patch, HDFS-14740.007.patch, 
> HDFS-14740.008.patch, HDFS-14740.009.patch, 
> HDFS_Persistent_Read-Cache_Design-v1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf
>
>
> In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache 
> management. Even though PM can persist cache data, for simplifying the 
> initial implementation, the previous cache data will be cleaned up during 
> DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking 
> advantage of PM's data persistence characteristic, i.e., recovering the 
> status for cached data, if any, when DataNode restarts, thus, cache warm up 
> time can be saved for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts

2019-12-19 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-14740:
--
Attachment: HDFS-14740-branch-3.1-000.patch

> Recover data blocks from persistent memory read cache during datanode restarts
> --
>
> Key: HDFS-14740
> URL: https://issues.apache.org/jira/browse/HDFS-14740
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-14740-branch-3.1-000.patch, 
> HDFS-14740-branch-3.2-000.patch, HDFS-14740.000.patch, HDFS-14740.001.patch, 
> HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, 
> HDFS-14740.005.patch, HDFS-14740.006.patch, HDFS-14740.007.patch, 
> HDFS-14740.008.patch, HDFS_Persistent_Read-Cache_Design-v1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf
>
>
> In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache 
> management. Even though PM can persist cache data, for simplifying the 
> initial implementation, the previous cache data will be cleaned up during 
> DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking 
> advantage of PM's data persistence characteristic, i.e., recovering the 
> status for cached data, if any, when DataNode restarts, thus, cache warm up 
> time can be saved for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts

2019-12-19 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-14740:
--
Attachment: (was: HDFS-14740-branch-3.1-000.patch)

> Recover data blocks from persistent memory read cache during datanode restarts
> --
>
> Key: HDFS-14740
> URL: https://issues.apache.org/jira/browse/HDFS-14740
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-14740-branch-3.1-000.patch, 
> HDFS-14740-branch-3.2-000.patch, HDFS-14740.000.patch, HDFS-14740.001.patch, 
> HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, 
> HDFS-14740.005.patch, HDFS-14740.006.patch, HDFS-14740.007.patch, 
> HDFS-14740.008.patch, HDFS_Persistent_Read-Cache_Design-v1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf
>
>
> In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache 
> management. Even though PM can persist cache data, for simplifying the 
> initial implementation, the previous cache data will be cleaned up during 
> DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking 
> advantage of PM's data persistence characteristic, i.e., recovering the 
> status for cached data, if any, when DataNode restarts, thus, cache warm up 
> time can be saved for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts

2019-12-19 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16999833#comment-16999833
 ] 

Feilong He commented on HDFS-14740:
---

[^HDFS-14740-branch-3.1-000.patch] & [^HDFS-14740-branch-3.2-000.patch] have 
been uploaded, respectively for backporting the code to branch-3.1 and 
branch-3.2.

> Recover data blocks from persistent memory read cache during datanode restarts
> --
>
> Key: HDFS-14740
> URL: https://issues.apache.org/jira/browse/HDFS-14740
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-14740-branch-3.1-000.patch, 
> HDFS-14740-branch-3.2-000.patch, HDFS-14740.000.patch, HDFS-14740.001.patch, 
> HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, 
> HDFS-14740.005.patch, HDFS-14740.006.patch, HDFS-14740.007.patch, 
> HDFS-14740.008.patch, HDFS_Persistent_Read-Cache_Design-v1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf
>
>
> In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache 
> management. Even though PM can persist cache data, for simplifying the 
> initial implementation, the previous cache data will be cleaned up during 
> DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking 
> advantage of PM's data persistence characteristic, i.e., recovering the 
> status for cached data, if any, when DataNode restarts, thus, cache warm up 
> time can be saved for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts

2019-12-18 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-14740:
--
Attachment: HDFS-14740-branch-3.2-000.patch

> Recover data blocks from persistent memory read cache during datanode restarts
> --
>
> Key: HDFS-14740
> URL: https://issues.apache.org/jira/browse/HDFS-14740
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-14740-branch-3.1-000.patch, 
> HDFS-14740-branch-3.2-000.patch, HDFS-14740.000.patch, HDFS-14740.001.patch, 
> HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, 
> HDFS-14740.005.patch, HDFS-14740.006.patch, HDFS-14740.007.patch, 
> HDFS-14740.008.patch, HDFS_Persistent_Read-Cache_Design-v1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf
>
>
> In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache 
> management. Even though PM can persist cache data, for simplifying the 
> initial implementation, the previous cache data will be cleaned up during 
> DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking 
> advantage of PM's data persistence characteristic, i.e., recovering the 
> status for cached data, if any, when DataNode restarts, thus, cache warm up 
> time can be saved for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts

2019-12-18 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-14740:
--
Attachment: HDFS-14740-branch-3.1-000.patch

> Recover data blocks from persistent memory read cache during datanode restarts
> --
>
> Key: HDFS-14740
> URL: https://issues.apache.org/jira/browse/HDFS-14740
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-14740-branch-3.1-000.patch, HDFS-14740.000.patch, 
> HDFS-14740.001.patch, HDFS-14740.002.patch, HDFS-14740.003.patch, 
> HDFS-14740.004.patch, HDFS-14740.005.patch, HDFS-14740.006.patch, 
> HDFS-14740.007.patch, HDFS-14740.008.patch, 
> HDFS_Persistent_Read-Cache_Design-v1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf
>
>
> In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache 
> management. Even though PM can persist cache data, for simplifying the 
> initial implementation, the previous cache data will be cleaned up during 
> DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking 
> advantage of PM's data persistence characteristic, i.e., recovering the 
> status for cached data, if any, when DataNode restarts, thus, cache warm up 
> time can be saved for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts

2019-12-16 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16997937#comment-16997937
 ] 

Feilong He commented on HDFS-14740:
---

Thanks [~rakeshr] for your suggestion. '{{dfs.datanode.pmem.cache.restore}}' 
and '{{dfs.datanode.pmem.cache.dirs}}' looks good to me. 
[^HDFS-14740.008.patch] has some updates covering this.

> Recover data blocks from persistent memory read cache during datanode restarts
> --
>
> Key: HDFS-14740
> URL: https://issues.apache.org/jira/browse/HDFS-14740
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, 
> HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, 
> HDFS-14740.005.patch, HDFS-14740.006.patch, HDFS-14740.007.patch, 
> HDFS-14740.008.patch, HDFS_Persistent_Read-Cache_Design-v1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf
>
>
> In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache 
> management. Even though PM can persist cache data, for simplifying the 
> initial implementation, the previous cache data will be cleaned up during 
> DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking 
> advantage of PM's data persistence characteristic, i.e., recovering the 
> status for cached data, if any, when DataNode restarts, thus, cache warm up 
> time can be saved for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts

2019-12-16 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-14740:
--
Attachment: HDFS-14740.008.patch

> Recover data blocks from persistent memory read cache during datanode restarts
> --
>
> Key: HDFS-14740
> URL: https://issues.apache.org/jira/browse/HDFS-14740
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, 
> HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, 
> HDFS-14740.005.patch, HDFS-14740.006.patch, HDFS-14740.007.patch, 
> HDFS-14740.008.patch, HDFS_Persistent_Read-Cache_Design-v1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf
>
>
> In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache 
> management. Even though PM can persist cache data, for simplifying the 
> initial implementation, the previous cache data will be cleaned up during 
> DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking 
> advantage of PM's data persistence characteristic, i.e., recovering the 
> status for cached data, if any, when DataNode restarts, thus, cache warm up 
> time can be saved for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts

2019-12-06 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989587#comment-16989587
 ] 

Feilong He commented on HDFS-14740:
---

[^HDFS-14740.007.patch] has been uploaded to change a property to 
'dfs.datanode.cache.restore.enabled'. Comment is welcome!

> Recover data blocks from persistent memory read cache during datanode restarts
> --
>
> Key: HDFS-14740
> URL: https://issues.apache.org/jira/browse/HDFS-14740
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, 
> HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, 
> HDFS-14740.005.patch, HDFS-14740.006.patch, HDFS-14740.007.patch, 
> HDFS_Persistent_Read-Cache_Design-v1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf
>
>
> In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache 
> management. Even though PM can persist cache data, for simplifying the 
> initial implementation, the previous cache data will be cleaned up during 
> DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking 
> advantage of PM's data persistence characteristic, i.e., recovering the 
> status for cached data, if any, when DataNode restarts, thus, cache warm up 
> time can be saved for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts

2019-12-06 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-14740:
--
Attachment: HDFS-14740.007.patch

> Recover data blocks from persistent memory read cache during datanode restarts
> --
>
> Key: HDFS-14740
> URL: https://issues.apache.org/jira/browse/HDFS-14740
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, 
> HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, 
> HDFS-14740.005.patch, HDFS-14740.006.patch, HDFS-14740.007.patch, 
> HDFS_Persistent_Read-Cache_Design-v1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf
>
>
> In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache 
> management. Even though PM can persist cache data, for simplifying the 
> initial implementation, the previous cache data will be cleaned up during 
> DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking 
> advantage of PM's data persistence characteristic, i.e., recovering the 
> status for cached data, if any, when DataNode restarts, thus, cache warm up 
> time can be saved for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts

2019-12-06 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989573#comment-16989573
 ] 

Feilong He commented on HDFS-14740:
---

Thanks [~rakeshr] so much for your comments. Sorry for this late reply.
 # Yes, 'dfs.datanode.cache.persistence.enabled' looks a bit ambiguous to user. 
This property is used to control whether the cache on pmem should be restored 
to aviod unnecessarily pulling data to pmem again after DataNode restarts. I 
prefer to use 'dfs.datanode.cache.restore.enabled'. If you have other comment, 
please kindly let me know.
 # I have conducted some tests on the case you mentioned.  1) In my test, a 
file is cached to pmem by HDFS with the above flag set to true. Then, I 
shutdown the cluster and set the flag to false. After restarted the cluster, I 
noted that the previous cache is dropped on pmem and DataNode has to recache 
the block data to pmem, as we expected. 2) I also did another test. Firstly, a 
file is cached to pmem by HDFS with the above flag set to false. Then, I 
shutdown the cluster and set the flat to true. During the restarting of 
DataNode, I can see that the previous cache is restored, as we expected. To sum 
up, the behavior in the two tests aligns with the purpose of this flag. 

> Recover data blocks from persistent memory read cache during datanode restarts
> --
>
> Key: HDFS-14740
> URL: https://issues.apache.org/jira/browse/HDFS-14740
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, 
> HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, 
> HDFS-14740.005.patch, HDFS-14740.006.patch, 
> HDFS_Persistent_Read-Cache_Design-v1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf
>
>
> In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache 
> management. Even though PM can persist cache data, for simplifying the 
> initial implementation, the previous cache data will be cleaned up during 
> DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking 
> advantage of PM's data persistence characteristic, i.e., recovering the 
> status for cached data, if any, when DataNode restarts, thus, cache warm up 
> time can be saved for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Assigned] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts

2019-12-06 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He reassigned HDFS-14740:
-

Assignee: Feilong He  (was: Rui Mo)

> Recover data blocks from persistent memory read cache during datanode restarts
> --
>
> Key: HDFS-14740
> URL: https://issues.apache.org/jira/browse/HDFS-14740
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, 
> HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, 
> HDFS-14740.005.patch, HDFS-14740.006.patch, 
> HDFS_Persistent_Read-Cache_Design-v1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf
>
>
> In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache 
> management. Even though PM can persist cache data, for simplifying the 
> initial implementation, the previous cache data will be cleaned up during 
> DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking 
> advantage of PM's data persistence characteristic, i.e., recovering the 
> status for cached data, if any, when DataNode restarts, thus, cache warm up 
> time can be saved for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts

2019-11-14 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16974816#comment-16974816
 ] 

Feilong He commented on HDFS-14740:
---

[^HDFS_Persistent_Read-Cache_Test-v2.pdf] has been uploaded for your reference.

> Recover data blocks from persistent memory read cache during datanode restarts
> --
>
> Key: HDFS-14740
> URL: https://issues.apache.org/jira/browse/HDFS-14740
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Rui Mo
>Priority: Major
> Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, 
> HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, 
> HDFS-14740.005.patch, HDFS-14740.006.patch, 
> HDFS_Persistent_Read-Cache_Design-v1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf
>
>
> In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache 
> management. Even though PM can persist cache data, for simplifying the 
> initial implementation, the previous cache data will be cleaned up during 
> DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking 
> advantage of PM's data persistence characteristic, i.e., recovering the 
> status for cached data, if any, when DataNode restarts, thus, cache warm up 
> time can be saved for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts

2019-11-14 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-14740:
--
Attachment: HDFS_Persistent_Read-Cache_Test-v2.pdf

> Recover data blocks from persistent memory read cache during datanode restarts
> --
>
> Key: HDFS-14740
> URL: https://issues.apache.org/jira/browse/HDFS-14740
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Rui Mo
>Priority: Major
> Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, 
> HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, 
> HDFS-14740.005.patch, HDFS-14740.006.patch, 
> HDFS_Persistent_Read-Cache_Design-v1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf
>
>
> In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache 
> management. Even though PM can persist cache data, for simplifying the 
> initial implementation, the previous cache data will be cleaned up during 
> DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking 
> advantage of PM's data persistence characteristic, i.e., recovering the 
> status for cached data, if any, when DataNode restarts, thus, cache warm up 
> time can be saved for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14740) HDFS read cache persistence support

2019-10-24 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16958672#comment-16958672
 ] 

Feilong He commented on HDFS-14740:
---

[^HDFS_Persistent_Read-Cache_Design-v1.pdf] and 
[^HDFS_Persistent_Read-Cache_Test-v1.pdf] have been uploaded. Any comment is 
welcome!

> HDFS read cache persistence support
> ---
>
> Key: HDFS-14740
> URL: https://issues.apache.org/jira/browse/HDFS-14740
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Rui Mo
>Priority: Major
> Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, 
> HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, 
> HDFS-14740.005.patch, HDFS-14740.006.patch, 
> HDFS_Persistent_Read-Cache_Design-v1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.pdf
>
>
> In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache 
> management. Even though PM can persist cache data, for simplifying the 
> initial implementation, the previous cache data will be cleaned up during 
> DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking 
> advantage of PM's data persistence characteristic, i.e., recovering the 
> status for cached data, if any, when DataNode restarts, thus, cache warm up 
> time can be saved for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14740) HDFS read cache persistence support

2019-10-24 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-14740:
--
Attachment: HDFS_Persistent_Read-Cache_Test-v1.pdf
HDFS_Persistent_Read-Cache_Design-v1.pdf

> HDFS read cache persistence support
> ---
>
> Key: HDFS-14740
> URL: https://issues.apache.org/jira/browse/HDFS-14740
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Rui Mo
>Priority: Major
> Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, 
> HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, 
> HDFS-14740.005.patch, HDFS-14740.006.patch, 
> HDFS_Persistent_Read-Cache_Design-v1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.pdf
>
>
> In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache 
> management. Even though PM can persist cache data, for simplifying the 
> initial implementation, the previous cache data will be cleaned up during 
> DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking 
> advantage of PM's data persistence characteristic, i.e., recovering the 
> status for cached data, if any, when DataNode restarts, thus, cache warm up 
> time can be saved for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14740) HDFS read cache persistence support

2019-10-14 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16950849#comment-16950849
 ] 

Feilong He commented on HDFS-14740:
---

[~Rui Mo], please prepare a design doc and test doc, then upload them to this 
JIra. Thanks!

> HDFS read cache persistence support
> ---
>
> Key: HDFS-14740
> URL: https://issues.apache.org/jira/browse/HDFS-14740
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Rui Mo
>Priority: Major
> Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, 
> HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, 
> HDFS-14740.005.patch, HDFS-14740.006.patch
>
>
> In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache 
> management. Even though PM can persist cache data, for simplifying the 
> initial implementation, the previous cache data will be cleaned up during 
> DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking 
> advantage of PM's data persistence characteristic, i.e., recovering the 
> status for cached data, if any, when DataNode restarts, thus, cache warm up 
> time can be saved for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14905) Backport HDFS persistent memory read cache support to branch-3.2

2019-10-12 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-14905:
--
Attachment: HDFS-14905-branch-3.2-000.patch
Status: Patch Available  (was: Open)

> Backport HDFS persistent memory read cache support to branch-3.2
> 
>
> Key: HDFS-14905
> URL: https://issues.apache.org/jira/browse/HDFS-14905
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14905-branch-3.2-000.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-14905) Backport HDFS persistent memory read cache support to branch-3.2

2019-10-12 Thread Feilong He (Jira)

Feilong He created HDFS-14905:
-

 Summary: Backport HDFS persistent memory read cache support to 
branch-3.2
 Key: HDFS-14905
 URL: https://issues.apache.org/jira/browse/HDFS-14905
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: caching, datanode
Reporter: Feilong He
Assignee: Feilong He
 Fix For: 3.3.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14745) Backport HDFS persistent memory read cache support to branch-3.1

2019-10-09 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-14745:
--
Attachment: HDFS-14745-branch-3.1-003.patch

> Backport HDFS persistent memory read cache support to branch-3.1
> 
>
> Key: HDFS-14745
> URL: https://issues.apache.org/jira/browse/HDFS-14745
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
>  Labels: cache, datanode
> Fix For: 3.3.0
>
> Attachments: HDFS-14745-branch-3.1-000.patch, 
> HDFS-14745-branch-3.1-001.patch, HDFS-14745-branch-3.1-002.patch, 
> HDFS-14745-branch-3.1-003.patch
>
>
> We are proposing to backport the patches for HDFS-13762, HDFS persistent 
> memory read cache support, to branch-3.1.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-14518) Optimize HDFS cache checksum and make checksum enabling configurable

2019-10-08 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16851456#comment-16851456
 ] 

Feilong He edited comment on HDFS-14518 at 10/8/19 8:51 AM:


Hi [~weichiu], this Jira is common to DRAM cache and Pmem cache. So strictly 
speaking, it is not only related to HDFS-13762, and the original DRAM cache 
will also be affected.


was (Author: philohe):
Hi [~weichiu], this Jira is common to DRAM cache and Pmem cache. So strictly 
speaking, it is not only related to HDFS-13762, but the original DRAM cache 
will also be affected.

> Optimize HDFS cache checksum and make checksum enabling configurable
> 
>
> Key: HDFS-14518
> URL: https://issues.apache.org/jira/browse/HDFS-14518
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Minor
> Attachments: HDFS-14518-.patch
>
>
> HDFS cache checksum can be operated on cached data for verification. And we 
> can also consider to make checksum configurable, thus user can shutdown 
> checksum operation when caching data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14518) Optimize HDFS cache checksum and make checksum enabling configurable

2019-09-27 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-14518:
--
Summary: Optimize HDFS cache checksum and make checksum enabling 
configurable  (was: Optimize HDFS cache checksum and make checksum configurable)

> Optimize HDFS cache checksum and make checksum enabling configurable
> 
>
> Key: HDFS-14518
> URL: https://issues.apache.org/jira/browse/HDFS-14518
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Minor
> Attachments: HDFS-14518-.patch
>
>
> HDFS cache checksum can be operated on cached data for verification. And we 
> can also consider to make checksum configurable, thus user can shutdown 
> checksum operation when caching data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14518) Optimize HDFS cache checksum and make checksum configurable

2019-09-27 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16939242#comment-16939242
 ] 

Feilong He commented on HDFS-14518:
---

[^HDFS-14518-.patch] is an inital patch and the native PMDK impl for 
caching block to PMEM has not been included. It looks that the size of buffer 
used in checksum can have an evident impact on the performance. It may  also 
need to be optimized. Any comment is welcome!

> Optimize HDFS cache checksum and make checksum configurable
> ---
>
> Key: HDFS-14518
> URL: https://issues.apache.org/jira/browse/HDFS-14518
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Minor
> Attachments: HDFS-14518-.patch
>
>
> HDFS cache checksum can be operated on cached data for verification. And we 
> can also consider to make checksum configurable, thus user can shutdown 
> checksum operation when caching data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14518) Optimize HDFS cache checksum and make checksum configurable

2019-09-27 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-14518:
--
Attachment: HDFS-14518-.patch

> Optimize HDFS cache checksum and make checksum configurable
> ---
>
> Key: HDFS-14518
> URL: https://issues.apache.org/jira/browse/HDFS-14518
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Minor
> Attachments: HDFS-14518-.patch
>
>
> HDFS cache checksum can be operated on cached data for verification. And we 
> can also consider to make checksum configurable, thus user can shutdown 
> checksum operation when caching data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-14518) Optimize HDFS cache checksum and make checksum configurable

2019-09-27 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16851456#comment-16851456
 ] 

Feilong He edited comment on HDFS-14518 at 9/27/19 8:11 AM:


Hi [~jojochuang], this Jira is common to DRAM cache and Pmem cache. So strictly 
speaking, it is not only related to HDFS-13762, but the original DRAM cache 
will also be affected.


was (Author: philohe):
Hi [~jojochuang], this Jira is common to DRAM cache and Pmem cache. So strictly 
speaking, it is not related to HDFS-13762.

> Optimize HDFS cache checksum and make checksum configurable
> ---
>
> Key: HDFS-14518
> URL: https://issues.apache.org/jira/browse/HDFS-14518
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Minor
>
> HDFS cache checksum can be operated on cached data for verification. And we 
> can also consider to make checksum configurable, thus user can shutdown 
> checksum operation when caching data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14518) Optimize HDFS cache checksum and make checksum configurable

2019-09-27 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-14518:
--
Description: HDFS cache checksum can be operated on cached data for 
verification. And we can also consider to make checksum configurable, thus user 
can shutdown checksum operation when caching data.  (was: HDFS cache checksum 
can be operated on cached data for verification. And we can also consider to 
make checksum configurable, so user can shutdown checksum operation when 
caching data.)

> Optimize HDFS cache checksum and make checksum configurable
> ---
>
> Key: HDFS-14518
> URL: https://issues.apache.org/jira/browse/HDFS-14518
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Minor
>
> HDFS cache checksum can be operated on cached data for verification. And we 
> can also consider to make checksum configurable, thus user can shutdown 
> checksum operation when caching data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-14745) Backport HDFS persistent memory read cache support to branch-3.1

2019-09-27 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16939199#comment-16939199
 ] 

Feilong He edited comment on HDFS-14745 at 9/27/19 8:03 AM:


[^HDFS-14745-branch-3.1-002.patch] has been uploaded to include the patch for 
HDFS-14818. 


was (Author: philohe):
[^HDFS-14745-branch-3.1-002.patch] has been uploaded to include the patch 
HDFS-14818. 

> Backport HDFS persistent memory read cache support to branch-3.1
> 
>
> Key: HDFS-14745
> URL: https://issues.apache.org/jira/browse/HDFS-14745
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
>  Labels: cache, datanode
> Fix For: 3.3.0
>
> Attachments: HDFS-14745-branch-3.1-000.patch, 
> HDFS-14745-branch-3.1-001.patch, HDFS-14745-branch-3.1-002.patch
>
>
> We are proposing to backport the patches for HDFS-13762, HDFS persistent 
> memory read cache support, to branch-3.1.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14745) Backport HDFS persistent memory read cache support to branch-3.1

2019-09-27 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16939199#comment-16939199
 ] 

Feilong He commented on HDFS-14745:
---

[^HDFS-14745-branch-3.1-002.patch] has been uploaded to include the patch 
HDFS-14818. 

> Backport HDFS persistent memory read cache support to branch-3.1
> 
>
> Key: HDFS-14745
> URL: https://issues.apache.org/jira/browse/HDFS-14745
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
>  Labels: cache, datanode
> Fix For: 3.3.0
>
> Attachments: HDFS-14745-branch-3.1-000.patch, 
> HDFS-14745-branch-3.1-001.patch, HDFS-14745-branch-3.1-002.patch
>
>
> We are proposing to backport the patches for HDFS-13762, HDFS persistent 
> memory read cache support, to branch-3.1.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14745) Backport HDFS persistent memory read cache support to branch-3.1

2019-09-27 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-14745:
--
Attachment: HDFS-14745-branch-3.1-002.patch

> Backport HDFS persistent memory read cache support to branch-3.1
> 
>
> Key: HDFS-14745
> URL: https://issues.apache.org/jira/browse/HDFS-14745
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
>  Labels: cache, datanode
> Fix For: 3.3.0
>
> Attachments: HDFS-14745-branch-3.1-000.patch, 
> HDFS-14745-branch-3.1-001.patch, HDFS-14745-branch-3.1-002.patch
>
>
> We are proposing to backport the patches for HDFS-13762, HDFS persistent 
> memory read cache support, to branch-3.1.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14818) Check native pmdk lib by 'hadoop checknative' command

2019-09-20 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16934140#comment-16934140
 ] 

Feilong He commented on HDFS-14818:
---

The uploaded [^HDFS-14818.004.patch] fixed checkstyle issues and made deferring 
conditions checked only for DRAM cache case. These original deferring 
conditions are just applicable to DRAM cache.

> Check native pmdk lib by 'hadoop checknative' command
> -
>
> Key: HDFS-14818
> URL: https://issues.apache.org/jira/browse/HDFS-14818
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: native
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Minor
> Attachments: HDFS-14818.000.patch, HDFS-14818.001.patch, 
> HDFS-14818.002.patch, HDFS-14818.003.patch, HDFS-14818.004.patch, 
> check_native_after_building_with_PMDK.png, 
> check_native_after_building_with_PMDK_using_NAME_instead_of_REALPATH.png, 
> check_native_after_building_without_PMDK.png
>
>
> Currently, 'hadoop checknative' command supports checking native libs, such 
> as zlib, snappy, openssl and ISA-L etc. It's necessary to include pmdk lib in 
> the checking.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14818) Check native pmdk lib by 'hadoop checknative' command

2019-09-20 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-14818:
--
Attachment: HDFS-14818.004.patch

> Check native pmdk lib by 'hadoop checknative' command
> -
>
> Key: HDFS-14818
> URL: https://issues.apache.org/jira/browse/HDFS-14818
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: native
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Minor
> Attachments: HDFS-14818.000.patch, HDFS-14818.001.patch, 
> HDFS-14818.002.patch, HDFS-14818.003.patch, HDFS-14818.004.patch, 
> check_native_after_building_with_PMDK.png, 
> check_native_after_building_with_PMDK_using_NAME_instead_of_REALPATH.png, 
> check_native_after_building_without_PMDK.png
>
>
> Currently, 'hadoop checknative' command supports checking native libs, such 
> as zlib, snappy, openssl and ISA-L etc. It's necessary to include pmdk lib in 
> the checking.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14818) Check native pmdk lib by 'hadoop checknative' command

2019-09-20 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-14818:
--
Attachment: HDFS-14818.003.patch

> Check native pmdk lib by 'hadoop checknative' command
> -
>
> Key: HDFS-14818
> URL: https://issues.apache.org/jira/browse/HDFS-14818
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: native
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Minor
> Attachments: HDFS-14818.000.patch, HDFS-14818.001.patch, 
> HDFS-14818.002.patch, HDFS-14818.003.patch, 
> check_native_after_building_with_PMDK.png, 
> check_native_after_building_with_PMDK_using_NAME_instead_of_REALPATH.png, 
> check_native_after_building_without_PMDK.png
>
>
> Currently, 'hadoop checknative' command supports checking native libs, such 
> as zlib, snappy, openssl and ISA-L etc. It's necessary to include pmdk lib in 
> the checking.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14740) HDFS read cache persistence support

2019-09-19 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-14740:
--
Component/s: datanode
 caching

> HDFS read cache persistence support
> ---
>
> Key: HDFS-14740
> URL: https://issues.apache.org/jira/browse/HDFS-14740
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Rui Mo
>Priority: Major
> Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, 
> HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch
>
>
> In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache 
> management. Even though PM can persist cache data, for simplifying the 
> initial implementation, the previous cache data will be cleaned up during 
> DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking 
> advantage of PM's data persistence characteristic, i.e., recovering the 
> status for cached data, if any, when DataNode restarts, thus, cache warm up 
> time can be saved for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14740) HDFS read cache persistence support

2019-09-19 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-14740:
--
Description: In HDFS-13762, persistent memory (PM) is enabled in HDFS 
centralized cache management. Even though PM can persist cache data, for 
simplifying the initial implementation, the previous cache data will be cleaned 
up during DataNode restarts. Here, we are proposing to improve HDFS PM cache by 
taking advantage of PM's data persistence characteristic, i.e., recovering the 
status for cached data, if any, when DataNode restarts, thus, cache warm up 
time can be saved for user.  (was: In HDFS-13762, persistent memory (PM) is 
enabled in HDFS centralized cache management. Even though PM can persist cache 
data, for simplifying the initial implementation, the previous cache data will 
be cleaned up during DataNode restarts. Here, we are proposing to improve HDFS 
PM cache by taking advantage of PM's data persistence characteristic, i.e., 
recovering the cache status when DataNode restarts, thus, cache warm up time 
can be saved for user.)

> HDFS read cache persistence support
> ---
>
> Key: HDFS-14740
> URL: https://issues.apache.org/jira/browse/HDFS-14740
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Feilong He
>Assignee: Rui Mo
>Priority: Major
> Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, 
> HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch
>
>
> In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache 
> management. Even though PM can persist cache data, for simplifying the 
> initial implementation, the previous cache data will be cleaned up during 
> DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking 
> advantage of PM's data persistence characteristic, i.e., recovering the 
> status for cached data, if any, when DataNode restarts, thus, cache warm up 
> time can be saved for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14818) Check native pmdk lib by 'hadoop checknative' command

2019-09-19 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-14818:
--
Attachment: HDFS-14818.002.patch

> Check native pmdk lib by 'hadoop checknative' command
> -
>
> Key: HDFS-14818
> URL: https://issues.apache.org/jira/browse/HDFS-14818
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: native
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Minor
> Attachments: HDFS-14818.000.patch, HDFS-14818.001.patch, 
> HDFS-14818.002.patch, check_native_after_building_with_PMDK.png, 
> check_native_after_building_with_PMDK_using_NAME_instead_of_REALPATH.png, 
> check_native_after_building_without_PMDK.png
>
>
> Currently, 'hadoop checknative' command supports checking native libs, such 
> as zlib, snappy, openssl and ISA-L etc. It's necessary to include pmdk lib in 
> the checking.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14818) Check native pmdk lib by 'hadoop checknative' command

2019-09-16 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16931007#comment-16931007
 ] 

Feilong He commented on HDFS-14818:
---

[^HDFS-14818.001.patch] has been uploaded with adding some comments for PMDK 
support states.

> Check native pmdk lib by 'hadoop checknative' command
> -
>
> Key: HDFS-14818
> URL: https://issues.apache.org/jira/browse/HDFS-14818
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: native
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-14818.000.patch, HDFS-14818.001.patch, 
> check_native_after_building_with_PMDK.png, 
> check_native_after_building_with_PMDK_using_NAME_instead_of_REALPATH.png, 
> check_native_after_building_without_PMDK.png
>
>
> Currently, 'hadoop checknative' command supports checking native libs, such 
> as zlib, snappy, openssl and ISA-L etc. It's necessary to include pmdk lib in 
> the checking.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-14818) Check native pmdk lib by 'hadoop checknative' command

2019-09-16 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16930216#comment-16930216
 ] 

Feilong He edited comment on HDFS-14818 at 9/17/19 1:50 AM:


Thanks [~rakeshr] for your comments.

To make the code change effect clear to reviewers, I posted some screenshots.
 * The below picture shows the result of 'hadoop checknative' afer building 
WITH PMDK. The build cmd is 'mvn clean package -Pdist,native -DskipTests -Dtar 
-Drequire.pmdk'.

!check_native_after_building_with_PMDK.png!
 * The below picture shows the result of 'hadoop checknative' afer building 
WITHOUT PMDK. The build cmd is 'mvn clean package -Pdist,native -DskipTests 
-Dtar'.

!check_native_after_building_without_PMDK.png!
 * The below picture shows the result of 'hadoop checknative' afer building 
WITH PMDK, but shading the modification brought by this patch for 
CMakeLists.txt, i.e., still use 'NAME' instead of 'REALPATH'. The build cmd is 
'mvn clean package -Pdist,native -DskipTests -Dtar -Drequire.pmdk'.

!check_native_after_building_with_PMDK_using_NAME_instead_of_REALPATH.png!

 
{quote}{{SupportState.PMDK_LIB_NOT_FOUND}} - its unused now, can you remove it.
{quote}
In some env, if the PMDK native lib is not found, this state and its message 
will help user to identify the fact. So I am leaning to keep this state.
{quote}Any reason to change 'NAME' to 'REALPATH'.
{quote}
As the above 3rd picture shows, if 'NAME' is used instead of ‘REALPATH', only 
the lib name can be obtained and then printed by 'hadoop checknative'. In this 
patch, by using 'REALPATH', the real path of the target lib will be kept, which 
is more useful to user, I think.

Please refer to 
[https://cmake.org/cmake/help/v3.15/command/get_filename_component.html].


was (Author: philohe):
Thanks [~rakeshr] for your comments.

To make the code change effect clear to reviewers, I posted some screenshots.
 * The below picture shows the result of 'hadoop checknative' afer building 
WITH PMDK. The build cmd is 'mvn clean package -Pdist,native -DskipTests -Dtar 
-Drequire.pmdk'.

!check_native_after_building_with_PMDK.png!
 * The below picture shows the result of 'hadoop checknative' afer building 
WITHOUT PMDK. The build cmd is 'mvn clean package -Pdist,native -DskipTests 
-Dtar'.

!check_native_after_building_without_PMDK.png!
 * The below picture shows the result of 'hadoop checknative' afer building 
WITH PMDK, but shading the modification brought by this patch for 
CMakeLists.txt, i.e., still use 'NAME' instead of 'REALPATH'. The build cmd is 
'mvn clean package -Pdist,native -DskipTests -Dtar -Drequire.pmdk'.

!check_native_after_building_with_PMDK_using_NAME_instead_of_REALPATH.png!

 
{quote}{{SupportState.PMDK_LIB_NOT_FOUND}} - its unused now, can you remove it.
{quote}
In some env, if the PMDK native lib is nout found, this state and its message 
will help user to identify the fact. So I am leaning to keep this state.
{quote}Any reason to change 'NAME' to 'REALPATH'.
{quote}
As the above 3rd picture shows, if 'NAME' is used instead of ‘REALPATH', only 
the lib name can be obtained and then printed by 'hadoop checknative'. In this 
patch, by using 'REALPATH', the real path of the target lib will be kept, which 
is more useful to user, I think.

Please refer to 
[https://cmake.org/cmake/help/v3.15/command/get_filename_component.html].

> Check native pmdk lib by 'hadoop checknative' command
> -
>
> Key: HDFS-14818
> URL: https://issues.apache.org/jira/browse/HDFS-14818
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: native
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-14818.000.patch, HDFS-14818.001.patch, 
> check_native_after_building_with_PMDK.png, 
> check_native_after_building_with_PMDK_using_NAME_instead_of_REALPATH.png, 
> check_native_after_building_without_PMDK.png
>
>
> Currently, 'hadoop checknative' command supports checking native libs, such 
> as zlib, snappy, openssl and ISA-L etc. It's necessary to include pmdk lib in 
> the checking.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14818) Check native pmdk lib by 'hadoop checknative' command

2019-09-16 Thread Feilong He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-14818:
--
Attachment: HDFS-14818.001.patch

> Check native pmdk lib by 'hadoop checknative' command
> -
>
> Key: HDFS-14818
> URL: https://issues.apache.org/jira/browse/HDFS-14818
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: native
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-14818.000.patch, HDFS-14818.001.patch, 
> check_native_after_building_with_PMDK.png, 
> check_native_after_building_with_PMDK_using_NAME_instead_of_REALPATH.png, 
> check_native_after_building_without_PMDK.png
>
>
> Currently, 'hadoop checknative' command supports checking native libs, such 
> as zlib, snappy, openssl and ISA-L etc. It's necessary to include pmdk lib in 
> the checking.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-14818) Check native pmdk lib by 'hadoop checknative' command

2019-09-15 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16930216#comment-16930216
 ] 

Feilong He edited comment on HDFS-14818 at 9/16/19 3:52 AM:


Thanks [~rakeshr] for your comments.

To make the code change effect clear to reviewers, I posted some screenshots.
 * The below picture shows the result of 'hadoop checknative' afer building 
WITH PMDK. The build cmd is 'mvn clean package -Pdist,native -DskipTests -Dtar 
-Drequire.pmdk'.

!check_native_after_building_with_PMDK.png!
 * The below picture shows the result of 'hadoop checknative' afer building 
WITHOUT PMDK. The build cmd is 'mvn clean package -Pdist,native -DskipTests 
-Dtar'.

!check_native_after_building_without_PMDK.png!
 * The below picture shows the result of 'hadoop checknative' afer building 
WITH PMDK, but shading the modification brought by this patch for 
CMakeLists.txt, i.e., still use 'NAME' instead of 'REALPATH'. The build cmd is 
'mvn clean package -Pdist,native -DskipTests -Dtar -Drequire.pmdk'.

!check_native_after_building_with_PMDK_using_NAME_instead_of_REALPATH.png!

 
{quote}{{SupportState.PMDK_LIB_NOT_FOUND}} - its unused now, can you remove it.
{quote}
In some env, if the PMDK native lib is nout found, this state and its message 
will help user to identify the fact. So I am leaning to keep this state.
{quote}Any reason to change 'NAME' to 'REALPATH'.
{quote}
As the above 3rd picture shows, if 'NAME' is used instead of ‘REALPATH', only 
the lib name can be obtained and then printed by 'hadoop checknative'. In this 
patch, by using 'REALPATH', the real path of the target lib will be kept, which 
is more useful to user, I think.

Please refer to 
[https://cmake.org/cmake/help/v3.15/command/get_filename_component.html].


was (Author: philohe):
Thanks [~rakeshr] for your comments.

To make the code change effect clear to reviewers, I posted some screenshots.
 * The below picture shows the result of 'hadoop checknative' afer building 
WITH PMDK. The build cmd is 'mvn clean package -Pdist,native -DskipTests -Dtar 
-Drequire.pmdk'. You could see the path 

!check_native_after_building_with_PMDK.png!
 * The below picture shows the result of 'hadoop checknative' afer building 
WITHOUT PMDK. The build cmd is 'mvn clean package -Pdist,native -DskipTests 
-Dtar'.

!check_native_after_building_without_PMDK.png!
 * The below picture shows the result of 'hadoop checknative' afer building 
WITH PMDK, but shading the modification brought by this patch for 
CMakeLists.txt, i.e., still use 'NAME' instead of 'REALPATH'. The build cmd is 
'mvn clean package -Pdist,native -DskipTests -Dtar -Drequire.pmdk'.

!check_native_after_building_with_PMDK_using_NAME_instead_of_REALPATH.png!

 
{quote}{{SupportState.PMDK_LIB_NOT_FOUND}} - its unused now, can you remove it.
{quote}
In some env, if the PMDK native lib is nout found, this state and its message 
will help user to identify the fact. So I am leaning to keep this state.
{quote}Any reason to change 'NAME' to 'REALPATH'.
{quote}
As the above 3rd picture shows, if 'NAME' is used instead of ‘REALPATH', only 
the lib name can be obtained and then printed by 'hadoop checknative'. In this 
patch, by using 'REALPATH', the real path of the target lib will be kept, which 
is more useful to user, I think.

Please refer to 
[https://cmake.org/cmake/help/v3.15/command/get_filename_component.html].

> Check native pmdk lib by 'hadoop checknative' command
> -
>
> Key: HDFS-14818
> URL: https://issues.apache.org/jira/browse/HDFS-14818
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: native
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-14818.000.patch, 
> check_native_after_building_with_PMDK.png, 
> check_native_after_building_with_PMDK_using_NAME_instead_of_REALPATH.png, 
> check_native_after_building_without_PMDK.png
>
>
> Currently, 'hadoop checknative' command supports checking native libs, such 
> as zlib, snappy, openssl and ISA-L etc. It's necessary to include pmdk lib in 
> the checking.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14818) Check native pmdk lib by 'hadoop checknative' command

2019-09-15 Thread Feilong He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16930216#comment-16930216
 ] 

Feilong He commented on HDFS-14818:
---

Thanks [~rakeshr] for your comments.

To make the code change effect clear to reviewers, I posted some screenshots.
 * The below picture shows the result of 'hadoop checknative' afer building 
WITH PMDK. The build cmd is 'mvn clean package -Pdist,native -DskipTests -Dtar 
-Drequire.pmdk'. You could see the path 

!check_native_after_building_with_PMDK.png!
 * The below picture shows the result of 'hadoop checknative' afer building 
WITHOUT PMDK. The build cmd is 'mvn clean package -Pdist,native -DskipTests 
-Dtar'.

!check_native_after_building_without_PMDK.png!
 * The below picture shows the result of 'hadoop checknative' afer building 
WITH PMDK, but shading the modification brought by this patch for 
CMakeLists.txt, i.e., still use 'NAME' instead of 'REALPATH'. The build cmd is 
'mvn clean package -Pdist,native -DskipTests -Dtar -Drequire.pmdk'.

!check_native_after_building_with_PMDK_using_NAME_instead_of_REALPATH.png!

 
{quote}{{SupportState.PMDK_LIB_NOT_FOUND}} - its unused now, can you remove it.
{quote}
In some env, if the PMDK native lib is nout found, this state and its message 
will help user to identify the fact. So I am leaning to keep this state.
{quote}Any reason to change 'NAME' to 'REALPATH'.
{quote}
As the above 3rd picture shows, if 'NAME' is used instead of ‘REALPATH', only 
the lib name can be obtained and then printed by 'hadoop checknative'. In this 
patch, by using 'REALPATH', the real path of the target lib will be kept, which 
is more useful to user, I think.

Please refer to 
[https://cmake.org/cmake/help/v3.15/command/get_filename_component.html].

> Check native pmdk lib by 'hadoop checknative' command
> -
>
> Key: HDFS-14818
> URL: https://issues.apache.org/jira/browse/HDFS-14818
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: native
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-14818.000.patch, 
> check_native_after_building_with_PMDK.png, 
> check_native_after_building_with_PMDK_using_NAME_instead_of_REALPATH.png, 
> check_native_after_building_without_PMDK.png
>
>
> Currently, 'hadoop checknative' command supports checking native libs, such 
> as zlib, snappy, openssl and ISA-L etc. It's necessary to include pmdk lib in 
> the checking.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

1 2 3 4 >

1 - 100 of 385 matches

Mail list logo