[jira] [Updated] (HDFS-7343) HDFS smart storage management

2017-04-01 Thread Wei Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zhou updated HDFS-7343:
---
Attachment: access_count_tables.jpg
tables_in_ssm.xlsx

> HDFS smart storage management
> -
>
> Key: HDFS-7343
> URL: https://issues.apache.org/jira/browse/HDFS-7343
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Wei Zhou
> Attachments: access_count_tables.jpg, 
> HDFSSmartStorageManagement-General-20170315.pdf, 
> HDFS-Smart-Storage-Management.pdf, 
> HDFSSmartStorageManagement-Phase1-20170315.pdf, 
> HDFS-Smart-Storage-Management-update.pdf, move.jpg, tables_in_ssm.xlsx
>
>
> As discussed in HDFS-7285, it would be better to have a comprehensive and 
> flexible storage policy engine considering file attributes, metadata, data 
> temperature, storage type, EC codec, available hardware capabilities, 
> user/application preference and etc.
> Modified the title for re-purpose.
> We'd extend this effort some bit and aim to work on a comprehensive solution 
> to provide smart storage management service in order for convenient, 
> intelligent and effective utilizing of erasure coding or replicas, HDFS cache 
> facility, HSM offering, and all kinds of tools (balancer, mover, disk 
> balancer and so on) in a large cluster.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-7343) HDFS smart storage management

2017-03-15 Thread Wei Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zhou updated HDFS-7343:
---
Attachment: HDFSSmartStorageManagement-Phase1-20170315.pdf
HDFSSmartStorageManagement-General-20170315.pdf

Based on the discussion and feedbacks collected, we updated the design. The 
design is separated into two: one for SSM general discussion and the other one 
focusing on Phase1 only.
In Phase 1, We focus on 3 use cases and go further to the details:
# Defines the metrics to collect. No modification to DN and Client, all metrics 
are got through NN. The modification and overhead for NN are under control.
# Defines the objects/attributes/events/commands to support in rule
# Defines the execution flow of rules
# Defines the data to be persistent for SSM and NN

Thanks [~andrew.wang], [~anu], [~umamaheswararao], [~rakeshr], [~eddyxu] for 
reviewing the design and the great comments. Hopefully this update has 
incorporated your thoughts and addressed your questions well. Thanks!

> HDFS smart storage management
> -
>
> Key: HDFS-7343
> URL: https://issues.apache.org/jira/browse/HDFS-7343
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Wei Zhou
> Attachments: HDFSSmartStorageManagement-General-20170315.pdf, 
> HDFS-Smart-Storage-Management.pdf, 
> HDFSSmartStorageManagement-Phase1-20170315.pdf, 
> HDFS-Smart-Storage-Management-update.pdf, move.jpg
>
>
> As discussed in HDFS-7285, it would be better to have a comprehensive and 
> flexible storage policy engine considering file attributes, metadata, data 
> temperature, storage type, EC codec, available hardware capabilities, 
> user/application preference and etc.
> Modified the title for re-purpose.
> We'd extend this effort some bit and aim to work on a comprehensive solution 
> to provide smart storage management service in order for convenient, 
> intelligent and effective utilizing of erasure coding or replicas, HDFS cache 
> facility, HSM offering, and all kinds of tools (balancer, mover, disk 
> balancer and so on) in a large cluster.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-7343) HDFS smart storage management

2017-01-09 Thread Wei Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zhou updated HDFS-7343:
---
Attachment: move.jpg

> HDFS smart storage management
> -
>
> Key: HDFS-7343
> URL: https://issues.apache.org/jira/browse/HDFS-7343
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Wei Zhou
> Attachments: HDFS-Smart-Storage-Management-update.pdf, 
> HDFS-Smart-Storage-Management.pdf, move.jpg
>
>
> As discussed in HDFS-7285, it would be better to have a comprehensive and 
> flexible storage policy engine considering file attributes, metadata, data 
> temperature, storage type, EC codec, available hardware capabilities, 
> user/application preference and etc.
> Modified the title for re-purpose.
> We'd extend this effort some bit and aim to work on a comprehensive solution 
> to provide smart storage management service in order for convenient, 
> intelligent and effective utilizing of erasure coding or replicas, HDFS cache 
> facility, HSM offering, and all kinds of tools (balancer, mover, disk 
> balancer and so on) in a large cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-7343) HDFS smart storage management

2017-01-05 Thread Wei Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zhou updated HDFS-7343:
---
Attachment: HDFS-Smart-Storage-Management-update.pdf

Based on the discussion and feedbacks collected, we updated the design 
document. There are many changes compared with the previous version:
# The ultimate target is separated into two phases, and in phase 1, we focus on 
implement a rule-based automation engine that integrates the facilities in 
HDFS. We will make it an end-to-end intelligent solution in phase 2.
# Kafka service dependency removed, SSM gets info directly from NN.
# DNs report metrics and events to NN instead of been polled by SSM directly.
# Metrics, events and some other info are maintained in NN off-heap memory and 
stored in NN.
# SSM is stateless and polling info from NN when starts up.

[~andrew.wang], [~anu], [~xiaochen], [~eddyxu] and anybody else, please help 
review it. Your suggestion or comment is appreciated. Thanks!

> HDFS smart storage management
> -
>
> Key: HDFS-7343
> URL: https://issues.apache.org/jira/browse/HDFS-7343
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Wei Zhou
> Attachments: HDFS-Smart-Storage-Management-update.pdf, 
> HDFS-Smart-Storage-Management.pdf
>
>
> As discussed in HDFS-7285, it would be better to have a comprehensive and 
> flexible storage policy engine considering file attributes, metadata, data 
> temperature, storage type, EC codec, available hardware capabilities, 
> user/application preference and etc.
> Modified the title for re-purpose.
> We'd extend this effort some bit and aim to work on a comprehensive solution 
> to provide smart storage management service in order for convenient, 
> intelligent and effective utilizing of erasure coding or replicas, HDFS cache 
> facility, HSM offering, and all kinds of tools (balancer, mover, disk 
> balancer and so on) in a large cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-7343) HDFS smart storage management

2016-10-14 Thread Wei Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zhou updated HDFS-7343:
---
Attachment: HDFS-Smart-Storage-Management.pdf

The initial design document attached. Please help in reviewing it and any 
suggestion or comment is appreciated! Thanks!

Greate thanks to [~drankye], [~rakeshr], [~HuafengWang], [~jingzhao], 
[~umamaheswararao], [~Sammi], [~jingcheng...@intel.com] and [~yuanbo] for their 
helping in this design.

> HDFS smart storage management
> -
>
> Key: HDFS-7343
> URL: https://issues.apache.org/jira/browse/HDFS-7343
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Wei Zhou
> Attachments: HDFS-Smart-Storage-Management.pdf
>
>
> As discussed in HDFS-7285, it would be better to have a comprehensive and 
> flexible storage policy engine considering file attributes, metadata, data 
> temperature, storage type, EC codec, available hardware capabilities, 
> user/application preference and etc.
> Modified the title for re-purpose.
> We'd extend this effort some bit and aim to work on a comprehensive solution 
> to provide smart storage management service in order for convenient, 
> intelligent and effective utilizing of erasure coding or replicas, HDFS cache 
> facility, HSM offering, and all kinds of tools (balancer, mover, disk 
> balancer and so on) in a large cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-7343) HDFS smart storage management

2016-09-16 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-7343:

   Assignee: Wei Zhou  (was: Kai Zheng)
Description: 
As discussed in HDFS-7285, it would be better to have a comprehensive and 
flexible storage policy engine considering file attributes, metadata, data 
temperature, storage type, EC codec, available hardware capabilities, 
user/application preference and etc.

Modified the title for re-purpose.
We'd extend this effort some bit and aim to work on a comprehensive solution to 
provide smart storage management service in order for convenient, intelligent 
and effective utilizing of erasure coding or replicas, HDFS cache facility, HSM 
offering, and all kinds of tools (balancer, mover, disk balancer and so on) in 
a large cluster.


  was:
As discussed in HDFS-7285, it would be better to have a comprehensive and 
flexible storage policy engine considering file attributes, metadata, data 
temperature, storage type, EC codec, available hardware capabilities, 
user/application preference and etc.


Component/s: (was: namenode)
Summary: HDFS smart storage management  (was: A comprehensive and 
flexible storage policy engine)

Modified the title for re-purpose.

We'd extend this effort some bit and aim to work on a comprehensive solution to 
provide smart storage management service in order for convenient, intelligent 
and effective utilizing of erasure coding or replicas, HDFS cache facility, HSM 
offering, and all kinds of tools (balancer, mover, disk balancer and so on) in 
a large cluster. 

Doing this as a standalone service to avoid big impact to existing NNs was 
inspired by [~jingzhao] quite some time ago, along with many other valuable 
insights. [~zhouwei] will work on this and let's wait another week for the 
delayed design.

> HDFS smart storage management
> -
>
> Key: HDFS-7343
> URL: https://issues.apache.org/jira/browse/HDFS-7343
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Wei Zhou
>
> As discussed in HDFS-7285, it would be better to have a comprehensive and 
> flexible storage policy engine considering file attributes, metadata, data 
> temperature, storage type, EC codec, available hardware capabilities, 
> user/application preference and etc.
> Modified the title for re-purpose.
> We'd extend this effort some bit and aim to work on a comprehensive solution 
> to provide smart storage management service in order for convenient, 
> intelligent and effective utilizing of erasure coding or replicas, HDFS cache 
> facility, HSM offering, and all kinds of tools (balancer, mover, disk 
> balancer and so on) in a large cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org