[
https://issues.apache.org/jira/browse/HDFS-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15583292#comment-15583292
]
Xiao Chen commented on HDFS-7343:
---------------------------------
Thanks all for the great documentation and discussions. It will be an
interesting undertaking. :)
May be too early to ask: in order to do HDFS management work, the SSM has to
run as hdfs superuser, right?
And related to Andrew's question on performance-based decisions, is it manual
or automatic (or both)?
The doc says {{SSM can make prediction on a file’s read based on read
historical information and cache the file automatically before the read
operation happens}}, and later gives an example of a similar rule ({{every 1d
at 0:00 | age lt 30d | cache}}). I think that means both: the description
indicating the automatic part, and the rule showing a same example for a manual
control. Is it true?
If the query is not latency-sensitive, the caching-uncaching in the 'automatic'
way may be unnecessary. Is it possible to not have the automatic way happen for
some workloads? I can think of similar cases where converting between EC <->
replica may not be necessary.
> HDFS smart storage management
> -----------------------------
>
> Key: HDFS-7343
> URL: https://issues.apache.org/jira/browse/HDFS-7343
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: Kai Zheng
> Assignee: Wei Zhou
> Attachments: HDFS-Smart-Storage-Management.pdf
>
>
> As discussed in HDFS-7285, it would be better to have a comprehensive and
> flexible storage policy engine considering file attributes, metadata, data
> temperature, storage type, EC codec, available hardware capabilities,
> user/application preference and etc.
> Modified the title for re-purpose.
> We'd extend this effort some bit and aim to work on a comprehensive solution
> to provide smart storage management service in order for convenient,
> intelligent and effective utilizing of erasure coding or replicas, HDFS cache
> facility, HSM offering, and all kinds of tools (balancer, mover, disk
> balancer and so on) in a large cluster.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]