subject:"\[jira\] \[Commented\] \(HDFS\-6009\) Tools based on favored node feature for isolation"

[jira] [Commented] (HDFS-6009) Tools based on favored node feature for isolation

2014-03-14 Thread Yu Li (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934814#comment-13934814
]

Yu Li commented on HDFS-6009:
-

{quote}
In particular, what caused the failure in your case? Is it a disk error,
network failure, or an application is buggy?
{quote}
In our product env, we almost encountered all the cases listed above, and
experienced a hard time comforting angry users. Especially in the buggy
application case, the other users affected would become crazy because of being
punished by other's faults. So in our case isolation is necessary.

To be more specific, our service is based on HBase, so the tools supplied here
are used along with the HBase regionserver group feature(HBASE-6721). If you're
interested in our use case, I've given some more detailed introduction
[here|https://issues.apache.org/jira/browse/HDFS-6010?focusedCommentId=13932891page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13932891]
in HDFS-6010 (just allow me to save some copy-paste effort :-))

Another thing to clarify here is that this suit of tools won't persist any
datanode group information into HDFS. All the 3 tools accept a -servers
option, so the admin needs to keep in mind the group information and pass it
to the tools, or like in our use case, persist the group information in
upper-level component like HBase.

[~thanhdo], hope this answers your question and just let me know if any further
comments.

Tools based on favored node feature for isolation
-

Key: HDFS-6009
URL: https://issues.apache.org/jira/browse/HDFS-6009
Project: Hadoop HDFS
Issue Type: Task
Affects Versions: 2.3.0
Reporter: Yu Li
Assignee: Yu Li
Priority: Minor

There're scenarios like mentioned in HBASE-6721 and HBASE-4210 that in
multi-tenant deployments of HBase we prefer to specify several groups of
regionservers to serve different applications, to achieve some kind of
isolation or resource allocation. However, although the regionservers are
grouped, the datanodes which store the data are not, which leads to the case
that one datanode failure affects multiple applications, as we already
observed in our product environment.
To relieve the above issue, we could take usage of the favored node feature
(HDFS-2576) to make regionserver able to locate data within its group, or say
make datanodes also grouped (passively), to form some level of isolation.
In this case, or any other case that needs datanodes to group, we would need
a bunch of tools to maintain the group, including:
1. Making balancer able to balance data among specified servers, rather than
the whole set
2. Set balance bandwidth for specified servers, rather than the whole set
3. Some tool to check whether the block is cross-group placed, and move it
back if so
This JIRA is an umbrella for the above tools.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6009) Tools based on favored node feature for isolation

2014-03-14 Thread Thanh Do (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935194#comment-13935194
]

Thanh Do commented on HDFS-6009:

Yu Li, thanks for your detailed comment! Your use case is a great example of
isolation. We are currently working on some similar problems but at a lower
level on the software stack, thus your use case is a great motivation.

Tools based on favored node feature for isolation
-

Key: HDFS-6009
URL: https://issues.apache.org/jira/browse/HDFS-6009
Project: Hadoop HDFS
Issue Type: Task
Affects Versions: 2.3.0
Reporter: Yu Li
Assignee: Yu Li
Priority: Minor

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6009) Tools based on favored node feature for isolation

2014-03-13 Thread Thanh Do (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933475#comment-13933475
]

Thanh Do commented on HDFS-6009:

Hi Yu Li,

I want to follow up on this issue. Could you please elaborate more on datanode
failure. In particular, what caused the failure in your case? Is it a disk
error, network failure, or an application is buggy?

If it is a disk error and network failure, I think isolation using datanode
group is reasonable.

Tools based on favored node feature for isolation
-

Key: HDFS-6009
URL: https://issues.apache.org/jira/browse/HDFS-6009
Project: Hadoop HDFS
Issue Type: Task
Affects Versions: 2.3.0
Reporter: Yu Li
Assignee: Yu Li
Priority: Minor

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6009) Tools based on favored node feature for isolation

2014-03-12 Thread Yu Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931955#comment-13931955
 ] 

Yu Li commented on HDFS-6009:
-

Hi [~thanhdo],

Yes, the data are replicated, so there won't be data loss. However, since one 
datanode might carry on data of multiple applications, the datanode failure 
will cause *several* application read request to retry until timeout and change 
to another datanode, while we'd like to reduce the impact range

Another scenario we experienced here is that application A crazily reading data 
from one DN, which occupied almost all network bandwidth, while meantime 
application B tried to write data to this DN but blocked a long time.

As I mentioned in HDFS-6010, people might ask in this case why don't use 
phasically separated clusters, the answer would be it's more convenient and 
saves people resource to manage one big cluster than several small ones.

There's also other solution like HDFS-5776 to reduce the impact of bad 
datanode, but I believe there're still scenarios which need more strict io 
isolation, so I think it's still valuable to contribute our tools.

Hope this answers your question. :-)

 Tools based on favored node feature for isolation
 -

 Key: HDFS-6009
 URL: https://issues.apache.org/jira/browse/HDFS-6009
 Project: Hadoop HDFS
  Issue Type: Task
Affects Versions: 2.3.0
Reporter: Yu Li
Assignee: Yu Li
Priority: Minor

 There're scenarios like mentioned in HBASE-6721 and HBASE-4210 that in 
 multi-tenant deployments of HBase we prefer to specify several groups of 
 regionservers to serve different applications, to achieve some kind of 
 isolation or resource allocation. However, although the regionservers are 
 grouped, the datanodes which store the data are not, which leads to the case 
 that one datanode failure affects multiple applications, as we already 
 observed in our product environment.
 To relieve the above issue, we could take usage of the favored node feature 
 (HDFS-2576) to make regionserver able to locate data within its group, or say 
 make datanodes also grouped (passively), to form some level of isolation.
 In this case, or any other case that needs datanodes to group, we would need 
 a bunch of tools to maintain the group, including:
 1. Making balancer able to balance data among specified servers, rather than 
 the whole set
 2. Set balance bandwidth for specified servers, rather than the whole set
 3. Some tool to check whether the block is cross-group placed, and move it 
 back if so
 This JIRA is an umbrella for the above tools.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6009) Tools based on favored node feature for isolation

2014-03-12 Thread Sirianni, Eric (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932042#comment-13932042
]

Sirianni, Eric commented on HDFS-6009:
--

Thanks for emailing NetApp. The email inbox you have attempted to reach has
been deactivated.

Tools based on favored node feature for isolation
-

Key: HDFS-6009
URL: https://issues.apache.org/jira/browse/HDFS-6009
Project: Hadoop HDFS
Issue Type: Task
Affects Versions: 2.3.0
Reporter: Yu Li
Assignee: Yu Li
Priority: Minor

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6009) Tools based on favored node feature for isolation

2014-03-12 Thread Thanh Do (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932037#comment-13932037
 ] 

Thanh Do commented on HDFS-6009:


Thank you!

 Tools based on favored node feature for isolation
 -

 Key: HDFS-6009
 URL: https://issues.apache.org/jira/browse/HDFS-6009
 Project: Hadoop HDFS
  Issue Type: Task
Affects Versions: 2.3.0
Reporter: Yu Li
Assignee: Yu Li
Priority: Minor

 There're scenarios like mentioned in HBASE-6721 and HBASE-4210 that in 
 multi-tenant deployments of HBase we prefer to specify several groups of 
 regionservers to serve different applications, to achieve some kind of 
 isolation or resource allocation. However, although the regionservers are 
 grouped, the datanodes which store the data are not, which leads to the case 
 that one datanode failure affects multiple applications, as we already 
 observed in our product environment.
 To relieve the above issue, we could take usage of the favored node feature 
 (HDFS-2576) to make regionserver able to locate data within its group, or say 
 make datanodes also grouped (passively), to form some level of isolation.
 In this case, or any other case that needs datanodes to group, we would need 
 a bunch of tools to maintain the group, including:
 1. Making balancer able to balance data among specified servers, rather than 
 the whole set
 2. Set balance bandwidth for specified servers, rather than the whole set
 3. Some tool to check whether the block is cross-group placed, and move it 
 back if so
 This JIRA is an umbrella for the above tools.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6009) Tools based on favored node feature for isolation

2014-03-11 Thread Thanh Do (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931338#comment-13931338
 ] 

Thanh Do commented on HDFS-6009:


Hi Yu, 

You mentioned although the regionservers are grouped, the datanodes which 
store the data are not, which leads to the case that one datanode failure 
affects multiple applications, as we already observed in our product 
environment.

Can you elaborate that scenarios? I thought a datanode failure will be ok, as 
the data are replicated. 

Best,

 Tools based on favored node feature for isolation
 -

 Key: HDFS-6009
 URL: https://issues.apache.org/jira/browse/HDFS-6009
 Project: Hadoop HDFS
  Issue Type: Task
Affects Versions: 2.3.0
Reporter: Yu Li
Assignee: Yu Li
Priority: Minor

 There're scenarios like mentioned in HBASE-6721 and HBASE-4210 that in 
 multi-tenant deployments of HBase we prefer to specify several groups of 
 regionservers to serve different applications, to achieve some kind of 
 isolation or resource allocation. However, although the regionservers are 
 grouped, the datanodes which store the data are not, which leads to the case 
 that one datanode failure affects multiple applications, as we already 
 observed in our product environment.
 To relieve the above issue, we could take usage of the favored node feature 
 (HDFS-2576) to make regionserver able to locate data within its group, or say 
 make datanodes also grouped (passively), to form some level of isolation.
 In this case, or any other case that needs datanodes to group, we would need 
 a bunch of tools to maintain the group, including:
 1. Making balancer able to balance data among specified servers, rather than 
 the whole set
 2. Set balance bandwidth for specified servers, rather than the whole set
 3. Some tool to check whether the block is cross-group placed, and move it 
 back if so
 This JIRA is an umbrella for the above tools.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6009) Tools based on favored node feature for isolation

[jira] [Commented] (HDFS-6009) Tools based on favored node feature for isolation

[jira] [Commented] (HDFS-6009) Tools based on favored node feature for isolation

[jira] [Commented] (HDFS-6009) Tools based on favored node feature for isolation

[jira] [Commented] (HDFS-6009) Tools based on favored node feature for isolation

[jira] [Commented] (HDFS-6009) Tools based on favored node feature for isolation

[jira] [Commented] (HDFS-6009) Tools based on favored node feature for isolation

7 matches

Site Navigation

Mail list logo

Footer information