subject:"\[jira\] \[Commented\] \(HDFS\-16115\) Asynchronously handle BPServiceActor command mechanism may result in BPServiceActor never fails even CommandProcessingThread is closed with fatal error."

[jira] [Commented] (HDFS-16115) Asynchronously handle BPServiceActor command mechanism may result in BPServiceActor never fails even CommandProcessingThread is closed with fatal error.

2021-07-06 Thread Daniel Ma (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17375401#comment-17375401
 ] 

Daniel Ma commented on HDFS-16115:
--

[~brahmareddy]

[~ayush]

Pls help to review this patch.

> Asynchronously handle BPServiceActor command mechanism may result in 
> BPServiceActor never fails even CommandProcessingThread is closed with fatal 
> error.
> 
>
> Key: HDFS-16115
> URL: https://issues.apache.org/jira/browse/HDFS-16115
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.3.1
>Reporter: Daniel Ma
>Priority: Critical
> Fix For: 3.3.1
>
> Attachments: 0001-HDFS-16115.patch
>
>
> It is an improvement issue. Actually the issue has two sub issues:
> 1- BPServerActor thread handle commands from NameNode in aysnchronous way ( 
> CommandProcessThread handle commands ), so if there are any exception or 
> errors happens in thread CommandProcessthread resulting the thread fails and 
> stop, of which BPServiceActor cannot aware and still keep put commands from 
> namenode into queues waiting to be handled by CommandProcessThread, actually 
> CommandProcessThread was dead already.
> 2-the second sub issue is based on the first one, if CommandProcessThread 
> fails owing to some non-fatal error like "can not create native thread" which 
> is caused by too many threads existed on the node, this kind of problem 
> should be given much torlerance instead of simply shudown the thread and 
> never recover automatically, because the non-fatal eror mention above may 
> recover soon by itself,
> currently, Datanode BPServiceActor cannot turn to normal even when the 
> non-fatal error was eliminated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-16115) Asynchronously handle BPServiceActor command mechanism may result in BPServiceActor never fails even CommandProcessingThread is closed with fatal error.

2021-07-06 Thread Xiaoqiao He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17375741#comment-17375741
 ] 

Xiaoqiao He commented on HDFS-16115:


[~Daniel Ma] Thanks for your report. I am not sure if HDFS-15651 could solve 
this issue. IMO If this thread meet some error DataNode process will exit too. 
Do you mind to offer some more information or log for this issue? Thanks.

> Asynchronously handle BPServiceActor command mechanism may result in 
> BPServiceActor never fails even CommandProcessingThread is closed with fatal 
> error.
> 
>
> Key: HDFS-16115
> URL: https://issues.apache.org/jira/browse/HDFS-16115
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.3.1
>Reporter: Daniel Ma
>Priority: Critical
> Fix For: 3.3.1
>
> Attachments: 0001-HDFS-16115.patch
>
>
> It is an improvement issue. Actually the issue has two sub issues:
> 1- BPServerActor thread handle commands from NameNode in aysnchronous way ( 
> CommandProcessThread handle commands ), so if there are any exceptions or 
> errors happen in thread CommandProcessthread resulting the thread fails and 
> stop, of which BPServiceActor cannot aware and still keep putting commands 
> from namenode into queues waiting to be handled by CommandProcessThread, 
> actually CommandProcessThread was dead already.
> 2-the second sub issue is based on the first one, if CommandProcessThread was 
> dead owing to some non-fatal errors like "can not create native thread" which 
> is caused by too many threads existed in OS, this kind of problem should be 
> given much more torlerance instead of simply shudown the thread and never 
> recover automatically, because the non-fatal errors mentioned above probably 
> can be recovered soon by itself,
> currently, Datanode BPServiceActor cannot turn to normal even when the 
> non-fatal error was eliminated.
> Therefore, in this patch, two things will be done:
> 1-Add retry mechanism in BPServiceActor thread and CommandProcessThread 
> thread which is 5 by default and configurable;
> 2-Add a monitor periodical thread in BPOfferService, if a BPServiceActor 
> thread is dead owing to  too many times non-fatal error, it should not be 
> simply removed from BPServviceActor lists stored in BPOfferService, instead, 
> the monitor thread will periodically try to start these special dead 
> BPServiceActor thread. the interval is also configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-16115) Asynchronously handle BPServiceActor command mechanism may result in BPServiceActor never fails even CommandProcessingThread is closed with fatal error.

2021-07-06 Thread Daniel Ma (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17376146#comment-17376146
 ] 

Daniel Ma commented on HDFS-16115:
--

[~hexiaoqiao] Thanks for your review and tips. Actually the issue that my patch 
attempt to solve is totally different from the one HDFS-15651 mentioned.

I have noticed this Jira perviously, but it can not solve my issue perfectly. 

What I try to solve in this patch is :

1-Once CommandProcess thread is dead, we need to stop the corresponding 
BPServiceActor thread as well. 

2-In my patch, for the non-fatal error, BPOfferService thread always running a 
periodical thread to try to recover the BPServiceActor thread that is dead 
owing to non-fatal error, which is the essential difference between our patch 
and HDFS-15651

 

> Asynchronously handle BPServiceActor command mechanism may result in 
> BPServiceActor never fails even CommandProcessingThread is closed with fatal 
> error.
> 
>
> Key: HDFS-16115
> URL: https://issues.apache.org/jira/browse/HDFS-16115
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.3.1
>Reporter: Daniel Ma
>Priority: Critical
> Fix For: 3.3.1
>
> Attachments: 0001-HDFS-16115.patch
>
>
> It is an improvement issue. Actually the issue has two sub issues:
> 1- BPServerActor thread handle commands from NameNode in aysnchronous way ( 
> CommandProcessThread handle commands ), so if there are any exceptions or 
> errors happen in thread CommandProcessthread resulting the thread fails and 
> stop, of which BPServiceActor cannot aware and still keep putting commands 
> from namenode into queues waiting to be handled by CommandProcessThread, 
> actually CommandProcessThread was dead already.
> 2-the second sub issue is based on the first one, if CommandProcessThread was 
> dead owing to some non-fatal errors like "can not create native thread" which 
> is caused by too many threads existed in OS, this kind of problem should be 
> given much more torlerance instead of simply shudown the thread and never 
> recover automatically, because the non-fatal errors mentioned above probably 
> can be recovered soon by itself,
> {code:java}
> //代码占位符
> 2021-07-02 16:26:02,315 | WARN  | Command processor | Exception happened when 
> process queue BPServiceActor.java:1393
> java.lang.OutOfMemoryError: unable to create new native thread
> at java.lang.Thread.start0(Native Method)
> at java.lang.Thread.start(Thread.java:717)
> at 
> java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:957)
> at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1367)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService.execute(FsDatasetAsyncDiskService.java:180)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService.deleteAsync(FsDatasetAsyncDiskService.java:229)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.invalidate(FsDatasetImpl.java:2315)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.invalidate(FsDatasetImpl.java:2237)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:752)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:698)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.processCommand(BPServiceActor.java:1417)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.lambda$enqueue$2(BPServiceActor.java:1463)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.processQueue(BPServiceActor.java:1382)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.run(BPServiceActor.java:1365)
> {code}
> currently, Datanode BPServiceActor cannot turn to normal even when the 
> non-fatal error was eliminated.
> Therefore, in this patch, two things will be done:
> 1-Add retry mechanism in BPServiceActor thread and CommandProcessThread 
> thread which is 5 by default and configurable;
> 2-Add a monitor periodical thread in BPOfferService, if a BPServiceActor 
> thread is dead owing to  too many times non-fatal error, it should not be 
> simply removed from BPServviceActor lists stored in BPOfferService, instead, 
> the monitor thread will periodically try to start these special dead 
> BPServiceActor thread. the interval is also configurable.



--
This message w

[jira] [Commented] (HDFS-16115) Asynchronously handle BPServiceActor command mechanism may result in BPServiceActor never fails even CommandProcessingThread is closed with fatal error.

2021-07-07 Thread Xiaoqiao He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17376556#comment-17376556
 ] 

Xiaoqiao He commented on HDFS-16115:


I agree with you that HDFS-15651 is rough solution for this case, but it is 
safe totally.
About your proposal, I am concerned that
A. how to define 'a non-fatal error'? which type exception/error should we 
retry or for every exception/error? 
B. when we meet 'a non-fatal error' which should retry, is this action should 
fall back to command queue? if yes, how to keep every command action is 
idempotent, if not any command could miss?
NOTE, sorry I don't review patch carefully. Please correct me f something I 
missed. Thanks.

> Asynchronously handle BPServiceActor command mechanism may result in 
> BPServiceActor never fails even CommandProcessingThread is closed with fatal 
> error.
> 
>
> Key: HDFS-16115
> URL: https://issues.apache.org/jira/browse/HDFS-16115
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.3.1
>Reporter: Daniel Ma
>Priority: Critical
> Fix For: 3.3.1
>
> Attachments: 0001-HDFS-16115.patch
>
>
> It is an improvement issue. Actually the issue has two sub issues:
> 1- BPServerActor thread handle commands from NameNode in aysnchronous way ( 
> CommandProcessThread handle commands ), so if there are any exceptions or 
> errors happen in thread CommandProcessthread resulting the thread fails and 
> stop, of which BPServiceActor cannot aware and still keep putting commands 
> from namenode into queues waiting to be handled by CommandProcessThread, 
> actually CommandProcessThread was dead already.
> 2-the second sub issue is based on the first one, if CommandProcessThread was 
> dead owing to some non-fatal errors like "can not create native thread" which 
> is caused by too many threads existed in OS, this kind of problem should be 
> given much more torlerance instead of simply shudown the thread and never 
> recover automatically, because the non-fatal errors mentioned above probably 
> can be recovered soon by itself,
> {code:java}
> //代码占位符
> 2021-07-02 16:26:02,315 | WARN  | Command processor | Exception happened when 
> process queue BPServiceActor.java:1393
> java.lang.OutOfMemoryError: unable to create new native thread
> at java.lang.Thread.start0(Native Method)
> at java.lang.Thread.start(Thread.java:717)
> at 
> java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:957)
> at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1367)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService.execute(FsDatasetAsyncDiskService.java:180)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService.deleteAsync(FsDatasetAsyncDiskService.java:229)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.invalidate(FsDatasetImpl.java:2315)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.invalidate(FsDatasetImpl.java:2237)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:752)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:698)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.processCommand(BPServiceActor.java:1417)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.lambda$enqueue$2(BPServiceActor.java:1463)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.processQueue(BPServiceActor.java:1382)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.run(BPServiceActor.java:1365)
> {code}
> currently, Datanode BPServiceActor cannot turn to normal even when the 
> non-fatal error was eliminated.
> Therefore, in this patch, two things will be done:
> 1-Add retry mechanism in BPServiceActor thread and CommandProcessThread 
> thread which is 5 by default and configurable;
> 2-Add a monitor periodical thread in BPOfferService, if a BPServiceActor 
> thread is dead owing to  too many times non-fatal error, it should not be 
> simply removed from BPServviceActor lists stored in BPOfferService, instead, 
> the monitor thread will periodically try to start these special dead 
> BPServiceActor thread. the interval is also configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

--

[jira] [Commented] (HDFS-16115) Asynchronously handle BPServiceActor command mechanism may result in BPServiceActor never fails even CommandProcessingThread is closed with fatal error.

2021-07-07 Thread Daniel Ma (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17376571#comment-17376571
 ] 

Daniel Ma commented on HDFS-16115:
--

hello [~hexiaoqiao].

Thanks for review, 

1-For non-fatal error, I define two kinds at present:

These two errors are caused by too many threads in OS and too many open files 
in OS, which probably can recover soon.

Even if the OS limit cannot be recovered proactively, users expect that the 
HDFS can recover automatically after manual access.

 
{code:java}
//代码占位符
enum NON_FATAL_TYPES {
  THREAD_EXCEED("unable to create new native thread"),
  FILE_EXCEED("Too many open files");

  private final String errorMsg;

  NON_FATAL_TYPES(String errorMsg){
this.errorMsg = errorMsg;
  }

  public String getErrorMsg() {
return errorMsg;
  }
}
{code}
2-the main defect of HDFS-15651 is that BPServiceActor thread will never back 
to normal unless restart DataNode which is always unacceptable in production 
environment reported by real users. That is why I develop this feature. 

 

> Asynchronously handle BPServiceActor command mechanism may result in 
> BPServiceActor never fails even CommandProcessingThread is closed with fatal 
> error.
> 
>
> Key: HDFS-16115
> URL: https://issues.apache.org/jira/browse/HDFS-16115
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.3.1
>Reporter: Daniel Ma
>Priority: Critical
> Fix For: 3.3.1
>
> Attachments: 0001-HDFS-16115.patch
>
>
> It is an improvement issue. Actually the issue has two sub issues:
> 1- BPServerActor thread handle commands from NameNode in aysnchronous way ( 
> CommandProcessThread handle commands ), so if there are any exceptions or 
> errors happen in thread CommandProcessthread resulting the thread fails and 
> stop, of which BPServiceActor cannot aware and still keep putting commands 
> from namenode into queues waiting to be handled by CommandProcessThread, 
> actually CommandProcessThread was dead already.
> 2-the second sub issue is based on the first one, if CommandProcessThread was 
> dead owing to some non-fatal errors like "can not create native thread" which 
> is caused by too many threads existed in OS, this kind of problem should be 
> given much more torlerance instead of simply shudown the thread and never 
> recover automatically, because the non-fatal errors mentioned above probably 
> can be recovered soon by itself,
> {code:java}
> //代码占位符
> 2021-07-02 16:26:02,315 | WARN  | Command processor | Exception happened when 
> process queue BPServiceActor.java:1393
> java.lang.OutOfMemoryError: unable to create new native thread
> at java.lang.Thread.start0(Native Method)
> at java.lang.Thread.start(Thread.java:717)
> at 
> java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:957)
> at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1367)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService.execute(FsDatasetAsyncDiskService.java:180)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService.deleteAsync(FsDatasetAsyncDiskService.java:229)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.invalidate(FsDatasetImpl.java:2315)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.invalidate(FsDatasetImpl.java:2237)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:752)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:698)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.processCommand(BPServiceActor.java:1417)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.lambda$enqueue$2(BPServiceActor.java:1463)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.processQueue(BPServiceActor.java:1382)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.run(BPServiceActor.java:1365)
> {code}
> currently, Datanode BPServiceActor cannot turn to normal even when the 
> non-fatal error was eliminated.
> Therefore, in this patch, two things will be done:
> 1-Add retry mechanism in BPServiceActor thread and CommandProcessThread 
> thread which is 5 by default and configurable;
> 2-Add a monitor periodical thread in BPOfferService, if a BPServiceActor 
> thread is dead owing to  too many times non-fatal

[jira] [Commented] (HDFS-16115) Asynchronously handle BPServiceActor command mechanism may result in BPServiceActor never fails even CommandProcessingThread is closed with fatal error.

2022-12-22 Thread Daniel Ma (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17651129#comment-17651129
 ] 

Daniel Ma commented on HDFS-16115:
--

[~hemanthboyina] [~brahma]
Could you pls help to review

> Asynchronously handle BPServiceActor command mechanism may result in 
> BPServiceActor never fails even CommandProcessingThread is closed with fatal 
> error.
> 
>
> Key: HDFS-16115
> URL: https://issues.apache.org/jira/browse/HDFS-16115
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.3.1
>Reporter: Daniel Ma
>Assignee: Daniel Ma
>Priority: Critical
> Fix For: 3.3.1
>
> Attachments: 0001-HDFS-16115.patch
>
>
> It is an improvement issue. Actually the issue has two sub issues:
> 1- BPServerActor thread handle commands from NameNode in aysnchronous way ( 
> CommandProcessThread handle commands ), so if there are any exceptions or 
> errors happen in thread CommandProcessthread resulting the thread fails and 
> stop, of which BPServiceActor cannot aware and still keep putting commands 
> from namenode into queues waiting to be handled by CommandProcessThread, 
> actually CommandProcessThread was dead already.
> 2-the second sub issue is based on the first one, if CommandProcessThread was 
> dead owing to some non-fatal errors like "can not create native thread" which 
> is caused by too many threads existed in OS, this kind of problem should be 
> given much more torlerance instead of simply shudown the thread and never 
> recover automatically, because the non-fatal errors mentioned above probably 
> can be recovered soon by itself,
> {code:java}
> //代码占位符
> 2021-07-02 16:26:02,315 | WARN  | Command processor | Exception happened when 
> process queue BPServiceActor.java:1393
> java.lang.OutOfMemoryError: unable to create new native thread
> at java.lang.Thread.start0(Native Method)
> at java.lang.Thread.start(Thread.java:717)
> at 
> java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:957)
> at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1367)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService.execute(FsDatasetAsyncDiskService.java:180)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService.deleteAsync(FsDatasetAsyncDiskService.java:229)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.invalidate(FsDatasetImpl.java:2315)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.invalidate(FsDatasetImpl.java:2237)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:752)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:698)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.processCommand(BPServiceActor.java:1417)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.lambda$enqueue$2(BPServiceActor.java:1463)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.processQueue(BPServiceActor.java:1382)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.run(BPServiceActor.java:1365)
> {code}
> currently, Datanode BPServiceActor cannot turn to normal even when the 
> non-fatal error was eliminated.
> Therefore, in this patch, two things will be done:
> 1-Add retry mechanism in BPServiceActor thread and CommandProcessThread 
> thread which is 5 by default and configurable;
> 2-Add a monitor periodical thread in BPOfferService, if a BPServiceActor 
> thread is dead owing to  too many times non-fatal error, it should not be 
> simply removed from BPServviceActor lists stored in BPOfferService, instead, 
> the monitor thread will periodically try to start these special dead 
> BPServiceActor thread. the interval is also configurable.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-16115) Asynchronously handle BPServiceActor command mechanism may result in BPServiceActor never fails even CommandProcessingThread is closed with fatal error.

[jira] [Commented] (HDFS-16115) Asynchronously handle BPServiceActor command mechanism may result in BPServiceActor never fails even CommandProcessingThread is closed with fatal error.

[jira] [Commented] (HDFS-16115) Asynchronously handle BPServiceActor command mechanism may result in BPServiceActor never fails even CommandProcessingThread is closed with fatal error.

[jira] [Commented] (HDFS-16115) Asynchronously handle BPServiceActor command mechanism may result in BPServiceActor never fails even CommandProcessingThread is closed with fatal error.

[jira] [Commented] (HDFS-16115) Asynchronously handle BPServiceActor command mechanism may result in BPServiceActor never fails even CommandProcessingThread is closed with fatal error.

[jira] [Commented] (HDFS-16115) Asynchronously handle BPServiceActor command mechanism may result in BPServiceActor never fails even CommandProcessingThread is closed with fatal error.

6 matches

Site Navigation

Mail list logo

Footer information