[ https://issues.apache.org/jira/browse/HDFS-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tanping Wang updated HDFS-1758: ------------------------------- Description: The set of JSP pages that web UI uses are not thread safe. We have observed some problems when requesting Live/Dead/Decommissioning pages from the web UI, incorrect page is displayed. To be more specific, requesting Dead node list page, sometimes, Live node page is returned. Requesting decommissioning page, sometimes, dead page is returned. The root cause of this problem is that JSP page is not thread safe by default. When multiple requests come in, each request is assigned a different thread, multiple threads access the same instance of the servlet class resulted from a JSP page. A class variable is shared by multiple threads. In code out of 20 branches, for example, dfsnodelist.jsp has {code} int rowNum = 0; int colNum = 0; String sorterField = null; String sorterOrder = null; String whatNodes = "LIVE"; {code} declared as class level variables. ( These set of variables are declared within <%! code %> directives which made them class members. ) Multiple threads share the same set of class member variables, one request would step anther's toe. However, due to the JSP code refactor, HADOOP-5857, all of these class member variables are moved to become function local variables. So this bug does not appear in Apache trunk. We have proposed to take a simple fix to fix this bug in 20 branch alone, to be more specific, branch-0.20-security. The simple fix is to add jsp ThreadSafe="false" directive into the related JSP pages to make them thread safe, i.e. only on request is processed at each time. We did evaluate the thread safety issue for other JSP pages on trunk, we noticed a potential problem that is when we retrieving some statistics from namenode, for example, we make the call NamenodeJspHelper.getInodeLimitText(fsn); in dfshealth.jsp, which eventuality is {code} static String getInodeLimitText(FSNamesystem fsn) { long inodes = fsn.dir.totalInodes(); long blocks = fsn.getBlocksTotal(); long maxobjects = fsn.getMaxObjects(); .... {code} some of the function calls are already guarded by readwritelock, e.g. dir.totalInodes, but others are not. As a result of this, the web ui results are not 100% thread safe. But after evaluating the prons and cons of adding a giant lock into the JSP pages, we decide not to issue FSNamesystem ReadWrite locks into JSPs. was: The set of JSP pages that web UI uses are not thread safe. We have observed some problems when requesting Live/Dead/Decommissioning pages from the web UI, incorrect page is displayed. To be more specific, requesting Dead node list page, sometimes, Live node page is returned. Requesting decommissioning page, sometimes, dead page is returned. The root cause of this problem is that JSP pages is not thread safe by default. When multiple requests come in, each request is assigned a different thread, multiple threads access the same instance of the servlet class resulted from a JSP page. A class variable is shared by multiple threads. In code out of 20 branches, for example, dfsnodelist.jsp has {code} int rowNum = 0; int colNum = 0; String sorterField = null; String sorterOrder = null; String whatNodes = "LIVE"; {code} declared as class level variables. ( These set of variables are declared within <%! code %> directives which made them class members. ) Multiple threads share the same set of class member variables, one request would step anther's toe. However, due to the JSP code refactor, HADOOP-5857, all of these class member variables are moved to become function local variables. So this bug does not appear in Apache trunk. We have proposed to take a simple fix to fix this bug in 20 branch alone, to be more specific, branch-0.20-security. The simple fix is to add jsp ThreadSafe="false" directive into the related JSP pages to make them thread safe, i.e. only on request is processed at each time. We did evaluate the thread safety issue for other JSP pages on trunk, we noticed a potential problem that is when we retrieving some statistics from namenode, for example, we make the call NamenodeJspHelper.getInodeLimitText(fsn); in dfshealth.jsp, which eventuality is {code} static String getInodeLimitText(FSNamesystem fsn) { long inodes = fsn.dir.totalInodes(); long blocks = fsn.getBlocksTotal(); long maxobjects = fsn.getMaxObjects(); .... {code} some of the function calls are already guarded by readwritelock, e.g. dir.totalInodes, but others are not. As a result of this, the web ui results are not 100% thread safe. But after evaluating the prons and cons of adding a giant lock into the JSP pages, we decide not to issue FSNamesystem ReadWrite locks into JSPs. > Web UI JSP pages thread safety issue > ------------------------------------ > > Key: HDFS-1758 > URL: https://issues.apache.org/jira/browse/HDFS-1758 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools > Environment: branch-20-security > Reporter: Tanping Wang > Assignee: Tanping Wang > Priority: Minor > > The set of JSP pages that web UI uses are not thread safe. We have observed > some problems when requesting Live/Dead/Decommissioning pages from the web > UI, incorrect page is displayed. To be more specific, requesting Dead node > list page, sometimes, Live node page is returned. Requesting decommissioning > page, sometimes, dead page is returned. > The root cause of this problem is that JSP page is not thread safe by > default. When multiple requests come in, each request is assigned a > different thread, multiple threads access the same instance of the servlet > class resulted from a JSP page. A class variable is shared by multiple > threads. In code out of 20 branches, for example, dfsnodelist.jsp has > {code} > int rowNum = 0; > int colNum = 0; > String sorterField = null; > String sorterOrder = null; > String whatNodes = "LIVE"; > {code} > declared as class level variables. ( These set of variables are declared > within <%! code %> directives which made them class members. ) Multiple > threads share the same set of class member variables, one request would step > anther's toe. > However, due to the JSP code refactor, HADOOP-5857, all of these class member > variables are moved to become function local variables. So this bug does not > appear in Apache trunk. We have proposed to take a simple fix to fix this > bug in 20 branch alone, to be more specific, branch-0.20-security. > The simple fix is to add jsp ThreadSafe="false" directive into the related > JSP pages to make them thread safe, i.e. only on request is processed at each > time. > We did evaluate the thread safety issue for other JSP pages on trunk, we > noticed a potential problem that is when we retrieving some statistics from > namenode, for example, we make the call > NamenodeJspHelper.getInodeLimitText(fsn); > in dfshealth.jsp, which eventuality is > {code} > static String getInodeLimitText(FSNamesystem fsn) { > long inodes = fsn.dir.totalInodes(); > long blocks = fsn.getBlocksTotal(); > long maxobjects = fsn.getMaxObjects(); > .... > {code} > some of the function calls are already guarded by readwritelock, e.g. > dir.totalInodes, but others are not. As a result of this, the web ui results > are not 100% thread safe. But after evaluating the prons and cons of adding > a giant lock into the JSP pages, we decide not to issue FSNamesystem > ReadWrite locks into JSPs. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira