[ 
https://issues.apache.org/jira/browse/HBASE-22200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16816994#comment-16816994
 ] 

Wei-Chiu Chuang edited comment on HBASE-22200 at 4/13/19 4:04 PM:
------------------------------------------------------------------

Probably unrelated, but worth noting for s3 root project.
If WAL and root are on different file system, FileSystem cache could cause out 
of memory for the WAL file system.  (HADOOP-13971) 
(fs.SCHEME.impl.disable.cache, SCHEME=s3,hdfs) Many downstreamers disable the 
cache as a workaround.

Currently it seems only the file system cache of hfile is disabled, which is 
fine as long as the HFile and WAL use the same fs.


was (Author: jojochuang):
If WAL and root are on different file system, FileSystem cache could cause out 
of memory for the WAL file system.  (HADOOP-13971) 
(fs.SCHEME.impl.disable.cache, SCHEME=s3,hdfs) Many downstreamers disable the 
cache as a workaround.

Currently it seems only the file system cache of hfile is disabled, which is 
fine as long as the HFile and WAL use the same fs.

> WALSplitter.hasRecoveredEdits should use same FS instance from WAL region dir
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-22200
>                 URL: https://issues.apache.org/jira/browse/HBASE-22200
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Wellington Chevreuil
>            Assignee: Wellington Chevreuil
>            Priority: Major
>              Labels: S3, WAL
>         Attachments: HBASE-22200-branch-2.1-001.patch, 
> HBASE-22200-branch-2.1-002.patch, HBASE-22200-branch-2.1-003.patch, 
> HBASE-22200-master-001.patch, HBASE-22200-master-002.patch
>
>
> *WALSplitter.hasRecoveredEdits* should use same FS instance from WAL region 
> dir when checking for recovered.edits files, instead of taking FS instance as 
> additional method parameter. When specifying different file systems for *wal 
> dir* and *root dir*,  *WALSplitter.hasRecoveredEdits* current implementation 
> will crash or give wrong results. As of now, it's being used indirectly by 
> *SplitTableRegionProcedure*. When running tests with *WAL dir* on HDFS and 
> *root dir* on S3, for example, noticed region split failing with below error:
> {noformat}
> 2019-04-08 13:53:58,064 ERROR 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: CODE-BUG: Uncaught 
> runtime exception: pid=98, 
> state=RUNNABLE:SPLIT_TABLE_REGIONS_CHECK_CLOSED_REGIONS, locked=true; 
> SplitTableRegionProcedure table=test-tbl, 
> parent=4c5db01611e97e3abbe02e781e867212, 
> daughterA=28a0a5e4ef7618899f6bd6dfb5335fe7, 
> daughterB=05fa26feaf03ebf9e87e099cbd1eabac
> java.lang.IllegalArgumentException: Path 
> hdfs://host-1.example.com:8020/wal_dir/default/test-tbl/4c5db01611e97e3abbe02e781e867212/recovered.edits
>  scheme must be s3a
>         at 
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:115)
>         at 
> org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore.checkPath(DynamoDBMetadataStore.java:1127)
>         at 
> org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore.get(DynamoDBMetadataStore.java:437)
>         at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2110)
>         at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2088)
>         at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
>         at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1668)
>         at 
> org.apache.hadoop.hbase.wal.WALSplitter.getSplitEditFilesSorted(WALSplitter.java:576)
>         at 
> org.apache.hadoop.hbase.wal.WALSplitter.hasRecoveredEdits(WALSplitter.java:558)
>         at 
> org.apache.hadoop.hbase.master.assignment.SplitTableRegionProcedure.hasRecoveredEdits(SplitTableRegionProcedure.java:148)
>         at 
> org.apache.hadoop.hbase.master.assignment.SplitTableRegionProcedure.executeFromState(SplitTableRegionProcedure.java:255)
> {noformat}
> Since *WALSplitter.hasRecoveredEdits* already resolves the proper WAL dir for 
> the region, we can simply re-use FS instance from the path instance for the 
> WAL dir region, when searching for recovered.edits.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to