[jira] [Commented] (HBASE-18166) [AMv2] We are splitting already-split files

2017-06-05 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16038047#comment-16038047
 ] 

Anoop Sam John commented on HBASE-18166:


bq.when it goes to get a listing of the files to split, it can pick up files 
that are for archiving but that have not been archived yet.
So this is just FS based listing not asking the RS which are all active files 
now?  
Ya it will mess up things.

> [AMv2] We are splitting already-split files
> ---
>
> Key: HBASE-18166
> URL: https://issues.apache.org/jira/browse/HBASE-18166
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 2.0.0
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0
>
>
> Interesting issue. The below adds a lag cleaning up files after a compaction 
> in case of on-going Scanners (for read replicas/offheap).
> HBASE-14970 Backport HBASE-13082 and its sub-jira to branch-1 - recommit (Ram)
> What the lag means is that now that split is run from the HMaster in master 
> branch, when it goes to get a listing of the files to split, it can pick up 
> files that are for archiving but that have not been archived yet.  When it 
> does, it goes ahead and splits them... making references of references.
> Its a mess.
> I added asking the Region if it is splittable a while back. The Master calls 
> this from SplitTableRegionProcedure during preparation. If the RegionServer 
> asked for the split, it is sort of redundant work given the RS asks itself if 
> any references still; if any, it'll wait before asking for a split. But if a 
> user/client asks, then this isSplittable over RPC comes in handy.
> I was thinking that isSplittable could return list of files 
> Or, easier, given we know a region is Splittable by the time we go to split 
> the files, then I think master-side we can just skip any references found 
> presuming read-for-archive.
> Will be back with a patch. Want to test on cluster first (Side-effect is 
> regions are offline because file at end of the reference to a reference is 
> removed ... and so the open fails).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18166) [AMv2] We are splitting already-split files

2017-06-05 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16038065#comment-16038065
 ] 

stack commented on HBASE-18166:
---

[~anoop.hbase]  Yes sir. I think patch is easy enough on Master side... will 
put it up in a sec.


> [AMv2] We are splitting already-split files
> ---
>
> Key: HBASE-18166
> URL: https://issues.apache.org/jira/browse/HBASE-18166
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 2.0.0
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0
>
>
> Interesting issue. The below adds a lag cleaning up files after a compaction 
> in case of on-going Scanners (for read replicas/offheap).
> HBASE-14970 Backport HBASE-13082 and its sub-jira to branch-1 - recommit (Ram)
> What the lag means is that now that split is run from the HMaster in master 
> branch, when it goes to get a listing of the files to split, it can pick up 
> files that are for archiving but that have not been archived yet.  When it 
> does, it goes ahead and splits them... making references of references.
> Its a mess.
> I added asking the Region if it is splittable a while back. The Master calls 
> this from SplitTableRegionProcedure during preparation. If the RegionServer 
> asked for the split, it is sort of redundant work given the RS asks itself if 
> any references still; if any, it'll wait before asking for a split. But if a 
> user/client asks, then this isSplittable over RPC comes in handy.
> I was thinking that isSplittable could return list of files 
> Or, easier, given we know a region is Splittable by the time we go to split 
> the files, then I think master-side we can just skip any references found 
> presuming read-for-archive.
> Will be back with a patch. Want to test on cluster first (Side-effect is 
> regions are offline because file at end of the reference to a reference is 
> removed ... and so the open fails).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18166) [AMv2] We are splitting already-split files

2017-06-05 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16038191#comment-16038191
 ] 

Anoop Sam John commented on HBASE-18166:


Changes in logging for debug purpose, some might not be needed right?
Just trying to understand more.  So we have isSplittable being used which says 
NO to split if there are ref files.Now the change is when we list files for 
split, we exclude the ref type files.   So what if a split happened and ref 
files are split file refes?  Those also we will exclude from split and go ahead 
with split?   We had to exclude the to be archived files which are already 
compacted away.  Is this new check more generic than it has to be?

> [AMv2] We are splitting already-split files
> ---
>
> Key: HBASE-18166
> URL: https://issues.apache.org/jira/browse/HBASE-18166
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 2.0.0
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0
>
> Attachments: HBASE-18166.master.001.patch
>
>
> Interesting issue. The below adds a lag cleaning up files after a compaction 
> in case of on-going Scanners (for read replicas/offheap).
> HBASE-14970 Backport HBASE-13082 and its sub-jira to branch-1 - recommit (Ram)
> What the lag means is that now that split is run from the HMaster in master 
> branch, when it goes to get a listing of the files to split, it can pick up 
> files that are for archiving but that have not been archived yet.  When it 
> does, it goes ahead and splits them... making references of references.
> Its a mess.
> I added asking the Region if it is splittable a while back. The Master calls 
> this from SplitTableRegionProcedure during preparation. If the RegionServer 
> asked for the split, it is sort of redundant work given the RS asks itself if 
> any references still; if any, it'll wait before asking for a split. But if a 
> user/client asks, then this isSplittable over RPC comes in handy.
> I was thinking that isSplittable could return list of files 
> Or, easier, given we know a region is Splittable by the time we go to split 
> the files, then I think master-side we can just skip any references found 
> presuming read-for-archive.
> Will be back with a patch. Want to test on cluster first (Side-effect is 
> regions are offline because file at end of the reference to a reference is 
> removed ... and so the open fails).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18166) [AMv2] We are splitting already-split files

2017-06-05 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16038201#comment-16038201
 ] 

stack commented on HBASE-18166:
---

v2 addresses feedback by [~syuanjiang] up on rb.

> [AMv2] We are splitting already-split files
> ---
>
> Key: HBASE-18166
> URL: https://issues.apache.org/jira/browse/HBASE-18166
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 2.0.0
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0
>
> Attachments: HBASE-18166.master.001.patch, 
> HBASE-18166.master.002.patch
>
>
> Interesting issue. The below adds a lag cleaning up files after a compaction 
> in case of on-going Scanners (for read replicas/offheap).
> HBASE-14970 Backport HBASE-13082 and its sub-jira to branch-1 - recommit (Ram)
> What the lag means is that now that split is run from the HMaster in master 
> branch, when it goes to get a listing of the files to split, it can pick up 
> files that are for archiving but that have not been archived yet.  When it 
> does, it goes ahead and splits them... making references of references.
> Its a mess.
> I added asking the Region if it is splittable a while back. The Master calls 
> this from SplitTableRegionProcedure during preparation. If the RegionServer 
> asked for the split, it is sort of redundant work given the RS asks itself if 
> any references still; if any, it'll wait before asking for a split. But if a 
> user/client asks, then this isSplittable over RPC comes in handy.
> I was thinking that isSplittable could return list of files 
> Or, easier, given we know a region is Splittable by the time we go to split 
> the files, then I think master-side we can just skip any references found 
> presuming read-for-archive.
> Will be back with a patch. Want to test on cluster first (Side-effect is 
> regions are offline because file at end of the reference to a reference is 
> removed ... and so the open fails).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18166) [AMv2] We are splitting already-split files

2017-06-05 Thread Stephen Yuan Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16038258#comment-16038258
 ] 

Stephen Yuan Jiang commented on HBASE-18166:


[~stack], when I implemented the SplitTableRegionProcedure, I copied the logic 
from SplitTransactionImpl.java:
{code}
  /**
   * Creates reference files for top and bottom half of the
   * @param hstoreFilesToSplit map of store files to create half file 
references for.
   * @return the number of reference files that were created.
   * @throws IOException
   */
  private Pair splitStoreFiles(
  final Map> hstoreFilesToSplit)
  throws IOException {
if (hstoreFilesToSplit == null) {
  // Could be null because close didn't succeed -- for now consider it fatal
  throw new IOException("Close returned empty list of StoreFiles");
}
// The following code sets up a thread pool executor with as many slots as
// there's files to split. It then fires up everything, waits for
// completion and finally checks for any exception
int nbFiles = 0;
for (Map.Entry> entry: 
hstoreFilesToSplit.entrySet()) {
nbFiles += entry.getValue().size();  ===> possible to have reference 
files 
}
{code}

I just wonder whether we should change the logic in SplitTransactionImpl in 
branch-1 to skip splitting reference files (I checked HRegion#doClose() and did 
not see the logic to skip reference files in region server side).

> [AMv2] We are splitting already-split files
> ---
>
> Key: HBASE-18166
> URL: https://issues.apache.org/jira/browse/HBASE-18166
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 2.0.0
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0
>
> Attachments: HBASE-18166.master.001.patch, 
> HBASE-18166.master.002.patch
>
>
> Interesting issue. The below adds a lag cleaning up files after a compaction 
> in case of on-going Scanners (for read replicas/offheap).
> HBASE-14970 Backport HBASE-13082 and its sub-jira to branch-1 - recommit (Ram)
> What the lag means is that now that split is run from the HMaster in master 
> branch, when it goes to get a listing of the files to split, it can pick up 
> files that are for archiving but that have not been archived yet.  When it 
> does, it goes ahead and splits them... making references of references.
> Its a mess.
> I added asking the Region if it is splittable a while back. The Master calls 
> this from SplitTableRegionProcedure during preparation. If the RegionServer 
> asked for the split, it is sort of redundant work given the RS asks itself if 
> any references still; if any, it'll wait before asking for a split. But if a 
> user/client asks, then this isSplittable over RPC comes in handy.
> I was thinking that isSplittable could return list of files 
> Or, easier, given we know a region is Splittable by the time we go to split 
> the files, then I think master-side we can just skip any references found 
> presuming read-for-archive.
> Will be back with a patch. Want to test on cluster first (Side-effect is 
> regions are offline because file at end of the reference to a reference is 
> removed ... and so the open fails).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18166) [AMv2] We are splitting already-split files

2017-06-06 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16038376#comment-16038376
 ] 

ramkrishna.s.vasudevan commented on HBASE-18166:


So we may have the same problem in branch-1 also? Even with out AMv2?

> [AMv2] We are splitting already-split files
> ---
>
> Key: HBASE-18166
> URL: https://issues.apache.org/jira/browse/HBASE-18166
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 2.0.0
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0
>
> Attachments: HBASE-18166.master.001.patch, 
> HBASE-18166.master.002.patch
>
>
> Interesting issue. The below adds a lag cleaning up files after a compaction 
> in case of on-going Scanners (for read replicas/offheap).
> HBASE-14970 Backport HBASE-13082 and its sub-jira to branch-1 - recommit (Ram)
> What the lag means is that now that split is run from the HMaster in master 
> branch, when it goes to get a listing of the files to split, it can pick up 
> files that are for archiving but that have not been archived yet.  When it 
> does, it goes ahead and splits them... making references of references.
> Its a mess.
> I added asking the Region if it is splittable a while back. The Master calls 
> this from SplitTableRegionProcedure during preparation. If the RegionServer 
> asked for the split, it is sort of redundant work given the RS asks itself if 
> any references still; if any, it'll wait before asking for a split. But if a 
> user/client asks, then this isSplittable over RPC comes in handy.
> I was thinking that isSplittable could return list of files 
> Or, easier, given we know a region is Splittable by the time we go to split 
> the files, then I think master-side we can just skip any references found 
> presuming read-for-archive.
> Will be back with a patch. Want to test on cluster first (Side-effect is 
> regions are offline because file at end of the reference to a reference is 
> removed ... and so the open fails).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18166) [AMv2] We are splitting already-split files

2017-06-07 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16041288#comment-16041288
 ] 

ramkrishna.s.vasudevan commented on HBASE-18166:


bq.When it does, it goes ahead and splits them... making references of 
references.
Since I was checking this issue with HBASE-17406, previously there was a check 
which did not allow splits to happen when there were references available. So 
the split itself would fail. So are you seeing that a region is getting 
splitted or it is only the StoreFileSplitter that is just creating references 
again?

> [AMv2] We are splitting already-split files
> ---
>
> Key: HBASE-18166
> URL: https://issues.apache.org/jira/browse/HBASE-18166
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 2.0.0
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0
>
> Attachments: HBASE-18166.master.001.patch, 
> HBASE-18166.master.002.patch
>
>
> Interesting issue. The below adds a lag cleaning up files after a compaction 
> in case of on-going Scanners (for read replicas/offheap).
> HBASE-14970 Backport HBASE-13082 and its sub-jira to branch-1 - recommit (Ram)
> What the lag means is that now that split is run from the HMaster in master 
> branch, when it goes to get a listing of the files to split, it can pick up 
> files that are for archiving but that have not been archived yet.  When it 
> does, it goes ahead and splits them... making references of references.
> Its a mess.
> I added asking the Region if it is splittable a while back. The Master calls 
> this from SplitTableRegionProcedure during preparation. If the RegionServer 
> asked for the split, it is sort of redundant work given the RS asks itself if 
> any references still; if any, it'll wait before asking for a split. But if a 
> user/client asks, then this isSplittable over RPC comes in handy.
> I was thinking that isSplittable could return list of files 
> Or, easier, given we know a region is Splittable by the time we go to split 
> the files, then I think master-side we can just skip any references found 
> presuming read-for-archive.
> Will be back with a patch. Want to test on cluster first (Side-effect is 
> regions are offline because file at end of the reference to a reference is 
> removed ... and so the open fails).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18166) [AMv2] We are splitting already-split files

2017-06-07 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042005#comment-16042005
 ] 

stack commented on HBASE-18166:
---

Missed your comment [~syuanjiang]

You think? It depends on how we get the list of hstoreFilesToSplit that we pass 
in. Are we doing a raw read of the fs or are we getting it from the regions 
in-memory notion of live files? If the former, then yes, branch-1 could have 
same issue (That make sense [~ram_krish])?




> [AMv2] We are splitting already-split files
> ---
>
> Key: HBASE-18166
> URL: https://issues.apache.org/jira/browse/HBASE-18166
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 2.0.0
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0
>
> Attachments: HBASE-18166.master.001.patch, 
> HBASE-18166.master.002.patch
>
>
> Interesting issue. The below adds a lag cleaning up files after a compaction 
> in case of on-going Scanners (for read replicas/offheap).
> HBASE-14970 Backport HBASE-13082 and its sub-jira to branch-1 - recommit (Ram)
> What the lag means is that now that split is run from the HMaster in master 
> branch, when it goes to get a listing of the files to split, it can pick up 
> files that are for archiving but that have not been archived yet.  When it 
> does, it goes ahead and splits them... making references of references.
> Its a mess.
> I added asking the Region if it is splittable a while back. The Master calls 
> this from SplitTableRegionProcedure during preparation. If the RegionServer 
> asked for the split, it is sort of redundant work given the RS asks itself if 
> any references still; if any, it'll wait before asking for a split. But if a 
> user/client asks, then this isSplittable over RPC comes in handy.
> I was thinking that isSplittable could return list of files 
> Or, easier, given we know a region is Splittable by the time we go to split 
> the files, then I think master-side we can just skip any references found 
> presuming read-for-archive.
> Will be back with a patch. Want to test on cluster first (Side-effect is 
> regions are offline because file at end of the reference to a reference is 
> removed ... and so the open fails).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18166) [AMv2] We are splitting already-split files

2017-06-07 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042182#comment-16042182
 ] 

ramkrishna.s.vasudevan commented on HBASE-18166:


I am seeing HBASE-18186 also. The stack trace seems to be doing listFiles from 
FS
{code}
ERROR regionserver.CompactSplitThread: Compaction selection failed Store = 
, pri = 289
java.io.FileNotFoundException: File does not exist: 
hdfs:
at 
org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1309)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1317)
at 
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421)
at 
org.apache.hadoop.hbase.regionserver.StoreFileInfo.getReferencedFileStatus(StoreFileInfo.java:342)
{code}
Ya, this could be a problem because only the RS in memory notion knows as what 
are the actual store files and the compacted files. Will dig in to this issue 
and HBASE-18186 and see if it can help.

> [AMv2] We are splitting already-split files
> ---
>
> Key: HBASE-18166
> URL: https://issues.apache.org/jira/browse/HBASE-18166
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 2.0.0
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0
>
> Attachments: HBASE-18166.master.001.patch, 
> HBASE-18166.master.002.patch
>
>
> Interesting issue. The below adds a lag cleaning up files after a compaction 
> in case of on-going Scanners (for read replicas/offheap).
> HBASE-14970 Backport HBASE-13082 and its sub-jira to branch-1 - recommit (Ram)
> What the lag means is that now that split is run from the HMaster in master 
> branch, when it goes to get a listing of the files to split, it can pick up 
> files that are for archiving but that have not been archived yet.  When it 
> does, it goes ahead and splits them... making references of references.
> Its a mess.
> I added asking the Region if it is splittable a while back. The Master calls 
> this from SplitTableRegionProcedure during preparation. If the RegionServer 
> asked for the split, it is sort of redundant work given the RS asks itself if 
> any references still; if any, it'll wait before asking for a split. But if a 
> user/client asks, then this isSplittable over RPC comes in handy.
> I was thinking that isSplittable could return list of files 
> Or, easier, given we know a region is Splittable by the time we go to split 
> the files, then I think master-side we can just skip any references found 
> presuming read-for-archive.
> Will be back with a patch. Want to test on cluster first (Side-effect is 
> regions are offline because file at end of the reference to a reference is 
> removed ... and so the open fails).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18166) [AMv2] We are splitting already-split files

2017-06-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16050833#comment-16050833
 ] 

Hudson commented on HBASE-18166:


FAILURE: Integrated in Jenkins build HBase-2.0 #48 (See 
[https://builds.apache.org/job/HBase-2.0/48/])
HBASE-18166 [AMv2] We are splitting already-split files v2 Address (stack: rev 
c02a1421437b65c127ae1e985edbd507b0d1696b)
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/SplitTableRegionProcedure.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/IncreasingToUpperBoundRegionSplitPolicy.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileInfo.java


> [AMv2] We are splitting already-split files
> ---
>
> Key: HBASE-18166
> URL: https://issues.apache.org/jira/browse/HBASE-18166
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 2.0.0
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0
>
> Attachments: HBASE-18166.master.001.patch, 
> HBASE-18166.master.002.patch
>
>
> Interesting issue. The below adds a lag cleaning up files after a compaction 
> in case of on-going Scanners (for read replicas/offheap).
> HBASE-14970 Backport HBASE-13082 and its sub-jira to branch-1 - recommit (Ram)
> What the lag means is that now that split is run from the HMaster in master 
> branch, when it goes to get a listing of the files to split, it can pick up 
> files that are for archiving but that have not been archived yet.  When it 
> does, it goes ahead and splits them... making references of references.
> Its a mess.
> I added asking the Region if it is splittable a while back. The Master calls 
> this from SplitTableRegionProcedure during preparation. If the RegionServer 
> asked for the split, it is sort of redundant work given the RS asks itself if 
> any references still; if any, it'll wait before asking for a split. But if a 
> user/client asks, then this isSplittable over RPC comes in handy.
> I was thinking that isSplittable could return list of files 
> Or, easier, given we know a region is Splittable by the time we go to split 
> the files, then I think master-side we can just skip any references found 
> presuming read-for-archive.
> Will be back with a patch. Want to test on cluster first (Side-effect is 
> regions are offline because file at end of the reference to a reference is 
> removed ... and so the open fails).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18166) [AMv2] We are splitting already-split files

2017-06-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16051141#comment-16051141
 ] 

Hudson commented on HBASE-18166:


FAILURE: Integrated in Jenkins build HBase-2.0 #49 (See 
[https://builds.apache.org/job/HBase-2.0/49/])
HBASE-18166 [AMv2] We are splitting already-split files v2 Address (stack: rev 
8c7bf7b0a92beac1dcb1f6a59d1057f7838bdc91)
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/SplitTableRegionProcedure.java


> [AMv2] We are splitting already-split files
> ---
>
> Key: HBASE-18166
> URL: https://issues.apache.org/jira/browse/HBASE-18166
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 2.0.0
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0
>
> Attachments: HBASE-18166.master.001.patch, 
> HBASE-18166.master.002.patch
>
>
> Interesting issue. The below adds a lag cleaning up files after a compaction 
> in case of on-going Scanners (for read replicas/offheap).
> HBASE-14970 Backport HBASE-13082 and its sub-jira to branch-1 - recommit (Ram)
> What the lag means is that now that split is run from the HMaster in master 
> branch, when it goes to get a listing of the files to split, it can pick up 
> files that are for archiving but that have not been archived yet.  When it 
> does, it goes ahead and splits them... making references of references.
> Its a mess.
> I added asking the Region if it is splittable a while back. The Master calls 
> this from SplitTableRegionProcedure during preparation. If the RegionServer 
> asked for the split, it is sort of redundant work given the RS asks itself if 
> any references still; if any, it'll wait before asking for a split. But if a 
> user/client asks, then this isSplittable over RPC comes in handy.
> I was thinking that isSplittable could return list of files 
> Or, easier, given we know a region is Splittable by the time we go to split 
> the files, then I think master-side we can just skip any references found 
> presuming read-for-archive.
> Will be back with a patch. Want to test on cluster first (Side-effect is 
> regions are offline because file at end of the reference to a reference is 
> removed ... and so the open fails).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18166) [AMv2] We are splitting already-split files

2017-06-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16051225#comment-16051225
 ] 

Hudson commented on HBASE-18166:


SUCCESS: Integrated in Jenkins build HBase-Trunk_matrix #3201 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/3201/])
HBASE-18166 [AMv2] We are splitting already-split files v2 Address (stack: rev 
f64512bee2454fc3728fe5d344a838781e26)
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/SplitTableRegionProcedure.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/IncreasingToUpperBoundRegionSplitPolicy.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileInfo.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java
HBASE-18166 [AMv2] We are splitting already-split files v2 Address (stack: rev 
c2eebfdb613427fa3314b7ee13c3b9f34ce4c120)
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/SplitTableRegionProcedure.java


> [AMv2] We are splitting already-split files
> ---
>
> Key: HBASE-18166
> URL: https://issues.apache.org/jira/browse/HBASE-18166
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 2.0.0
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0
>
> Attachments: HBASE-18166.master.001.patch, 
> HBASE-18166.master.002.patch
>
>
> Interesting issue. The below adds a lag cleaning up files after a compaction 
> in case of on-going Scanners (for read replicas/offheap).
> HBASE-14970 Backport HBASE-13082 and its sub-jira to branch-1 - recommit (Ram)
> What the lag means is that now that split is run from the HMaster in master 
> branch, when it goes to get a listing of the files to split, it can pick up 
> files that are for archiving but that have not been archived yet.  When it 
> does, it goes ahead and splits them... making references of references.
> Its a mess.
> I added asking the Region if it is splittable a while back. The Master calls 
> this from SplitTableRegionProcedure during preparation. If the RegionServer 
> asked for the split, it is sort of redundant work given the RS asks itself if 
> any references still; if any, it'll wait before asking for a split. But if a 
> user/client asks, then this isSplittable over RPC comes in handy.
> I was thinking that isSplittable could return list of files 
> Or, easier, given we know a region is Splittable by the time we go to split 
> the files, then I think master-side we can just skip any references found 
> presuming read-for-archive.
> Will be back with a patch. Want to test on cluster first (Side-effect is 
> regions are offline because file at end of the reference to a reference is 
> removed ... and so the open fails).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)