[jira] [Commented] (HBASE-16393) Improve computeHDFSBlocksDistribution

2016-08-28 Thread binlijin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1504#comment-1504
 ] 

binlijin commented on HBASE-16393:
--

Actually for the use case in hbase the first rpc call is not needed, it is 
needed only by symlinks, but there is no way to bypass it using 
DistributedFileSystem. If we want to call just on rpc, we need to direct call 
DFSClient, but it is not public.


> Improve computeHDFSBlocksDistribution
> -
>
> Key: HBASE-16393
> URL: https://issues.apache.org/jira/browse/HBASE-16393
> Project: HBase
>  Issue Type: Improvement
>Reporter: binlijin
>Assignee: binlijin
> Attachments: HBASE-16393.patch
>
>
> With our cluster is big, i can see the balancer is slow from time to time. 
> And the balancer will be called on master startup, so we can see the startup 
> is slow also. 
> The first thing i think whether if we can parallel compute different region's 
> HDFSBlocksDistribution. 
> The second i think we can improve compute single region's 
> HDFSBlocksDistribution.
> When to compute a storefile's HDFSBlocksDistribution first we call 
> FileSystem#getFileStatus(path) and then 
> FileSystem#getFileBlockLocations(status, start, length), so two namenode rpc 
> call for every storefile. Instead we can use FileSystem#listLocatedStatus to 
> get a LocatedFileStatus for the information we need, so reduce the namenode 
> rpc call to one. This can speed the computeHDFSBlocksDistribution, but also 
> send out less rpc call to namenode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16393) Improve computeHDFSBlocksDistribution

2016-08-28 Thread binlijin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15444387#comment-15444387
 ] 

binlijin commented on HBASE-16393:
--

After dig into it, i find for a a storefile 
DistributedFileSystem#listLocatedStatus(final Path p, final PathFilter filter) 
actually have two rpc call.
{code}
  @Override
  protected RemoteIterator listLocatedStatus(final Path p,
  final PathFilter filter)
  throws IOException {
  { // initializer
// Fully resolve symlinks in path first to avoid additional resolution
// round-trips as we fetch more batches of listings
src = getPathName(resolvePath(absF));
// fetch the first batch of entries in the directory
thisListing = dfs.listPaths(src, HdfsFileStatus.EMPTY_NAME, true);
statistics.incrementReadOps(1);
if (thisListing == null) { // the directory does not exist
  throw new FileNotFoundException("File " + p + " does not exist.");
}
  }
}

  /**
   * Return the fully-qualified path of path f resolving the path
   * through any symlinks or mount point
   * @param p path to be resolved
   * @return fully qualified path 
   * @throws FileNotFoundException
   */
   public Path resolvePath(final Path p) throws IOException {
 checkPath(p);
 return getFileStatus(p).getPath();
   }

  /**
   * Returns the stat information about the file.
   * @throws FileNotFoundException if the file does not exist.
   */
  @Override
  public FileStatus getFileStatus(Path f) throws IOException {
statistics.incrementReadOps(1);
Path absF = fixRelativePart(f);
return new FileSystemLinkResolver() {
  @Override
  public FileStatus doCall(final Path p) throws IOException,
  UnresolvedLinkException {
HdfsFileStatus fi = dfs.getFileInfo(getPathName(p));
if (fi != null) {
  return fi.makeQualified(getUri(), p);
} else {
  throw new FileNotFoundException("File does not exist: " + p);
}
  }
  @Override
  public FileStatus next(final FileSystem fs, final Path p)
  throws IOException {
return fs.getFileStatus(p);
  }
}.resolve(this, absF);
  }

{code}

> Improve computeHDFSBlocksDistribution
> -
>
> Key: HBASE-16393
> URL: https://issues.apache.org/jira/browse/HBASE-16393
> Project: HBase
>  Issue Type: Improvement
>Reporter: binlijin
>Assignee: binlijin
> Attachments: HBASE-16393.patch
>
>
> With our cluster is big, i can see the balancer is slow from time to time. 
> And the balancer will be called on master startup, so we can see the startup 
> is slow also. 
> The first thing i think whether if we can parallel compute different region's 
> HDFSBlocksDistribution. 
> The second i think we can improve compute single region's 
> HDFSBlocksDistribution.
> When to compute a storefile's HDFSBlocksDistribution first we call 
> FileSystem#getFileStatus(path) and then 
> FileSystem#getFileBlockLocations(status, start, length), so two namenode rpc 
> call for every storefile. Instead we can use FileSystem#listLocatedStatus to 
> get a LocatedFileStatus for the information we need, so reduce the namenode 
> rpc call to one. This can speed the computeHDFSBlocksDistribution, but also 
> send out less rpc call to namenode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16393) Improve computeHDFSBlocksDistribution

2016-08-11 Thread binlijin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418224#comment-15418224
 ] 

binlijin commented on HBASE-16393:
--

Yes, still need it when this is a fresh cluster startup, or there is many 
regionservers down?

> Improve computeHDFSBlocksDistribution
> -
>
> Key: HBASE-16393
> URL: https://issues.apache.org/jira/browse/HBASE-16393
> Project: HBase
>  Issue Type: Improvement
>Reporter: binlijin
>Assignee: binlijin
> Attachments: HBASE-16393.patch
>
>
> With our cluster is big, i can see the balancer is slow from time to time. 
> And the balancer will be called on master startup, so we can see the startup 
> is slow also. 
> The first thing i think whether if we can parallel compute different region's 
> HDFSBlocksDistribution. 
> The second i think we can improve compute single region's 
> HDFSBlocksDistribution.
> When to compute a storefile's HDFSBlocksDistribution first we call 
> FileSystem#getFileStatus(path) and then 
> FileSystem#getFileBlockLocations(status, start, length), so two namenode rpc 
> call for every storefile. Instead we can use FileSystem#listLocatedStatus to 
> get a LocatedFileStatus for the information we need, so reduce the namenode 
> rpc call to one. This can speed the computeHDFSBlocksDistribution, but also 
> send out less rpc call to namenode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16393) Improve computeHDFSBlocksDistribution

2016-08-11 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417630#comment-15417630
 ] 

Elliott Clark commented on HBASE-16393:
---

So that one computes the calls after the first one in parallel. There is still 
a lot of work needed to make the first invocation happen in parallel.

> Improve computeHDFSBlocksDistribution
> -
>
> Key: HBASE-16393
> URL: https://issues.apache.org/jira/browse/HBASE-16393
> Project: HBase
>  Issue Type: Improvement
>Reporter: binlijin
>Assignee: binlijin
> Attachments: HBASE-16393.patch
>
>
> With our cluster is big, i can see the balancer is slow from time to time. 
> And the balancer will be called on master startup, so we can see the startup 
> is slow also. 
> The first thing i think whether if we can parallel compute different region's 
> HDFSBlocksDistribution. 
> The second i think we can improve compute single region's 
> HDFSBlocksDistribution.
> When to compute a storefile's HDFSBlocksDistribution first we call 
> FileSystem#getFileStatus(path) and then 
> FileSystem#getFileBlockLocations(status, start, length), so two namenode rpc 
> call for every storefile. Instead we can use FileSystem#listLocatedStatus to 
> get a LocatedFileStatus for the information we need, so reduce the namenode 
> rpc call to one. This can speed the computeHDFSBlocksDistribution, but also 
> send out less rpc call to namenode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16393) Improve computeHDFSBlocksDistribution

2016-08-11 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15416970#comment-15416970
 ] 

Yu Li commented on HBASE-16393:
---

Ok, two tasks to complete then. :-)

> Improve computeHDFSBlocksDistribution
> -
>
> Key: HBASE-16393
> URL: https://issues.apache.org/jira/browse/HBASE-16393
> Project: HBase
>  Issue Type: Improvement
>Reporter: binlijin
>Assignee: binlijin
> Attachments: HBASE-16393.patch
>
>
> With our cluster is big, i can see the balancer is slow from time to time. 
> And the balancer will be called on master startup, so we can see the startup 
> is slow also. 
> The first thing i think whether if we can parallel compute different region's 
> HDFSBlocksDistribution. 
> The second i think we can improve compute single region's 
> HDFSBlocksDistribution.
> When to compute a storefile's HDFSBlocksDistribution first we call 
> FileSystem#getFileStatus(path) and then 
> FileSystem#getFileBlockLocations(status, start, length), so two namenode rpc 
> call for every storefile. Instead we can use FileSystem#listLocatedStatus to 
> get a LocatedFileStatus for the information we need, so reduce the namenode 
> rpc call to one. This can speed the computeHDFSBlocksDistribution, but also 
> send out less rpc call to namenode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16393) Improve computeHDFSBlocksDistribution

2016-08-11 Thread binlijin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15416960#comment-15416960
 ] 

binlijin commented on HBASE-16393:
--

Looks like HBASE-14473 already compute region locality in parallel.


> Improve computeHDFSBlocksDistribution
> -
>
> Key: HBASE-16393
> URL: https://issues.apache.org/jira/browse/HBASE-16393
> Project: HBase
>  Issue Type: Improvement
>Reporter: binlijin
>Assignee: binlijin
> Attachments: HBASE-16393.patch
>
>
> With our cluster is big, i can see the balancer is slow from time to time. 
> And the balancer will be called on master startup, so we can see the startup 
> is slow also. 
> The first thing i think whether if we can parallel compute different region's 
> HDFSBlocksDistribution. 
> The second i think we can improve compute single region's 
> HDFSBlocksDistribution.
> When to compute a storefile's HDFSBlocksDistribution first we call 
> FileSystem#getFileStatus(path) and then 
> FileSystem#getFileBlockLocations(status, start, length), so two namenode rpc 
> call for every storefile. Instead we can use FileSystem#listLocatedStatus to 
> get a LocatedFileStatus for the information we need, so reduce the namenode 
> rpc call to one. This can speed the computeHDFSBlocksDistribution, but also 
> send out less rpc call to namenode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16393) Improve computeHDFSBlocksDistribution

2016-08-11 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15416942#comment-15416942
 ] 

Yu Li commented on HBASE-16393:
---

{quote}
The first thing i think whether if we can parallel compute different region's 
HDFSBlocksDistribution. 
The second i think we can improve compute single region's 
HDFSBlocksDistribution.

I know there is other place can improve by the same way, first improve 
StoreFileInfo#computeHDFSBlocksDistribution. I like to improve other places by 
subtask.
{quote}
Please open the subtasks and attach patches there for easier review fella, I 
guess there should be at least three? [~aoxiang]

> Improve computeHDFSBlocksDistribution
> -
>
> Key: HBASE-16393
> URL: https://issues.apache.org/jira/browse/HBASE-16393
> Project: HBase
>  Issue Type: Improvement
>Reporter: binlijin
> Attachments: HBASE-16393.patch
>
>
> With our cluster is big, i can see the balancer is slow from time to time. 
> And the balancer will be called on master startup, so we can see the startup 
> is slow also. 
> The first thing i think whether if we can parallel compute different region's 
> HDFSBlocksDistribution. 
> The second i think we can improve compute single region's 
> HDFSBlocksDistribution.
> When to compute a storefile's HDFSBlocksDistribution first we call 
> FileSystem#getFileStatus(path) and then 
> FileSystem#getFileBlockLocations(status, start, length), so two namenode rpc 
> call for every storefile. Instead we can use FileSystem#listLocatedStatus to 
> get a LocatedFileStatus for the information we need, so reduce the namenode 
> rpc call to one. This can speed the computeHDFSBlocksDistribution, but also 
> send out less rpc call to namenode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16393) Improve computeHDFSBlocksDistribution

2016-08-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15416520#comment-15416520
 ] 

Hadoop QA commented on HBASE-16393:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 23s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 
51s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s 
{color} | {color:green} master passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s 
{color} | {color:green} master passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
4s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
30s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
57s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s 
{color} | {color:green} master passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s 
{color} | {color:green} master passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
49s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s 
{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 34s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
51s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
17s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
29m 0s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 
17s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 113m 51s 
{color} | {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
23s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 162m 41s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hbase.master.procedure.TestMasterFailoverWithProcedures |
| Timed out junit tests | org.apache.hadoop.hbase.wal.TestWALSplitCompressed |
|   | org.apache.hadoop.hbase.io.encoding.TestChangingEncoding |
|   | org.apache.hadoop.hbase.zookeeper.lock.TestZKInterProcessReadWriteLock |
|   | org.apache.hadoop.hbase.TestIOFencing |
|   | org.apache.hadoop.hbase.wal.TestWALSplit |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.12.0 Server=1.12.0 Image:yetus/hbase:date2016-08-11 |
| JIRA Patch URL | 

[jira] [Commented] (HBASE-16393) Improve computeHDFSBlocksDistribution

2016-08-10 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15416327#comment-15416327
 ] 

Guanghao Zhang commented on HBASE-16393:


+1 on this idea. We found this in our production cluster, too. The balancer is 
too slow when there are a lot of regions. And some default balancer configs is 
too small for big cluster. Maybe we can make the default config value related 
to regions number.

> Improve computeHDFSBlocksDistribution
> -
>
> Key: HBASE-16393
> URL: https://issues.apache.org/jira/browse/HBASE-16393
> Project: HBase
>  Issue Type: Improvement
>Reporter: binlijin
> Attachments: HBASE-16393.patch
>
>
> With our cluster is big, i can see the balancer is slow from time to time. 
> And the balancer will be called on master startup, so we can see the startup 
> is slow also. 
> The first thing i think whether if we can parallel compute different region's 
> HDFSBlocksDistribution. 
> The second i think we can improve compute single region's 
> HDFSBlocksDistribution.
> When to compute a storefile's HDFSBlocksDistribution first we call 
> FileSystem#getFileStatus(path) and then 
> FileSystem#getFileBlockLocations(status, start, length), so two namenode rpc 
> call for every storefile. Instead we can use FileSystem#listLocatedStatus to 
> get a LocatedFileStatus for the information we need, so reduce the namenode 
> rpc call to one. This can speed the computeHDFSBlocksDistribution, but also 
> send out less rpc call to namenode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16393) Improve computeHDFSBlocksDistribution

2016-08-10 Thread binlijin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15416311#comment-15416311
 ] 

binlijin commented on HBASE-16393:
--

I do not get the numbers yet, will do it.

> Improve computeHDFSBlocksDistribution
> -
>
> Key: HBASE-16393
> URL: https://issues.apache.org/jira/browse/HBASE-16393
> Project: HBase
>  Issue Type: Improvement
>Reporter: binlijin
> Attachments: HBASE-16393.patch
>
>
> With our cluster is big, i can see the balancer is slow from time to time. 
> And the balancer will be called on master startup, so we can see the startup 
> is slow also. 
> The first thing i think whether if we can parallel compute different region's 
> HDFSBlocksDistribution. 
> The second i think we can improve compute single region's 
> HDFSBlocksDistribution.
> When to compute a storefile's HDFSBlocksDistribution first we call 
> FileSystem#getFileStatus(path) and then 
> FileSystem#getFileBlockLocations(status, start, length), so two namenode rpc 
> call for every storefile. Instead we can use FileSystem#listLocatedStatus to 
> get a LocatedFileStatus for the information we need, so reduce the namenode 
> rpc call to one. This can speed the computeHDFSBlocksDistribution, but also 
> send out less rpc call to namenode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16393) Improve computeHDFSBlocksDistribution

2016-08-10 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15416309#comment-15416309
 ] 

Ted Yu commented on HBASE-16393:


Do you have performance improvement comparison with vs. without the patch ?

Thanks

> Improve computeHDFSBlocksDistribution
> -
>
> Key: HBASE-16393
> URL: https://issues.apache.org/jira/browse/HBASE-16393
> Project: HBase
>  Issue Type: Improvement
>Reporter: binlijin
> Attachments: HBASE-16393.patch
>
>
> With our cluster is big, i can see the balancer is slow from time to time. 
> And the balancer will be called on master startup, so we can see the startup 
> is slow also. 
> The first thing i think whether if we can parallel compute different region's 
> HDFSBlocksDistribution. 
> The second i think we can improve compute single region's 
> HDFSBlocksDistribution.
> When to compute a storefile's HDFSBlocksDistribution first we call 
> FileSystem#getFileStatus(path) and then 
> FileSystem#getFileBlockLocations(status, start, length), so two namenode rpc 
> call for every storefile. Instead we can use FileSystem#listLocatedStatus to 
> get a LocatedFileStatus for the information we need, so reduce the namenode 
> rpc call to one. This can speed the computeHDFSBlocksDistribution, but also 
> send out less rpc call to namenode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16393) Improve computeHDFSBlocksDistribution

2016-08-10 Thread binlijin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15416302#comment-15416302
 ] 

binlijin commented on HBASE-16393:
--

I know there is other place can improve by the same way, first improve 
StoreFileInfo#computeHDFSBlocksDistribution.

> Improve computeHDFSBlocksDistribution
> -
>
> Key: HBASE-16393
> URL: https://issues.apache.org/jira/browse/HBASE-16393
> Project: HBase
>  Issue Type: Improvement
>Reporter: binlijin
> Attachments: HBASE-16393.patch
>
>
> With our cluster is big, i can see the balancer is slow from time to time. 
> And the balancer will be called on master startup, so we can see the startup 
> is slow also. 
> The first thing i think whether if we can parallel compute different region's 
> HDFSBlocksDistribution. 
> The second i think we can improve compute single region's 
> HDFSBlocksDistribution.
> When to compute a storefile's HDFSBlocksDistribution first we call 
> FileSystem#getFileStatus(path) and then 
> FileSystem#getFileBlockLocations(status, start, length), so two namenode rpc 
> call for every storefile. Instead we can use FileSystem#listLocatedStatus to 
> get a LocatedFileStatus for the information we need, so reduce the namenode 
> rpc call to one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16393) Improve computeHDFSBlocksDistribution

2016-08-10 Thread binlijin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15416295#comment-15416295
 ] 

binlijin commented on HBASE-16393:
--

Master balancer's jstack 
{code}
hbase(main):002:0> balancer

ERROR: Call id=3, waitTime=180001, operationTimeout=18 expired.
{code}

{code}
"B.defaultRpcServer.handler=31,queue=5,port=60100" daemon prio=10 
tid=0x7f3e2aec1800 nid=0x369b2 in Object.wait() [0x7f3e1affd000]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:503)
at org.apache.hadoop.ipc.Client.call(Client.java:1484)
- locked <0x000603eb5738> (a org.apache.hadoop.ipc.Client$Call)
at org.apache.hadoop.ipc.Client.call(Client.java:1429)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy15.getBlockLocations(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:254)
at sun.reflect.GeneratedMethodAccessor75.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy16.getBlockLocations(Unknown Source)
at sun.reflect.GeneratedMethodAccessor75.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:330)
at com.sun.proxy.$Proxy17.getBlockLocations(Unknown Source)
at sun.reflect.GeneratedMethodAccessor75.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:330)
at com.sun.proxy.$Proxy17.getBlockLocations(Unknown Source)
at 
org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1205)
at 
org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1195)
at 
org.apache.hadoop.hdfs.DFSClient.getBlockLocations(DFSClient.java:1245)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$1.doCall(DistributedFileSystem.java:220)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$1.doCall(DistributedFileSystem.java:216)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileBlockLocations(DistributedFileSystem.java:216)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileBlockLocations(DistributedFileSystem.java:208)
at 
org.apache.hadoop.hbase.util.FSUtils.computeHDFSBlocksDistribution(FSUtils.java:1042)
at 
org.apache.hadoop.hbase.regionserver.StoreFileInfo.computeHDFSBlocksDistributionInternal(StoreFileInfo.java:294)
at 
org.apache.hadoop.hbase.regionserver.StoreFileInfo.computeHDFSBlocksDistribution(StoreFileInfo.java:284)
at 
org.apache.hadoop.hbase.regionserver.HRegion.computeHDFSBlocksDistribution(HRegion.java:1083)
at 
org.apache.hadoop.hbase.regionserver.HRegion.computeHDFSBlocksDistribution(HRegion.java:1058)
at 
org.apache.hadoop.hbase.master.balancer.RegionLocationFinder.internalGetTopBlockLocation(RegionLocationFinder.java:127)
at 
org.apache.hadoop.hbase.master.balancer.RegionLocationFinder$1.load(RegionLocationFinder.java:65)
at 
org.apache.hadoop.hbase.master.balancer.RegionLocationFinder$1.load(RegionLocationFinder.java:61)
at 
com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3584)
at 
com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2372)
at 
com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2335)
- locked <0x000603eabc40> (a 
com.google.common.cache.LocalCache$StrongAccessEntry)
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2250)
at com.google.common.cache.LocalCache.get(LocalCache.java:3985)
at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3989)
at 
com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4873)
at 
org.apache.hadoop.hbase.master.balancer.RegionLocationFinder.getTopBlockLocations(RegionLocationFinder.java:105)
at 

[jira] [Commented] (HBASE-16393) Improve computeHDFSBlocksDistribution

2016-08-10 Thread binlijin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15416293#comment-15416293
 ] 

binlijin commented on HBASE-16393:
--

{code}
"hostmaster:60100.activeMasterManager" daemon prio=10 tid=0x7f3df900a000 
nid=0x36a51 in Object.wait() [0x7f3df86fc000]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:503)
at org.apache.hadoop.ipc.Client.call(Client.java:1484)
- locked <0x0005fdaea6c8> (a org.apache.hadoop.ipc.Client$Call)
at org.apache.hadoop.ipc.Client.call(Client.java:1429)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy15.getFileInfo(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:752)
at sun.reflect.GeneratedMethodAccessor71.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy16.getFileInfo(Unknown Source)
at sun.reflect.GeneratedMethodAccessor71.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:330)
at com.sun.proxy.$Proxy17.getFileInfo(Unknown Source)
at sun.reflect.GeneratedMethodAccessor71.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:330)
at com.sun.proxy.$Proxy17.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1974)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1128)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1124)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1124)
at 
org.apache.hadoop.hbase.regionserver.StoreFileInfo.getReferencedFileStatus(StoreFileInfo.java:337)
at 
org.apache.hadoop.hbase.regionserver.StoreFileInfo.computeHDFSBlocksDistributionInternal(StoreFileInfo.java:290)
at 
org.apache.hadoop.hbase.regionserver.StoreFileInfo.computeHDFSBlocksDistribution(StoreFileInfo.java:284)
at 
org.apache.hadoop.hbase.regionserver.HRegion.computeHDFSBlocksDistribution(HRegion.java:1083)
at 
org.apache.hadoop.hbase.regionserver.HRegion.computeHDFSBlocksDistribution(HRegion.java:1058)
at 
org.apache.hadoop.hbase.master.balancer.RegionLocationFinder.internalGetTopBlockLocation(RegionLocationFinder.java:127)
at 
org.apache.hadoop.hbase.master.balancer.RegionLocationFinder$1.load(RegionLocationFinder.java:65)
at 
org.apache.hadoop.hbase.master.balancer.RegionLocationFinder$1.load(RegionLocationFinder.java:61)
at 
com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3584)
at 
com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2372)
at 
com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2335)
- locked <0x0005fda4e288> (a 
com.google.common.cache.LocalCache$StrongAccessEntry)
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2250)
at com.google.common.cache.LocalCache.get(LocalCache.java:3985)
at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3989)
at 
com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4873)
at 
org.apache.hadoop.hbase.master.balancer.RegionLocationFinder.getTopBlockLocations(RegionLocationFinder.java:105)
at 
org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.registerRegion(BaseLoadBalancer.java:433)
at 
org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.(BaseLoadBalancer.java:281)
at 
org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.createCluster(BaseLoadBalancer.java:1098)
at 
org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.retainAssignment(BaseLoadBalancer.java:1235)
at