[jira] [Created] (HBASE-12760) Return NullAction when region is located in highest locality server
cuijianwei created HBASE-12760: -- Summary: Return NullAction when region is located in highest locality server Key: HBASE-12760 URL: https://issues.apache.org/jira/browse/HBASE-12760 Project: HBase Issue Type: Improvement Components: Balancer Affects Versions: 0.99.2 Reporter: cuijianwei Priority: Minor StochasticLoadBalancer#LocalityBasedCandidateGenerator will try to move a region to the server with the highest locality. The target server is selected by LocalityBasedCandidateGenerator.pickHighestLocalityServer, as: {code} private int pickHighestLocalityServer(Cluster cluster, int thisServer, int thisRegion) { ... for (int loc : regionLocations) { if (loc = 0 loc != thisServer) { // find the first suitable server return loc; } } ... } {code} If the region is just located in the best server, the current logic will choose the server having the second highest locality, generating an action to try to move region to server with lower locality. Is it better to return a NullAction in this situation (as below)? so that there won't be further computing caused by the generated action. {code} private int pickHighestLocalityServer(Cluster cluster, int thisServer, int thisRegion) { ... for (int loc : regionLocations) { if (loc == thisServer) { return -1; // return NullAction when the region is just located in the best server } if (loc = 0) { // find the first suitable server return loc; } } ... } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-12761) On region jump ClientScanners should get next row start key instead of a skip.
Jurriaan Mous created HBASE-12761: - Summary: On region jump ClientScanners should get next row start key instead of a skip. Key: HBASE-12761 URL: https://issues.apache.org/jira/browse/HBASE-12761 Project: HBase Issue Type: Improvement Reporter: Jurriaan Mous Assignee: Jurriaan Mous While working on async scanner I had some trouble with the extra RPC calls that happen to let the Scanner advance 1 row so it skips the last already known row. This RPC call can be avoided by letting the start key be the last row with an appended 0. This saves quite some logic from the scanners and improves performance by saving extra RPC calls. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-12762) Region with no hfiles will have the highest locality cost in LocalityCostFunction
cuijianwei created HBASE-12762: -- Summary: Region with no hfiles will have the highest locality cost in LocalityCostFunction Key: HBASE-12762 URL: https://issues.apache.org/jira/browse/HBASE-12762 Project: HBase Issue Type: Improvement Components: Balancer Affects Versions: 0.99.2 Reporter: cuijianwei Priority: Minor The locality cost of region will be computed in LocalityCostFunction.cost as: {code} double cost() { ... int index = -1; for (int j = 0; j regionLocations.length; j++) { if (regionLocations[j] = 0 regionLocations[j] == serverIndex) { index = j; break; } } if (index 0) { cost += 1; // == region with no hfiles will have the highest cost } else { cost += (double) index / (double) regionLocations.length; } ... } {code} The region with no hfiles(such as empty region) will have the highest cost which represents the worst case that region located in the server with no locality for hfiles. However, this might be the best case because there are no hlogs for the region. Although the absolute cost value won't affect the balance process, will it be more reasonable to have zero cost for such regions, such as: {code} ... if (index 0) { if (regionLocation.length 0) { // == only consider regions with hfiles cost += 1; } } else { cost += (double) index / (double) regionLocations.length; } ... {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Toward the Incomplete 1K
Hi Andrew, this filter says: The requested filter doesn't exist or is private.. Could you check if it's shared? Thanks! Mikhail On Wed, Dec 24, 2014 at 12:17 PM, Andrew Purtell apurt...@apache.org wrote: Happy holidays. We currently have 2,117 incomplete JIRA issues. When you've returned from festivities, please consider joining me in the Incomplete 1K initiative! Let's go through these as time and bandwidth permit and prune this down to 1,000. This JIRA filter will order these incomplete issues by age, least recently updated first: https://issues.apache.org/jira/issues/?filter=12327837 (JQL: project = HBASE AND statusCategory not in (Complete) ORDER BY updated ASC) -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) -- Thanks, Michael Antonov
[jira] [Created] (HBASE-12763) Make it so there must be WALs for a server to be marked dead
stack created HBASE-12763: - Summary: Make it so there must be WALs for a server to be marked dead Key: HBASE-12763 URL: https://issues.apache.org/jira/browse/HBASE-12763 Project: HBase Issue Type: Sub-task Components: wal Reporter: stack Attachments: 12746-v2-master-and-098.patch The patch for this issue is a subset of the patch attached to the parent. The parent solves a 1.0.0-specific issue but part of the patch needs applying to 0.98 and to master to fix an issue where Master on startup would think it was joining a cluster rather than undergoing a fresh start just because it came across a directory named for a server that was once running (the patch checks if the dir has WALs and if none, does not think the server a dead server). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Toward the Incomplete 1K
Sure, in the meantime the JQL will work. On Dec 27, 2014, at 8:49 AM, Mikhail Antonov olorinb...@gmail.com wrote: Hi Andrew, this filter says: The requested filter doesn't exist or is private.. Could you check if it's shared? Thanks! Mikhail On Wed, Dec 24, 2014 at 12:17 PM, Andrew Purtell apurt...@apache.org wrote: Happy holidays. We currently have 2,117 incomplete JIRA issues. When you've returned from festivities, please consider joining me in the Incomplete 1K initiative! Let's go through these as time and bandwidth permit and prune this down to 1,000. This JIRA filter will order these incomplete issues by age, least recently updated first: https://issues.apache.org/jira/issues/?filter=12327837 (JQL: project = HBASE AND statusCategory not in (Complete) ORDER BY updated ASC) -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) -- Thanks, Michael Antonov
[jira] [Resolved] (HBASE-12608) region_mover.rb does not log moving region count correctly when loading regions
[ https://issues.apache.org/jira/browse/HBASE-12608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-12608. --- Resolution: Fixed Fix Version/s: 1.1.0 0.98.10 2.0.0 1.0.0 Assignee: cuijianwei Hadoop Flags: Reviewed Pushed to branch-0.98+ Thanks for the patch [~cuijianwei] region_mover.rb does not log moving region count correctly when loading regions --- Key: HBASE-12608 URL: https://issues.apache.org/jira/browse/HBASE-12608 Project: HBase Issue Type: Improvement Components: shell Affects Versions: 0.98.8 Reporter: cuijianwei Assignee: cuijianwei Priority: Minor Fix For: 1.0.0, 2.0.0, 0.98.10, 1.1.0 Attachments: HBASE-12608-trunk.patch region_mover.rb does not seem to log moving region count correctly by the following code: {code} ... if currentServer and currentServer == servername $LOG.info(Region + r.getRegionNameAsString() + ( + count.to_s + of + regions.length.to_s + ) already on target server= + servername) counter = counter + 1 next end pool.launch(r,currentServer,count) do |_r,_currentServer,_count| $LOG.info(Moving region + _r.getRegionNameAsString() + ( + (_count + 1).to_s + of + regions.length.to_s + ) from + _currentServer.to_s + to server= + servername); move(admin, _r, servername, _currentServer) end counter = counter + 1 {code} It seems we should use 'counter' when logging and remove the duplicated variable 'count'. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Moving HBASE-11125 forward with a proposal
See http://s.apache.org/tIN I had a crazy thought that started making more sense the more I thought about it. -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
Re: Toward the Incomplete 1K
Sorry, the filter is private. I clicked the share button and copied the URL from the popup to the list, but this isn't apparently sufficient to make a filter public. I need a global perm for the JIRA instance to create a shared filter. Anyway, there is no need for that. Search or your own filter with the JQL: project = HBASE AND statusCategory not in (Complete) ORDER BY updated ASC On Sat, Dec 27, 2014 at 9:53 AM, Andrew Purtell andrew.purt...@gmail.com wrote: Sure, in the meantime the JQL will work. On Dec 27, 2014, at 8:49 AM, Mikhail Antonov olorinb...@gmail.com wrote: Hi Andrew, this filter says: The requested filter doesn't exist or is private.. Could you check if it's shared? Thanks! Mikhail On Wed, Dec 24, 2014 at 12:17 PM, Andrew Purtell apurt...@apache.org wrote: Happy holidays. We currently have 2,117 incomplete JIRA issues. When you've returned from festivities, please consider joining me in the Incomplete 1K initiative! Let's go through these as time and bandwidth permit and prune this down to 1,000. This JIRA filter will order these incomplete issues by age, least recently updated first: https://issues.apache.org/jira/issues/?filter=12327837 (JQL: project = HBASE AND statusCategory not in (Complete) ORDER BY updated ASC)
[jira] [Resolved] (HBASE-8310) HBase snapshot timeout default values and TableLockManger timeout
[ https://issues.apache.org/jira/browse/HBASE-8310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jerry He resolved HBASE-8310. - Resolution: Won't Fix Clean up JIRAs. Close this a Won't Fix. HBase snapshot timeout default values and TableLockManger timeout - Key: HBASE-8310 URL: https://issues.apache.org/jira/browse/HBASE-8310 Project: HBase Issue Type: Bug Components: snapshots Affects Versions: 0.95.0 Reporter: Jerry He Assignee: Jerry He Priority: Minor Attachments: trunk.patch There are a few timeout values and defaults being used by HBase snapshot. DEFAULT_MAX_WAIT_TIME (6 milli sec, 1 min) for client response TIMEOUT_MILLIS_DEFAULT (6 milli sec, 1 min) for Procedure timeout SNAPSHOT_TIMEOUT_MILLIS_DEFAULT (6 milli sec, 1 min) for region server subprocedure There is also other timeout involved, for example, DEFAULT_TABLE_WRITE_LOCK_TIMEOUT_MS (10 mins) for TakeSnapshotHandler#prepare() We could have this case: The user issues a sync snapshot request, waits for 1 min, and gets an exception. In the meantime the snapshot handler is blocked on the table lock, and the snapshot may continue to finish after 10 mins. But the user will probably re-issue the snapshot request during the 10 mins. This is a little confusing and messy when this happens. To be more reasonable, we should either increase the DEFAULT_MAX_WAIT_TIME or decrease the table lock waiting time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-12764) TestPerColumnFamilyFlush#testCompareStoreFileCount may fail due to new table not available
Ted Yu created HBASE-12764: -- Summary: TestPerColumnFamilyFlush#testCompareStoreFileCount may fail due to new table not available Key: HBASE-12764 URL: https://issues.apache.org/jira/browse/HBASE-12764 Project: HBase Issue Type: Test Reporter: Ted Yu Assignee: Ted Yu Priority: Minor From https://builds.apache.org/job/HBase-1.1/27/testReport/org.apache.hadoop.hbase.regionserver/TestPerColumnFamilyFlush/testCompareStoreFileCount/ : {code} java.lang.NullPointerException: null at org.apache.hadoop.hbase.regionserver.TestPerColumnFamilyFlush.testCompareStoreFileCount(TestPerColumnFamilyFlush.java:542) {code} The exception was due to getRegionWithName() returning null: {code} getRegionWithName(TABLENAME).getFirst(); {code} The new table was not available yet. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[ANNOUNCE] HBase 0.94.26 is available for download
The HBase Team is pleased to announce the immediate release of HBase 0.94.26. Download it from your favorite Apache mirror [1]. This release has also been pushed to Apache's maven repository. All previous 0.92.x and 0.94.x releases can be upgraded to 0.94.26 via a rolling upgrade without downtime, intermediary versions can be skipped. HBase 0.94.26 is a bug fix release with 5 fixes: [HBASE-12279] - Generated thrift files were generated with the wrong parameters [HBASE-12491] - TableMapReduceUtil.findContainingJar() NPE [HBASE-12635] - Delete acl notify znode of table after the table is deleted [HBASE-12657] - The Region is not being split and far exceeds the desired maximum size. [HBASE-12692] - NPE from SnapshotManager#stop See also the full release notes [2]. Thanks to everybody who contributed to this release! Yours, The HBase Team 1. http://www.apache.org/dyn/closer.cgi/hbase/ 2. https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753version=12328781
[jira] [Resolved] (HBASE-12531) bug in cachedataonwrite
[ https://issues.apache.org/jira/browse/HBASE-12531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell resolved HBASE-12531. Resolution: Duplicate Release Note: It is a dup. Let's close this one because the other is further along and has a patch. bug in cachedataonwrite --- Key: HBASE-12531 URL: https://issues.apache.org/jira/browse/HBASE-12531 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.98.6.1 Reporter: dennislee when configuring {color:red}hbase.rs.cacheblocksonwrite{color} as true on a region server ,or setting cacheDataOnWrite as true on a column family in a table ,we flush the header bytes,ondisk bytebuffer and checksum bytes to disk,but only store the header and uncompressedBytesWithoutHeader to the block cache . so if we read a block from block cache which cached on write ,the method {color:red} getBufferWithoutHeader{color}of{color:red} org.apache.hadoop.hbase.io.hfile.HFileBlock{color} will cut off the head and checksum bytes,even if the checksum was never written ,and then we get a IllegalArgumentException thrown by ByteBuffer cause there is not enough bytes to read or skip at the end of the ByteBuffer. I fixed this problem but I don't know how to commit a patch ,so I paste my code here : {code:title=org.apache.hadoop.hbase.io.hfile.HFileBlock.java|borderStyle=solid} public ByteBuffer getBufferWithoutHeader() { int length,lengthWithoutHeader=buf.limit()-headerSize(),lengthWithoutHeaderAndCheckSum=lengthWithoutHeader - totalChecksumBytes(); if(lengthWithoutHeader==uncompressedSizeWithoutHeader){ // NO check sum tail length=lengthWithoutHeader; } else if(lengthWithoutHeaderAndCheckSum==uncompressedSizeWithoutHeader){ // has check sum tail length=lengthWithoutHeaderAndCheckSum; } else { throw new IllegalArgumentException(this.toString()+,this block may be crashed); } ByteBuffer buffer =ByteBuffer.wrap(buf.array(), buf.arrayOffset() + headerSize(),length)//length .slice(); return buffer; } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-12765) SplitTransaction creates too many threads (potentially)
Lars Hofhansl created HBASE-12765: - Summary: SplitTransaction creates too many threads (potentially) Key: HBASE-12765 URL: https://issues.apache.org/jira/browse/HBASE-12765 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl In splitStoreFiles(...) we create a new thread pool with as many threads as there are files to split. We should be able to do better. During times of very heavy write loads there might be a lot of files to split and multiple splits might be going on at the same time on the same region server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)