[jira] [Commented] (HBASE-6142) Javadoc in some Filters ambiguous
[ https://issues.apache.org/jira/browse/HBASE-6142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287862#comment-13287862 ] Joep Rottinghuis commented on HBASE-6142: - Ah, I will look up my (signed) copy of said book, write test and take a shot at the javadoc. Cannot promise any angel qualities though... > Javadoc in some Filters ambiguous > - > > Key: HBASE-6142 > URL: https://issues.apache.org/jira/browse/HBASE-6142 > Project: HBase > Issue Type: Bug > Components: documentation >Affects Versions: 0.92.2, 0.96.0, 0.94.1 >Reporter: Joep Rottinghuis >Priority: Minor > Labels: noob > > The javadoc on some of the filter is somewhat confusing. > The main Filter interface has methods that behave like a sieve; when > filterRowKey returns true, that means that the row is filtered _out_ (not > included). > Many of the Filter implementations work the other way around. When the > condition is met the value passes (ie, the row is returned). > Most Filters make it clear when a values passes (passing through the filter > meaning the values are returned from the scan). > Some are less clear in light of how the Filter interface works: > WhileMatchFilter and SingleColumnValueFilter are examples. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6142) Javadoc in some Filters ambiguous
[ https://issues.apache.org/jira/browse/HBASE-6142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6142: - Tags: noob (was: noo) Labels: noob (was: ) Tagging as noob. Would be grand if an angel from heaven would write little unit tests to verify which way the filter blows and fix the javadoc accordingly (Lars George, if we read that book of yours, would it answer the questions Joep raises? IIRC, its good on filters?) > Javadoc in some Filters ambiguous > - > > Key: HBASE-6142 > URL: https://issues.apache.org/jira/browse/HBASE-6142 > Project: HBase > Issue Type: Bug > Components: documentation >Affects Versions: 0.92.2, 0.96.0, 0.94.1 >Reporter: Joep Rottinghuis >Priority: Minor > Labels: noob > > The javadoc on some of the filter is somewhat confusing. > The main Filter interface has methods that behave like a sieve; when > filterRowKey returns true, that means that the row is filtered _out_ (not > included). > Many of the Filter implementations work the other way around. When the > condition is met the value passes (ie, the row is returned). > Most Filters make it clear when a values passes (passing through the filter > meaning the values are returned from the scan). > Some are less clear in light of how the Filter interface works: > WhileMatchFilter and SingleColumnValueFilter are examples. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5936) Add Column-level PB-based calls to HMasterInterface
[ https://issues.apache.org/jira/browse/HBASE-5936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287859#comment-13287859 ] Hudson commented on HBASE-5936: --- Integrated in HBase-TRUNK #2974 (See [https://builds.apache.org/job/HBase-TRUNK/2974/]) HBASE-5936 Addendum adds changes for TestHMasterRPCException that were missed in previous checkin (Revision 1345441) Result = FAILURE tedyu : Files : * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestHMasterRPCException.java > Add Column-level PB-based calls to HMasterInterface > --- > > Key: HBASE-5936 > URL: https://issues.apache.org/jira/browse/HBASE-5936 > Project: HBase > Issue Type: Task > Components: ipc, master, migration >Reporter: Gregory Chanan >Assignee: Gregory Chanan > Fix For: 0.96.0 > > Attachments: 5936-addendum-v2.txt, HBASE-5936-v3.patch, > HBASE-5936-v4.patch, HBASE-5936-v4.patch, HBASE-5936-v5.patch, > HBASE-5936-v6.patch, HBASE-5936.patch > > > This should be a subtask of HBASE-5445, but since that is a subtask, I can't > also make this a subtask (apparently). > This is for converting the column-level calls, i.e.: > addColumn > deleteColumn > modifyColumn -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6142) Javadoc in some Filters ambiguous
[ https://issues.apache.org/jira/browse/HBASE-6142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6142: - Tags: noo > Javadoc in some Filters ambiguous > - > > Key: HBASE-6142 > URL: https://issues.apache.org/jira/browse/HBASE-6142 > Project: HBase > Issue Type: Bug > Components: documentation >Affects Versions: 0.92.2, 0.96.0, 0.94.1 >Reporter: Joep Rottinghuis >Priority: Minor > > The javadoc on some of the filter is somewhat confusing. > The main Filter interface has methods that behave like a sieve; when > filterRowKey returns true, that means that the row is filtered _out_ (not > included). > Many of the Filter implementations work the other way around. When the > condition is met the value passes (ie, the row is returned). > Most Filters make it clear when a values passes (passing through the filter > meaning the values are returned from the scan). > Some are less clear in light of how the Filter interface works: > WhileMatchFilter and SingleColumnValueFilter are examples. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6145) Fix site target post modularization
[ https://issues.apache.org/jira/browse/HBASE-6145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287857#comment-13287857 ] stack commented on HBASE-6145: -- The above failure seems unrelated (and fixed since by Ted addendum I think). Jesse, what you think of this patch? I just messed more w/ it and I think its good enough to commit. Adds site and assembly (if I undo an assembly tgz, inside in it, I can build another assembly that runs, etc., so all source etc. and docs are present). What you think? You ok on commit? > Fix site target post modularization > --- > > Key: HBASE-6145 > URL: https://issues.apache.org/jira/browse/HBASE-6145 > Project: HBase > Issue Type: Task >Reporter: stack >Assignee: stack > Attachments: site.txt, site2.txt, sitev3.txt > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5936) Add Column-level PB-based calls to HMasterInterface
[ https://issues.apache.org/jira/browse/HBASE-5936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287856#comment-13287856 ] stack commented on HBASE-5936: -- Thanks Ted. > Add Column-level PB-based calls to HMasterInterface > --- > > Key: HBASE-5936 > URL: https://issues.apache.org/jira/browse/HBASE-5936 > Project: HBase > Issue Type: Task > Components: ipc, master, migration >Reporter: Gregory Chanan >Assignee: Gregory Chanan > Fix For: 0.96.0 > > Attachments: 5936-addendum-v2.txt, HBASE-5936-v3.patch, > HBASE-5936-v4.patch, HBASE-5936-v4.patch, HBASE-5936-v5.patch, > HBASE-5936-v6.patch, HBASE-5936.patch > > > This should be a subtask of HBASE-5445, but since that is a subtask, I can't > also make this a subtask (apparently). > This is for converting the column-level calls, i.e.: > addColumn > deleteColumn > modifyColumn -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5936) Add Column-level PB-based calls to HMasterInterface
[ https://issues.apache.org/jira/browse/HBASE-5936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287855#comment-13287855 ] Zhihong Yu commented on HBASE-5936: --- Addendum v2 integrated to trunk. > Add Column-level PB-based calls to HMasterInterface > --- > > Key: HBASE-5936 > URL: https://issues.apache.org/jira/browse/HBASE-5936 > Project: HBase > Issue Type: Task > Components: ipc, master, migration >Reporter: Gregory Chanan >Assignee: Gregory Chanan > Fix For: 0.96.0 > > Attachments: 5936-addendum-v2.txt, HBASE-5936-v3.patch, > HBASE-5936-v4.patch, HBASE-5936-v4.patch, HBASE-5936-v5.patch, > HBASE-5936-v6.patch, HBASE-5936.patch > > > This should be a subtask of HBASE-5445, but since that is a subtask, I can't > also make this a subtask (apparently). > This is for converting the column-level calls, i.e.: > addColumn > deleteColumn > modifyColumn -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5936) Add Column-level PB-based calls to HMasterInterface
[ https://issues.apache.org/jira/browse/HBASE-5936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5936: -- Attachment: 5936-addendum-v2.txt The exception came out of HBaseRPC.getProxy() call. Addendum v2 passes TestHMasterRPCException. > Add Column-level PB-based calls to HMasterInterface > --- > > Key: HBASE-5936 > URL: https://issues.apache.org/jira/browse/HBASE-5936 > Project: HBase > Issue Type: Task > Components: ipc, master, migration >Reporter: Gregory Chanan >Assignee: Gregory Chanan > Fix For: 0.96.0 > > Attachments: 5936-addendum-v2.txt, HBASE-5936-v3.patch, > HBASE-5936-v4.patch, HBASE-5936-v4.patch, HBASE-5936-v5.patch, > HBASE-5936-v6.patch, HBASE-5936.patch > > > This should be a subtask of HBASE-5445, but since that is a subtask, I can't > also make this a subtask (apparently). > This is for converting the column-level calls, i.e.: > addColumn > deleteColumn > modifyColumn -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5936) Add Column-level PB-based calls to HMasterInterface
[ https://issues.apache.org/jira/browse/HBASE-5936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5936: -- Attachment: (was: 5936-addendum.txt) > Add Column-level PB-based calls to HMasterInterface > --- > > Key: HBASE-5936 > URL: https://issues.apache.org/jira/browse/HBASE-5936 > Project: HBase > Issue Type: Task > Components: ipc, master, migration >Reporter: Gregory Chanan >Assignee: Gregory Chanan > Fix For: 0.96.0 > > Attachments: HBASE-5936-v3.patch, HBASE-5936-v4.patch, > HBASE-5936-v4.patch, HBASE-5936-v5.patch, HBASE-5936-v6.patch, > HBASE-5936.patch > > > This should be a subtask of HBASE-5445, but since that is a subtask, I can't > also make this a subtask (apparently). > This is for converting the column-level calls, i.e.: > addColumn > deleteColumn > modifyColumn -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5936) Add Column-level PB-based calls to HMasterInterface
[ https://issues.apache.org/jira/browse/HBASE-5936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5936: -- Attachment: 5936-addendum.txt Looks like the snippet from patch v6 for TestHMasterRPCException wasn't applied to trunk. Addendum attached. > Add Column-level PB-based calls to HMasterInterface > --- > > Key: HBASE-5936 > URL: https://issues.apache.org/jira/browse/HBASE-5936 > Project: HBase > Issue Type: Task > Components: ipc, master, migration >Reporter: Gregory Chanan >Assignee: Gregory Chanan > Fix For: 0.96.0 > > Attachments: 5936-addendum.txt, HBASE-5936-v3.patch, > HBASE-5936-v4.patch, HBASE-5936-v4.patch, HBASE-5936-v5.patch, > HBASE-5936-v6.patch, HBASE-5936.patch > > > This should be a subtask of HBASE-5445, but since that is a subtask, I can't > also make this a subtask (apparently). > This is for converting the column-level calls, i.e.: > addColumn > deleteColumn > modifyColumn -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5936) Add Column-level PB-based calls to HMasterInterface
[ https://issues.apache.org/jira/browse/HBASE-5936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287852#comment-13287852 ] stack commented on HBASE-5936: -- That seems like easy enough to work around. Any chance of your taking a look Gregory? Separate issue? Thanks boss. > Add Column-level PB-based calls to HMasterInterface > --- > > Key: HBASE-5936 > URL: https://issues.apache.org/jira/browse/HBASE-5936 > Project: HBase > Issue Type: Task > Components: ipc, master, migration >Reporter: Gregory Chanan >Assignee: Gregory Chanan > Fix For: 0.96.0 > > Attachments: HBASE-5936-v3.patch, HBASE-5936-v4.patch, > HBASE-5936-v4.patch, HBASE-5936-v5.patch, HBASE-5936-v6.patch, > HBASE-5936.patch > > > This should be a subtask of HBASE-5445, but since that is a subtask, I can't > also make this a subtask (apparently). > This is for converting the column-level calls, i.e.: > addColumn > deleteColumn > modifyColumn -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover
[ https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-6060: -- Attachment: 6060-94-v4.patch Patch v4 addresses Rajesh's comment and some of my own comments. TestAssignmentManager passes. Running test suite. > Regions's in OPENING state from failed regionservers takes a long time to > recover > - > > Key: HBASE-6060 > URL: https://issues.apache.org/jira/browse/HBASE-6060 > Project: HBase > Issue Type: Bug > Components: master, regionserver >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Attachments: 6060-94-v3.patch, 6060-94-v4.patch, HBASE-6060-94.patch > > > we have seen a pattern in tests, that the regions are stuck in OPENING state > for a very long time when the region server who is opening the region fails. > My understanding of the process: > > - master calls rs to open the region. If rs is offline, a new plan is > generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in > master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), > HMaster.assign() > - RegionServer, starts opening a region, changes the state in znode. But > that znode is not ephemeral. (see ZkAssign) > - Rs transitions zk node from OFFLINE to OPENING. See > OpenRegionHandler.process() > - rs then opens the region, and changes znode from OPENING to OPENED > - when rs is killed between OPENING and OPENED states, then zk shows OPENING > state, and the master just waits for rs to change the region state, but since > rs is down, that wont happen. > - There is a AssignmentManager.TimeoutMonitor, which does exactly guard > against these kind of conditions. It periodically checks (every 10 sec by > default) the regions in transition to see whether they timedout > (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, > which explains what you and I are seeing. > - ServerShutdownHandler in Master does not reassign regions in OPENING > state, although it handles other states. > Lowering that threshold from the configuration is one option, but still I > think we can do better. > Will investigate more. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6152) Split abort is not handled properly
[ https://issues.apache.org/jira/browse/HBASE-6152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287838#comment-13287838 ] Enis Soztutar commented on HBASE-6152: -- I think the problem is that the master offlines the region at step 3, however, the parent region is recovered, and onlined by RS. So all other region transitions fail for the master. > Split abort is not handled properly > --- > > Key: HBASE-6152 > URL: https://issues.apache.org/jira/browse/HBASE-6152 > Project: HBase > Issue Type: Bug >Affects Versions: 0.92.0 >Reporter: Devaraj Das >Assignee: Devaraj Das > > I ran into this: > 1. RegionServer started to split a region(R), but the split was taking a long > time, and hence the split was aborted > 2. As part of cleanup, the RS deleted the ZK node that it created initially > for R > 3. The master (AssignmentManager) noticed the node deletion, and made R > offline > 4. The RS recovered from the failure, and at some point of time, tried to do > the split again. > 5. The master got an event RS_ZK_REGION_SPLIT but the server gave an error > like - "Received SPLIT for region R from server RS but it doesn't exist > anymore,.." > 6. The RS apparently did the split successfully this time, but is stuck on > the master to delete the znode for the region. It kept on saying - > "org.apache.hadoop.hbase.regionserver.SplitTransaction: Still waiting on the > master to process the split for R" and it was stuck there forever. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover
[ https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287831#comment-13287831 ] rajeshbabu commented on HBASE-6060: --- @Ted in v3 patch small change. {code} RegionPlan plan = getRegionPlan(state, forceNewPlan); if (plan == null) { LOG.debug("Unable to determine a plan to assign " + state); this.timeoutMonitor.setAllRegionServersOffline(true); return; // Should get reassigned later when RIT times out. } {code} in this place also instead of null check, we need to give {code} plan == RegionPlan.NO_SERVERS_TO_ASSIGN {code} Thanks. > Regions's in OPENING state from failed regionservers takes a long time to > recover > - > > Key: HBASE-6060 > URL: https://issues.apache.org/jira/browse/HBASE-6060 > Project: HBase > Issue Type: Bug > Components: master, regionserver >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Attachments: 6060-94-v3.patch, HBASE-6060-94.patch > > > we have seen a pattern in tests, that the regions are stuck in OPENING state > for a very long time when the region server who is opening the region fails. > My understanding of the process: > > - master calls rs to open the region. If rs is offline, a new plan is > generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in > master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), > HMaster.assign() > - RegionServer, starts opening a region, changes the state in znode. But > that znode is not ephemeral. (see ZkAssign) > - Rs transitions zk node from OFFLINE to OPENING. See > OpenRegionHandler.process() > - rs then opens the region, and changes znode from OPENING to OPENED > - when rs is killed between OPENING and OPENED states, then zk shows OPENING > state, and the master just waits for rs to change the region state, but since > rs is down, that wont happen. > - There is a AssignmentManager.TimeoutMonitor, which does exactly guard > against these kind of conditions. It periodically checks (every 10 sec by > default) the regions in transition to see whether they timedout > (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, > which explains what you and I are seeing. > - ServerShutdownHandler in Master does not reassign regions in OPENING > state, although it handles other states. > Lowering that threshold from the configuration is one option, but still I > think we can do better. > Will investigate more. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4676) Prefix Compression - Trie data block encoding
[ https://issues.apache.org/jira/browse/HBASE-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287829#comment-13287829 ] Matt Corgan commented on HBASE-4676: A little more detail... it would help where you have long qualifiers but only a few of them, like if you migrated a narrow relational db table over. If you have wide rows with long qualifiers, then you would want to take advantage of the qualifier trie. Not sure which case you have, but migrating relational style tables over will be pretty common so i wanted to handle that common case well. If you get a chance to do a random read benchmark on it I'd love to hear the results. I've only done a few small benchmarks at the Store level and haven't benchmarked a whole cluster. > Prefix Compression - Trie data block encoding > - > > Key: HBASE-4676 > URL: https://issues.apache.org/jira/browse/HBASE-4676 > Project: HBase > Issue Type: New Feature > Components: io, performance, regionserver >Affects Versions: 0.90.6 >Reporter: Matt Corgan >Assignee: Matt Corgan > Attachments: HBASE-4676-0.94-v1.patch, PrefixTrie_Format_v1.pdf, > PrefixTrie_Performance_v1.pdf, SeeksPerSec by blockSize.png, > hbase-prefix-trie-0.1.jar > > > The HBase data block format has room for 2 significant improvements for > applications that have high block cache hit ratios. > First, there is no prefix compression, and the current KeyValue format is > somewhat metadata heavy, so there can be tremendous memory bloat for many > common data layouts, specifically those with long keys and short values. > Second, there is no random access to KeyValues inside data blocks. This > means that every time you double the datablock size, average seek time (or > average cpu consumption) goes up by a factor of 2. The standard 64KB block > size is ~10x slower for random seeks than a 4KB block size, but block sizes > as small as 4KB cause problems elsewhere. Using block sizes of 256KB or 1MB > or more may be more efficient from a disk access and block-cache perspective > in many big-data applications, but doing so is infeasible from a random seek > perspective. > The PrefixTrie block encoding format attempts to solve both of these > problems. Some features: > * trie format for row key encoding completely eliminates duplicate row keys > and encodes similar row keys into a standard trie structure which also saves > a lot of space > * the column family is currently stored once at the beginning of each block. > this could easily be modified to allow multiple family names per block > * all qualifiers in the block are stored in their own trie format which > caters nicely to wide rows. duplicate qualifers between rows are eliminated. > the size of this trie determines the width of the block's qualifier > fixed-width-int > * the minimum timestamp is stored at the beginning of the block, and deltas > are calculated from that. the maximum delta determines the width of the > block's timestamp fixed-width-int > The block is structured with metadata at the beginning, then a section for > the row trie, then the column trie, then the timestamp deltas, and then then > all the values. Most work is done in the row trie, where every leaf node > (corresponding to a row) contains a list of offsets/references corresponding > to the cells in that row. Each cell is fixed-width to enable binary > searching and is represented by [1 byte operationType, X bytes qualifier > offset, X bytes timestamp delta offset]. > If all operation types are the same for a block, there will be zero per-cell > overhead. Same for timestamps. Same for qualifiers when i get a chance. > So, the compression aspect is very strong, but makes a few small sacrifices > on VarInt size to enable faster binary searches in trie fan-out nodes. > A more compressed but slower version might build on this by also applying > further (suffix, etc) compression on the trie nodes at the cost of slower > write speed. Even further compression could be obtained by using all VInts > instead of FInts with a sacrifice on random seek speed (though not huge). > One current drawback is the current write speed. While programmed with good > constructs like TreeMaps, ByteBuffers, binary searches, etc, it's not > programmed with the same level of optimization as the read path. Work will > need to be done to optimize the data structures used for encoding and could > probably show a 10x increase. It will still be slower than delta encoding, > but with a much higher decode speed. I have not yet created a thorough > benchmark for write speed nor sequential read speed. > Though the trie is reaching a point where it is internally very efficient > (probably within half or a quart
[jira] [Created] (HBASE-6153) RS aborted due to rename problem (maybe a race)
Devaraj Das created HBASE-6153: -- Summary: RS aborted due to rename problem (maybe a race) Key: HBASE-6153 URL: https://issues.apache.org/jira/browse/HBASE-6153 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Devaraj Das Assignee: Devaraj Das I had a RS crash with the following: 2012-05-31 18:34:42,534 DEBUG org.apache.hadoop.hbase.regionserver.Store: Renaming flushed file at hdfs://ip-10-140-14-134.ec2.internal:8020/apps/hbase/data/TestLoadAndVerify_1338488017181/8974506aa04c5a04e5cc23c11de0039d/.tmp/294a7a31f04949b8bf07682a43157b35 to hdfs://ip-10-140-14-134.ec2.internal:8020/apps/hbase/data/TestLoadAndVerify_1338488017181/8974506aa04c5a04e5cc23c11de0039d/f1/294a7a31f04949b8bf07682a43157b35 2012-05-31 18:34:42,536 WARN org.apache.hadoop.hbase.regionserver.Store: Unable to rename hdfs://ip-10-140-14-134.ec2.internal:8020/apps/hbase/data/TestLoadAndVerify_1338488017181/8974506aa04c5a04e5cc23c11de0039d/.tmp/294a7a31f04949b8bf07682a43157b35 to hdfs://ip-10-140-14-134.ec2.internal:8020/apps/hbase/data/TestLoadAndVerify_1338488017181/8974506aa04c5a04e5cc23c11de0039d/f1/294a7a31f04949b8bf07682a43157b35 2012-05-31 18:34:42,541 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server ip-10-68-7-146.ec2.internal,60020,1338343120038: Replay of HLog required. Forcing server shutdown org.apache.hadoop.hbase.DroppedSnapshotException: region: TestLoadAndVerify_1338488017181,\x15\xD9\x01\x00\x00\x00\x00\x00/87_0,1338491364569.8974506aa04c5a04e5cc23c11de0039d. at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1288) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1172) at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1114) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:400) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:374) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:243) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.FileNotFoundException: File does not exist: /apps/hbase/data/TestLoadAndVerify_1338488017181/8974506aa04c5a04e5cc23c11de0039d/f1/294a7a31f04949b8bf07682a43157b35 at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1901) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.(DFSClient.java:1892) at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:636) at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:154) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:427) at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:387) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.(StoreFile.java:1008) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:470) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:548) at org.apache.hadoop.hbase.regionserver.Store.internalFlushCache(Store.java:595) On the NameNode logs: 2012-05-31 18:34:42,588 WARN org.apache.hadoop.hdfs.StateChange: DIR* FSDirectory.unprotectedRenameTo: failed to rename /apps/hbase/data/TestLoadAndVerify_1338488017181/8974506aa04c5a04e5cc23c11de0039d/.tmp/294a7a31f04949b8bf07682a43157b35 to /apps/hbase/data/TestLoadAndVerify_1338488017181/8974506aa04c5a04e5cc23c11de0039d/f1/294a7a31f04949b8bf07682a43157b35 because destination's parent does not exist I haven't looked deeply yet but I guess it is a race of some sort. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6152) Split abort is not handled properly
[ https://issues.apache.org/jira/browse/HBASE-6152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj Das updated HBASE-6152: --- Description: I ran into this: 1. RegionServer started to split a region(R), but the split was taking a long time, and hence the split was aborted 2. As part of cleanup, the RS deleted the ZK node that it created initially for R 3. The master (AssignmentManager) noticed the node deletion, and made R offline 4. The RS recovered from the failure, and at some point of time, tried to do the split again. 5. The master got an event RS_ZK_REGION_SPLIT but the server gave an error like - "Received SPLIT for region R from server RS but it doesn't exist anymore,.." 6. The RS apparently did the split successfully this time, but is stuck on the master to delete the znode for the region. It kept on saying - "org.apache.hadoop.hbase.regionserver.SplitTransaction: Still waiting on the master to process the split for R" and it was stuck there forever. was: I ran into this: 1. RegionServer started to split a region(R), but the split was taking a long time, and hence the split was aborted 2. As part of cleanup, the RS deleted the ZK node that it created initially for R 3. The master (AssignmentManager) noticed the node deletion, and made R offline 4. The RS recovered from the failure, and at some point of time, tried to do the split again. 5. The master got an event RS_ZK_REGION_SPLITTING but the server gave an error like - "Received SPLIT for region R from server RS but it doesn't exist anymore,.." 6. The RS apparently did the split successfully this time, but is stuck on the master to delete the znode for the region. It kept on saying - "org.apache.hadoop.hbase.regionserver.SplitTransaction: Still waiting on the master to process the split for R" and it was stuck there forever. > Split abort is not handled properly > --- > > Key: HBASE-6152 > URL: https://issues.apache.org/jira/browse/HBASE-6152 > Project: HBase > Issue Type: Bug >Affects Versions: 0.92.0 >Reporter: Devaraj Das >Assignee: Devaraj Das > > I ran into this: > 1. RegionServer started to split a region(R), but the split was taking a long > time, and hence the split was aborted > 2. As part of cleanup, the RS deleted the ZK node that it created initially > for R > 3. The master (AssignmentManager) noticed the node deletion, and made R > offline > 4. The RS recovered from the failure, and at some point of time, tried to do > the split again. > 5. The master got an event RS_ZK_REGION_SPLIT but the server gave an error > like - "Received SPLIT for region R from server RS but it doesn't exist > anymore,.." > 6. The RS apparently did the split successfully this time, but is stuck on > the master to delete the znode for the region. It kept on saying - > "org.apache.hadoop.hbase.regionserver.SplitTransaction: Still waiting on the > master to process the split for R" and it was stuck there forever. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6145) Fix site target post modularization
[ https://issues.apache.org/jira/browse/HBASE-6145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287815#comment-13287815 ] Hadoop QA commented on HBASE-6145: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12530620/sitev3.txt against trunk revision . -1 @author. The patch appears to contain 2 @author tags which the Hadoop community has agreed to not allow in code contributions. +1 tests included. The patch appears to include 8 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestHMasterRPCException Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2087//testReport/ Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2087//console This message is automatically generated. > Fix site target post modularization > --- > > Key: HBASE-6145 > URL: https://issues.apache.org/jira/browse/HBASE-6145 > Project: HBase > Issue Type: Task >Reporter: stack >Assignee: stack > Attachments: site.txt, site2.txt, sitev3.txt > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6152) Split abort is not handled properly
Devaraj Das created HBASE-6152: -- Summary: Split abort is not handled properly Key: HBASE-6152 URL: https://issues.apache.org/jira/browse/HBASE-6152 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Devaraj Das Assignee: Devaraj Das I ran into this: 1. RegionServer started to split a region(R), but the split was taking a long time, and hence the split was aborted 2. As part of cleanup, the RS deleted the ZK node that it created initially for R 3. The master (AssignmentManager) noticed the node deletion, and made R offline 4. The RS recovered from the failure, and at some point of time, tried to do the split again. 5. The master got an event RS_ZK_REGION_SPLITTING but the server gave an error like - "Received SPLIT for region R from server RS but it doesn't exist anymore,.." 6. The RS apparently did the split successfully this time, but is stuck on the master to delete the znode for the region. It kept on saying - "org.apache.hadoop.hbase.regionserver.SplitTransaction: Still waiting on the master to process the split for R" and it was stuck there forever. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6151) Master can die if RegionServer throws ServerNotRunningYet
[ https://issues.apache.org/jira/browse/HBASE-6151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gregory Chanan updated HBASE-6151: -- Description: See, for example: {noformat} 2012-05-23 16:49:22,745 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. org.apache.hadoop.hbase.ipc.ServerNotRunningException: org.apache.hadoop.hbase.ipc.ServerNotRunningException: Server is not running yet at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1038) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:96) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1240) at org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:444) at org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:343) at org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:540) at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:474) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:412) {noformat} The HRegionServer calls HBaseServer: {code} public void start() { startThreads(); openServer(); } {code} but the server can start accepting RPCs once the threads have been started, but if they do, they throw ServerNotRunningException until openServer runs. We should probably 1) Catch the remote exception and retry on the master 2) Look into whether the start() behavior of HBaseServer makes any sense. Why would you start accepting RPCs only to throw back ServerNotRunningException? was: See, for example: {noformat} 2012-05-23 16:49:22,745 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. org.apache.hadoop.hbase.ipc.ServerNotRunningException: org.apache.hadoop.hbase.ipc.ServerNotRunningException: Server is not running yet at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1038) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:96) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1240) at org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:444) at org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:343) at org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:540) at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:474) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:412) {noformat} {code} The HRegionServer calls HBaseServer: public void start() { startThreads(); openServer(); } {code} but the server can start accepting RPCs once the threads have been started, but if they do, they throw ServerNotRunningException until openServer runs. We should probably 1) Catch the remote exception and retry on the master 2) Look into whether the start() behavior of HBaseServer makes any sense. Why would you start accepting RPCs only to throw back ServerNotRunningException? > Master can die if RegionServer throws ServerNotRunningYet > - > > Key: HBASE-6151 > URL: https://issues.apache.org/jira/browse/HBASE-6151 > Project: HBase > Issue Type: Bug > Components: ipc >Affects Versions: 0.90.7, 0.92.2, 0.96.0, 0.94.1 >Reporter: Gregory Chanan >Assignee: Gregory Chanan > > See, for example: > {noformat} > 2012-05-23 16:49:22,745 FATAL org.apache.hadoop.hbase.master.HMaster: > Unhandled exception. Starting shutdown. > org.apache.hadoop.hbase.ipc.ServerNotRunningException: > org.apache.hadoop.hbase.ipc.ServerNotRunningException: Server is not running > yet > at > org.apa
[jira] [Created] (HBASE-6151) Master can die if RegionServer throws ServerNotRunningYet
Gregory Chanan created HBASE-6151: - Summary: Master can die if RegionServer throws ServerNotRunningYet Key: HBASE-6151 URL: https://issues.apache.org/jira/browse/HBASE-6151 Project: HBase Issue Type: Bug Components: ipc Affects Versions: 0.90.7, 0.92.2, 0.96.0, 0.94.1 Reporter: Gregory Chanan Assignee: Gregory Chanan See, for example: {noformat} 2012-05-23 16:49:22,745 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. org.apache.hadoop.hbase.ipc.ServerNotRunningException: org.apache.hadoop.hbase.ipc.ServerNotRunningException: Server is not running yet at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1038) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:96) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1240) at org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:444) at org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:343) at org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:540) at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:474) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:412) {noformat} The HRegionServer calls HBaseServer: public void start() { startThreads(); openServer(); } but the server can start accepting RPCs once the threads have been started, but if they do, they throw ServerNotRunningException until openServer runs. We should probably 1) Catch the remote exception and retry on the master 2) Look into whether the start() behavior of HBaseServer makes any sense. Why would you start accepting RPCs only to throw back ServerNotRunningException? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6151) Master can die if RegionServer throws ServerNotRunningYet
[ https://issues.apache.org/jira/browse/HBASE-6151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gregory Chanan updated HBASE-6151: -- Description: See, for example: {noformat} 2012-05-23 16:49:22,745 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. org.apache.hadoop.hbase.ipc.ServerNotRunningException: org.apache.hadoop.hbase.ipc.ServerNotRunningException: Server is not running yet at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1038) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:96) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1240) at org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:444) at org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:343) at org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:540) at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:474) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:412) {noformat} {code} The HRegionServer calls HBaseServer: public void start() { startThreads(); openServer(); } {code} but the server can start accepting RPCs once the threads have been started, but if they do, they throw ServerNotRunningException until openServer runs. We should probably 1) Catch the remote exception and retry on the master 2) Look into whether the start() behavior of HBaseServer makes any sense. Why would you start accepting RPCs only to throw back ServerNotRunningException? was: See, for example: {noformat} 2012-05-23 16:49:22,745 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. org.apache.hadoop.hbase.ipc.ServerNotRunningException: org.apache.hadoop.hbase.ipc.ServerNotRunningException: Server is not running yet at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1038) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:96) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1240) at org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:444) at org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:343) at org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:540) at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:474) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:412) {noformat} The HRegionServer calls HBaseServer: public void start() { startThreads(); openServer(); } but the server can start accepting RPCs once the threads have been started, but if they do, they throw ServerNotRunningException until openServer runs. We should probably 1) Catch the remote exception and retry on the master 2) Look into whether the start() behavior of HBaseServer makes any sense. Why would you start accepting RPCs only to throw back ServerNotRunningException? > Master can die if RegionServer throws ServerNotRunningYet > - > > Key: HBASE-6151 > URL: https://issues.apache.org/jira/browse/HBASE-6151 > Project: HBase > Issue Type: Bug > Components: ipc >Affects Versions: 0.90.7, 0.92.2, 0.96.0, 0.94.1 >Reporter: Gregory Chanan >Assignee: Gregory Chanan > > See, for example: > {noformat} > 2012-05-23 16:49:22,745 FATAL org.apache.hadoop.hbase.master.HMaster: > Unhandled exception. Starting shutdown. > org.apache.hadoop.hbase.ipc.ServerNotRunningException: > org.apache.hadoop.hbase.ipc.ServerNotRunningException: Server is not running > yet > at > org.apache.hadoop.hbas
[jira] [Commented] (HBASE-5936) Add Column-level PB-based calls to HMasterInterface
[ https://issues.apache.org/jira/browse/HBASE-5936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287813#comment-13287813 ] Zhihong Yu commented on HBASE-5936: --- I can easily reproduce one of the test failures seen on Jenkins (https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/2972/testReport/org.apache.hadoop.hbase.master/TestHMasterRPCException/testRPCException/): {code} Failed tests: testRPCException(org.apache.hadoop.hbase.master.TestHMasterRPCException): Unexpected throwable: org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server is not running yet {code} > Add Column-level PB-based calls to HMasterInterface > --- > > Key: HBASE-5936 > URL: https://issues.apache.org/jira/browse/HBASE-5936 > Project: HBase > Issue Type: Task > Components: ipc, master, migration >Reporter: Gregory Chanan >Assignee: Gregory Chanan > Fix For: 0.96.0 > > Attachments: HBASE-5936-v3.patch, HBASE-5936-v4.patch, > HBASE-5936-v4.patch, HBASE-5936-v5.patch, HBASE-5936-v6.patch, > HBASE-5936.patch > > > This should be a subtask of HBASE-5445, but since that is a subtask, I can't > also make this a subtask (apparently). > This is for converting the column-level calls, i.e.: > addColumn > deleteColumn > modifyColumn -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5936) Add Column-level PB-based calls to HMasterInterface
[ https://issues.apache.org/jira/browse/HBASE-5936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287792#comment-13287792 ] Hudson commented on HBASE-5936: --- Integrated in HBase-TRUNK #2972 (See [https://builds.apache.org/job/HBase-TRUNK/2972/]) HBASE-5936 Add Column-level PB-based calls to HMasterInterface (Revision 1345390) Result = FAILURE stack : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/Invocation.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/WritableRpcEngine.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/protobuf/RequestConverter.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/protobuf/generated/MasterProtos.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java * /hbase/trunk/hbase-server/src/main/protobuf/Master.proto * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java > Add Column-level PB-based calls to HMasterInterface > --- > > Key: HBASE-5936 > URL: https://issues.apache.org/jira/browse/HBASE-5936 > Project: HBase > Issue Type: Task > Components: ipc, master, migration >Reporter: Gregory Chanan >Assignee: Gregory Chanan > Fix For: 0.96.0 > > Attachments: HBASE-5936-v3.patch, HBASE-5936-v4.patch, > HBASE-5936-v4.patch, HBASE-5936-v5.patch, HBASE-5936-v6.patch, > HBASE-5936.patch > > > This should be a subtask of HBASE-5445, but since that is a subtask, I can't > also make this a subtask (apparently). > This is for converting the column-level calls, i.e.: > addColumn > deleteColumn > modifyColumn -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6138) HadoopQA not running findbugs [Trunk]
[ https://issues.apache.org/jira/browse/HBASE-6138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287793#comment-13287793 ] Hudson commented on HBASE-6138: --- Integrated in HBase-TRUNK #2972 (See [https://builds.apache.org/job/HBase-TRUNK/2972/]) HBASE-6138 HadoopQA not running findbugs [Trunk] (Anoop Sam John) (Revision 1345391) Result = FAILURE tedyu : Files : * /hbase/trunk/pom.xml > HadoopQA not running findbugs [Trunk] > - > > Key: HBASE-6138 > URL: https://issues.apache.org/jira/browse/HBASE-6138 > Project: HBase > Issue Type: Bug > Components: build >Affects Versions: 0.96.0 >Reporter: Anoop Sam John >Assignee: Anoop Sam John > Fix For: 0.96.0 > > Attachments: 6138.txt > > > HadoopQA shows like > -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. > But not able to see any reports link > When I checked the console output for the build I can see > {code} > [INFO] --- findbugs-maven-plugin:2.4.0:findbugs (default-cli) @ hbase-common > --- > [INFO] Fork Value is true > [INFO] > > [INFO] Reactor Summary: > [INFO] > [INFO] HBase . SUCCESS [1.890s] > [INFO] HBase - Common FAILURE [2.238s] > [INFO] HBase - Server SKIPPED > [INFO] HBase - Assembly .. SKIPPED > [INFO] HBase - Site .. SKIPPED > [INFO] > > [INFO] BUILD FAILURE > [INFO] > > [INFO] Total time: 4.856s > [INFO] Finished at: Thu May 31 03:35:35 UTC 2012 > [INFO] Final Memory: 23M/154M > [INFO] > > [ERROR] Could not find resource > '${parent.basedir}/dev-support/findbugs-exclude.xml'. -> [Help 1] > [ERROR] > {code} > Because of this error Findbugs is getting run! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6150) Remove empty files causing rat check fail
[ https://issues.apache.org/jira/browse/HBASE-6150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287781#comment-13287781 ] Hudson commented on HBASE-6150: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #36 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/36/]) HBASE-6150 Remove empty files causing rat check fail (Revision 1345369) Result = FAILURE stack : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/metrics/histogram/ExponentiallyDecayingSample.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/metrics/histogram/Sample.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/metrics/histogram/Snapshot.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/metrics/histogram/UniformSample.java > Remove empty files causing rat check fail > - > > Key: HBASE-6150 > URL: https://issues.apache.org/jira/browse/HBASE-6150 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: stack > Fix For: 0.96.0 > > > Set of empty files found by Jesse. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5936) Add Column-level PB-based calls to HMasterInterface
[ https://issues.apache.org/jira/browse/HBASE-5936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287779#comment-13287779 ] Hudson commented on HBASE-5936: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #36 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/36/]) HBASE-5936 Add Column-level PB-based calls to HMasterInterface (Revision 1345390) Result = FAILURE stack : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/Invocation.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/WritableRpcEngine.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/protobuf/RequestConverter.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/protobuf/generated/MasterProtos.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java * /hbase/trunk/hbase-server/src/main/protobuf/Master.proto * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java > Add Column-level PB-based calls to HMasterInterface > --- > > Key: HBASE-5936 > URL: https://issues.apache.org/jira/browse/HBASE-5936 > Project: HBase > Issue Type: Task > Components: ipc, master, migration >Reporter: Gregory Chanan >Assignee: Gregory Chanan > Fix For: 0.96.0 > > Attachments: HBASE-5936-v3.patch, HBASE-5936-v4.patch, > HBASE-5936-v4.patch, HBASE-5936-v5.patch, HBASE-5936-v6.patch, > HBASE-5936.patch > > > This should be a subtask of HBASE-5445, but since that is a subtask, I can't > also make this a subtask (apparently). > This is for converting the column-level calls, i.e.: > addColumn > deleteColumn > modifyColumn -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6149) Fix TestFSUtils creating dirs under top level dir
[ https://issues.apache.org/jira/browse/HBASE-6149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287780#comment-13287780 ] Hudson commented on HBASE-6149: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #36 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/36/]) HBASE-6149 Fix TestFSUtils creating dirs under top level dir (Revision 1345343) Result = FAILURE stack : Files : * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestFSUtils.java > Fix TestFSUtils creating dirs under top level dir > - > > Key: HBASE-6149 > URL: https://issues.apache.org/jira/browse/HBASE-6149 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: stack > Fix For: 0.96.0 > > Attachments: fixtestdir.txt > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6145) Fix site target post modularization
[ https://issues.apache.org/jira/browse/HBASE-6145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6145: - Attachment: sitev3.txt I went back to trying to make build work w/ hbase-assembly. It looked attractive because it almost does the right thing. The big problem w/ this route that maven wants you to take is that its bound to the package phase. That means whenever you do a mvn package, it'll take for ever was maven builds the world then copies it all over the place including jars to make you your .tar.gz. Most of the time when folks do package, they just want jars made and installed is my thinking so this would just be a fat annoyance. I went back to removing hbase-assembly and having assembly done by the parent. You invoke an assembly by doing assembly:assembly after doing a package and site (not assembly:single -- thats something else). The attached patch is pretty much there. I need to do a bit more polishing. All src is included, its buildable and it runs. Let me do some more testing before committing. > Fix site target post modularization > --- > > Key: HBASE-6145 > URL: https://issues.apache.org/jira/browse/HBASE-6145 > Project: HBase > Issue Type: Task >Reporter: stack >Assignee: stack > Attachments: site.txt, site2.txt, sitev3.txt > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6138) HadoopQA not running findbugs [Trunk]
[ https://issues.apache.org/jira/browse/HBASE-6138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287771#comment-13287771 ] Zhihong Yu commented on HBASE-6138: --- Integrated to trunk. Thanks for the patch Anoop. > HadoopQA not running findbugs [Trunk] > - > > Key: HBASE-6138 > URL: https://issues.apache.org/jira/browse/HBASE-6138 > Project: HBase > Issue Type: Bug > Components: build >Affects Versions: 0.96.0 >Reporter: Anoop Sam John >Assignee: Anoop Sam John > Fix For: 0.96.0 > > Attachments: 6138.txt > > > HadoopQA shows like > -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. > But not able to see any reports link > When I checked the console output for the build I can see > {code} > [INFO] --- findbugs-maven-plugin:2.4.0:findbugs (default-cli) @ hbase-common > --- > [INFO] Fork Value is true > [INFO] > > [INFO] Reactor Summary: > [INFO] > [INFO] HBase . SUCCESS [1.890s] > [INFO] HBase - Common FAILURE [2.238s] > [INFO] HBase - Server SKIPPED > [INFO] HBase - Assembly .. SKIPPED > [INFO] HBase - Site .. SKIPPED > [INFO] > > [INFO] BUILD FAILURE > [INFO] > > [INFO] Total time: 4.856s > [INFO] Finished at: Thu May 31 03:35:35 UTC 2012 > [INFO] Final Memory: 23M/154M > [INFO] > > [ERROR] Could not find resource > '${parent.basedir}/dev-support/findbugs-exclude.xml'. -> [Help 1] > [ERROR] > {code} > Because of this error Findbugs is getting run! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-6138) HadoopQA not running findbugs [Trunk]
[ https://issues.apache.org/jira/browse/HBASE-6138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu reassigned HBASE-6138: - Assignee: Anoop Sam John > HadoopQA not running findbugs [Trunk] > - > > Key: HBASE-6138 > URL: https://issues.apache.org/jira/browse/HBASE-6138 > Project: HBase > Issue Type: Bug > Components: build >Affects Versions: 0.96.0 >Reporter: Anoop Sam John >Assignee: Anoop Sam John > Fix For: 0.96.0 > > Attachments: 6138.txt > > > HadoopQA shows like > -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. > But not able to see any reports link > When I checked the console output for the build I can see > {code} > [INFO] --- findbugs-maven-plugin:2.4.0:findbugs (default-cli) @ hbase-common > --- > [INFO] Fork Value is true > [INFO] > > [INFO] Reactor Summary: > [INFO] > [INFO] HBase . SUCCESS [1.890s] > [INFO] HBase - Common FAILURE [2.238s] > [INFO] HBase - Server SKIPPED > [INFO] HBase - Assembly .. SKIPPED > [INFO] HBase - Site .. SKIPPED > [INFO] > > [INFO] BUILD FAILURE > [INFO] > > [INFO] Total time: 4.856s > [INFO] Finished at: Thu May 31 03:35:35 UTC 2012 > [INFO] Final Memory: 23M/154M > [INFO] > > [ERROR] Could not find resource > '${parent.basedir}/dev-support/findbugs-exclude.xml'. -> [Help 1] > [ERROR] > {code} > Because of this error Findbugs is getting run! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5936) Add Column-level PB-based calls to HMasterInterface
[ https://issues.apache.org/jira/browse/HBASE-5936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5936: - Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk. I ran the three failing tests locally and they passed for me w/ this patch applied. Thanks Gregory for your doggedness getting this in. > Add Column-level PB-based calls to HMasterInterface > --- > > Key: HBASE-5936 > URL: https://issues.apache.org/jira/browse/HBASE-5936 > Project: HBase > Issue Type: Task > Components: ipc, master, migration >Reporter: Gregory Chanan >Assignee: Gregory Chanan > Fix For: 0.96.0 > > Attachments: HBASE-5936-v3.patch, HBASE-5936-v4.patch, > HBASE-5936-v4.patch, HBASE-5936-v5.patch, HBASE-5936-v6.patch, > HBASE-5936.patch > > > This should be a subtask of HBASE-5445, but since that is a subtask, I can't > also make this a subtask (apparently). > This is for converting the column-level calls, i.e.: > addColumn > deleteColumn > modifyColumn -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6138) HadoopQA not running findbugs [Trunk]
[ https://issues.apache.org/jira/browse/HBASE-6138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-6138: -- Attachment: 6138.txt Patch that I am going to apply. > HadoopQA not running findbugs [Trunk] > - > > Key: HBASE-6138 > URL: https://issues.apache.org/jira/browse/HBASE-6138 > Project: HBase > Issue Type: Bug > Components: build >Affects Versions: 0.96.0 >Reporter: Anoop Sam John > Fix For: 0.96.0 > > Attachments: 6138.txt > > > HadoopQA shows like > -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. > But not able to see any reports link > When I checked the console output for the build I can see > {code} > [INFO] --- findbugs-maven-plugin:2.4.0:findbugs (default-cli) @ hbase-common > --- > [INFO] Fork Value is true > [INFO] > > [INFO] Reactor Summary: > [INFO] > [INFO] HBase . SUCCESS [1.890s] > [INFO] HBase - Common FAILURE [2.238s] > [INFO] HBase - Server SKIPPED > [INFO] HBase - Assembly .. SKIPPED > [INFO] HBase - Site .. SKIPPED > [INFO] > > [INFO] BUILD FAILURE > [INFO] > > [INFO] Total time: 4.856s > [INFO] Finished at: Thu May 31 03:35:35 UTC 2012 > [INFO] Final Memory: 23M/154M > [INFO] > > [ERROR] Could not find resource > '${parent.basedir}/dev-support/findbugs-exclude.xml'. -> [Help 1] > [ERROR] > {code} > Because of this error Findbugs is getting run! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4676) Prefix Compression - Trie data block encoding
[ https://issues.apache.org/jira/browse/HBASE-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287766#comment-13287766 ] James Taylor commented on HBASE-4676: - Our qualifiers tend to be long, so trie-encoding them helps quite a bit. We could optimize this ourselves by managing our own mapping, but it's great that the trie-encoding does it for us. We haven't seen this impact our scan times. > Prefix Compression - Trie data block encoding > - > > Key: HBASE-4676 > URL: https://issues.apache.org/jira/browse/HBASE-4676 > Project: HBase > Issue Type: New Feature > Components: io, performance, regionserver >Affects Versions: 0.90.6 >Reporter: Matt Corgan >Assignee: Matt Corgan > Attachments: HBASE-4676-0.94-v1.patch, PrefixTrie_Format_v1.pdf, > PrefixTrie_Performance_v1.pdf, SeeksPerSec by blockSize.png, > hbase-prefix-trie-0.1.jar > > > The HBase data block format has room for 2 significant improvements for > applications that have high block cache hit ratios. > First, there is no prefix compression, and the current KeyValue format is > somewhat metadata heavy, so there can be tremendous memory bloat for many > common data layouts, specifically those with long keys and short values. > Second, there is no random access to KeyValues inside data blocks. This > means that every time you double the datablock size, average seek time (or > average cpu consumption) goes up by a factor of 2. The standard 64KB block > size is ~10x slower for random seeks than a 4KB block size, but block sizes > as small as 4KB cause problems elsewhere. Using block sizes of 256KB or 1MB > or more may be more efficient from a disk access and block-cache perspective > in many big-data applications, but doing so is infeasible from a random seek > perspective. > The PrefixTrie block encoding format attempts to solve both of these > problems. Some features: > * trie format for row key encoding completely eliminates duplicate row keys > and encodes similar row keys into a standard trie structure which also saves > a lot of space > * the column family is currently stored once at the beginning of each block. > this could easily be modified to allow multiple family names per block > * all qualifiers in the block are stored in their own trie format which > caters nicely to wide rows. duplicate qualifers between rows are eliminated. > the size of this trie determines the width of the block's qualifier > fixed-width-int > * the minimum timestamp is stored at the beginning of the block, and deltas > are calculated from that. the maximum delta determines the width of the > block's timestamp fixed-width-int > The block is structured with metadata at the beginning, then a section for > the row trie, then the column trie, then the timestamp deltas, and then then > all the values. Most work is done in the row trie, where every leaf node > (corresponding to a row) contains a list of offsets/references corresponding > to the cells in that row. Each cell is fixed-width to enable binary > searching and is represented by [1 byte operationType, X bytes qualifier > offset, X bytes timestamp delta offset]. > If all operation types are the same for a block, there will be zero per-cell > overhead. Same for timestamps. Same for qualifiers when i get a chance. > So, the compression aspect is very strong, but makes a few small sacrifices > on VarInt size to enable faster binary searches in trie fan-out nodes. > A more compressed but slower version might build on this by also applying > further (suffix, etc) compression on the trie nodes at the cost of slower > write speed. Even further compression could be obtained by using all VInts > instead of FInts with a sacrifice on random seek speed (though not huge). > One current drawback is the current write speed. While programmed with good > constructs like TreeMaps, ByteBuffers, binary searches, etc, it's not > programmed with the same level of optimization as the read path. Work will > need to be done to optimize the data structures used for encoding and could > probably show a 10x increase. It will still be slower than delta encoding, > but with a much higher decode speed. I have not yet created a thorough > benchmark for write speed nor sequential read speed. > Though the trie is reaching a point where it is internally very efficient > (probably within half or a quarter of its max read speed) the way that hbase > currently uses it is far from optimal. The KeyValueScanner and related > classes that iterate through the trie will eventually need to be smarter and > have methods to do things like skipping to the next row of results without > scanning every cell in between. When that is accomplished it will also
[jira] [Comment Edited] (HBASE-6055) Snapshots in HBase 0.96
[ https://issues.apache.org/jira/browse/HBASE-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13286862#comment-13286862 ] Jonathan Hsieh edited comment on HBASE-6055 at 6/1/12 11:02 PM: _(jon: I made a minor formatting tweak to make this easier to read the dir structure)_ But before a detailed description of how timestamp-based snapshots work internally, lets answer some comments! @Jon: I'll add more info to the document to cover this stuff, but for the moment, lets just get it out there. {quote} What is the read mechanism for snapshots like? Does the snapshot act like a read-only table or is there some special external mechanism needed to read the data from a snapshot? You mention having to rebuild in-memory state by replaying wals – is this a recovery situation or needed in normal reads? {quote} Its almost, but not quite like a table. Read of a snapshot is going to require an external tool but after hooking up the snapshot via the external tool, it should act just like a real table. Snapshots are intended to happen as fast as possible, to minimize downtime for the table. To enable that, we are just creating reference files in the snapshot directory. My vision is that once you take a snapshot, at some point (maybe weekly), you export the snapshot to a backup area. In the export you actually do the copy of the referenced files - you do a direct scan of the HFile (avoiding the top-level interface and going right to HDFS) and the WAL files. Then when you want to read the snapshot, you can just bulk-import the HFIles and replay the WAL files (with the WALPlayer this is relatively easy) to rebuild the state of the table at the time of the snapshot. Its not an exact copy (META isn't preserved), but all the actual data is there. The caveat here is since everything is references, one of the WAL files you reference may not actually have been closed (and therefore not readable). In the common case this won't happen, but if you snap and immediately export, its possible. In that case, you need to roll the WAL for the RS that haven't rolled them yet. However, this is in the export process, so a little latency there is tolerable, whereas avoiding this means adding latency to taking a snapshot - bad news bears. Keep in mind that the log files and hfiles will get regularly cleaned up. The former will be moved to the .oldlogs directory and periodically cleaned up and the latter get moved to the .archive directory (again with a parallel file hierarchy, as per HBASE-5547). If the snapshot goes to read the reference file, which tracks down to the original file and it doesn't find it, then it will need to lookup the same file in its respective archive directory. If its not there, then you are really hosed (except for the case mentioned in the doc about the WALs getting cleaned up by an aggressive log cleaner, which it is shown, is not a problem). Haven't gotten around to implementing this yet, but it seems reasonable to finish up (and I think Matteo was interested in working on that part). {quote} What is a representation of a snapshot look like in terms of META and file system contents? {quote} The way I see the implementation in the end is just a bunch of files in the /hbase/.snapshot directory. Like I mentioned above, the layout is very similar to the layout of a table. Lets look at an example of a table named "stuff" (snapshot names need to be valid directory names - same as a table or CF) and has column "column" which is hosted on servers rs-1 and rs-2. Originally, the file system will look something like (with license taken on file names - its not exact, I know, this is just an example) : {code} /hbase/ .logs/ rs-1/ WAL-rs1-1 WAL-rs1-2 rs-2/ WAL-rs2-1 WAL-rs2-2 stuff/ .tableinfo region1 column region1-hfile-1 region2 column region2-hfile-1 {code} The snapshot named "tuesday-at-nine", when completed, then just adds the following to the directory structure (or close enough): {code} .snapshot/ tuesday-at-nine/ .tableinfo .snapshotinfo .logs rs-1/ WAL-rs1-1.reference WAL-rs1-2.reference rs-2/ WAL-rs2-1.reference WAL-rs2-2.reference stuff/ .tableinfo region1
[jira] [Commented] (HBASE-6150) Remove empty files causing rat check fail
[ https://issues.apache.org/jira/browse/HBASE-6150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287758#comment-13287758 ] Hudson commented on HBASE-6150: --- Integrated in HBase-TRUNK #2971 (See [https://builds.apache.org/job/HBase-TRUNK/2971/]) HBASE-6150 Remove empty files causing rat check fail (Revision 1345369) Result = SUCCESS stack : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/metrics/histogram/ExponentiallyDecayingSample.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/metrics/histogram/Sample.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/metrics/histogram/Snapshot.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/metrics/histogram/UniformSample.java > Remove empty files causing rat check fail > - > > Key: HBASE-6150 > URL: https://issues.apache.org/jira/browse/HBASE-6150 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: stack > Fix For: 0.96.0 > > > Set of empty files found by Jesse. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6138) HadoopQA not running findbugs [Trunk]
[ https://issues.apache.org/jira/browse/HBASE-6138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287753#comment-13287753 ] Zhihong Yu commented on HBASE-6138: --- Will integrate the suggested fix if there is no objection. > HadoopQA not running findbugs [Trunk] > - > > Key: HBASE-6138 > URL: https://issues.apache.org/jira/browse/HBASE-6138 > Project: HBase > Issue Type: Bug > Components: build >Affects Versions: 0.96.0 >Reporter: Anoop Sam John > Fix For: 0.96.0 > > > HadoopQA shows like > -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. > But not able to see any reports link > When I checked the console output for the build I can see > {code} > [INFO] --- findbugs-maven-plugin:2.4.0:findbugs (default-cli) @ hbase-common > --- > [INFO] Fork Value is true > [INFO] > > [INFO] Reactor Summary: > [INFO] > [INFO] HBase . SUCCESS [1.890s] > [INFO] HBase - Common FAILURE [2.238s] > [INFO] HBase - Server SKIPPED > [INFO] HBase - Assembly .. SKIPPED > [INFO] HBase - Site .. SKIPPED > [INFO] > > [INFO] BUILD FAILURE > [INFO] > > [INFO] Total time: 4.856s > [INFO] Finished at: Thu May 31 03:35:35 UTC 2012 > [INFO] Final Memory: 23M/154M > [INFO] > > [ERROR] Could not find resource > '${parent.basedir}/dev-support/findbugs-exclude.xml'. -> [Help 1] > [ERROR] > {code} > Because of this error Findbugs is getting run! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5892) [hbck] Refactor parallel WorkItem* to Futures.
[ https://issues.apache.org/jira/browse/HBASE-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287748#comment-13287748 ] Andrew Wang commented on HBASE-5892: I don't know why Findbugs is erroring. Maybe the modularization change? {code} [ERROR] Could not find resource '${parent.basedir}/dev-support/findbugs-exclude.xml'. -> [Help 1] {code} No tests because there's no functionality change, it's a refactor. > [hbck] Refactor parallel WorkItem* to Futures. > -- > > Key: HBASE-5892 > URL: https://issues.apache.org/jira/browse/HBASE-5892 > Project: HBase > Issue Type: Improvement >Reporter: Jonathan Hsieh >Assignee: Andrew Wang > Labels: noob > Attachments: hbase-5892-1.patch, hbase-5892-2-0.90.patch, > hbase-5892-2.patch, hbase-5892-3.patch, hbase-5892-4-0.90.patch, > hbase-5892-4.patch, hbase-5892.patch > > > This would convert WorkItem* logic (with low level notifies, and rough > exception handling) into a more canonical Futures pattern. > Currently there are two instances of this pattern (for loading hdfs dirs, for > contacting regionservers for assignments, and soon -- for loading hdfs > .regioninfo files). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6150) Remove empty files causing rat check fail
[ https://issues.apache.org/jira/browse/HBASE-6150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287724#comment-13287724 ] stack commented on HBASE-6150: -- Removed these four files: {code} D hbase-server/src/main/java/org/apache/hadoop/hbase/metrics/histogram/Snapshot.java D hbase-server/src/main/java/org/apache/hadoop/hbase/metrics/histogram/ExponentiallyDecayingSample.java D hbase-server/src/main/java/org/apache/hadoop/hbase/metrics/histogram/Sample.java D hbase-server/src/main/java/org/apache/hadoop/hbase/metrics/histogram/UniformSample.java {code} > Remove empty files causing rat check fail > - > > Key: HBASE-6150 > URL: https://issues.apache.org/jira/browse/HBASE-6150 > Project: HBase > Issue Type: Bug >Reporter: stack > Fix For: 0.96.0 > > > Set of empty files found by Jesse. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-6150) Remove empty files causing rat check fail
[ https://issues.apache.org/jira/browse/HBASE-6150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-6150. -- Resolution: Fixed Fix Version/s: 0.96.0 Assignee: stack Committed to trunk. > Remove empty files causing rat check fail > - > > Key: HBASE-6150 > URL: https://issues.apache.org/jira/browse/HBASE-6150 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: stack > Fix For: 0.96.0 > > > Set of empty files found by Jesse. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6150) Remove empty files causing rat check fail
stack created HBASE-6150: Summary: Remove empty files causing rat check fail Key: HBASE-6150 URL: https://issues.apache.org/jira/browse/HBASE-6150 Project: HBase Issue Type: Bug Reporter: stack Set of empty files found by Jesse. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6149) Fix TestFSUtils creating dirs under top level dir
[ https://issues.apache.org/jira/browse/HBASE-6149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287705#comment-13287705 ] Hudson commented on HBASE-6149: --- Integrated in HBase-TRUNK #2970 (See [https://builds.apache.org/job/HBase-TRUNK/2970/]) HBASE-6149 Fix TestFSUtils creating dirs under top level dir (Revision 1345343) Result = FAILURE stack : Files : * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestFSUtils.java > Fix TestFSUtils creating dirs under top level dir > - > > Key: HBASE-6149 > URL: https://issues.apache.org/jira/browse/HBASE-6149 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: stack > Fix For: 0.96.0 > > Attachments: fixtestdir.txt > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-6149) Fix TestFSUtils creating dirs under top level dir
[ https://issues.apache.org/jira/browse/HBASE-6149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-6149. -- Resolution: Fixed Fix Version/s: 0.96.0 Assignee: stack Committed to trunk > Fix TestFSUtils creating dirs under top level dir > - > > Key: HBASE-6149 > URL: https://issues.apache.org/jira/browse/HBASE-6149 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: stack > Fix For: 0.96.0 > > Attachments: fixtestdir.txt > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover
[ https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-6060: -- Attachment: 6060-94-v3.patch Patch v3 illustrates my proposal. I also created a singleton for the null RegionPlan that signifies there is no server to assign region. TestAssignmentManager passes. > Regions's in OPENING state from failed regionservers takes a long time to > recover > - > > Key: HBASE-6060 > URL: https://issues.apache.org/jira/browse/HBASE-6060 > Project: HBase > Issue Type: Bug > Components: master, regionserver >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Attachments: 6060-94-v3.patch, HBASE-6060-94.patch > > > we have seen a pattern in tests, that the regions are stuck in OPENING state > for a very long time when the region server who is opening the region fails. > My understanding of the process: > > - master calls rs to open the region. If rs is offline, a new plan is > generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in > master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), > HMaster.assign() > - RegionServer, starts opening a region, changes the state in znode. But > that znode is not ephemeral. (see ZkAssign) > - Rs transitions zk node from OFFLINE to OPENING. See > OpenRegionHandler.process() > - rs then opens the region, and changes znode from OPENING to OPENED > - when rs is killed between OPENING and OPENED states, then zk shows OPENING > state, and the master just waits for rs to change the region state, but since > rs is down, that wont happen. > - There is a AssignmentManager.TimeoutMonitor, which does exactly guard > against these kind of conditions. It periodically checks (every 10 sec by > default) the regions in transition to see whether they timedout > (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, > which explains what you and I are seeing. > - ServerShutdownHandler in Master does not reassign regions in OPENING > state, although it handles other states. > Lowering that threshold from the configuration is one option, but still I > think we can do better. > Will investigate more. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6149) Fix TestFSUtils creating dirs under top level dir
[ https://issues.apache.org/jira/browse/HBASE-6149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6149: - Attachment: fixtestdir.txt Minor fix > Fix TestFSUtils creating dirs under top level dir > - > > Key: HBASE-6149 > URL: https://issues.apache.org/jira/browse/HBASE-6149 > Project: HBase > Issue Type: Bug >Reporter: stack > Attachments: fixtestdir.txt > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6149) Fix TestFSUtils creating dirs under top level dir
stack created HBASE-6149: Summary: Fix TestFSUtils creating dirs under top level dir Key: HBASE-6149 URL: https://issues.apache.org/jira/browse/HBASE-6149 Project: HBase Issue Type: Bug Reporter: stack -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6145) Fix site target post modularization
[ https://issues.apache.org/jira/browse/HBASE-6145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287648#comment-13287648 ] Hadoop QA commented on HBASE-6145: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12530584/site2.txt against trunk revision . -1 @author. The patch appears to contain 2 @author tags which the Hadoop community has agreed to not allow in code contributions. +1 tests included. The patch appears to include 8 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestFromClientSide Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2085//testReport/ Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2085//console This message is automatically generated. > Fix site target post modularization > --- > > Key: HBASE-6145 > URL: https://issues.apache.org/jira/browse/HBASE-6145 > Project: HBase > Issue Type: Task >Reporter: stack >Assignee: stack > Attachments: site.txt, site2.txt > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5609) Add the ability to pass additional information for slow query logging
[ https://issues.apache.org/jira/browse/HBASE-5609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287636#comment-13287636 ] Jesse Yates commented on HBASE-5609: I'm +1 on Drz's latest patch on RB. Would love to see another patch to abstract out the commonality in toMap(), but that can go in another ticket. > Add the ability to pass additional information for slow query logging > - > > Key: HBASE-5609 > URL: https://issues.apache.org/jira/browse/HBASE-5609 > Project: HBase > Issue Type: New Feature > Components: client, ipc >Reporter: Michael Drzal >Assignee: Michael Drzal >Priority: Minor > Attachments: HBASE-5609-v2.patch, HBASE-5609.patch > > > HBase-4117 added the ability to log information about queries that returned > too much data or ran for too long. There is some information written as a > fingerprint that can be used to tell what table/column families/... are > affected. I would like to extend this functionality to allow the client to > insert an identifier into the operation that gets output in the log. The > idea behind this would be that if there were N places in the client > application that touched a given table in a certain way, you could quickly > narrow things down by inserting a className:functionName or similar > identifier. I'm fully willing to go back on this if people think that it > isn't a problem in real life and it would just add complexity to the code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4676) Prefix Compression - Trie data block encoding
[ https://issues.apache.org/jira/browse/HBASE-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287637#comment-13287637 ] Matt Corgan commented on HBASE-4676: Great to hear James. I'm working on migrating it to trunk, but wrestling with git and maven, my two worst friends. The last change I've been thinking about making to it before finalizing this version is to add an option controlling whether to trie-encode the qualifiers vs merely de-duping them. There's some read-time expense to decoding the trie-encoded qualifiers, and if you have a narrow table then you may not be saving much memory anyway. So it would be an option to trade a little memory for faster scans/decoding. > Prefix Compression - Trie data block encoding > - > > Key: HBASE-4676 > URL: https://issues.apache.org/jira/browse/HBASE-4676 > Project: HBase > Issue Type: New Feature > Components: io, performance, regionserver >Affects Versions: 0.90.6 >Reporter: Matt Corgan >Assignee: Matt Corgan > Attachments: HBASE-4676-0.94-v1.patch, PrefixTrie_Format_v1.pdf, > PrefixTrie_Performance_v1.pdf, SeeksPerSec by blockSize.png, > hbase-prefix-trie-0.1.jar > > > The HBase data block format has room for 2 significant improvements for > applications that have high block cache hit ratios. > First, there is no prefix compression, and the current KeyValue format is > somewhat metadata heavy, so there can be tremendous memory bloat for many > common data layouts, specifically those with long keys and short values. > Second, there is no random access to KeyValues inside data blocks. This > means that every time you double the datablock size, average seek time (or > average cpu consumption) goes up by a factor of 2. The standard 64KB block > size is ~10x slower for random seeks than a 4KB block size, but block sizes > as small as 4KB cause problems elsewhere. Using block sizes of 256KB or 1MB > or more may be more efficient from a disk access and block-cache perspective > in many big-data applications, but doing so is infeasible from a random seek > perspective. > The PrefixTrie block encoding format attempts to solve both of these > problems. Some features: > * trie format for row key encoding completely eliminates duplicate row keys > and encodes similar row keys into a standard trie structure which also saves > a lot of space > * the column family is currently stored once at the beginning of each block. > this could easily be modified to allow multiple family names per block > * all qualifiers in the block are stored in their own trie format which > caters nicely to wide rows. duplicate qualifers between rows are eliminated. > the size of this trie determines the width of the block's qualifier > fixed-width-int > * the minimum timestamp is stored at the beginning of the block, and deltas > are calculated from that. the maximum delta determines the width of the > block's timestamp fixed-width-int > The block is structured with metadata at the beginning, then a section for > the row trie, then the column trie, then the timestamp deltas, and then then > all the values. Most work is done in the row trie, where every leaf node > (corresponding to a row) contains a list of offsets/references corresponding > to the cells in that row. Each cell is fixed-width to enable binary > searching and is represented by [1 byte operationType, X bytes qualifier > offset, X bytes timestamp delta offset]. > If all operation types are the same for a block, there will be zero per-cell > overhead. Same for timestamps. Same for qualifiers when i get a chance. > So, the compression aspect is very strong, but makes a few small sacrifices > on VarInt size to enable faster binary searches in trie fan-out nodes. > A more compressed but slower version might build on this by also applying > further (suffix, etc) compression on the trie nodes at the cost of slower > write speed. Even further compression could be obtained by using all VInts > instead of FInts with a sacrifice on random seek speed (though not huge). > One current drawback is the current write speed. While programmed with good > constructs like TreeMaps, ByteBuffers, binary searches, etc, it's not > programmed with the same level of optimization as the read path. Work will > need to be done to optimize the data structures used for encoding and could > probably show a 10x increase. It will still be slower than delta encoding, > but with a much higher decode speed. I have not yet created a thorough > benchmark for write speed nor sequential read speed. > Though the trie is reaching a point where it is internally very efficient > (probably within half or a quarter of its max read speed) the way that hbase > currentl
[jira] [Commented] (HBASE-5251) Some commands return "0 rows" when > 0 rows were processed successfully
[ https://issues.apache.org/jira/browse/HBASE-5251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287631#comment-13287631 ] s...@hotmail.com commented on HBASE-5251: - I am working on fixing this issue. Will have a patch soon. > Some commands return "0 rows" when > 0 rows were processed successfully > --- > > Key: HBASE-5251 > URL: https://issues.apache.org/jira/browse/HBASE-5251 > Project: HBase > Issue Type: Bug > Components: shell >Affects Versions: 0.90.5 >Reporter: David S. Wang >Assignee: Himanshu Vashishtha >Priority: Minor > Labels: noob > > From the hbase shell, I see this: > hbase(main):049:0> scan 't1' > ROW COLUMN+CELL > > r1 column=f1:c1, timestamp=1327104295560, value=value > > r1 column=f1:c2, timestamp=1327104330625, value=value > > 1 row(s) in 0.0300 seconds > hbase(main):050:0> deleteall 't1', 'r1' > 0 row(s) in 0.0080 seconds <== I expected this to read > "2 row(s)" > hbase(main):051:0> scan 't1' > ROW COLUMN+CELL > > 0 row(s) in 0.0090 seconds > I expected the deleteall command to return "1 row(s)" instead of 0, because 1 > row was deleted. Similar behavior for delete and some other commands. Some > commands such as "put" work fine. > Looking at the ruby shell code, it seems that formatter.footer() is called > even for commands that will not actually increment the number of rows > reported, such as deletes. Perhaps there should be another similar function > to formatter.footer(), but that will not print out @row_count. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6046) Master retry on ZK session expiry causes inconsistent region assignments.
[ https://issues.apache.org/jira/browse/HBASE-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287632#comment-13287632 ] Zhihong Yu commented on HBASE-6046: --- Patch v2 looks good. Minor comment: {code} - ".splitLogManagerTimeoutMonitor"); + public void finishInitialization(boolean masterRecovery) { {code} Add javadoc for the method and masterRecovery parameter. > Master retry on ZK session expiry causes inconsistent region assignments. > - > > Key: HBASE-6046 > URL: https://issues.apache.org/jira/browse/HBASE-6046 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.92.1, 0.94.0 >Reporter: Gopinathan A >Assignee: ramkrishna.s.vasudevan > Attachments: HBASE_6046_0.94.patch, HBASE_6046_0.94_1.patch, > HBASE_6046_0.94_2.patch > > > 1> ZK Session timeout in the hmaster leads to bulk assignment though all the > RSs are online. > 2> While doing bulk assignment, if the master again goes down & restart(or > backup comes up) all the node created in the ZK will now be tried to reassign > to the new RSs. This is leading to double assignment. > we had 2800 regions, among this 1900 region got double assignment, taking the > region count to 4700. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6046) Master retry on ZK session expiry causes inconsistent region assignments.
[ https://issues.apache.org/jira/browse/HBASE-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287628#comment-13287628 ] Hadoop QA commented on HBASE-6046: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12530583/HBASE_6046_0.94_2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2086//console This message is automatically generated. > Master retry on ZK session expiry causes inconsistent region assignments. > - > > Key: HBASE-6046 > URL: https://issues.apache.org/jira/browse/HBASE-6046 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.92.1, 0.94.0 >Reporter: Gopinathan A >Assignee: ramkrishna.s.vasudevan > Attachments: HBASE_6046_0.94.patch, HBASE_6046_0.94_1.patch, > HBASE_6046_0.94_2.patch > > > 1> ZK Session timeout in the hmaster leads to bulk assignment though all the > RSs are online. > 2> While doing bulk assignment, if the master again goes down & restart(or > backup comes up) all the node created in the ZK will now be tried to reassign > to the new RSs. This is leading to double assignment. > we had 2800 regions, among this 1900 region got double assignment, taking the > region count to 4700. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover
[ https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287599#comment-13287599 ] Zhihong Yu commented on HBASE-6060: --- Thinking more about the usable RegionPlan flag, we don't really need it. We can introduce an 'unusable' RegionPlan singleton which signifies the fact that it is not to be used. > Regions's in OPENING state from failed regionservers takes a long time to > recover > - > > Key: HBASE-6060 > URL: https://issues.apache.org/jira/browse/HBASE-6060 > Project: HBase > Issue Type: Bug > Components: master, regionserver >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Attachments: HBASE-6060-94.patch > > > we have seen a pattern in tests, that the regions are stuck in OPENING state > for a very long time when the region server who is opening the region fails. > My understanding of the process: > > - master calls rs to open the region. If rs is offline, a new plan is > generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in > master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), > HMaster.assign() > - RegionServer, starts opening a region, changes the state in znode. But > that znode is not ephemeral. (see ZkAssign) > - Rs transitions zk node from OFFLINE to OPENING. See > OpenRegionHandler.process() > - rs then opens the region, and changes znode from OPENING to OPENED > - when rs is killed between OPENING and OPENED states, then zk shows OPENING > state, and the master just waits for rs to change the region state, but since > rs is down, that wont happen. > - There is a AssignmentManager.TimeoutMonitor, which does exactly guard > against these kind of conditions. It periodically checks (every 10 sec by > default) the regions in transition to see whether they timedout > (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, > which explains what you and I are seeing. > - ServerShutdownHandler in Master does not reassign regions in OPENING > state, although it handles other states. > Lowering that threshold from the configuration is one option, but still I > think we can do better. > Will investigate more. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6145) Fix site target post modularization
[ https://issues.apache.org/jira/browse/HBASE-6145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6145: - Attachment: site2.txt v2 removes hbase-assembly and assembly in general -- for now. At the moment after looking at assembly trying to make the assembly:assembly work -- i.e. doing assembly up in the parent rather than down in hbase-assembly module -- it seems like it could be made work (you just do assembly:assembly after doing package) but its totally a manual affair shaping the end product while meantime maven throws cryptic exceptions. It will take hours. Trying to read up on how others have done this is a trip through other people's misery. I'm tempted to just write a shell script to do the packaging. > Fix site target post modularization > --- > > Key: HBASE-6145 > URL: https://issues.apache.org/jira/browse/HBASE-6145 > Project: HBase > Issue Type: Task >Reporter: stack >Assignee: stack > Attachments: site.txt, site2.txt > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6148) [89-fb] Avoid allocating large objects when reading corrupted RPC
Liyin Tang created HBASE-6148: - Summary: [89-fb] Avoid allocating large objects when reading corrupted RPC Key: HBASE-6148 URL: https://issues.apache.org/jira/browse/HBASE-6148 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Recently RegionServer allocates very large objects when reading some corrupted RPC calls, which may caused by client-server version incompatibility. We need to add a protection before allocating the objects. Apache trunk won't suffer from this problem since it had moved to the versioned invocation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.
[ https://issues.apache.org/jira/browse/HBASE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287595#comment-13287595 ] Zhihong Yu commented on HBASE-5924: --- For #2 above, I think we can remove the callback in 0.96 > In the client code, don't wait for all the requests to be executed before > resubmitting a request in error. > -- > > Key: HBASE-5924 > URL: https://issues.apache.org/jira/browse/HBASE-5924 > Project: HBase > Issue Type: Improvement > Components: client >Affects Versions: 0.96.0 >Reporter: nkeywal >Assignee: nkeywal >Priority: Minor > > The client (in the function HConnectionManager#processBatchCallback) works in > two steps: > - make the requests > - collect the failures and successes and prepare for retry > It means that when there is an immediate error (region moved, split, dead > server, ...) we still wait for all the initial requests to be executed before > submitting again the failed request. If we have a scenario with all the > requests taking 5 seconds we have a final execution time of: 5 (initial > requests) + 1 (wait time) + 5 (final request) = 11s. > We could improve this by analyzing immediately the results. This would lead > us, for the scenario mentioned above, to 6 seconds. > So we could have a performance improvement of nearly 50% in many cases, and > much more than 50% if the request execution time is different. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6046) Master retry on ZK session expiry causes inconsistent region assignments.
[ https://issues.apache.org/jira/browse/HBASE-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-6046: -- Attachment: HBASE_6046_0.94_2.patch > Master retry on ZK session expiry causes inconsistent region assignments. > - > > Key: HBASE-6046 > URL: https://issues.apache.org/jira/browse/HBASE-6046 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.92.1, 0.94.0 >Reporter: Gopinathan A >Assignee: ramkrishna.s.vasudevan > Attachments: HBASE_6046_0.94.patch, HBASE_6046_0.94_1.patch, > HBASE_6046_0.94_2.patch > > > 1> ZK Session timeout in the hmaster leads to bulk assignment though all the > RSs are online. > 2> While doing bulk assignment, if the master again goes down & restart(or > backup comes up) all the node created in the ZK will now be tried to reassign > to the new RSs. This is leading to double assignment. > we had 2800 regions, among this 1900 region got double assignment, taking the > region count to 4700. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4676) Prefix Compression - Trie data block encoding
[ https://issues.apache.org/jira/browse/HBASE-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287587#comment-13287587 ] James Taylor commented on HBASE-4676: - This is fantastic, Matt. We're testing your patch on 0.94 in our dev cluster here at Salesforce and are seeing 5-15x compression with no degradation on scan performance. > Prefix Compression - Trie data block encoding > - > > Key: HBASE-4676 > URL: https://issues.apache.org/jira/browse/HBASE-4676 > Project: HBase > Issue Type: New Feature > Components: io, performance, regionserver >Affects Versions: 0.90.6 >Reporter: Matt Corgan >Assignee: Matt Corgan > Attachments: HBASE-4676-0.94-v1.patch, PrefixTrie_Format_v1.pdf, > PrefixTrie_Performance_v1.pdf, SeeksPerSec by blockSize.png, > hbase-prefix-trie-0.1.jar > > > The HBase data block format has room for 2 significant improvements for > applications that have high block cache hit ratios. > First, there is no prefix compression, and the current KeyValue format is > somewhat metadata heavy, so there can be tremendous memory bloat for many > common data layouts, specifically those with long keys and short values. > Second, there is no random access to KeyValues inside data blocks. This > means that every time you double the datablock size, average seek time (or > average cpu consumption) goes up by a factor of 2. The standard 64KB block > size is ~10x slower for random seeks than a 4KB block size, but block sizes > as small as 4KB cause problems elsewhere. Using block sizes of 256KB or 1MB > or more may be more efficient from a disk access and block-cache perspective > in many big-data applications, but doing so is infeasible from a random seek > perspective. > The PrefixTrie block encoding format attempts to solve both of these > problems. Some features: > * trie format for row key encoding completely eliminates duplicate row keys > and encodes similar row keys into a standard trie structure which also saves > a lot of space > * the column family is currently stored once at the beginning of each block. > this could easily be modified to allow multiple family names per block > * all qualifiers in the block are stored in their own trie format which > caters nicely to wide rows. duplicate qualifers between rows are eliminated. > the size of this trie determines the width of the block's qualifier > fixed-width-int > * the minimum timestamp is stored at the beginning of the block, and deltas > are calculated from that. the maximum delta determines the width of the > block's timestamp fixed-width-int > The block is structured with metadata at the beginning, then a section for > the row trie, then the column trie, then the timestamp deltas, and then then > all the values. Most work is done in the row trie, where every leaf node > (corresponding to a row) contains a list of offsets/references corresponding > to the cells in that row. Each cell is fixed-width to enable binary > searching and is represented by [1 byte operationType, X bytes qualifier > offset, X bytes timestamp delta offset]. > If all operation types are the same for a block, there will be zero per-cell > overhead. Same for timestamps. Same for qualifiers when i get a chance. > So, the compression aspect is very strong, but makes a few small sacrifices > on VarInt size to enable faster binary searches in trie fan-out nodes. > A more compressed but slower version might build on this by also applying > further (suffix, etc) compression on the trie nodes at the cost of slower > write speed. Even further compression could be obtained by using all VInts > instead of FInts with a sacrifice on random seek speed (though not huge). > One current drawback is the current write speed. While programmed with good > constructs like TreeMaps, ByteBuffers, binary searches, etc, it's not > programmed with the same level of optimization as the read path. Work will > need to be done to optimize the data structures used for encoding and could > probably show a 10x increase. It will still be slower than delta encoding, > but with a much higher decode speed. I have not yet created a thorough > benchmark for write speed nor sequential read speed. > Though the trie is reaching a point where it is internally very efficient > (probably within half or a quarter of its max read speed) the way that hbase > currently uses it is far from optimal. The KeyValueScanner and related > classes that iterate through the trie will eventually need to be smarter and > have methods to do things like skipping to the next row of results without > scanning every cell in between. When that is accomplished it will also allow > much faster compactions because the full row key will
[jira] [Commented] (HBASE-5936) Add Column-level PB-based calls to HMasterInterface
[ https://issues.apache.org/jira/browse/HBASE-5936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287586#comment-13287586 ] Gregory Chanan commented on HBASE-5936: --- I ran these failed tests multiple times locally and they passed. > Add Column-level PB-based calls to HMasterInterface > --- > > Key: HBASE-5936 > URL: https://issues.apache.org/jira/browse/HBASE-5936 > Project: HBase > Issue Type: Task > Components: ipc, master, migration >Reporter: Gregory Chanan >Assignee: Gregory Chanan > Fix For: 0.96.0 > > Attachments: HBASE-5936-v3.patch, HBASE-5936-v4.patch, > HBASE-5936-v4.patch, HBASE-5936-v5.patch, HBASE-5936-v6.patch, > HBASE-5936.patch > > > This should be a subtask of HBASE-5445, but since that is a subtask, I can't > also make this a subtask (apparently). > This is for converting the column-level calls, i.e.: > addColumn > deleteColumn > modifyColumn -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.
[ https://issues.apache.org/jira/browse/HBASE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287582#comment-13287582 ] nkeywal commented on HBASE-5924: This leads to a complete rewriting of the processBatchCallback function. 3 comments: 1) I don't see how this piece of code can happen, and I ran the complete test suite without getting into this part. Do I miss anything? {noformat} for (Pair regionResult : regionResults) { if (regionResult == null) { // if the first/only record is 'null' the entire region failed. LOG.debug("Failures for region: " + Bytes.toStringBinary(regionName) + ", removing from cache"); } else { {noformat} 2) The callback is never used internally. Is this something we should keep for customer code? 3) Do I move it to HTable? There is a comment saying that it does not belong to Connection, and it's true. But it's public, so... > In the client code, don't wait for all the requests to be executed before > resubmitting a request in error. > -- > > Key: HBASE-5924 > URL: https://issues.apache.org/jira/browse/HBASE-5924 > Project: HBase > Issue Type: Improvement > Components: client >Affects Versions: 0.96.0 >Reporter: nkeywal >Assignee: nkeywal >Priority: Minor > > The client (in the function HConnectionManager#processBatchCallback) works in > two steps: > - make the requests > - collect the failures and successes and prepare for retry > It means that when there is an immediate error (region moved, split, dead > server, ...) we still wait for all the initial requests to be executed before > submitting again the failed request. If we have a scenario with all the > requests taking 5 seconds we have a final execution time of: 5 (initial > requests) + 1 (wait time) + 5 (final request) = 11s. > We could improve this by analyzing immediately the results. This would lead > us, for the scenario mentioned above, to 6 seconds. > So we could have a performance improvement of nearly 50% in many cases, and > much more than 50% if the request execution time is different. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover
[ https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287579#comment-13287579 ] Zhihong Yu commented on HBASE-6060: --- Thanks for working on this issue. I will review the next version in more detail :-) > Regions's in OPENING state from failed regionservers takes a long time to > recover > - > > Key: HBASE-6060 > URL: https://issues.apache.org/jira/browse/HBASE-6060 > Project: HBase > Issue Type: Bug > Components: master, regionserver >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Attachments: HBASE-6060-94.patch > > > we have seen a pattern in tests, that the regions are stuck in OPENING state > for a very long time when the region server who is opening the region fails. > My understanding of the process: > > - master calls rs to open the region. If rs is offline, a new plan is > generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in > master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), > HMaster.assign() > - RegionServer, starts opening a region, changes the state in znode. But > that znode is not ephemeral. (see ZkAssign) > - Rs transitions zk node from OFFLINE to OPENING. See > OpenRegionHandler.process() > - rs then opens the region, and changes znode from OPENING to OPENED > - when rs is killed between OPENING and OPENED states, then zk shows OPENING > state, and the master just waits for rs to change the region state, but since > rs is down, that wont happen. > - There is a AssignmentManager.TimeoutMonitor, which does exactly guard > against these kind of conditions. It periodically checks (every 10 sec by > default) the regions in transition to see whether they timedout > (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, > which explains what you and I are seeing. > - ServerShutdownHandler in Master does not reassign regions in OPENING > state, although it handles other states. > Lowering that threshold from the configuration is one option, but still I > think we can do better. > Will investigate more. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover
[ https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287568#comment-13287568 ] ramkrishna.s.vasudevan commented on HBASE-6060: --- @Ted bq.why cannot we return null above so that we don't need to add the boolean member to RegionPlan ? The reason here is incase of null region plan we have a different behaviour. We consider it as a case where we dont have any live RS and hence set some flag such that the timeoutmonitor can be skipped. Hence we need to differentiate the null behaviour from this. {code} if (plan == null) { LOG.debug("Unable to determine a plan to assign " + state); this.timeoutMonitor.setAllRegionServersOffline(true); return; // Should get reassigned later when RIT times out. } {code} I was not sure of which name to give for the usePlan. May be 'usable' is better. {code} Before making the deadServerRegionsFromRegionPlan.put() call {code} I think as the SSH calls per server it should be ok. I will check on other comments before changing it. Thanks for your detailed review. > Regions's in OPENING state from failed regionservers takes a long time to > recover > - > > Key: HBASE-6060 > URL: https://issues.apache.org/jira/browse/HBASE-6060 > Project: HBase > Issue Type: Bug > Components: master, regionserver >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Attachments: HBASE-6060-94.patch > > > we have seen a pattern in tests, that the regions are stuck in OPENING state > for a very long time when the region server who is opening the region fails. > My understanding of the process: > > - master calls rs to open the region. If rs is offline, a new plan is > generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in > master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), > HMaster.assign() > - RegionServer, starts opening a region, changes the state in znode. But > that znode is not ephemeral. (see ZkAssign) > - Rs transitions zk node from OFFLINE to OPENING. See > OpenRegionHandler.process() > - rs then opens the region, and changes znode from OPENING to OPENED > - when rs is killed between OPENING and OPENED states, then zk shows OPENING > state, and the master just waits for rs to change the region state, but since > rs is down, that wont happen. > - There is a AssignmentManager.TimeoutMonitor, which does exactly guard > against these kind of conditions. It periodically checks (every 10 sec by > default) the regions in transition to see whether they timedout > (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, > which explains what you and I are seeing. > - ServerShutdownHandler in Master does not reassign regions in OPENING > state, although it handles other states. > Lowering that threshold from the configuration is one option, but still I > think we can do better. > Will investigate more. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover
[ https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287561#comment-13287561 ] Zhihong Yu commented on HBASE-6060: --- I ran the tests in TestAssignmentManager and they passed. {code} synchronized (this.regionPlans) { + regionsOnDeadServer = new RegionsOnDeadServer(); + regionsFromRegionPlansForServer = new ConcurrentSkipListSet(); + this.deadServerRegionsFromRegionPlan.put(sn, regionsOnDeadServer); {code} Can the first two assignments be placed outside synchronized block ? Before making the deadServerRegionsFromRegionPlan.put() call, I think we should check that sn isn't currently in deadServerRegionsFromRegionPlan. For isRegionOnline(HRegionInfo hri): {code} +return true; + } else { +// Remove the assignment mapping for sn. +Set hriSet = this.servers.get(sn); +if (hriSet != null) { + hriSet.remove(hri); +} {code} The else keyword isn't needed. What if hriSet contains other regions apart from hri, should they be removed as well ? > Regions's in OPENING state from failed regionservers takes a long time to > recover > - > > Key: HBASE-6060 > URL: https://issues.apache.org/jira/browse/HBASE-6060 > Project: HBase > Issue Type: Bug > Components: master, regionserver >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Attachments: HBASE-6060-94.patch > > > we have seen a pattern in tests, that the regions are stuck in OPENING state > for a very long time when the region server who is opening the region fails. > My understanding of the process: > > - master calls rs to open the region. If rs is offline, a new plan is > generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in > master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), > HMaster.assign() > - RegionServer, starts opening a region, changes the state in znode. But > that znode is not ephemeral. (see ZkAssign) > - Rs transitions zk node from OFFLINE to OPENING. See > OpenRegionHandler.process() > - rs then opens the region, and changes znode from OPENING to OPENED > - when rs is killed between OPENING and OPENED states, then zk shows OPENING > state, and the master just waits for rs to change the region state, but since > rs is down, that wont happen. > - There is a AssignmentManager.TimeoutMonitor, which does exactly guard > against these kind of conditions. It periodically checks (every 10 sec by > default) the regions in transition to see whether they timedout > (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, > which explains what you and I are seeing. > - ServerShutdownHandler in Master does not reassign regions in OPENING > state, although it handles other states. > Lowering that threshold from the configuration is one option, but still I > think we can do better. > Will investigate more. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover
[ https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287547#comment-13287547 ] Zhihong Yu commented on HBASE-6060: --- {code} + if(!plan.canUsePlan()){ +return; {code} Please insert a space after if. It would be helpful to add LOG.debug() before returning. {code} + public void usePlan(boolean usePlan) { +this.usePlan = usePlan; + } {code} I would name the boolean 'usable'. The setter can be named setUsable(). A bigger question is: {code} + if (newPlan) { +randomPlan.usePlan(false); +this.regionPlans.remove(randomPlan.getRegionName()); + } else { +existingPlan.usePlan(false); +this.regionPlans.remove(existingPlan.getRegionName()); + } {code} why cannot we return null above so that we don't need to add the boolean member to RegionPlan ? At least we shouldn't return an unusable randomPlan. > Regions's in OPENING state from failed regionservers takes a long time to > recover > - > > Key: HBASE-6060 > URL: https://issues.apache.org/jira/browse/HBASE-6060 > Project: HBase > Issue Type: Bug > Components: master, regionserver >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Attachments: HBASE-6060-94.patch > > > we have seen a pattern in tests, that the regions are stuck in OPENING state > for a very long time when the region server who is opening the region fails. > My understanding of the process: > > - master calls rs to open the region. If rs is offline, a new plan is > generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in > master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), > HMaster.assign() > - RegionServer, starts opening a region, changes the state in znode. But > that znode is not ephemeral. (see ZkAssign) > - Rs transitions zk node from OFFLINE to OPENING. See > OpenRegionHandler.process() > - rs then opens the region, and changes znode from OPENING to OPENED > - when rs is killed between OPENING and OPENED states, then zk shows OPENING > state, and the master just waits for rs to change the region state, but since > rs is down, that wont happen. > - There is a AssignmentManager.TimeoutMonitor, which does exactly guard > against these kind of conditions. It periodically checks (every 10 sec by > default) the regions in transition to see whether they timedout > (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, > which explains what you and I are seeing. > - ServerShutdownHandler in Master does not reassign regions in OPENING > state, although it handles other states. > Lowering that threshold from the configuration is one option, but still I > think we can do better. > Will investigate more. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region to be assigned before log splitting is completed, causing data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5179: - Fix Version/s: (was: 0.92.2) 0.92.3 > Concurrent processing of processFaileOver and ServerShutdownHandler may cause > region to be assigned before log splitting is completed, causing data loss > > > Key: HBASE-5179 > URL: https://issues.apache.org/jira/browse/HBASE-5179 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.2 >Reporter: chunhui shen >Assignee: chunhui shen >Priority: Critical > Fix For: 0.92.3 > > Attachments: 5179-90.txt, 5179-90v10.patch, 5179-90v11.patch, > 5179-90v12.patch, 5179-90v13.txt, 5179-90v14.patch, 5179-90v15.patch, > 5179-90v16.patch, 5179-90v17.txt, 5179-90v18.txt, 5179-90v2.patch, > 5179-90v3.patch, 5179-90v4.patch, 5179-90v5.patch, 5179-90v6.patch, > 5179-90v7.patch, 5179-90v8.patch, 5179-90v9.patch, 5179-92v17.patch, > 5179-v11-92.txt, 5179-v11.txt, 5179-v2.txt, 5179-v3.txt, 5179-v4.txt, > Errorlog, hbase-5179.patch, hbase-5179v10.patch, hbase-5179v12.patch, > hbase-5179v17.patch, hbase-5179v5.patch, hbase-5179v6.patch, > hbase-5179v7.patch, hbase-5179v8.patch, hbase-5179v9.patch > > > If master's processing its failover and ServerShutdownHandler's processing > happen concurrently, it may appear following case. > 1.master completed splitLogAfterStartup() > 2.RegionserverA restarts, and ServerShutdownHandler is processing. > 3.master starts to rebuildUserRegions, and RegionserverA is considered as > dead server. > 4.master starts to assign regions of RegionserverA because it is a dead > server by step3. > However, when doing step4(assigning region), ServerShutdownHandler may be > doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3680) Publish more metrics about mslab
[ https://issues.apache.org/jira/browse/HBASE-3680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-3680: - Fix Version/s: (was: 0.92.2) 0.92.3 > Publish more metrics about mslab > > > Key: HBASE-3680 > URL: https://issues.apache.org/jira/browse/HBASE-3680 > Project: HBase > Issue Type: Improvement >Affects Versions: 0.90.1 >Reporter: Jean-Daniel Cryans >Assignee: Todd Lipcon > Fix For: 0.92.3 > > Attachments: hbase-3680.txt, hbase-3680.txt > > > We have been using mslab on all our clusters for a while now and it seems it > tends to OOME or send us into GC loops of death a lot more than it used to. > For example, one RS with mslab enabled and 7GB of heap died out of OOME this > afternoon; it had .55GB in the block cache and 2.03GB in the memstores which > doesn't account for much... but it could be that because of mslab a lot of > space was lost in those incomplete 2MB blocks and without metrics we can't > really tell. Compactions were running at the time of the OOME and I see block > cache activity. The average load on that cluster is 531. > We should at least publish the total size of all those blocks and maybe even > take actions based on that (like force flushing). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5492) Caching StartKeys and EndKeys of Regions
[ https://issues.apache.org/jira/browse/HBASE-5492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5492: - Fix Version/s: (was: 0.92.2) 0.92.3 > Caching StartKeys and EndKeys of Regions > > > Key: HBASE-5492 > URL: https://issues.apache.org/jira/browse/HBASE-5492 > Project: HBase > Issue Type: Improvement > Components: client >Affects Versions: 0.92.0 > Environment: all >Reporter: honghua zhu > Fix For: 0.92.3 > > Attachments: HBASE-5492.patch > > > Each call for HTable.getStartEndKeys will read meta table. > In particular, > in the case of client side multi-threaded concurrency statistics, > we must call HTable.coprocessorExec== > getStartKeysInRange ==> > getStartEndKeys, > resulting in the need to always scan the meta table. > This is not necessary, > we can implement the > HConnectionManager.HConnectionImplementation.locateRegions(byte[] tableName) > method, > then, get the StartKeys and EndKeys from the cachedRegionLocations of > HConnectionImplementation. > Combined with https://issues.apache.org/jira/browse/HBASE-5491, can improve > the performance of statistical -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5821) Incorrect handling of null value in Coprocessor aggregation function min()
[ https://issues.apache.org/jira/browse/HBASE-5821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5821: - Fix Version/s: (was: 0.94.1) (was: 0.96.0) (was: 0.92.2) 0.92.3 > Incorrect handling of null value in Coprocessor aggregation function min() > -- > > Key: HBASE-5821 > URL: https://issues.apache.org/jira/browse/HBASE-5821 > Project: HBase > Issue Type: Bug > Components: coprocessors >Affects Versions: 0.92.1 >Reporter: Maryann Xue >Assignee: Maryann Xue > Fix For: 0.92.3 > > Attachments: HBASE-5821.patch > > > Both in AggregateImplementation and AggregationClient, the evaluation of the > current minimum value is like: > min = (min == null || ci.compare(result, min) < 0) ? result : min; > The LongColumnInterpreter takes null value is treated as the least value, > while the above expression takes min as the greater value when it is null. > Thus, the real minimum value gets discarded if a null value comes later. > max() could also be wrong if a different ColumnInterpreter other than > LongColumnInterpreter treats null value differently (as the greatest). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4415) Add configuration script for setup HBase (hbase-setup-conf.sh)
[ https://issues.apache.org/jira/browse/HBASE-4415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-4415: - Fix Version/s: (was: 0.94.1) (was: 0.92.2) 0.92.3 > Add configuration script for setup HBase (hbase-setup-conf.sh) > -- > > Key: HBASE-4415 > URL: https://issues.apache.org/jira/browse/HBASE-4415 > Project: HBase > Issue Type: New Feature > Components: scripts >Affects Versions: 0.90.4, 0.92.0 > Environment: Java 6, Linux >Reporter: Eric Yang >Assignee: Eric Yang > Fix For: 0.92.3 > > Attachments: HBASE-4415-1.patch, HBASE-4415-2.patch, > HBASE-4415-3.patch, HBASE-4415-4.patch, HBASE-4415-5.patch, > HBASE-4415-6.patch, HBASE-4415-7.patch, HBASE-4415-8.patch, > HBASE-4415-9.patch, HBASE-4415.patch > > > The goal of this jura is to provide a installation script for configuring > HBase environment and configuration. By using the same pattern of > *-setup-conf.sh for all Hadoop related projects. For HBase, the usage of the > script looks like this: > {noformat} > usage: ./hbase-setup-conf.sh > Optional parameters: > --hadoop-conf=/etc/hadoopSet Hadoop configuration directory > location > --hadoop-home=/usr Set Hadoop directory location > --hadoop-namenode=localhost Set Hadoop namenode hostname > --hadoop-replication=3 Set HDFS replication > --hbase-home=/usrSet HBase directory location > --hbase-conf=/etc/hbase Set HBase configuration > directory location > --hbase-log=/var/log/hbase Set HBase log directory location > --hbase-pid=/var/run/hbase Set HBase pid directory location > --hbase-user=hbase Set HBase user > --java-home=/usr/java/defaultSet JAVA_HOME directory location > --kerberos-realm=KERBEROS.EXAMPLE.COMSet Kerberos realm > --kerberos-principal-id=_HOSTSet Kerberos principal ID > --keytab-dir=/etc/security/keytabs Set keytab directory > --regionservers=localhostSet regionservers hostnames > --zookeeper-home=/usrSet ZooKeeper directory location > --zookeeper-quorum=localhost Set ZooKeeper Quorum > --zookeeper-snapshot=/var/lib/zookeeper Set ZooKeeper snapshot location > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5578) NPE when regionserver reported server load, caused rs stop.
[ https://issues.apache.org/jira/browse/HBASE-5578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5578: - Fix Version/s: (was: 0.92.2) 0.92.3 > NPE when regionserver reported server load, caused rs stop. > --- > > Key: HBASE-5578 > URL: https://issues.apache.org/jira/browse/HBASE-5578 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 0.92.0 > Environment: centos6.2 hadoop-1.0.0 hbase-0.92.0 >Reporter: Storm Lee >Priority: Critical > Fix For: 0.92.3 > > Attachments: 5589.txt > > > The regeionserver log: > 2012-03-11 11:55:37,808 FATAL > org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server > data3,60020,1331286604591: Unhandled exception: null > java.lang.NullPointerException > at > org.apache.hadoop.hbase.regionserver.Store.getTotalStaticIndexSize(Store.java:1788) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.createRegionLoad(HRegionServer.java:994) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.buildServerLoad(HRegionServer.java:800) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:776) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:678) > at java.lang.Thread.run(Thread.java:662) > 2012-03-11 11:55:37,808 FATAL > org.apache.hadoop.hbase.regionserver.HRegionServer: RegionServer abort: > loaded coprocessors are: [] > 2012-03-11 11:55:37,808 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics: > requestsPerSecond=1687, numberOfOnlineRegions=37, numberOfStores=37, > numberOfStorefiles=144, storefileIndexSizeMB=2, rootIndexSizeKB=2362, > totalStaticIndexSizeKB=229808, totalStaticBloomSizeKB=2166296, > memstoreSizeMB=2854, readRequestsCount=1352673, writeRequestsCount=113137586, > compactionQueueSize=8, flushQueueSize=3, usedHeapMB=7359, maxHeapMB=12999, > blockCacheSizeMB=32.31, blockCacheFreeMB=3867.52, blockCacheCount=38, > blockCacheHitCount=87713, blockCacheMissCount=22144560, > blockCacheEvictedCount=122, blockCacheHitRatio=0%, > blockCacheHitCachingRatio=99%, hdfsBlocksLocalityIndex=100 > 2012-03-11 11:55:37,992 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Unhandled > exception: null -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4523) dfs.support.append config should be present in the hadoop configs, we should remove them from hbase so the user is not confused when they see the config in 2 places
[ https://issues.apache.org/jira/browse/HBASE-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-4523: - Fix Version/s: (was: 0.92.2) 0.92.3 > dfs.support.append config should be present in the hadoop configs, we should > remove them from hbase so the user is not confused when they see the config > in 2 places > > > Key: HBASE-4523 > URL: https://issues.apache.org/jira/browse/HBASE-4523 > Project: HBase > Issue Type: Bug >Affects Versions: 0.90.4, 0.92.0 >Reporter: Arpit Gupta >Assignee: Eric Yang > Fix For: 0.92.3 > > Attachments: HBASE-4523.patch > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4467) Handle inconsistencies in Hadoop libraries naming in hbase script
[ https://issues.apache.org/jira/browse/HBASE-4467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-4467: - Fix Version/s: (was: 0.92.2) 0.92.3 > Handle inconsistencies in Hadoop libraries naming in hbase script > - > > Key: HBASE-4467 > URL: https://issues.apache.org/jira/browse/HBASE-4467 > Project: HBase > Issue Type: Bug > Components: scripts >Affects Versions: 0.92.0, 0.94.0 >Reporter: Lars George >Assignee: Lars George >Priority: Trivial > Fix For: 0.92.3 > > Attachments: HBASE-4467.patch > > > When using an Hadoop tarball that has a library naming of "hadoop-x.y.z-core" > as opposed to "hadoop-core-x.y.z" then the hbase script throws errors. > {noformat} > $ bin/start-hbase.sh > ls: /projects/opensource/hadoop-0.20.2-append/hadoop-core*.jar: No such file > or directory > Exception in thread "main" java.lang.NoClassDefFoundError: > org/apache/hadoop/util/PlatformName > Caused by: java.lang.ClassNotFoundException: > org.apache.hadoop.util.PlatformName > at java.net.URLClassLoader$1.run(URLClassLoader.java:202) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:190) > at java.lang.ClassLoader.loadClass(ClassLoader.java:306) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) > at java.lang.ClassLoader.loadClass(ClassLoader.java:247) > ls: /projects/opensource/hadoop-0.20.2-append/hadoop-core*.jar: No such file > or directory > Exception in thread "main" java.lang.NoClassDefFoundError: > org/apache/hadoop/util/PlatformName > Caused by: java.lang.ClassNotFoundException: > org.apache.hadoop.util.PlatformName > at java.net.URLClassLoader$1.run(URLClassLoader.java:202) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:190) > at java.lang.ClassLoader.loadClass(ClassLoader.java:306) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) > at java.lang.ClassLoader.loadClass(ClassLoader.java:247) > localhost: starting zookeeper, logging to > /projects/opensource/hbase-trunk-rw//logs/hbase-larsgeorge-zookeeper-de1-app-mbp-2.out > localhost: /projects/opensource/hadoop-0.20.2-append > localhost: ls: /projects/opensource/hadoop-0.20.2-append/hadoop-core*.jar: No > such file or directory > localhost: Exception in thread "main" java.lang.NoClassDefFoundError: > org/apache/hadoop/util/PlatformName > localhost: Caused by: java.lang.ClassNotFoundException: > org.apache.hadoop.util.PlatformName > localhost:at java.net.URLClassLoader$1.run(URLClassLoader.java:202) > localhost:at java.security.AccessController.doPrivileged(Native Method) > localhost:at java.net.URLClassLoader.findClass(URLClassLoader.java:190) > localhost:at java.lang.ClassLoader.loadClass(ClassLoader.java:306) > localhost:at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) > localhost:at java.lang.ClassLoader.loadClass(ClassLoader.java:247) > starting master, logging to > /projects/opensource/hbase-trunk-rw/bin/../logs/hbase-larsgeorge-master-de1-app-mbp-2.out > /projects/opensource/hadoop-0.20.2-append > ls: /projects/opensource/hadoop-0.20.2-append/hadoop-core*.jar: No such file > or directory > Exception in thread "main" java.lang.NoClassDefFoundError: > org/apache/hadoop/util/PlatformName > Caused by: java.lang.ClassNotFoundException: > org.apache.hadoop.util.PlatformName > at java.net.URLClassLoader$1.run(URLClassLoader.java:202) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:190) > at java.lang.ClassLoader.loadClass(ClassLoader.java:306) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) > at java.lang.ClassLoader.loadClass(ClassLoader.java:247) > localhost: starting regionserver, logging to > /projects/opensource/hbase-trunk-rw//logs/hbase-larsgeorge-regionserver-de1-app-mbp-2.out > localhost: /projects/opensource/hadoop-0.20.2-append > localhost: ls: /projects/opensource/hadoop-0.20.2-append/hadoop-core*.jar: No > such file or directory > localhost: Exception in thread "main" java.lang.NoClassDefFoundError: > org/apache/hadoop/util/PlatformName > localhost: Caused by: java.lang.ClassNotFoundException: > org.apache.hadoop.util.PlatformName > localhost:at java.net.URLClassLoader$1.run(URLClassLoader.java:202) > localhost:at java.security.AccessController.doPrivileged(Native Method) > localhost:at java.net.URLClassLoader.findClass(URLClassLoader.java:190) > localhost:at java.lang.ClassLoader.loadClass(ClassLoader.java:306) > lo
[jira] [Updated] (HBASE-4457) Starting region server on non-default info port is resulting in broken URL's in master UI
[ https://issues.apache.org/jira/browse/HBASE-4457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-4457: - Fix Version/s: (was: 0.94.1) (was: 0.92.2) 0.92.3 > Starting region server on non-default info port is resulting in broken URL's > in master UI > - > > Key: HBASE-4457 > URL: https://issues.apache.org/jira/browse/HBASE-4457 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.92.0 >Reporter: Praveen Patibandla > Labels: newbie > Fix For: 0.92.3 > > Attachments: 4457-V1.patch, 4457.patch > > > When "hbase.regionserver.info.port" is set to non-default port, Master UI > has broken URL's in the region server table because it's hard coded to > default port. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover
[ https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287489#comment-13287489 ] Zhihong Yu edited comment on HBASE-6060 at 6/1/12 4:47 PM: --- This patch is also a backport of HBASE-5396. But this is more exhaustive and also tries to address HBASE-5816. HBASE-6147 has been raised to solve other assign related issues that comes from SSH and joincluster. Pls review and provide your comments. was (Author: rajesh23): This patch is also a backport of HBASe-5396. But this is more exhaustive and also tries to address HBASE-5816. HBASE-6147 has been raised to solve other assign related issues that comes from SSH and joincluster. Pls review and provide your comments. > Regions's in OPENING state from failed regionservers takes a long time to > recover > - > > Key: HBASE-6060 > URL: https://issues.apache.org/jira/browse/HBASE-6060 > Project: HBase > Issue Type: Bug > Components: master, regionserver >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Attachments: HBASE-6060-94.patch > > > we have seen a pattern in tests, that the regions are stuck in OPENING state > for a very long time when the region server who is opening the region fails. > My understanding of the process: > > - master calls rs to open the region. If rs is offline, a new plan is > generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in > master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), > HMaster.assign() > - RegionServer, starts opening a region, changes the state in znode. But > that znode is not ephemeral. (see ZkAssign) > - Rs transitions zk node from OFFLINE to OPENING. See > OpenRegionHandler.process() > - rs then opens the region, and changes znode from OPENING to OPENED > - when rs is killed between OPENING and OPENED states, then zk shows OPENING > state, and the master just waits for rs to change the region state, but since > rs is down, that wont happen. > - There is a AssignmentManager.TimeoutMonitor, which does exactly guard > against these kind of conditions. It periodically checks (every 10 sec by > default) the regions in transition to see whether they timedout > (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, > which explains what you and I are seeing. > - ServerShutdownHandler in Master does not reassign regions in OPENING > state, although it handles other states. > Lowering that threshold from the configuration is one option, but still I > think we can do better. > Will investigate more. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6046) Master retry on ZK session expiry causes inconsistent region assignments.
[ https://issues.apache.org/jira/browse/HBASE-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287526#comment-13287526 ] Zhihong Yu commented on HBASE-6046: --- I ran the new test and it passed. {code} } + public void finishInitialization() { +finishInitialization(false); {code} Please add javadoc for the above method. Leave one empty line between the previous method and finishInitialization(). In test code: {code} + public static class MockLoadBalancer extends DefaultLoadBalancer { {code} The above class can be private. > Master retry on ZK session expiry causes inconsistent region assignments. > - > > Key: HBASE-6046 > URL: https://issues.apache.org/jira/browse/HBASE-6046 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.92.1, 0.94.0 >Reporter: Gopinathan A >Assignee: ramkrishna.s.vasudevan > Attachments: HBASE_6046_0.94.patch, HBASE_6046_0.94_1.patch > > > 1> ZK Session timeout in the hmaster leads to bulk assignment though all the > RSs are online. > 2> While doing bulk assignment, if the master again goes down & restart(or > backup comes up) all the node created in the ZK will now be tried to reassign > to the new RSs. This is leading to double assignment. > we had 2800 regions, among this 1900 region got double assignment, taking the > region count to 4700. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-5360) [uberhbck] Add options for how to handle offline split parents.
[ https://issues.apache.org/jira/browse/HBASE-5360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang reassigned HBASE-5360: -- Assignee: Jimmy Xiang > [uberhbck] Add options for how to handle offline split parents. > > > Key: HBASE-5360 > URL: https://issues.apache.org/jira/browse/HBASE-5360 > Project: HBase > Issue Type: Improvement > Components: hbck >Affects Versions: 0.90.7, 0.92.1, 0.94.0 >Reporter: Jonathan Hsieh >Assignee: Jimmy Xiang > > In a recent case, we attempted to repair a cluster that suffered from > HBASE-4238 that had about 6-7 generations of "leftover" split data. The hbck > repair options in an development version of HBASE-5128 treat HDFS as ground > truth but didn't check SPLIT and OFFLINE flags only found in meta. The net > effect was that it essentially attempted to merge many regions back into its > eldest geneneration's parent's range. > More safe guards to prevent "mega-merges" are being added on HBASE-5128. > This issue would automate the handling of the "mega-merge" avoiding cases > such as "lingering grandparents". The strategy here would be to add more > checks against .META., and perform part of the catalog janitor's > responsibilities for lingering grandparents. This would potentially include > options to sideline regions, deleting grandparent regions, min size for > sidelining, and mechanisms for cleaning .META.. > Note: There already exists an mechanism to reload these regions -- the bulk > loaded mechanisms in LoadIncrementalHFiles can be used to re-add grandparents > (automatically splitting them if necessary) to HBase. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover
[ https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287489#comment-13287489 ] rajeshbabu commented on HBASE-6060: --- This patch is also a backport of HBASe-5396. But this is more exhaustive and also tries to address HBASE-5816. HBASE-6147 has been raised to solve other assign related issues that comes from SSH and joincluster. Pls review and provide your comments. > Regions's in OPENING state from failed regionservers takes a long time to > recover > - > > Key: HBASE-6060 > URL: https://issues.apache.org/jira/browse/HBASE-6060 > Project: HBase > Issue Type: Bug > Components: master, regionserver >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Attachments: HBASE-6060-94.patch > > > we have seen a pattern in tests, that the regions are stuck in OPENING state > for a very long time when the region server who is opening the region fails. > My understanding of the process: > > - master calls rs to open the region. If rs is offline, a new plan is > generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in > master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), > HMaster.assign() > - RegionServer, starts opening a region, changes the state in znode. But > that znode is not ephemeral. (see ZkAssign) > - Rs transitions zk node from OFFLINE to OPENING. See > OpenRegionHandler.process() > - rs then opens the region, and changes znode from OPENING to OPENED > - when rs is killed between OPENING and OPENED states, then zk shows OPENING > state, and the master just waits for rs to change the region state, but since > rs is down, that wont happen. > - There is a AssignmentManager.TimeoutMonitor, which does exactly guard > against these kind of conditions. It periodically checks (every 10 sec by > default) the regions in transition to see whether they timedout > (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, > which explains what you and I are seeing. > - ServerShutdownHandler in Master does not reassign regions in OPENING > state, although it handles other states. > Lowering that threshold from the configuration is one option, but still I > think we can do better. > Will investigate more. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover
[ https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] rajeshbabu updated HBASE-6060: -- Attachment: HBASE-6060-94.patch > Regions's in OPENING state from failed regionservers takes a long time to > recover > - > > Key: HBASE-6060 > URL: https://issues.apache.org/jira/browse/HBASE-6060 > Project: HBase > Issue Type: Bug > Components: master, regionserver >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Attachments: HBASE-6060-94.patch > > > we have seen a pattern in tests, that the regions are stuck in OPENING state > for a very long time when the region server who is opening the region fails. > My understanding of the process: > > - master calls rs to open the region. If rs is offline, a new plan is > generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in > master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), > HMaster.assign() > - RegionServer, starts opening a region, changes the state in znode. But > that znode is not ephemeral. (see ZkAssign) > - Rs transitions zk node from OFFLINE to OPENING. See > OpenRegionHandler.process() > - rs then opens the region, and changes znode from OPENING to OPENED > - when rs is killed between OPENING and OPENED states, then zk shows OPENING > state, and the master just waits for rs to change the region state, but since > rs is down, that wont happen. > - There is a AssignmentManager.TimeoutMonitor, which does exactly guard > against these kind of conditions. It periodically checks (every 10 sec by > default) the regions in transition to see whether they timedout > (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, > which explains what you and I are seeing. > - ServerShutdownHandler in Master does not reassign regions in OPENING > state, although it handles other states. > Lowering that threshold from the configuration is one option, but still I > think we can do better. > Will investigate more. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6046) Master retry on ZK session expiry causes inconsistent region assignments.
[ https://issues.apache.org/jira/browse/HBASE-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287476#comment-13287476 ] Hadoop QA commented on HBASE-6046: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12530559/HBASE_6046_0.94_1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2084//console This message is automatically generated. > Master retry on ZK session expiry causes inconsistent region assignments. > - > > Key: HBASE-6046 > URL: https://issues.apache.org/jira/browse/HBASE-6046 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.92.1, 0.94.0 >Reporter: Gopinathan A >Assignee: ramkrishna.s.vasudevan > Attachments: HBASE_6046_0.94.patch, HBASE_6046_0.94_1.patch > > > 1> ZK Session timeout in the hmaster leads to bulk assignment though all the > RSs are online. > 2> While doing bulk assignment, if the master again goes down & restart(or > backup comes up) all the node created in the ZK will now be tried to reassign > to the new RSs. This is leading to double assignment. > we had 2800 regions, among this 1900 region got double assignment, taking the > region count to 4700. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6046) Master retry on ZK session expiry causes inconsistent region assignments.
[ https://issues.apache.org/jira/browse/HBASE-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287475#comment-13287475 ] Ashutosh Jindal commented on HBASE-6046: @Anoop Thanks for the review. Uplaoded patch addressing the comments. > Master retry on ZK session expiry causes inconsistent region assignments. > - > > Key: HBASE-6046 > URL: https://issues.apache.org/jira/browse/HBASE-6046 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.92.1, 0.94.0 >Reporter: Gopinathan A >Assignee: ramkrishna.s.vasudevan > Attachments: HBASE_6046_0.94.patch, HBASE_6046_0.94_1.patch > > > 1> ZK Session timeout in the hmaster leads to bulk assignment though all the > RSs are online. > 2> While doing bulk assignment, if the master again goes down & restart(or > backup comes up) all the node created in the ZK will now be tried to reassign > to the new RSs. This is leading to double assignment. > we had 2800 regions, among this 1900 region got double assignment, taking the > region count to 4700. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6046) Master retry on ZK session expiry causes inconsistent region assignments.
[ https://issues.apache.org/jira/browse/HBASE-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Jindal updated HBASE-6046: --- Hadoop Flags: (was: Reviewed) Status: Patch Available (was: Open) > Master retry on ZK session expiry causes inconsistent region assignments. > - > > Key: HBASE-6046 > URL: https://issues.apache.org/jira/browse/HBASE-6046 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.94.0, 0.92.1 >Reporter: Gopinathan A >Assignee: ramkrishna.s.vasudevan > Attachments: HBASE_6046_0.94.patch, HBASE_6046_0.94_1.patch > > > 1> ZK Session timeout in the hmaster leads to bulk assignment though all the > RSs are online. > 2> While doing bulk assignment, if the master again goes down & restart(or > backup comes up) all the node created in the ZK will now be tried to reassign > to the new RSs. This is leading to double assignment. > we had 2800 regions, among this 1900 region got double assignment, taking the > region count to 4700. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6046) Master retry on ZK session expiry causes inconsistent region assignments.
[ https://issues.apache.org/jira/browse/HBASE-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Jindal updated HBASE-6046: --- Status: Open (was: Patch Available) > Master retry on ZK session expiry causes inconsistent region assignments. > - > > Key: HBASE-6046 > URL: https://issues.apache.org/jira/browse/HBASE-6046 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.94.0, 0.92.1 >Reporter: Gopinathan A >Assignee: ramkrishna.s.vasudevan > Attachments: HBASE_6046_0.94.patch, HBASE_6046_0.94_1.patch > > > 1> ZK Session timeout in the hmaster leads to bulk assignment though all the > RSs are online. > 2> While doing bulk assignment, if the master again goes down & restart(or > backup comes up) all the node created in the ZK will now be tried to reassign > to the new RSs. This is leading to double assignment. > we had 2800 regions, among this 1900 region got double assignment, taking the > region count to 4700. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6046) Master retry on ZK session expiry causes inconsistent region assignments.
[ https://issues.apache.org/jira/browse/HBASE-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Jindal updated HBASE-6046: --- Attachment: HBASE_6046_0.94_1.patch > Master retry on ZK session expiry causes inconsistent region assignments. > - > > Key: HBASE-6046 > URL: https://issues.apache.org/jira/browse/HBASE-6046 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.92.1, 0.94.0 >Reporter: Gopinathan A >Assignee: ramkrishna.s.vasudevan > Attachments: HBASE_6046_0.94.patch, HBASE_6046_0.94_1.patch > > > 1> ZK Session timeout in the hmaster leads to bulk assignment though all the > RSs are online. > 2> While doing bulk assignment, if the master again goes down & restart(or > backup comes up) all the node created in the ZK will now be tried to reassign > to the new RSs. This is leading to double assignment. > we had 2800 regions, among this 1900 region got double assignment, taking the > region count to 4700. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6046) Master retry on ZK session expiry causes inconsistent region assignments.
[ https://issues.apache.org/jira/browse/HBASE-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287467#comment-13287467 ] Ashutosh Jindal commented on HBASE-6046: Please check the second testcase added testLogSplittingAfterMasterRecoveryDueToZKExpiry() .If the testcase is run without the patch , stackOverFlow exception is thrown. {code} java.lang.StackOverflowError at java.lang.System.getProperty(System.java:647) at sun.security.action.GetPropertyAction.run(GetPropertyAction.java:67) at sun.security.action.GetPropertyAction.run(GetPropertyAction.java:32) at java.security.AccessController.doPrivileged(Native Method) at java.io.PrintWriter.(PrintWriter.java:78) at java.io.PrintWriter.(PrintWriter.java:62) at org.apache.log4j.DefaultThrowableRenderer.render(DefaultThrowableRenderer.java:58) at org.apache.log4j.spi.ThrowableInformation.getThrowableStrRep(ThrowableInformation.java:87) at org.apache.log4j.spi.LoggingEvent.getThrowableStrRep(LoggingEvent.java:413) at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:313) at org.apache.log4j.WriterAppender.append(WriterAppender.java:162) at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251) at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) at org.apache.log4j.Category.callAppenders(Category.java:206) at org.apache.log4j.Category.forcedLog(Category.java:391) at org.apache.log4j.Category.log(Category.java:856) at org.slf4j.impl.Log4jLoggerAdapter.error(Log4jLoggerAdapter.java:485) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:623) at org.apache.zookeeper.ClientCnxn$EventThread.queuePacket(ClientCnxn.java:477) at org.apache.zookeeper.ClientCnxn.finishPacket(ClientCnxn.java:640) at org.apache.zookeeper.ClientCnxn.conLossPacket(ClientCnxn.java:658) at org.apache.zookeeper.ClientCnxn.queuePacket(ClientCnxn.java:1274) at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:975) at org.apache.hadoop.hbase.master.SplitLogManager.deleteNode(SplitLogManager.java:626) at org.apache.hadoop.hbase.master.SplitLogManager.access$17(SplitLogManager.java:620) at org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback.processResult(SplitLogManager.java:1104) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:619) at org.apache.zookeeper.ClientCnxn$EventThread.queuePacket(ClientCnxn.java:477) at org.apache.zookeeper.ClientCnxn.finishPacket(ClientCnxn.java:640) at org.apache.zookeeper.ClientCnxn.conLossPacket(ClientCnxn.java:658) at org.apache.zookeeper.ClientCnxn.queuePacket(ClientCnxn.java:1274) at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:975) at org.apache.hadoop.hbase.master.SplitLogManager.deleteNode(SplitLogManager.java:626) at org.apache.hadoop.hbase.master.SplitLogManager.access$17(SplitLogManager.java:620) at org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback.processResult(SplitLogManager.java:1104) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:619) at org.apache.zookeeper.ClientCnxn$EventThread.queuePacket(ClientCnxn.java:477) {code} This is coming because the listener for splitLogManager is not registered after the master recovers from expired zk session. > Master retry on ZK session expiry causes inconsistent region assignments. > - > > Key: HBASE-6046 > URL: https://issues.apache.org/jira/browse/HBASE-6046 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.92.1, 0.94.0 >Reporter: Gopinathan A >Assignee: ramkrishna.s.vasudevan > Attachments: HBASE_6046_0.94.patch > > > 1> ZK Session timeout in the hmaster leads to bulk assignment though all the > RSs are online. > 2> While doing bulk assignment, if the master again goes down & restart(or > backup comes up) all the node created in the ZK will now be tried to reassign > to the new RSs. This is leading to double assignment. > we had 2800 regions, among this 1900 region got double assignment, taking the > region count to 4700. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6046) Master retry on ZK session expiry causes inconsistent region assignments.
[ https://issues.apache.org/jira/browse/HBASE-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287460#comment-13287460 ] Anoop Sam John commented on HBASE-6046: --- One immediate comment after seeing the patch {code} +this.fileSystemManager = new MasterFileSystem(this, this, metrics, masterRecovery ? true +: false); {code} Can pass the boolean variable masterRecovery directly. > Master retry on ZK session expiry causes inconsistent region assignments. > - > > Key: HBASE-6046 > URL: https://issues.apache.org/jira/browse/HBASE-6046 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.92.1, 0.94.0 >Reporter: Gopinathan A >Assignee: ramkrishna.s.vasudevan > Attachments: HBASE_6046_0.94.patch > > > 1> ZK Session timeout in the hmaster leads to bulk assignment though all the > RSs are online. > 2> While doing bulk assignment, if the master again goes down & restart(or > backup comes up) all the node created in the ZK will now be tried to reassign > to the new RSs. This is leading to double assignment. > we had 2800 regions, among this 1900 region got double assignment, taking the > region count to 4700. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6046) Master retry on ZK session expiry causes inconsistent region assignments.
[ https://issues.apache.org/jira/browse/HBASE-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287453#comment-13287453 ] Hadoop QA commented on HBASE-6046: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12530554/HBASE_6046_0.94.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2083//console This message is automatically generated. > Master retry on ZK session expiry causes inconsistent region assignments. > - > > Key: HBASE-6046 > URL: https://issues.apache.org/jira/browse/HBASE-6046 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.92.1, 0.94.0 >Reporter: Gopinathan A >Assignee: ramkrishna.s.vasudevan > Attachments: HBASE_6046_0.94.patch > > > 1> ZK Session timeout in the hmaster leads to bulk assignment though all the > RSs are online. > 2> While doing bulk assignment, if the master again goes down & restart(or > backup comes up) all the node created in the ZK will now be tried to reassign > to the new RSs. This is leading to double assignment. > we had 2800 regions, among this 1900 region got double assignment, taking the > region count to 4700. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6046) Master retry on ZK session expiry causes inconsistent region assignments.
[ https://issues.apache.org/jira/browse/HBASE-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287451#comment-13287451 ] Ashutosh Jindal commented on HBASE-6046: Attached patch for 0.94 version. Please review and provide your suggestion/comments. > Master retry on ZK session expiry causes inconsistent region assignments. > - > > Key: HBASE-6046 > URL: https://issues.apache.org/jira/browse/HBASE-6046 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.92.1, 0.94.0 >Reporter: Gopinathan A >Assignee: ramkrishna.s.vasudevan > Attachments: HBASE_6046_0.94.patch > > > 1> ZK Session timeout in the hmaster leads to bulk assignment though all the > RSs are online. > 2> While doing bulk assignment, if the master again goes down & restart(or > backup comes up) all the node created in the ZK will now be tried to reassign > to the new RSs. This is leading to double assignment. > we had 2800 regions, among this 1900 region got double assignment, taking the > region count to 4700. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6046) Master retry on ZK session expiry causes inconsistent region assignments.
[ https://issues.apache.org/jira/browse/HBASE-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Jindal updated HBASE-6046: --- Fix Version/s: (was: 0.94.1) (was: 0.92.2) Hadoop Flags: Reviewed Status: Patch Available (was: Open) > Master retry on ZK session expiry causes inconsistent region assignments. > - > > Key: HBASE-6046 > URL: https://issues.apache.org/jira/browse/HBASE-6046 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.94.0, 0.92.1 >Reporter: Gopinathan A >Assignee: ramkrishna.s.vasudevan > Attachments: HBASE_6046_0.94.patch > > > 1> ZK Session timeout in the hmaster leads to bulk assignment though all the > RSs are online. > 2> While doing bulk assignment, if the master again goes down & restart(or > backup comes up) all the node created in the ZK will now be tried to reassign > to the new RSs. This is leading to double assignment. > we had 2800 regions, among this 1900 region got double assignment, taking the > region count to 4700. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6046) Master retry on ZK session expiry causes inconsistent region assignments.
[ https://issues.apache.org/jira/browse/HBASE-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Jindal updated HBASE-6046: --- Attachment: HBASE_6046_0.94.patch > Master retry on ZK session expiry causes inconsistent region assignments. > - > > Key: HBASE-6046 > URL: https://issues.apache.org/jira/browse/HBASE-6046 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.92.1, 0.94.0 >Reporter: Gopinathan A >Assignee: ramkrishna.s.vasudevan > Attachments: HBASE_6046_0.94.patch > > > 1> ZK Session timeout in the hmaster leads to bulk assignment though all the > RSs are online. > 2> While doing bulk assignment, if the master again goes down & restart(or > backup comes up) all the node created in the ZK will now be tried to reassign > to the new RSs. This is leading to double assignment. > we had 2800 regions, among this 1900 region got double assignment, taking the > region count to 4700. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6138) HadoopQA not running findbugs [Trunk]
[ https://issues.apache.org/jira/browse/HBASE-6138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-6138: -- Description: HadoopQA shows like -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. But not able to see any reports link When I checked the console output for the build I can see {code} [INFO] --- findbugs-maven-plugin:2.4.0:findbugs (default-cli) @ hbase-common --- [INFO] Fork Value is true [INFO] [INFO] Reactor Summary: [INFO] [INFO] HBase . SUCCESS [1.890s] [INFO] HBase - Common FAILURE [2.238s] [INFO] HBase - Server SKIPPED [INFO] HBase - Assembly .. SKIPPED [INFO] HBase - Site .. SKIPPED [INFO] [INFO] BUILD FAILURE [INFO] [INFO] Total time: 4.856s [INFO] Finished at: Thu May 31 03:35:35 UTC 2012 [INFO] Final Memory: 23M/154M [INFO] [ERROR] Could not find resource '${parent.basedir}/dev-support/findbugs-exclude.xml'. -> [Help 1] [ERROR] {code} Because of this error Findbugs is getting run! was: HadoopQA shows like -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. But not able to see any reports link When I checked the console output for the build I can see {code} [INFO] --- findbugs-maven-plugin:2.4.0:findbugs (default-cli) @ hbase-common --- [INFO] Fork Value is true [INFO] [INFO] Reactor Summary: [INFO] [INFO] HBase . SUCCESS [1.890s] [INFO] HBase - Common FAILURE [2.238s] [INFO] HBase - Server SKIPPED [INFO] HBase - Assembly .. SKIPPED [INFO] HBase - Site .. SKIPPED [INFO] [INFO] BUILD FAILURE [INFO] [INFO] Total time: 4.856s [INFO] Finished at: Thu May 31 03:35:35 UTC 2012 [INFO] Final Memory: 23M/154M [INFO] [ERROR] Could not find resource '${parent.basedir}/dev-support/findbugs-exclude.xml'. -> [Help 1] [ERROR] {code} Summary: HadoopQA not running findbugs [Trunk] (was: HadoopQA not showing the findbugs report[Trunk]) > HadoopQA not running findbugs [Trunk] > - > > Key: HBASE-6138 > URL: https://issues.apache.org/jira/browse/HBASE-6138 > Project: HBase > Issue Type: Bug > Components: build >Affects Versions: 0.96.0 >Reporter: Anoop Sam John > Fix For: 0.96.0 > > > HadoopQA shows like > -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. > But not able to see any reports link > When I checked the console output for the build I can see > {code} > [INFO] --- findbugs-maven-plugin:2.4.0:findbugs (default-cli) @ hbase-common > --- > [INFO] Fork Value is true > [INFO] > > [INFO] Reactor Summary: > [INFO] > [INFO] HBase . SUCCESS [1.890s] > [INFO] HBase - Common FAILURE [2.238s] > [INFO] HBase - Server SKIPPED > [INFO] HBase - Assembly .. SKIPPED > [INFO] HBase - Site .. SKIPPED > [INFO] > > [INFO] BUILD FAILURE > [INFO] > > [INFO] Total time: 4.856s > [INFO] Finished at: Thu May 31 03:35:35 UTC 2012 > [INFO] Final Memory: 23M/154M > [INFO] > > [ERROR] Could not find resource > '${parent.basedir}/dev-support/findbugs-exclude.xml'. -> [Help 1] > [ERROR] > {code} > Because of this error Findbugs is getting run! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6145) Fix site target post modularization
[ https://issues.apache.org/jira/browse/HBASE-6145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287427#comment-13287427 ] stack commented on HBASE-6145: -- Test failures probably not related. I can't commit this yet until I fix assembly. Thinking on this more, the way I've done site up in parent might not jibe well doing assembly down in an assembly module (which I believe you cannot avoid). I love maven! > Fix site target post modularization > --- > > Key: HBASE-6145 > URL: https://issues.apache.org/jira/browse/HBASE-6145 > Project: HBase > Issue Type: Task >Reporter: stack >Assignee: stack > Attachments: site.txt > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6012) AssignmentManager#asyncSetOfflineInZooKeeper wouldn't force node offline
[ https://issues.apache.org/jira/browse/HBASE-6012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287416#comment-13287416 ] ramkrishna.s.vasudevan commented on HBASE-6012: --- @Chunhui Please check the issue HBASE-6147. Along with the scenario mentioned there if we take the current patch here I feel the following problem can come -> Because as per the current patch attached here ,we force the znode to OFFLINE state, if there was any current assignment going on due to SSH in one of the RS that will get stopped because of znode version mismatch. Internally that failure to OPEN regin will make the znode to FAILED_OPEN. Now based on this master will again start a new assignment. So this along with the issue in HBASE-6147 may lead to double assignment. So what we felt here is this patch should go along with some changes in HBASE-6147. Pls feel free to correct me if am wrong. > AssignmentManager#asyncSetOfflineInZooKeeper wouldn't force node offline > > > Key: HBASE-6012 > URL: https://issues.apache.org/jira/browse/HBASE-6012 > Project: HBase > Issue Type: Bug >Affects Versions: 0.96.0 >Reporter: chunhui shen >Assignee: chunhui shen > Fix For: 0.96.0 > > Attachments: HBASE-6012.patch > > > As the javadoc of method and the log message > {code} > /** >* Set region as OFFLINED up in zookeeper asynchronously. >*/ > boolean asyncSetOfflineInZooKeeper( > ... > master.abort("Unexpected ZK exception creating/setting node OFFLINE", e); > ... > } > {code} > I think AssignmentManager#asyncSetOfflineInZooKeeper should also force node > offline, just like AssignmentManager#setOfflineInZooKeeper do. Otherwise, it > may cause bulk assign failed which called this method. > Error log on the master caused by the issue > 2012-05-12 01:40:09,437 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; > was=writetest,1YTQDPGLXBTICHOPQ6IL,1336590857771.674da422fc7cb9a7d42c74499ace1d93. > state=PENDING_CLOSE, ts=1336757876856 > 2012-05-12 01:40:09,437 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: > master:6-0x23736bf74780082 Async create of unassigned node for > 674da422fc7cb9a7d42c74499ace1d93 with OFFLINE state > 2012-05-12 01:40:09,446 WARN > org.apache.hadoop.hbase.master.AssignmentManager$CreateUnassignedAsyncCallback: > rc != 0 for /hbase-func1/unassigned/674da422fc7cb9a7d42c74499ace1d93 -- > retryable connectionloss -- FIX see > http://wiki.apache.org/hadoop/ZooKeeper/FAQ#A2 > 2012-05-12 01:40:09,447 FATAL org.apache.hadoop.hbase.master.HMaster: > Connectionloss writing unassigned at > /hbase-func1/unassigned/674da422fc7cb9a7d42c74499ace1d93, rc=-110 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6147) SSH and AM.joinCluster leads to region assignment inconsistency in many cases.
[ https://issues.apache.org/jira/browse/HBASE-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287410#comment-13287410 ] ramkrishna.s.vasudevan commented on HBASE-6147: --- We got the following case -> Initially we had 2 RS and 1 Master with few regions -> Stopped the cluster and restarted the master and 2 RS. -> One of the RS znode was not yet deleted but the master started coming up. -> Here we will now see that there is a server which dead and not yet expired so we wil call expireServer which inturn calls SSH. -> After this the master sees this as a clean cluster startup. -> Now SSH triggers one assignment and master startup starts bulk assignment. -> Now when the znode is present already the Bulk assignment will make the master go down. So we need to handle such cases. Solving this should help us to solve most of the double assignment cases. There can be more such scenarios. > SSH and AM.joinCluster leads to region assignment inconsistency in many cases. > -- > > Key: HBASE-6147 > URL: https://issues.apache.org/jira/browse/HBASE-6147 > Project: HBase > Issue Type: Bug >Affects Versions: 0.92.1, 0.94.0 >Reporter: ramkrishna.s.vasudevan > Fix For: 0.92.2, 0.96.0, 0.94.1 > > > We are facing few issues in the master restart and SSH going in parallel. > Chunhui also suggested that we need to rework on this part. This JIRA is > aimed at solving all such possibilities of region assignment inconsistency -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6147) SSH and AM.joinCluster leads to region assignment inconsistency in many cases.
ramkrishna.s.vasudevan created HBASE-6147: - Summary: SSH and AM.joinCluster leads to region assignment inconsistency in many cases. Key: HBASE-6147 URL: https://issues.apache.org/jira/browse/HBASE-6147 Project: HBase Issue Type: Bug Affects Versions: 0.94.0, 0.92.1 Reporter: ramkrishna.s.vasudevan Fix For: 0.92.2, 0.96.0, 0.94.1 We are facing few issues in the master restart and SSH going in parallel. Chunhui also suggested that we need to rework on this part. This JIRA is aimed at solving all such possibilities of region assignment inconsistency -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6138) HadoopQA not showing the findbugs report[Trunk]
[ https://issues.apache.org/jira/browse/HBASE-6138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287375#comment-13287375 ] Anoop Sam John commented on HBASE-6138: --- I think below change can fix the issue In pom.xml {code} - ${parent.basedir}/dev-support/findbugs-exclude.xml + ${parent.basedir}/../dev-support/findbugs-exclude.xml {code} > HadoopQA not showing the findbugs report[Trunk] > --- > > Key: HBASE-6138 > URL: https://issues.apache.org/jira/browse/HBASE-6138 > Project: HBase > Issue Type: Bug > Components: build >Affects Versions: 0.96.0 >Reporter: Anoop Sam John > Fix For: 0.96.0 > > > HadoopQA shows like > -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. > But not able to see any reports link > When I checked the console output for the build I can see > {code} > [INFO] --- findbugs-maven-plugin:2.4.0:findbugs (default-cli) @ hbase-common > --- > [INFO] Fork Value is true > [INFO] > > [INFO] Reactor Summary: > [INFO] > [INFO] HBase . SUCCESS [1.890s] > [INFO] HBase - Common FAILURE [2.238s] > [INFO] HBase - Server SKIPPED > [INFO] HBase - Assembly .. SKIPPED > [INFO] HBase - Site .. SKIPPED > [INFO] > > [INFO] BUILD FAILURE > [INFO] > > [INFO] Total time: 4.856s > [INFO] Finished at: Thu May 31 03:35:35 UTC 2012 > [INFO] Final Memory: 23M/154M > [INFO] > > [ERROR] Could not find resource > '${parent.basedir}/dev-support/findbugs-exclude.xml'. -> [Help 1] > [ERROR] > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6146) Disabling of Catalog tables should not be allowed
[ https://issues.apache.org/jira/browse/HBASE-6146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287373#comment-13287373 ] Anoop Sam John commented on HBASE-6146: --- -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.replication.TestReplication This seems not at all related to this patch -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. This is coming because of HBASE-6138 > Disabling of Catalog tables should not be allowed > - > > Key: HBASE-6146 > URL: https://issues.apache.org/jira/browse/HBASE-6146 > Project: HBase > Issue Type: Bug > Components: client >Reporter: Anoop Sam John >Assignee: Anoop Sam John > Fix For: 0.96.0, 0.94.1 > > Attachments: HBASE-6146_94.patch, HBASE-6146_Trunk.patch > > > HBaseAdmin#disableTable() when called with META or ROOT table, it will pass > the disable instruction to Master and table is actually getting disabled. > Later this API call will fail as there is a call to > HBaseAdmin#isTableDisabled() which is having a check like > isLegalTableName(tableName).So this call makes the catalog table to be in > disabled state. > We can have same kind of isLegalTableName(tableName) checks in disableTable() > and enableTable() APIs -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6116) Allow parallel HDFS writes for HLogs.
[ https://issues.apache.org/jira/browse/HBASE-6116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-6116: - Attachment: 6116-v1.txt Initial patch, which includes HBASE-5954. This also fixes building HBase trunk with Hadoop trunk (3.0.0-SNAPSHOT). In order to test, HDFS-1783 needs to be applied to Hadoop (trunk) first. Then build Hadoop with: mvn -Pnative -Pdist -Dtar -DskipTests install And then HBase with: mvn -DskipTests -Dhadoop.profile=3.0 ... Parallel writes can be enable in hbase-site.xml with: hbase.regionserver.wal.parallel.writes Since this patch include HBASE-5954, durable sync can also be enabled: hbase.regionserver.wal.durable.sync hbase.regionserver.hfile.durable.sync (all options can be set to "true") @Andy: If your offer to do a quick test in EC2 still stands that'd be awesome! > Allow parallel HDFS writes for HLogs. > - > > Key: HBASE-6116 > URL: https://issues.apache.org/jira/browse/HBASE-6116 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl > Attachments: 6116-v1.txt > > > In HDFS-1783 I adapted Dhrubas changes to be used in Hadoop trunk. > This issue will include the necessary reflection changes to optionally enable > this for the WALs in HBase. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6146) Disabling of Catalog tables should not be allowed
[ https://issues.apache.org/jira/browse/HBASE-6146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287350#comment-13287350 ] Hadoop QA commented on HBASE-6146: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12530537/HBASE-6146_Trunk.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.replication.TestReplication Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2082//testReport/ Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2082//console This message is automatically generated. > Disabling of Catalog tables should not be allowed > - > > Key: HBASE-6146 > URL: https://issues.apache.org/jira/browse/HBASE-6146 > Project: HBase > Issue Type: Bug > Components: client >Reporter: Anoop Sam John >Assignee: Anoop Sam John > Fix For: 0.96.0, 0.94.1 > > Attachments: HBASE-6146_94.patch, HBASE-6146_Trunk.patch > > > HBaseAdmin#disableTable() when called with META or ROOT table, it will pass > the disable instruction to Master and table is actually getting disabled. > Later this API call will fail as there is a call to > HBaseAdmin#isTableDisabled() which is having a check like > isLegalTableName(tableName).So this call makes the catalog table to be in > disabled state. > We can have same kind of isLegalTableName(tableName) checks in disableTable() > and enableTable() APIs -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6146) Disabling of Catalog tables should not be allowed
[ https://issues.apache.org/jira/browse/HBASE-6146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-6146: -- Attachment: HBASE-6146_Trunk.patch HBASE-6146_94.patch Patches for 0.94 and Trunk > Disabling of Catalog tables should not be allowed > - > > Key: HBASE-6146 > URL: https://issues.apache.org/jira/browse/HBASE-6146 > Project: HBase > Issue Type: Bug > Components: client >Reporter: Anoop Sam John >Assignee: Anoop Sam John > Fix For: 0.96.0, 0.94.1 > > Attachments: HBASE-6146_94.patch, HBASE-6146_Trunk.patch > > > HBaseAdmin#disableTable() when called with META or ROOT table, it will pass > the disable instruction to Master and table is actually getting disabled. > Later this API call will fail as there is a call to > HBaseAdmin#isTableDisabled() which is having a check like > isLegalTableName(tableName).So this call makes the catalog table to be in > disabled state. > We can have same kind of isLegalTableName(tableName) checks in disableTable() > and enableTable() APIs -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6146) Disabling of Catalog tables should not be allowed
[ https://issues.apache.org/jira/browse/HBASE-6146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-6146: -- Fix Version/s: 0.94.1 0.96.0 Status: Patch Available (was: Open) > Disabling of Catalog tables should not be allowed > - > > Key: HBASE-6146 > URL: https://issues.apache.org/jira/browse/HBASE-6146 > Project: HBase > Issue Type: Bug > Components: client >Reporter: Anoop Sam John >Assignee: Anoop Sam John > Fix For: 0.96.0, 0.94.1 > > Attachments: HBASE-6146_94.patch, HBASE-6146_Trunk.patch > > > HBaseAdmin#disableTable() when called with META or ROOT table, it will pass > the disable instruction to Master and table is actually getting disabled. > Later this API call will fail as there is a call to > HBaseAdmin#isTableDisabled() which is having a check like > isLegalTableName(tableName).So this call makes the catalog table to be in > disabled state. > We can have same kind of isLegalTableName(tableName) checks in disableTable() > and enableTable() APIs -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira