[jira] [Commented] (HBASE-6142) Javadoc in some Filters ambiguous
[ https://issues.apache.org/jira/browse/HBASE-6142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287862#comment-13287862 ] Joep Rottinghuis commented on HBASE-6142: - Ah, I will look up my (signed) copy of said book, write test and take a shot at the javadoc. Cannot promise any angel qualities though... Javadoc in some Filters ambiguous - Key: HBASE-6142 URL: https://issues.apache.org/jira/browse/HBASE-6142 Project: HBase Issue Type: Bug Components: documentation Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Joep Rottinghuis Priority: Minor Labels: noob The javadoc on some of the filter is somewhat confusing. The main Filter interface has methods that behave like a sieve; when filterRowKey returns true, that means that the row is filtered _out_ (not included). Many of the Filter implementations work the other way around. When the condition is met the value passes (ie, the row is returned). Most Filters make it clear when a values passes (passing through the filter meaning the values are returned from the scan). Some are less clear in light of how the Filter interface works: WhileMatchFilter and SingleColumnValueFilter are examples. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5974) Scanner retry behavior with RPC timeout on next() seems incorrect
[ https://issues.apache.org/jira/browse/HBASE-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287878#comment-13287878 ] Anoop Sam John commented on HBASE-5974: --- If the solution is fine with every one I can make patch for other versions also. @Ted Regarding the new Exception extending DoNotRetryIOException, I was following NSRE. I can make this change. I think it should be ok. Scanner retry behavior with RPC timeout on next() seems incorrect - Key: HBASE-5974 URL: https://issues.apache.org/jira/browse/HBASE-5974 Project: HBase Issue Type: Bug Components: client, regionserver Affects Versions: 0.90.7, 0.92.1, 0.94.0, 0.96.0 Reporter: Todd Lipcon Assignee: Anoop Sam John Priority: Critical Fix For: 0.94.1 Attachments: HBASE-5974_0.94.patch, HBASE-5974_94-V2.patch I'm seeing the following behavior: - set RPC timeout to a short value - call next() for some batch of rows, big enough so the client times out before the result is returned - the HConnectionManager stuff will retry the next() call to the same server. At this point, one of two things can happen: 1) the previous next() call will still be processing, in which case you get a LeaseException, because it was removed from the map during the processing, or 2) the next() call will succeed but skip the prior batch of rows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5936) Add Column-level PB-based calls to HMasterInterface
[ https://issues.apache.org/jira/browse/HBASE-5936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287896#comment-13287896 ] Hudson commented on HBASE-5936: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #37 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/37/]) HBASE-5936 Addendum adds changes for TestHMasterRPCException that were missed in previous checkin (Revision 1345441) Result = FAILURE tedyu : Files : * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestHMasterRPCException.java Add Column-level PB-based calls to HMasterInterface --- Key: HBASE-5936 URL: https://issues.apache.org/jira/browse/HBASE-5936 Project: HBase Issue Type: Task Components: ipc, master, migration Reporter: Gregory Chanan Assignee: Gregory Chanan Fix For: 0.96.0 Attachments: 5936-addendum-v2.txt, HBASE-5936-v3.patch, HBASE-5936-v4.patch, HBASE-5936-v4.patch, HBASE-5936-v5.patch, HBASE-5936-v6.patch, HBASE-5936.patch This should be a subtask of HBASE-5445, but since that is a subtask, I can't also make this a subtask (apparently). This is for converting the column-level calls, i.e.: addColumn deleteColumn modifyColumn -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6138) HadoopQA not running findbugs [Trunk]
[ https://issues.apache.org/jira/browse/HBASE-6138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287897#comment-13287897 ] Hudson commented on HBASE-6138: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #37 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/37/]) HBASE-6138 HadoopQA not running findbugs [Trunk] (Anoop Sam John) (Revision 1345391) Result = FAILURE tedyu : Files : * /hbase/trunk/pom.xml HadoopQA not running findbugs [Trunk] - Key: HBASE-6138 URL: https://issues.apache.org/jira/browse/HBASE-6138 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.96.0 Reporter: Anoop Sam John Assignee: Anoop Sam John Fix For: 0.96.0 Attachments: 6138.txt HadoopQA shows like -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. But not able to see any reports link When I checked the console output for the build I can see {code} [INFO] --- findbugs-maven-plugin:2.4.0:findbugs (default-cli) @ hbase-common --- [INFO] Fork Value is true [INFO] [INFO] Reactor Summary: [INFO] [INFO] HBase . SUCCESS [1.890s] [INFO] HBase - Common FAILURE [2.238s] [INFO] HBase - Server SKIPPED [INFO] HBase - Assembly .. SKIPPED [INFO] HBase - Site .. SKIPPED [INFO] [INFO] BUILD FAILURE [INFO] [INFO] Total time: 4.856s [INFO] Finished at: Thu May 31 03:35:35 UTC 2012 [INFO] Final Memory: 23M/154M [INFO] [ERROR] Could not find resource '${parent.basedir}/dev-support/findbugs-exclude.xml'. - [Help 1] [ERROR] {code} Because of this error Findbugs is getting run! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5923) Cleanup checkAndXXX logic
[ https://issues.apache.org/jira/browse/HBASE-5923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287911#comment-13287911 ] Lars Hofhansl commented on HBASE-5923: -- That works. The other problem is o.a.h.h.Filter.WritableByteArrayComparable. I thought I could move this to o.a.h.h.BaseWritableByteArrayComparable and have o.a.h.h.Filter.WritableByteArrayComparable be a no-op subclass, but that would change the wire protocol :( Initially I thought one could just always BinaryComparator, but especially for LESS/GREATER type operations it is important to be able to control the sort order (for example for Unicode). It seems I'm stumped. Either o.a.h.h.Filter.WritableByteArrayComparable has to leak up into HTableInterface, or the wire protocol changes. Cleanup checkAndXXX logic - Key: HBASE-5923 URL: https://issues.apache.org/jira/browse/HBASE-5923 Project: HBase Issue Type: Improvement Components: client, regionserver Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Fix For: 0.96.0, 0.94.1 Attachments: 5923-0.94.txt, 5923-trunk.txt 1. the checkAnd{Put|Delete} method that takes a CompareOP is not exposed via HTable[Interface]. 2. there is unnecessary duplicate code in the check{Put|Delete} code in HRegionServer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5974) Scanner retry behavior with RPC timeout on next() seems incorrect
[ https://issues.apache.org/jira/browse/HBASE-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287916#comment-13287916 ] Lars Hofhansl commented on HBASE-5974: -- +1 for V2 Scanner retry behavior with RPC timeout on next() seems incorrect - Key: HBASE-5974 URL: https://issues.apache.org/jira/browse/HBASE-5974 Project: HBase Issue Type: Bug Components: client, regionserver Affects Versions: 0.90.7, 0.92.1, 0.94.0, 0.96.0 Reporter: Todd Lipcon Assignee: Anoop Sam John Priority: Critical Fix For: 0.94.1 Attachments: HBASE-5974_0.94.patch, HBASE-5974_94-V2.patch I'm seeing the following behavior: - set RPC timeout to a short value - call next() for some batch of rows, big enough so the client times out before the result is returned - the HConnectionManager stuff will retry the next() call to the same server. At this point, one of two things can happen: 1) the previous next() call will still be processing, in which case you get a LeaseException, because it was removed from the map during the processing, or 2) the next() call will succeed but skip the prior batch of rows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again
[ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287917#comment-13287917 ] Lars Hofhansl commented on HBASE-6059: -- I don't grok the patch in all detail, but looks good, and same as trunk patch. So +1. @Stack: Maybe you can have a safety look...? Replaying recovered edits would make deleted data exist again - Key: HBASE-6059 URL: https://issues.apache.org/jira/browse/HBASE-6059 Project: HBase Issue Type: Bug Components: regionserver Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0 Attachments: 6059v6.txt, 6059v7-94.patch, 6059v7.txt, 6059v7.txt, HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch, HBASE-6059v4.patch, HBASE-6059v5.patch When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again. Let's see how it happens. Suppose the region with two families(cf1,cf2) 1.put one data to the region (put r1,cf1:q1,v1) 2.move the region from server A to server B. 3.delete the data put by step 1(delete r1) 4.flush this region. 5.make major compaction for this region 6.move the region from server B to server A. 7.Abort server A 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1) (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5974) Scanner retry behavior with RPC timeout on next() seems incorrect
[ https://issues.apache.org/jira/browse/HBASE-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287928#comment-13287928 ] ramkrishna.s.vasudevan commented on HBASE-5974: --- +1 from me too. Scanner retry behavior with RPC timeout on next() seems incorrect - Key: HBASE-5974 URL: https://issues.apache.org/jira/browse/HBASE-5974 Project: HBase Issue Type: Bug Components: client, regionserver Affects Versions: 0.90.7, 0.92.1, 0.94.0, 0.96.0 Reporter: Todd Lipcon Assignee: Anoop Sam John Priority: Critical Fix For: 0.94.1 Attachments: HBASE-5974_0.94.patch, HBASE-5974_94-V2.patch I'm seeing the following behavior: - set RPC timeout to a short value - call next() for some batch of rows, big enough so the client times out before the result is returned - the HConnectionManager stuff will retry the next() call to the same server. At this point, one of two things can happen: 1) the previous next() call will still be processing, in which case you get a LeaseException, because it was removed from the map during the processing, or 2) the next() call will succeed but skip the prior batch of rows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6046) Master retry on ZK session expiry causes inconsistent region assignments.
[ https://issues.apache.org/jira/browse/HBASE-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287932#comment-13287932 ] ramkrishna.s.vasudevan commented on HBASE-6046: --- If the patch is ok can i prepare for trunk also. I can fix the comment on commit. @Stack Pls reivew and provide your comments on this. Master retry on ZK session expiry causes inconsistent region assignments. - Key: HBASE-6046 URL: https://issues.apache.org/jira/browse/HBASE-6046 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.1, 0.94.0 Reporter: Gopinathan A Assignee: ramkrishna.s.vasudevan Attachments: HBASE_6046_0.94.patch, HBASE_6046_0.94_1.patch, HBASE_6046_0.94_2.patch 1 ZK Session timeout in the hmaster leads to bulk assignment though all the RSs are online. 2 While doing bulk assignment, if the master again goes down restart(or backup comes up) all the node created in the ZK will now be tried to reassign to the new RSs. This is leading to double assignment. we had 2800 regions, among this 1900 region got double assignment, taking the region count to 4700. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover
[ https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287934#comment-13287934 ] ramkrishna.s.vasudevan commented on HBASE-6060: --- +1 on v4. Hope test suite passes. Regions's in OPENING state from failed regionservers takes a long time to recover - Key: HBASE-6060 URL: https://issues.apache.org/jira/browse/HBASE-6060 Project: HBase Issue Type: Bug Components: master, regionserver Reporter: Enis Soztutar Assignee: Enis Soztutar Attachments: 6060-94-v3.patch, 6060-94-v4.patch, HBASE-6060-94.patch we have seen a pattern in tests, that the regions are stuck in OPENING state for a very long time when the region server who is opening the region fails. My understanding of the process: - master calls rs to open the region. If rs is offline, a new plan is generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), HMaster.assign() - RegionServer, starts opening a region, changes the state in znode. But that znode is not ephemeral. (see ZkAssign) - Rs transitions zk node from OFFLINE to OPENING. See OpenRegionHandler.process() - rs then opens the region, and changes znode from OPENING to OPENED - when rs is killed between OPENING and OPENED states, then zk shows OPENING state, and the master just waits for rs to change the region state, but since rs is down, that wont happen. - There is a AssignmentManager.TimeoutMonitor, which does exactly guard against these kind of conditions. It periodically checks (every 10 sec by default) the regions in transition to see whether they timedout (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, which explains what you and I are seeing. - ServerShutdownHandler in Master does not reassign regions in OPENING state, although it handles other states. Lowering that threshold from the configuration is one option, but still I think we can do better. Will investigate more. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover
[ https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287937#comment-13287937 ] Zhihong Yu commented on HBASE-6060: --- Test suite passed. Regions's in OPENING state from failed regionservers takes a long time to recover - Key: HBASE-6060 URL: https://issues.apache.org/jira/browse/HBASE-6060 Project: HBase Issue Type: Bug Components: master, regionserver Reporter: Enis Soztutar Assignee: Enis Soztutar Attachments: 6060-94-v3.patch, 6060-94-v4.patch, HBASE-6060-94.patch we have seen a pattern in tests, that the regions are stuck in OPENING state for a very long time when the region server who is opening the region fails. My understanding of the process: - master calls rs to open the region. If rs is offline, a new plan is generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), HMaster.assign() - RegionServer, starts opening a region, changes the state in znode. But that znode is not ephemeral. (see ZkAssign) - Rs transitions zk node from OFFLINE to OPENING. See OpenRegionHandler.process() - rs then opens the region, and changes znode from OPENING to OPENED - when rs is killed between OPENING and OPENED states, then zk shows OPENING state, and the master just waits for rs to change the region state, but since rs is down, that wont happen. - There is a AssignmentManager.TimeoutMonitor, which does exactly guard against these kind of conditions. It periodically checks (every 10 sec by default) the regions in transition to see whether they timedout (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, which explains what you and I are seeing. - ServerShutdownHandler in Master does not reassign regions in OPENING state, although it handles other states. Lowering that threshold from the configuration is one option, but still I think we can do better. Will investigate more. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5974) Scanner retry behavior with RPC timeout on next() seems incorrect
[ https://issues.apache.org/jira/browse/HBASE-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287949#comment-13287949 ] Zhihong Yu commented on HBASE-5974: --- HRegionInterface.java doesn't exist in trunk so patch v2 wouldn't apply to trunk. I would suggest creating patch for trunk and run through hadoop QA. {code} + LOG.info(Seq number based scan API not present at RS side! Trying with API: {code} I think the above log should be at warn level. {code} +} else if (ioe instanceof CallSequenceOutOfOrderException) { + // The callSeq from the client not matched with the one expected at the RS side + // This means the RS might have done extra scanning of data which is not received by the + // client.Throw a DNRE so that we close the current scanner and opens a new one with RS. + throw new DoNotRetryIOException(Reset scanner, ioe); {code} Should we disclose a little more detail in the message of DNRIOE ? The above is the same as response to NotServingRegionException and RegionServerStoppedException. 'not matched with' - 'does not match' 'is not received' - 'has not been received' 'opens a new' - 'open a new' {code} +// if callSeq do not match throw Exception straight away. This needs to be performed even {code} 'do not match' - 'does not match' {code} +public class TestClientScannerRPCTimesout {^M {code} Please add short javadoc for the test class. I think it should be called TestClientScannerRPCTimeout. Please use utility such as dos2unix to remove the trailing ^M from the patch file. {code} + public static class RegionServerWithScanTimesout extends MiniHBaseClusterRegionServer {^M {code} The above class can be made private. It should be named RegionServerWithScanTimeout. {code} + * Thrown by a region server while scan related next() calls. Both client and server maintain a^M + * callSequence and if the both do not match, RS will throw this exception.^M + */^M +public class CallSequenceOutOfOrderException extends IOException {^M {code} CallSequenceOutOfOrderException should extend DoNotRetryIOException so that we don't need to create DoNotRetryIOException instance (shown above). 'while scan related next()' - 'while doing scan related next()' 'the both do not' - 'they do not' It would be nice for Todd to take a look at the patch. Scanner retry behavior with RPC timeout on next() seems incorrect - Key: HBASE-5974 URL: https://issues.apache.org/jira/browse/HBASE-5974 Project: HBase Issue Type: Bug Components: client, regionserver Affects Versions: 0.90.7, 0.92.1, 0.94.0, 0.96.0 Reporter: Todd Lipcon Assignee: Anoop Sam John Priority: Critical Fix For: 0.94.1 Attachments: HBASE-5974_0.94.patch, HBASE-5974_94-V2.patch I'm seeing the following behavior: - set RPC timeout to a short value - call next() for some batch of rows, big enough so the client times out before the result is returned - the HConnectionManager stuff will retry the next() call to the same server. At this point, one of two things can happen: 1) the previous next() call will still be processing, in which case you get a LeaseException, because it was removed from the map during the processing, or 2) the next() call will succeed but skip the prior batch of rows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6152) Split abort is not handled properly
[ https://issues.apache.org/jira/browse/HBASE-6152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-6152: -- Attachment: HBASE-6152_0.94.patch Test case to reproduce this issue. Infact this will happen in 0.92.0 and 0.92.1 and not in the latest code in 0.92 or 0.94. Now the current code {code} if (rs.isSplit() || rs.isSplitting()) { {code} does not have this line. So it should not create a problem here. It was removed as part of HBASE-6070. Prior to this it could have happened. The same has been reproduced in the testcase. Split abort is not handled properly --- Key: HBASE-6152 URL: https://issues.apache.org/jira/browse/HBASE-6152 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Devaraj Das Assignee: Devaraj Das Attachments: HBASE-6152_0.94.patch I ran into this: 1. RegionServer started to split a region(R), but the split was taking a long time, and hence the split was aborted 2. As part of cleanup, the RS deleted the ZK node that it created initially for R 3. The master (AssignmentManager) noticed the node deletion, and made R offline 4. The RS recovered from the failure, and at some point of time, tried to do the split again. 5. The master got an event RS_ZK_REGION_SPLIT but the server gave an error like - Received SPLIT for region R from server RS but it doesn't exist anymore,.. 6. The RS apparently did the split successfully this time, but is stuck on the master to delete the znode for the region. It kept on saying - org.apache.hadoop.hbase.regionserver.SplitTransaction: Still waiting on the master to process the split for R and it was stuck there forever. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5974) Scanner retry behavior with RPC timeout on next() seems incorrect
[ https://issues.apache.org/jira/browse/HBASE-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287953#comment-13287953 ] Zhihong Yu commented on HBASE-5974: --- w.r.t. keeping RegionScannerHolder, I posted a poll to dev@hbase for use case of letting [pre,post]ScannerOpen() return a custom RegionScanner implementation. Scanner retry behavior with RPC timeout on next() seems incorrect - Key: HBASE-5974 URL: https://issues.apache.org/jira/browse/HBASE-5974 Project: HBase Issue Type: Bug Components: client, regionserver Affects Versions: 0.90.7, 0.92.1, 0.94.0, 0.96.0 Reporter: Todd Lipcon Assignee: Anoop Sam John Priority: Critical Fix For: 0.94.1 Attachments: HBASE-5974_0.94.patch, HBASE-5974_94-V2.patch I'm seeing the following behavior: - set RPC timeout to a short value - call next() for some batch of rows, big enough so the client times out before the result is returned - the HConnectionManager stuff will retry the next() call to the same server. At this point, one of two things can happen: 1) the previous next() call will still be processing, in which case you get a LeaseException, because it was removed from the map during the processing, or 2) the next() call will succeed but skip the prior batch of rows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6151) Master can die if RegionServer throws ServerNotRunningYet
[ https://issues.apache.org/jira/browse/HBASE-6151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287971#comment-13287971 ] Zhihong Yu commented on HBASE-6151: --- ServerNotRunningYetException should be handled in the last catch block of getCachedConnection(): {code} } catch (IOException ioe) { {code} Master can die if RegionServer throws ServerNotRunningYet - Key: HBASE-6151 URL: https://issues.apache.org/jira/browse/HBASE-6151 Project: HBase Issue Type: Bug Components: ipc Affects Versions: 0.90.7, 0.92.2, 0.96.0, 0.94.1 Reporter: Gregory Chanan Assignee: Gregory Chanan See, for example: {noformat} 2012-05-23 16:49:22,745 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. org.apache.hadoop.hbase.ipc.ServerNotRunningException: org.apache.hadoop.hbase.ipc.ServerNotRunningException: Server is not running yet at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1038) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:96) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1240) at org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:444) at org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:343) at org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:540) at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:474) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:412) {noformat} The HRegionServer calls HBaseServer: {code} public void start() { startThreads(); openServer(); } {code} but the server can start accepting RPCs once the threads have been started, but if they do, they throw ServerNotRunningException until openServer runs. We should probably 1) Catch the remote exception and retry on the master 2) Look into whether the start() behavior of HBaseServer makes any sense. Why would you start accepting RPCs only to throw back ServerNotRunningException? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6067) HBase won't start when hbase.rootdir uses ViewFileSystem
[ https://issues.apache.org/jira/browse/HBASE-6067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-6067: -- Attachment: 6067.txt Patch v1 introduces reflection to detect the presence of getDefaultBlockSize(Path f) TestHLog passes. HBase won't start when hbase.rootdir uses ViewFileSystem Key: HBASE-6067 URL: https://issues.apache.org/jira/browse/HBASE-6067 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Eli Collins Assignee: Eli Collins Attachments: 6067.txt HBase currently doesn't work with HDFS federation (hbase.rootdir with a client that uses viewfs) because HLog#init uses FileSystem#getDefaultBlockSize and getDefaultReplication. These throw an exception because there is no default filesystem in a viewfs client so there's no way to determine a default block size or replication factor. They could use the versions of these methods that take a path, however these were introduced in HADOOP-8014 and are not yet available in Hadoop 1.x. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6067) HBase won't start when hbase.rootdir uses ViewFileSystem
[ https://issues.apache.org/jira/browse/HBASE-6067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-6067: -- Status: Patch Available (was: Open) HBase won't start when hbase.rootdir uses ViewFileSystem Key: HBASE-6067 URL: https://issues.apache.org/jira/browse/HBASE-6067 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Eli Collins Assignee: Eli Collins Attachments: 6067.txt HBase currently doesn't work with HDFS federation (hbase.rootdir with a client that uses viewfs) because HLog#init uses FileSystem#getDefaultBlockSize and getDefaultReplication. These throw an exception because there is no default filesystem in a viewfs client so there's no way to determine a default block size or replication factor. They could use the versions of these methods that take a path, however these were introduced in HADOOP-8014 and are not yet available in Hadoop 1.x. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5974) Scanner retry behavior with RPC timeout on next() seems incorrect
[ https://issues.apache.org/jira/browse/HBASE-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287974#comment-13287974 ] Anoop Sam John commented on HBASE-5974: --- @Ted Will look into your comments and need a rebase for 94 patch too.. I will make seperate patch fro trunk also. wrt pre and post CP hooks, it is not only creating a new custom RegionScanner, but may be creating a wrapper for the actual RegionScanner. In one of our impl, we use this approach. [Just a wrapper which delegates the calls with some extra steps for next() calls. ] Here also if we add new methods to RegionScanner interface which deals with the check and incerement for this seqNo, will get exposed to user. I felt this might look odd for them. Scanner retry behavior with RPC timeout on next() seems incorrect - Key: HBASE-5974 URL: https://issues.apache.org/jira/browse/HBASE-5974 Project: HBase Issue Type: Bug Components: client, regionserver Affects Versions: 0.90.7, 0.92.1, 0.94.0, 0.96.0 Reporter: Todd Lipcon Assignee: Anoop Sam John Priority: Critical Fix For: 0.94.1 Attachments: HBASE-5974_0.94.patch, HBASE-5974_94-V2.patch I'm seeing the following behavior: - set RPC timeout to a short value - call next() for some batch of rows, big enough so the client times out before the result is returned - the HConnectionManager stuff will retry the next() call to the same server. At this point, one of two things can happen: 1) the previous next() call will still be processing, in which case you get a LeaseException, because it was removed from the map during the processing, or 2) the next() call will succeed but skip the prior batch of rows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6067) HBase won't start when hbase.rootdir uses ViewFileSystem
[ https://issues.apache.org/jira/browse/HBASE-6067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287984#comment-13287984 ] Hadoop QA commented on HBASE-6067: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12530653/6067.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2088//testReport/ Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2088//console This message is automatically generated. HBase won't start when hbase.rootdir uses ViewFileSystem Key: HBASE-6067 URL: https://issues.apache.org/jira/browse/HBASE-6067 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Eli Collins Assignee: Eli Collins Attachments: 6067.txt HBase currently doesn't work with HDFS federation (hbase.rootdir with a client that uses viewfs) because HLog#init uses FileSystem#getDefaultBlockSize and getDefaultReplication. These throw an exception because there is no default filesystem in a viewfs client so there's no way to determine a default block size or replication factor. They could use the versions of these methods that take a path, however these were introduced in HADOOP-8014 and are not yet available in Hadoop 1.x. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5974) Scanner retry behavior with RPC timeout on next() seems incorrect
[ https://issues.apache.org/jira/browse/HBASE-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-5974: -- Attachment: HBASE-5974_94-V3.patch Scanner retry behavior with RPC timeout on next() seems incorrect - Key: HBASE-5974 URL: https://issues.apache.org/jira/browse/HBASE-5974 Project: HBase Issue Type: Bug Components: client, regionserver Affects Versions: 0.90.7, 0.92.1, 0.94.0, 0.96.0 Reporter: Todd Lipcon Assignee: Anoop Sam John Priority: Critical Fix For: 0.94.1 Attachments: HBASE-5974_0.94.patch, HBASE-5974_94-V2.patch, HBASE-5974_94-V3.patch I'm seeing the following behavior: - set RPC timeout to a short value - call next() for some batch of rows, big enough so the client times out before the result is returned - the HConnectionManager stuff will retry the next() call to the same server. At this point, one of two things can happen: 1) the previous next() call will still be processing, in which case you get a LeaseException, because it was removed from the map during the processing, or 2) the next() call will succeed but skip the prior batch of rows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5974) Scanner retry behavior with RPC timeout on next() seems incorrect
[ https://issues.apache.org/jira/browse/HBASE-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287993#comment-13287993 ] Anoop Sam John commented on HBASE-5974: --- Patch addressing Ted's comments {quote} The above class can be made private. It should be named RegionServerWithScanTimeout. {quote} The class name is changed. But we can not make this private. If so RS impl class can not get instantiated. Scanner retry behavior with RPC timeout on next() seems incorrect - Key: HBASE-5974 URL: https://issues.apache.org/jira/browse/HBASE-5974 Project: HBase Issue Type: Bug Components: client, regionserver Affects Versions: 0.90.7, 0.92.1, 0.94.0, 0.96.0 Reporter: Todd Lipcon Assignee: Anoop Sam John Priority: Critical Fix For: 0.94.1 Attachments: HBASE-5974_0.94.patch, HBASE-5974_94-V2.patch, HBASE-5974_94-V3.patch I'm seeing the following behavior: - set RPC timeout to a short value - call next() for some batch of rows, big enough so the client times out before the result is returned - the HConnectionManager stuff will retry the next() call to the same server. At this point, one of two things can happen: 1) the previous next() call will still be processing, in which case you get a LeaseException, because it was removed from the map during the processing, or 2) the next() call will succeed but skip the prior batch of rows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5974) Scanner retry behavior with RPC timeout on next() seems incorrect
[ https://issues.apache.org/jira/browse/HBASE-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287994#comment-13287994 ] Anoop Sam John commented on HBASE-5974: --- I will give patch fro trunk tomorrow. Scanner retry behavior with RPC timeout on next() seems incorrect - Key: HBASE-5974 URL: https://issues.apache.org/jira/browse/HBASE-5974 Project: HBase Issue Type: Bug Components: client, regionserver Affects Versions: 0.90.7, 0.92.1, 0.94.0, 0.96.0 Reporter: Todd Lipcon Assignee: Anoop Sam John Priority: Critical Fix For: 0.94.1 Attachments: HBASE-5974_0.94.patch, HBASE-5974_94-V2.patch, HBASE-5974_94-V3.patch I'm seeing the following behavior: - set RPC timeout to a short value - call next() for some batch of rows, big enough so the client times out before the result is returned - the HConnectionManager stuff will retry the next() call to the same server. At this point, one of two things can happen: 1) the previous next() call will still be processing, in which case you get a LeaseException, because it was removed from the map during the processing, or 2) the next() call will succeed but skip the prior batch of rows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6145) Fix site target post modularization
[ https://issues.apache.org/jira/browse/HBASE-6145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287995#comment-13287995 ] Jesse Yates commented on HBASE-6145: A pretty in depth analysis, hopefully not too much: Patch didn't apply cleanly with git, but got it go with basic patch command (mostly - looks like you might be a little behind in the docbkx?). I'm just going to step through per pom… hbase-server/pom.xml {quote} + version${avro.version}/version {quote} This (and the rest of the dependency info) should be in the parent pom's dependencyManagement section. Just as general style, if another module wants to use avro, they should just declare the dependency, and not worry about making sure they have the right version, excludes, etc (I know we are dropping avro soon, but we should still do the right thing). Also, looks like your spacing is off for the added dependencies - 2 spaces, not tabs in xml. hbase-common/pom.xml {quote} +plugins + plugin +artifactIdmaven-surefire-plugin/artifactId {quote} This can/should stay in the pluginManagement section. Surefire is part of the default maven things to run, so it will just pick up the configuration/executions from the management section - this also keeps all the surefire stuff in the same sections in all poms. Also, does anything actually happen if we remove: {quote} + plugin +groupIdorg.apache.maven.plugins/groupId +artifactIdmaven-site-plugin/artifactId {quote} from this pom? It may create the target/site (side effect of site being a core part of maven, so all modules can respond to it), but is any actual work done? Okay, onto the parent pom (/hbase/pom.xml): Why the removal of the hbase-assembly module? In the official docs, it actually says to use an assembly module. This is particularly poignant because the alternative is to use the assembly:assembly descriptor, which is deprecated… The docs say that you can use assembly:assembly from within the parent pom (but doesn't say what that actually means) - any way to tie the assembly:single phase to call the assembly:single/:assembly phase in the children poms? In src/assembly/all.xml, this comment is no longer applicable. {quote} !-- This is only necessary until maven fixes the intra-project dependency bug in maven 3.0. Until then, we have to include the test jars for sub-projects. When fixed, the below dependencySet stuff is sufficient for pulling in the test jars as well, as long as they are added as dependencies in this project. Right now, we only have 1 submodule to accumulate, but we can copy/paste as necessary until maven is fixed. -- {quote} Also, it would be awesome to move file set matching to a more general regex, rather than tying it to the maven property (which is defined in the main pom). General nit: I prefer having the properties above the build, since the properties are used in the build section, but that's just style. {quote} !--Pass -DskipJavadoc=true on command-line to skip javadoc building-- {quote} when building the site? Also, can't you just pass in -DskipJavadoc? {quote} version${maven.assembly.version}/version {quote} and {quote} version${maven.site.version}/version {quote} Would be nice to add: {code} !--$NO-MVN-MAN-VER$-- {code} to the end of the lines to remove the eclipse warning {quote} plugin groupIdorg.codehaus.mojo/groupId artifactIdxml-maven-plugin/artifactId {quote} configuration formatting is off. {quote} execution idcopy-docbkx/id goals goalcopy-resources/goal /goals phasepre-site/phase configuration outputDirectorytarget/site/outputDirectory resources resource directory${basedir}/target/docbkx/directory includes include**/**/include /includes /resource /resources /configuration /execution /executions configuration escapeString\/escapeString /configuration {quote} Is the escape string new here? What property are you avoiding overriding? Also, why are you moving them in the first place? You can just set the target directory to be ${basedir}/target/docbkx in the docbkx plugin: {code} groupIdcom.agilejava.docbkx/groupId artifactIddocbkx-maven-plugin/artifactId {code} Here, you can also move the common traits to the 'top level' configuration for the plugin, then just put the differences in each execution. For example: {code} plugin groupIdcom.agilejava.docbkx/groupId artifactIddocbkx-maven-plugin/artifactId version2.0.14/version
[jira] [Commented] (HBASE-6067) HBase won't start when hbase.rootdir uses ViewFileSystem
[ https://issues.apache.org/jira/browse/HBASE-6067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13288017#comment-13288017 ] Zhihong Yu commented on HBASE-6067: --- @Eli: Do you think the patch is Okay. HBase won't start when hbase.rootdir uses ViewFileSystem Key: HBASE-6067 URL: https://issues.apache.org/jira/browse/HBASE-6067 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Eli Collins Assignee: Eli Collins Attachments: 6067.txt HBase currently doesn't work with HDFS federation (hbase.rootdir with a client that uses viewfs) because HLog#init uses FileSystem#getDefaultBlockSize and getDefaultReplication. These throw an exception because there is no default filesystem in a viewfs client so there's no way to determine a default block size or replication factor. They could use the versions of these methods that take a path, however these were introduced in HADOOP-8014 and are not yet available in Hadoop 1.x. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (HBASE-6067) HBase won't start when hbase.rootdir uses ViewFileSystem
[ https://issues.apache.org/jira/browse/HBASE-6067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13288017#comment-13288017 ] Zhihong Yu edited comment on HBASE-6067 at 6/2/12 10:11 PM: @Eli: Do you think the patch is Okay ? was (Author: zhi...@ebaysf.com): @Eli: Do you think the patch is Okay. HBase won't start when hbase.rootdir uses ViewFileSystem Key: HBASE-6067 URL: https://issues.apache.org/jira/browse/HBASE-6067 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Eli Collins Assignee: Eli Collins Attachments: 6067.txt HBase currently doesn't work with HDFS federation (hbase.rootdir with a client that uses viewfs) because HLog#init uses FileSystem#getDefaultBlockSize and getDefaultReplication. These throw an exception because there is no default filesystem in a viewfs client so there's no way to determine a default block size or replication factor. They could use the versions of these methods that take a path, however these were introduced in HADOOP-8014 and are not yet available in Hadoop 1.x. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6145) Fix site target post modularization
[ https://issues.apache.org/jira/browse/HBASE-6145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13288037#comment-13288037 ] stack commented on HBASE-6145: -- On avro, its deprecated. I moved it out of top-level into module where its used as a). encouraging its deprecation, and b). to get rid of some of the noise avro was generating each time mvn dipped into a module. How did you get that Unknown macro message? I don't see it when I run local? W/ the -X flag? I see a version when avro does its stuff. I moved avro back up to top-level, at least the pluginManagement section for now. I did pretty print on all the poms to fix indent issues: xmllint --format. On hbase-common/pom.xml and surefire '...can/should stay in the pluginManagement section', I don't think our modules should have a pluginManagement section. It makes sense in top level or in a module IF this module had submodules but otherwise a pluginManagement doesn't make sense (as I understand it). I tried removing them from modules. On site goal and the following '...but is any actual work done?' There is. A site dir is made w/ css and images in it. I just checked. Seems like the site dir is made anyways, in spite of these flags saying don't generate a site. I just removed them. On hbase/pom.xml, assembly:assembly is not deprecated. A distinction is made between assembly:single and assembly:assembly. The former is for attaching to the packaging or pre-package phase somewhere. The latter requires explicit invocation on the command-line. In my comment above at 01/Jun/12 23:24, I talk of how I looked at taking both routes and decided against the direction the sonatype manual was encouraging because a). its an ugly hack (even the manual allows so) and b). their technique attaches itself to package phase which means a user who wants to do a basic jar build has to wait on maven copying around fat dependencies and gzipping up packages all the while spewing their console though all they want to do is check their jar builds. A third reason to avoid hbase-assembly is that hbase-assembly would force an hbase-site too since hbase-assembly would want to depend on hbase-site if the tarball was to include documentation (the javadoc and jxr aggregations work fine up in parent, wasn't sure how well they'd work in a submodule and it seemed wrong doing aggregations in a submodule anyways). I think it better that we require you explicitly ask for tarball packaging by adding the assembly:assembly to your command line (I was afraid it would not work, that the dependency facility figuring which jars to include would be broken but it seems fine). Regards 'In src/assembly/all.xml, this comment is no longer applicable.' I think it is. At least, w/o that section, I can't get the test jar to build in (which is why you added that comment in first place I guess?). bq. Also, it would be awesome to move file set matching to a more general regex, rather than tying it to the maven property (which is defined in the main pom). I suppose. I don't trust mvn to do anything right. Therefore my tendency is to keep it dumb. bq. General nit: I prefer having the properties above the build, since the properties are used in the build section, but that's just style. Yeah. I kept looking for them above ... but this is not my change. This is how it was. We can change it in another issue? bq. ...when building the site? Also, can't you just pass in -DskipJavadoc? You mean instead of -DskipJavadoc=true? I just tried it, and yes, seems to work. Let me change the comment. Again, how do you get those Unknown macro outputs? I tried w/ -X and it doesn't show. I don't know what NO-MVN-MAN-VER does (google'd it but no explaination after clicking ~10 links). Escape string is not new. Copied from what currently exists. On docbkx, they are built into target/docbkx, and then on site build, copied under site dir. Similar to javadoc. Keeping it a little independent of site in case someone is working w/ docbkx only, and not interested in site (This is how it used work). For docbkx configuration, this is how it was. I tried your suggestion of moving common config above the executions and that seems to work. Good. Fixed. bq. With the aggregate goal, you can just have all the javadocs copied into the right directory in this module when you build them; at least that is what is was doing before. This is a change you made, that javadoc was aggregated into site/apidocs. Previous to this, pre-modularization, javadocs were made into target/apidocs and then copied into site when we ran site goal. We could go that way but seemed strange building into site if not interested in 'site': i.e. you are just making javadocs to check them out, etc. The xmllint --format should take care of indents and tabs. On fixing eclipse, can we do that in another
[jira] [Updated] (HBASE-6145) Fix site target post modularization
[ https://issues.apache.org/jira/browse/HBASE-6145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6145: - Attachment: 6145v4.txt Address Jesse comments. Fix site target post modularization --- Key: HBASE-6145 URL: https://issues.apache.org/jira/browse/HBASE-6145 Project: HBase Issue Type: Task Reporter: stack Assignee: stack Attachments: 6145v4.txt, site.txt, site2.txt, sitev3.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6145) Fix site target post modularization
[ https://issues.apache.org/jira/browse/HBASE-6145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13288042#comment-13288042 ] stack commented on HBASE-6145: -- I made HBASE-6154 to address stuff not included in this patch. Jesse, I won't commit, not till I get your blessing (you might have problem w/ some of my response above) Fix site target post modularization --- Key: HBASE-6145 URL: https://issues.apache.org/jira/browse/HBASE-6145 Project: HBase Issue Type: Task Reporter: stack Assignee: stack Attachments: 6145v4.txt, site.txt, site2.txt, sitev3.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6067) HBase won't start when hbase.rootdir uses ViewFileSystem
[ https://issues.apache.org/jira/browse/HBASE-6067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13288045#comment-13288045 ] Eli Collins commented on HBASE-6067: Zhihong, Approach seems reasonable to me. I'd make it consistent with getNumCurrentReplicas and use a Method member. Also, think you're missing a call to setAccessible? Worth checking out why there's a findbugs warning as well. HBase gang - can someone make Zhihong a contributor and assign this to him? HBase won't start when hbase.rootdir uses ViewFileSystem Key: HBASE-6067 URL: https://issues.apache.org/jira/browse/HBASE-6067 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Eli Collins Assignee: Eli Collins Attachments: 6067.txt HBase currently doesn't work with HDFS federation (hbase.rootdir with a client that uses viewfs) because HLog#init uses FileSystem#getDefaultBlockSize and getDefaultReplication. These throw an exception because there is no default filesystem in a viewfs client so there's no way to determine a default block size or replication factor. They could use the versions of these methods that take a path, however these were introduced in HADOOP-8014 and are not yet available in Hadoop 1.x. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6155) [copytable] Unexpected behavior if --starttime is not specifed but --endtime is.
Jonathan Hsieh created HBASE-6155: - Summary: [copytable] Unexpected behavior if --starttime is not specifed but --endtime is. Key: HBASE-6155 URL: https://issues.apache.org/jira/browse/HBASE-6155 Project: HBase Issue Type: Improvement Affects Versions: 0.94.0, 0.92.1, 0.90.6, 0.96.0 Reporter: Jonathan Hsieh If one uses copytable and specifies only an endtime, I'd expect to include all rows from unix epoch time upto the specified endtime. Instead, it copies all the rows. The workaround for copies with this kind of range is to specify --startime=1 (Note not --starttime=0), which is also unintuitive. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6145) Fix site target post modularization
[ https://issues.apache.org/jira/browse/HBASE-6145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13288050#comment-13288050 ] Hadoop QA commented on HBASE-6145: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12530664/6145v4.txt against trunk revision . -1 @author. The patch appears to contain 3 @author tags which the Hadoop community has agreed to not allow in code contributions. +1 tests included. The patch appears to include 16 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestFromClientSide org.apache.hadoop.hbase.io.hfile.TestForceCacheImportantBlocks org.apache.hadoop.hbase.security.access.TestZKPermissionsWatcher Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2089//testReport/ Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2089//console This message is automatically generated. Fix site target post modularization --- Key: HBASE-6145 URL: https://issues.apache.org/jira/browse/HBASE-6145 Project: HBase Issue Type: Task Reporter: stack Assignee: stack Attachments: 6145v4.txt, site.txt, site2.txt, sitev3.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6067) HBase won't start when hbase.rootdir uses ViewFileSystem
[ https://issues.apache.org/jira/browse/HBASE-6067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13288057#comment-13288057 ] Zhihong Yu commented on HBASE-6067: --- fs.getDefaultBlockSize() is only called in one place: {code} this.blocksize = conf.getLong(hbase.regionserver.hlog.blocksize, getDefaultBlockSize()); {code} So I didn't a Method member. I will upload a new patch with setAccessible() call. HBase won't start when hbase.rootdir uses ViewFileSystem Key: HBASE-6067 URL: https://issues.apache.org/jira/browse/HBASE-6067 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Eli Collins Assignee: Eli Collins Attachments: 6067.txt HBase currently doesn't work with HDFS federation (hbase.rootdir with a client that uses viewfs) because HLog#init uses FileSystem#getDefaultBlockSize and getDefaultReplication. These throw an exception because there is no default filesystem in a viewfs client so there's no way to determine a default block size or replication factor. They could use the versions of these methods that take a path, however these were introduced in HADOOP-8014 and are not yet available in Hadoop 1.x. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-6067) HBase won't start when hbase.rootdir uses ViewFileSystem
[ https://issues.apache.org/jira/browse/HBASE-6067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu reassigned HBASE-6067: - Assignee: Zhihong Yu (was: Eli Collins) HBase won't start when hbase.rootdir uses ViewFileSystem Key: HBASE-6067 URL: https://issues.apache.org/jira/browse/HBASE-6067 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Eli Collins Assignee: Zhihong Yu Attachments: 6067.txt HBase currently doesn't work with HDFS federation (hbase.rootdir with a client that uses viewfs) because HLog#init uses FileSystem#getDefaultBlockSize and getDefaultReplication. These throw an exception because there is no default filesystem in a viewfs client so there's no way to determine a default block size or replication factor. They could use the versions of these methods that take a path, however these were introduced in HADOOP-8014 and are not yet available in Hadoop 1.x. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6067) HBase won't start when hbase.rootdir uses ViewFileSystem
[ https://issues.apache.org/jira/browse/HBASE-6067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-6067: -- Attachment: 6067-v2.txt Added setAccessible() call. HBase won't start when hbase.rootdir uses ViewFileSystem Key: HBASE-6067 URL: https://issues.apache.org/jira/browse/HBASE-6067 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Eli Collins Assignee: Zhihong Yu Attachments: 6067-v2.txt, 6067.txt HBase currently doesn't work with HDFS federation (hbase.rootdir with a client that uses viewfs) because HLog#init uses FileSystem#getDefaultBlockSize and getDefaultReplication. These throw an exception because there is no default filesystem in a viewfs client so there's no way to determine a default block size or replication factor. They could use the versions of these methods that take a path, however these were introduced in HADOOP-8014 and are not yet available in Hadoop 1.x. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6067) HBase won't start when hbase.rootdir uses ViewFileSystem
[ https://issues.apache.org/jira/browse/HBASE-6067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13288059#comment-13288059 ] Zhihong Yu commented on HBASE-6067: --- findbugs functionality hasn't been fixed (https://builds.apache.org/job/PreCommit-HBASE-Build/2088//console): {code} [ERROR] Could not find resource '${parent.basedir}/../dev-support/findbugs-exclude.xml'. - [Help 1] {code} HBase won't start when hbase.rootdir uses ViewFileSystem Key: HBASE-6067 URL: https://issues.apache.org/jira/browse/HBASE-6067 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Eli Collins Assignee: Zhihong Yu Attachments: 6067-v2.txt, 6067.txt HBase currently doesn't work with HDFS federation (hbase.rootdir with a client that uses viewfs) because HLog#init uses FileSystem#getDefaultBlockSize and getDefaultReplication. These throw an exception because there is no default filesystem in a viewfs client so there's no way to determine a default block size or replication factor. They could use the versions of these methods that take a path, however these were introduced in HADOOP-8014 and are not yet available in Hadoop 1.x. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6067) HBase won't start when hbase.rootdir uses ViewFileSystem
[ https://issues.apache.org/jira/browse/HBASE-6067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13288061#comment-13288061 ] Hadoop QA commented on HBASE-6067: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12530672/6067-v2.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.TestRegionRebalancing Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2090//testReport/ Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2090//console This message is automatically generated. HBase won't start when hbase.rootdir uses ViewFileSystem Key: HBASE-6067 URL: https://issues.apache.org/jira/browse/HBASE-6067 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Eli Collins Assignee: Zhihong Yu Attachments: 6067-v2.txt, 6067.txt HBase currently doesn't work with HDFS federation (hbase.rootdir with a client that uses viewfs) because HLog#init uses FileSystem#getDefaultBlockSize and getDefaultReplication. These throw an exception because there is no default filesystem in a viewfs client so there's no way to determine a default block size or replication factor. They could use the versions of these methods that take a path, however these were introduced in HADOOP-8014 and are not yet available in Hadoop 1.x. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira